PDtoMSstatsTMTFormat.Rd
Convert Proteome discoverer output into the required input format for MSstatsTMT.
PDtoMSstatsTMTFormat( input, annotation, which.proteinid = "Protein.Accessions", useNumProteinsColumn = TRUE, useUniquePeptide = TRUE, rmPSM_withMissing_withinRun = FALSE, rmPSM_withfewMea_withinRun = TRUE, rmProtein_with1Feature = FALSE, summaryforMultipleRows = sum )
input | data name of Proteome discover PSM output. |
---|---|
annotation | data frame which contains column Run, Fraction, TechRepMixture, Mixture, Channel, BioReplicate, Condition. Refer to the example 'annotation.pd' for the meaning of each column. |
which.proteinid | Use 'Protein.Accessions'(default) column for protein name. 'Master.Protein.Accessions' can be used instead to get the protein name with single protein. |
useNumProteinsColumn | TURE(default) remove shared peptides by information of # Proteins column in PSM sheet. |
useUniquePeptide | TRUE(default) removes peptides that are assigned for more than one proteins. We assume to use unique peptide for each protein. |
rmPSM_withMissing_withinRun | TRUE will remove PSM with any missing value within each Run. Defaut is FALSE. |
rmPSM_withfewMea_withinRun | only for rmPSM_withMissing_withinRun = FALSE. TRUE(default) will remove the features that have 1 or 2 measurements within each Run. |
rmProtein_with1Feature | TRUE will remove the proteins which have only 1 peptide and charge. Defaut is FALSE. |
summaryforMultipleRows | sum(default) or max - when there are multiple measurements for certain feature in certain run, select the feature with the largest summation or maximal value. |
input for proteinSummarization
function
#> Checked Confidence Identifying.Node PSM.Ambiguity #> 1: FALSE High Mascot (O4) Unambiguous #> 2: FALSE High Mascot (K2) Unambiguous #> 3: FALSE High Mascot (K2) Unambiguous #> 4: FALSE High Mascot (F2) Selected #> 5: FALSE High Mascot (K2) Unambiguous #> 6: FALSE High Mascot (K2) Unambiguous #> Annotated.Sequence #> 1: [K].gFQQILAGEYDHLPEQAFYMVGPIEEAVAk.[A] #> 2: [R].qYPWGVAEVENGEHcDFTILr.[N] #> 3: [R].dkPSVEPVEEYDYEDLk.[E] #> 4: [R].hEHQVMLmr.[Q] #> 5: [R].dNLTLWTADNAGEEGGEAPQEPQS.[-] #> 6: [R].aLVAIGTHDLDTLSGPFTYTAk.[R] #> Modifications Marked.as #> 1: N-Term(TMT6plex); K30(TMT6plex) NA #> 2: N-Term(TMT6plex); C15(Carbamidomethyl); R21(Label:13C(6)15N(4)) NA #> 3: N-Term(TMT6plex); K2(Label); K17(Label) NA #> 4: N-Term(TMT6plex); M8(Oxidation); R9(Label:13C(6)15N(4)) NA #> 5: N-Term(TMT6plex) NA #> 6: N-Term(TMT6plex); K22(Label) NA #> X..Protein.Groups X..Proteins Master.Protein.Accessions #> 1: 1 1 P06576 #> 2: 1 1 Q16181 #> 3: 1 1 Q9Y450 #> 4: 1 1 Q15233 #> 5: 1 1 P31947 #> 6: 1 1 Q9NSD9 #> Master.Protein.Descriptions #> 1: ATP synthase subunit beta, mitochondrial OS=Homo sapiens GN=ATP5B PE=1 SV=3 #> 2: Septin-7 OS=Homo sapiens GN=SEPT7 PE=1 SV=2 #> 3: HBS1-like protein OS=Homo sapiens GN=HBS1L PE=1 SV=1 #> 4: Non-POU domain-containing octamer-binding protein OS=Homo sapiens GN=NONO PE=1 SV=4 #> 5: 14-3-3 protein sigma OS=Homo sapiens GN=SFN PE=1 SV=1 #> 6: Phenylalanine--tRNA ligase beta subunit OS=Homo sapiens GN=FARSB PE=1 SV=3 #> Protein.Accessions #> 1: P06576 #> 2: Q16181 #> 3: Q9Y450 #> 4: Q15233 #> 5: P31947 #> 6: Q9NSD9 #> Protein.Descriptions #> 1: ATP synthase subunit beta, mitochondrial OS=Homo sapiens GN=ATP5B PE=1 SV=3 #> 2: Septin-7 OS=Homo sapiens GN=SEPT7 PE=1 SV=2 #> 3: HBS1-like protein OS=Homo sapiens GN=HBS1L PE=1 SV=1 #> 4: Non-POU domain-containing octamer-binding protein OS=Homo sapiens GN=NONO PE=1 SV=4 #> 5: 14-3-3 protein sigma OS=Homo sapiens GN=SFN PE=1 SV=1 #> 6: Phenylalanine--tRNA ligase beta subunit OS=Homo sapiens GN=FARSB PE=1 SV=3 #> X..Missed.Cleavages Charge DeltaScore DeltaCn Rank Search.Engine.Rank #> 1: 0 3 1.0000 0 1 1 #> 2: 0 3 1.0000 0 1 1 #> 3: 1 3 0.9730 0 1 1 #> 4: 0 4 0.5250 0 1 1 #> 5: 0 3 1.0000 0 1 1 #> 6: 0 3 0.9783 0 1 1 #> m.z..Da. MH...Da. Theo..MH...Da. DeltaM..ppm. Deltam.z..Da. Activation.Type #> 1: 1270.3249 3808.960 3808.966 -1.51 -0.00192 CID #> 2: 920.4493 2759.333 2759.332 0.31 0.00028 CID #> 3: 920.1605 2758.467 2758.461 2.08 0.00192 CID #> 4: 359.6898 1435.737 1435.738 -0.04 -0.00002 CID #> 5: 920.0943 2758.268 2758.264 1.53 0.00141 CID #> 6: 919.8502 2757.536 2757.532 1.48 0.00136 CID #> MS.Order Isolation.Interference.... Average.Reporter.S.N #> 1: MS2 47.955590 8.7 #> 2: MS2 9.377507 8.1 #> 3: MS2 38.317050 17.8 #> 4: MS2 21.390040 36.5 #> 5: MS2 0.000000 16.7 #> 6: MS2 30.619960 26.7 #> Ion.Inject.Time..ms. RT..min. First.Scan #> 1: 50.000 212.2487 112815 #> 2: 3.242 164.7507 87392 #> 3: 13.596 143.4534 74786 #> 4: 50.000 21.6426 6458 #> 5: 6.723 174.1863 92950 #> 6: 8.958 176.4863 94294 #> Spectrum.File File.ID Abundance..126 #> 1: 161117_SILAC_HeLa_UPS1_TMT10_Mixture1_03.raw F1 2548.326 #> 2: 161117_SILAC_HeLa_UPS1_TMT10_Mixture3_03.raw F5 22861.765 #> 3: 161117_SILAC_HeLa_UPS1_TMT10_Mixture3_03.raw F5 25504.083 #> 4: 161117_SILAC_HeLa_UPS1_TMT10_Mixture4_02.raw F10 13493.228 #> 5: 161117_SILAC_HeLa_UPS1_TMT10_Mixture3_03.raw F5 64582.786 #> 6: 161117_SILAC_HeLa_UPS1_TMT10_Mixture3_03.raw F5 35404.709 #> Abundance..127N Abundance..127C Abundance..128N Abundance..128C #> 1: 3231.929 2760.839 4111.639 3127.254 #> 2: 25817.946 23349.498 29449.609 25995.929 #> 3: 27740.450 25144.974 25754.579 29923.176 #> 4: 14674.490 11187.900 12831.495 13839.426 #> 5: 50576.417 47126.037 56285.129 46257.310 #> 6: 31905.852 30993.941 36854.351 37506.001 #> Abundance..129N Abundance..129C Abundance..130N Abundance..130C #> 1: 1874.163 2831.423 2298.401 3798.876 #> 2: 22955.769 30578.971 30660.488 38728.853 #> 3: 34097.637 31650.255 27632.692 23886.881 #> 4: 12441.353 13450.885 14777.844 13039.995 #> 5: 52634.885 49716.850 60660.574 55830.488 #> 6: 25703.444 38626.598 35447.942 33788.409 #> Abundance..131 Quan.Info Ions.Score Identity.Strict Identity.Relaxed #> 1: 3739.067 NA 90 28 21 #> 2: 25047.280 NA 76 24 17 #> 3: 35331.092 NA 74 30 23 #> 4: 12057.121 NA 40 25 18 #> 5: 40280.577 NA 38 21 14 #> 6: 32031.516 NA 46 29 22 #> Expectation.Value Percolator.q.Value Percolator.PEP #> 1: 7.038672e-09 0 1.396e-05 #> 2: 6.298627e-08 0 3.349e-07 #> 3: 4.318385e-07 0 9.922e-07 #> 4: 3.351211e-04 0 1.175e-04 #> 5: 2.152501e-04 0 1.383e-05 #> 6: 2.060469e-04 0 7.198e-05#> Run Fraction TechRepMixture Channel #> 1 161117_SILAC_HeLa_UPS1_TMT10_Mixture1_01.raw 1 1 126 #> 2 161117_SILAC_HeLa_UPS1_TMT10_Mixture1_01.raw 1 1 127N #> 3 161117_SILAC_HeLa_UPS1_TMT10_Mixture1_01.raw 1 1 127C #> 4 161117_SILAC_HeLa_UPS1_TMT10_Mixture1_01.raw 1 1 128N #> 5 161117_SILAC_HeLa_UPS1_TMT10_Mixture1_01.raw 1 1 128C #> 6 161117_SILAC_HeLa_UPS1_TMT10_Mixture1_01.raw 1 1 129N #> Condition Mixture BioReplicate #> 1 Norm Mixture1 Mixture1_Norm #> 2 0.667 Mixture1 Mixture1_0.667 #> 3 0.125 Mixture1 Mixture1_0.125 #> 4 0.5 Mixture1 Mixture1_0.5 #> 5 1 Mixture1 Mixture1_1 #> 6 0.125 Mixture1 Mixture1_0.125input.pd <- PDtoMSstatsTMTFormat(raw.pd, annotation.pd)#>#>#>#> ProteinName PeptideSequence Charge #> 1 P04406 [K].lISWYDNEFGYSNR.[V] 2 #> 2 Q9NSD9 [K].irPFAVAAVLr.[N] 3 #> 3 P04406 [K].lVINGNPITIFQErDPSk.[I] 3 #> 4 P04406 [R].vVDLmAHMASkE.[-] 3 #> 5 P06576 [R].dQEGQDVLLFIDNIFR.[F] 3 #> 6 P06576 [R].iPSAVGYQPTLATDMGTMQEr.[I] 3 #> PSM Mixture TechRepMixture #> 1 [K].lISWYDNEFGYSNR.[V]_2 Mixture1 1 #> 2 [K].irPFAVAAVLr.[N]_3 Mixture1 1 #> 3 [K].lVINGNPITIFQErDPSk.[I]_3 Mixture1 1 #> 4 [R].vVDLmAHMASkE.[-]_3 Mixture1 1 #> 5 [R].dQEGQDVLLFIDNIFR.[F]_3 Mixture1 1 #> 6 [R].iPSAVGYQPTLATDMGTMQEr.[I]_3 Mixture1 1 #> Run Channel Condition BioReplicate #> 1 161117_SILAC_HeLa_UPS1_TMT10_Mixture1_01.raw 126 Norm Mixture1_Norm #> 2 161117_SILAC_HeLa_UPS1_TMT10_Mixture1_01.raw 126 Norm Mixture1_Norm #> 3 161117_SILAC_HeLa_UPS1_TMT10_Mixture1_01.raw 126 Norm Mixture1_Norm #> 4 161117_SILAC_HeLa_UPS1_TMT10_Mixture1_01.raw 126 Norm Mixture1_Norm #> 5 161117_SILAC_HeLa_UPS1_TMT10_Mixture1_01.raw 126 Norm Mixture1_Norm #> 6 161117_SILAC_HeLa_UPS1_TMT10_Mixture1_01.raw 126 Norm Mixture1_Norm #> Intensity #> 1 8348.351 #> 2 28327.492 #> 3 1275010.965 #> 4 80589.877 #> 5 2231.389 #> 6 144854.307