SpectroMinetoMSstatsTMTFormat.Rd
Convert SpectroMine output into the required input format for MSstatsTMT.
SpectroMinetoMSstatsTMTFormat( input, annotation, filter_with_Qvalue = TRUE, qvalue_cutoff = 0.01, useUniquePeptide = TRUE, rmPSM_withMissing_withinRun = FALSE, rmPSM_withfewMea_withinRun = TRUE, rmProtein_with1Feature = FALSE, summaryforMultipleRows = sum )
input | data name of SpectroMine PSM output. Read PSM sheet. |
---|---|
annotation | data frame which contains column Run, Fraction, TechRepMixture, Mixture, Channel, BioReplicate, Condition. Refer to the example 'annotation.mine' for the meaning of each column. |
filter_with_Qvalue | TRUE(default) will filter out the intensities that have greater than qvalue_cutoff in EG.Qvalue column. Those intensities will be replaced with NA and will be considered as censored missing values for imputation purpose. |
qvalue_cutoff | Cutoff for EG.Qvalue. default is 0.01. |
useUniquePeptide | TRUE(default) removes peptides that are assigned for more than one proteins. We assume to use unique peptide for each protein. |
rmPSM_withMissing_withinRun | TRUE will remove PSM with any missing value within each Run. Defaut is FALSE. |
rmPSM_withfewMea_withinRun | only for rmPSM_withMissing_withinRun = FALSE. TRUE(default) will remove the features that have 1 or 2 measurements within each Run. |
rmProtein_with1Feature | TRUE will remove the proteins which have only 1 peptide and charge. Defaut is FALSE. |
summaryforMultipleRows | sum(default) or max - when there are multiple measurements for certain feature in certain run, select the feature with the largest summation or maximal value. |
input for proteinSummarization
function
#> R.MS3.Used R.BlockName R.QuantificationMethod #> 1 True All Runs TMT6Plex #> 2 True All Runs TMT6Plex #> 3 True All Runs TMT6Plex #> 4 True All Runs TMT6Plex #> 5 True All Runs TMT6Plex #> 6 True All Runs TMT6Plex #> R.FileName PG.Genes PG.Organisms #> 1 ch_19Jan2017_SM-1-1_Sp-6-2_CID-OT-MS3-Short_HpH_1.raw EGLN1 Homo sapiens #> 2 ch_19Jan2017_SM-1-1_Sp-6-2_CID-OT-MS3-Short_HpH_1.raw SEPT11 Homo sapiens #> 3 ch_19Jan2017_SM-1-1_Sp-6-2_CID-OT-MS3-Short_HpH_1.raw SEPT11 Homo sapiens #> 4 ch_19Jan2017_SM-1-1_Sp-6-2_CID-OT-MS3-Short_HpH_1.raw SEPT11 Homo sapiens #> 5 ch_19Jan2017_SM-1-1_Sp-6-2_CID-OT-MS3-Short_HpH_1.raw SEPT11 Homo sapiens #> 6 ch_19Jan2017_SM-1-1_Sp-6-2_CID-OT-MS3-Short_HpH_1.raw NAP1L4 Homo sapiens #> PG.ProteinAccessions PG.ProteinDescriptions PG.ProteinNames #> 1 Q9GZT9 Egl nine homolog 1 EGLN1_HUMAN #> 2 Q9NVA2 Septin-11 SEP11_HUMAN #> 3 Q9NVA2 Septin-11 SEP11_HUMAN #> 4 Q9NVA2 Septin-11 SEP11_HUMAN #> 5 Q9NVA2 Septin-11 SEP11_HUMAN #> 6 Q99733 Nucleosome assembly protein 1-like 4 NP1L4_HUMAN #> PG.UniprotIds PG.Coverage PG.QValue PEP.IsProteinGroupSpecific #> 1 Q9GZT9 5.6% 0 TRUE #> 2 Q9NVA2 9.1% 0 TRUE #> 3 Q9NVA2 9.1% 0 FALSE #> 4 Q9NVA2 9.1% 0 TRUE #> 5 Q9NVA2 9.1% 0 TRUE #> 6 Q99733 5.1% 0 TRUE #> PEP.IsProteotypic PEP.StrippedSequence PEP.QValue #> 1 TRUE AAAGGQGSAVAAEAEPGKEEPPAR 0.0000000000 #> 2 TRUE KELEEEVNNFQK 0.0001986492 #> 3 FALSE SLDLVTMK 0.0000000000 #> 4 TRUE AAAQLLQSQAQQSGAQQTK 0.0000000000 #> 5 TRUE AAAQLLQSQAQQSGAQQTK 0.0000000000 #> 6 TRUE VLAALQER 0.0024595109 #> PEP.IsUsedForQuantification PP.Charge #> 1 True 3 #> 2 True 3 #> 3 True 2 #> 4 True 3 #> 5 True 2 #> 6 True 2 #> P.MoleculeID PSM.TMT6_126..Raw. #> 1 _[TMT_Nter]AAAGGQGSAVAAEAEPGK[TMT_Lys]EEPPAR_ 382.1107 #> 2 _[TMT_Nter]K[TMT_Lys]ELEEEVNNFQK[TMT_Lys]_ 33554.1900 #> 3 _[TMT_Nter]SLDLVTMK[TMT_Lys]_ 44713.6300 #> 4 _[TMT_Nter]AAAQLLQSQAQQSGAQQTK[TMT_Lys]_ 20877.8700 #> 5 _[TMT_Nter]AAAQLLQSQAQQSGAQQTK[TMT_Lys]_ 506.1669 #> 6 _[TMT_Nter]VLAALQER_ 17143.5200 #> PSM.TMT6_127..Raw. PSM.TMT6_128..Raw. PSM.TMT6_129..Raw. PSM.TMT6_130..Raw. #> 1 392.9243 477.7064 989.6951 695.7537 #> 2 33671.5400 40525.3900 43739.8600 40697.4400 #> 3 51052.8400 58457.6400 47213.1300 58004.6700 #> 4 19930.0100 24963.5400 29021.8300 19510.8600 #> 5 1019.8020 543.2091 555.3629 501.8123 #> 6 4284.8930 26957.7100 15610.6900 21208.9500 #> PSM.TMT6_131..Raw. PSM.IsUsedForQuantification PSM.NrOfMatchedChannelIons #> 1 107.6282 True 6 #> 2 17299.1300 True 6 #> 3 23622.9500 True 6 #> 4 15952.0600 True 6 #> 5 268.5246 True 6 #> 6 7701.3510 True 6 #> PSM.Qvalue #> 1 0.000000e+00 #> 2 2.146015e-04 #> 3 0.000000e+00 #> 4 5.613247e-05 #> 5 0.000000e+00 #> 6 1.951193e-03#> Run TechRepMixture Fraction #> 1 ch_19Jan2017_SM-1-1_Sp-6-2_CID-OT-MS3-Short_HpH_1.raw 1 1 #> 2 ch_19Jan2017_SM-1-1_Sp-6-2_CID-OT-MS3-Short_HpH_2.raw 1 2 #> 3 ch_19Jan2017_SM-1-1_Sp-6-2_CID-OT-MS3-Short_HpH_3.raw 1 3 #> 4 ch_19Jan2017_SM-1-1_Sp-6-2_CID-OT-MS3-Short_HpH_4.raw 1 4 #> 5 ch_19Jan2017_SM-1-1_Sp-6-2_CID-OT-MS3-Short_HpH_5.raw 1 5 #> 6 ch_19Jan2017_SM-1-1_Sp-6-2_CID-OT-MS3-Short_HpH_6.raw 1 6 #> Channel Condition Mixture BioReplicate #> 1 TMT6_126 3 1 1 #> 2 TMT6_126 3 1 1 #> 3 TMT6_126 3 1 1 #> 4 TMT6_126 3 1 1 #> 5 TMT6_126 3 1 1 #> 6 TMT6_126 3 1 1input.mine <- SpectroMinetoMSstatsTMTFormat(raw.mine, annotation.mine)#>#>#>#>#>#>#>#>#> ProteinName PeptideSequence Charge #> 1 Q9GZT9 _[TMT_Nter]AAAGGQGSAVAAEAEPGK[TMT_Lys]EEPPAR_ 3 #> 2 Q9NVA2 _[TMT_Nter]K[TMT_Lys]ELEEEVNNFQK[TMT_Lys]_ 3 #> 3 Q9NVA2 _[TMT_Nter]SLDLVTMK[TMT_Lys]_ 2 #> 4 Q9NVA2 _[TMT_Nter]AAAQLLQSQAQQSGAQQTK[TMT_Lys]_ 3 #> 5 Q9NVA2 _[TMT_Nter]AAAQLLQSQAQQSGAQQTK[TMT_Lys]_ 2 #> 6 P06753 _[TMT_Nter]AADAEAEVASLNRR_ 3 #> PSM Mixture TechRepMixture Run #> 1 _[TMT_Nter]AAAGGQGSAVAAEAEPGK[TMT_Lys]EEPPAR__3 1 1 1_1 #> 2 _[TMT_Nter]K[TMT_Lys]ELEEEVNNFQK[TMT_Lys]__3 1 1 1_1 #> 3 _[TMT_Nter]SLDLVTMK[TMT_Lys]__2 1 1 1_1 #> 4 _[TMT_Nter]AAAQLLQSQAQQSGAQQTK[TMT_Lys]__3 1 1 1_1 #> 5 _[TMT_Nter]AAAQLLQSQAQQSGAQQTK[TMT_Lys]__2 1 1 1_1 #> 6 _[TMT_Nter]AADAEAEVASLNRR__3 1 1 1_1 #> Channel BioReplicate Condition Intensity #> 1 TMT6_126 1 3 382.1107 #> 2 TMT6_126 1 3 33554.1900 #> 3 TMT6_126 1 3 44713.6300 #> 4 TMT6_126 1 3 20877.8700 #> 5 TMT6_126 1 3 506.1669 #> 6 TMT6_126 1 3 10065.2800