Convert output of TMT labeled MaxQuant experiment into MSstatsPTM format

Takes as input TMT experiments from MaxQ and converts the data into the format needed for MSstatsPTM. Requires only the modified file from MaxQ (for example Phospho(STY)Sites) and an annotation file for PTM data. To adjust modified peptides for changes in global protein level, unmodified TMT experimental data must also be returned.

MaxQtoMSstatsPTMFormat(
  sites.data,
  annotation,
  evidence = NULL,
  proteinGroups = NULL,
  mod.num = "Single",
  keyword = "phos",
  which.proteinid.ptm = "Protein",
  which.proteinid.protein = "Leading.razor.protein",
  removeMpeptides = FALSE
)

Arguments

sites.data	modified peptide output from MaxQuant. For example, a phosphorylation experiment would require the Phospho(STY)Sites.txt file
annotation	data frame which contains column Run, Fraction, TechRepMixture, Mixture, Channel, BioReplicate, Condition.
evidence	for global protein dataset. name of 'evidence.txt' data, which includes feature-level data.
proteinGroups	for global protein dataset, name of 'proteinGroups.txt' data.
mod.num	For modified peptide dataset. The number modifications per peptide to be used. If "Single", only peptides with one modification will be used. Otherwise "Total" can be selected which does not cap the number of modifications per peptide. "Single" is the default. Selecting "Total" may confound the effect of different modifications.
keyword	the sub-name of columns in the sites.data file. For phosphorylation data, this value should be "phos". The default is "phos".
which.proteinid.ptm	For PTM dataset, which column to use for protein name. Use 'Proteins'(default) column for protein name. 'Leading.proteins' or 'Leading.razor.protein' or 'Gene.names' can be used instead to get the protein ID with single protein. However, those can potentially have the shared peptides.
which.proteinid.protein	For Protein dataset, which column to use for protein name. Same options as above.
removeMpeptides	If Oxidation (M) modifications should be removed. Default is TRUE.

Value

a list of two data.tables named 'PTM' and 'PROTEIN' in the format required by MSstatsPTM.

Examples


head(raw.input.tmt$PTM)
#>       ProteinName PeptideSequence Charge           PSM Mixture TechRepMixture
#> 1 Protein_12_S703      Peptide491      3 Peptide_491_3       1              1
#> 2 Protein_12_S703      Peptide491      3 Peptide_491_3       1              1
#> 3 Protein_12_S703      Peptide491      3 Peptide_491_3       1              1
#> 4 Protein_12_S703      Peptide491      3 Peptide_491_3       1              1
#> 5 Protein_12_S703      Peptide491      3 Peptide_491_3       1              1
#> 6 Protein_12_S703      Peptide491      3 Peptide_491_3       1              1
#>   Run Channel   Condition  BioReplicate Intensity
#> 1 1_1    128N Condition_2 Condition_2_1   48030.0
#> 2 1_1    129C Condition_4 Condition_4_2  100224.4
#> 3 1_1    131C Condition_3 Condition_3_2   66804.6
#> 4 1_1    130N Condition_1 Condition_1_2   46779.8
#> 5 1_1    128C Condition_6 Condition_6_1   77497.9
#> 6 1_1    126C Condition_4 Condition_4_1   81559.7
head(raw.input.tmt$PROTEIN)
#>   ProteinName PeptideSequence Charge             PSM Mixture TechRepMixture Run
#> 1  Protein_12     Peptide9121      3  Peptide_9121_3       1              1 1_1
#> 2  Protein_12    Peptide27963      5 Peptide_27963_5       1              1 1_1
#> 3  Protein_12    Peptide28482      4 Peptide_28482_4       1              1 1_1
#> 4  Protein_12    Peptide10940      2 Peptide_10940_2       2              1 2_1
#> 5  Protein_12     Peptide4900      2  Peptide_4900_2       2              1 2_1
#> 6  Protein_12     Peptide4900      3  Peptide_4900_3       2              1 2_1
#>   Channel   Condition  BioReplicate  Intensity
#> 1    126C Condition_4 Condition_4_1 10996116.9
#> 2    127C Condition_5 Condition_5_1    56965.1
#> 3    131N Condition_2 Condition_2_2   286121.7
#> 4    131N Condition_2 Condition_2_4   534806.0
#> 5    126C Condition_4 Condition_4_3  1134908.7
#> 6    126C Condition_4 Condition_4_3  1605773.2