Import MaxQuant files

MaxQtoMSstatsFormat(
  evidence,
  annotation,
  proteinGroups,
  proteinID = "Proteins",
  useUniquePeptide = TRUE,
  summaryforMultipleRows = max,
  removeFewMeasurements = TRUE,
  removeMpeptides = FALSE,
  removeOxidationMpeptides = FALSE,
  removeProtein_with1Peptide = FALSE,
  use_log_file = TRUE,
  append = FALSE,
  verbose = TRUE,
  log_file_path = NULL,
  ...
)

Arguments

evidence	name of 'evidence.txt' data, which includes feature-level data.
annotation	name of 'annotation.txt' data which includes Raw.file, Condition, BioReplicate, Run, IsotopeLabelType information.
proteinGroups	name of 'proteinGroups.txt' data. It needs to matching protein group ID. If proteinGroups=NULL, use 'Proteins' column in 'evidence.txt'.
proteinID	'Proteins'(default) or 'Leading.razor.protein' for Protein ID.
useUniquePeptide	TRUE (default) removes peptides that are assigned for more than one proteins. We assume to use unique peptide for each protein.
summaryforMultipleRows	max(default) or sum - when there are multiple measurements for certain feature and certain run, use highest or sum of multiple intensities.
removeFewMeasurements	TRUE (default) will remove the features that have 1 or 2 measurements across runs.
removeMpeptides	TRUE will remove the peptides including 'M' sequence. FALSE is default.
removeOxidationMpeptides	TRUE will remove the peptides including 'oxidation (M)' in modification. FALSE is default.
removeProtein_with1Peptide	TRUE will remove the proteins which have only 1 peptide and charge. FALSE is default.
use_log_file	logical. If TRUE, information about data processing will be saved to a file.
append	logical. If TRUE, information about data processing will be added to an existing log file.
verbose	logical. If TRUE, information about data processing wil be printed to the console.
log_file_path	character. Path to a file to which information about data processing will be saved. If not provided, such a file will be created automatically. If `append = TRUE`, has to be a valid path to a file.
...	additional parameters to `data.table::fread`.

Value

data.frame in the MSstats required format.

Note

Warning: MSstats does not support for metabolic labeling or iTRAQ experiments.

Examples

mq_ev = data.table::fread(system.file("tinytest/raw_data/MaxQuant/mq_ev.csv",
                                      package = "MSstatsConvert"))
mq_pg = data.table::fread(system.file("tinytest/raw_data/MaxQuant/mq_pg.csv",
                                      package = "MSstatsConvert"))
annot = data.table::fread(system.file("tinytest/raw_data/MaxQuant/annotation.csv",
                                      package = "MSstatsConvert"))
maxq_imported = MaxQtoMSstatsFormat(mq_ev, annot, mq_pg, use_log_file = FALSE)
#> INFO  [2021-07-05 20:05:31] ** Raw data from MaxQuant imported successfully.
#> INFO  [2021-07-05 20:05:31] ** Rows with values of Potentialcontaminant equal to + are removed 
#> INFO  [2021-07-05 20:05:31] ** Rows with values of Reverse equal to + are removed 
#> INFO  [2021-07-05 20:05:31] ** Rows with values of Potentialcontaminant equal to + are removed 
#> INFO  [2021-07-05 20:05:31] ** Rows with values of Reverse equal to + are removed 
#> INFO  [2021-07-05 20:05:31] ** Rows with values of Onlyidentifiedbysite equal to + are removed 
#> INFO  [2021-07-05 20:05:31] ** + Contaminant, + Reverse, + Potential.contaminant, + Only.identified.by.site proteins are removed.
#> INFO  [2021-07-05 20:05:31] ** Raw data from MaxQuant cleaned successfully.
#> INFO  [2021-07-05 20:05:31] ** Using provided annotation.
#> INFO  [2021-07-05 20:05:31] ** Run labels were standardized to remove symbols such as '.' or '%'.
#> INFO  [2021-07-05 20:05:31] ** The following options are used:
#>   - Features will be defined by the columns: PeptideSequence, PrecursorCharge
#>   - Shared peptides will be removed.
#>   - Proteins with single feature will not be removed.
#>   - Features with less than 3 measurements across runs will be removed.
#> INFO  [2021-07-05 20:05:31] ** Features with all missing measurements across runs are removed.
#> INFO  [2021-07-05 20:05:31] ** Shared peptides are removed.
#> INFO  [2021-07-05 20:05:31] ** Multiple measurements in a feature and a run are summarized by summaryforMultipleRows: max
#> INFO  [2021-07-05 20:05:31] ** Features with one or two measurements across runs are removed.
#> INFO  [2021-07-05 20:05:31] ** Run annotation merged with quantification data.
#> INFO  [2021-07-05 20:05:31] ** Features with one or two measurements across runs are removed.
#> INFO  [2021-07-05 20:05:31] ** Fractionation handled.
#> INFO  [2021-07-05 20:05:31] ** Updated quantification data to make balanced design. Missing values are marked by NA
#> INFO  [2021-07-05 20:05:31] ** Finished preprocessing. The dataset is ready to be processed by the dataProcess function.
head(maxq_imported)
#>   ProteinName PeptideSequence PrecursorCharge FragmentIon ProductCharge
#> 1      P06959     AEAPAAAPAAK               2          NA            NA
#> 2      P06959     AEAPAAAPAAK               2          NA            NA
#> 3      P06959     AEAPAAAPAAK               2          NA            NA
#> 4      P06959     AEAPAAAPAAK               2          NA            NA
#> 5      P06959     AEAPAAAPAAK               2          NA            NA
#> 6      P06959     AEAPAAAPAAK               2          NA            NA
#>   IsotopeLabelType Condition BioReplicate
#> 1                L         1            1
#> 2                L         1            1
#> 3                L         1            1
#> 4                L         2            2
#> 5                L         2            2
#> 6                L         2            2
#>                                           Run Fraction Intensity
#> 1 121219_S_CCES_01_01_LysC_Try_1to10_Mixt_1_1        1   4023100
#> 2 121219_S_CCES_01_02_LysC_Try_1to10_Mixt_1_2        1   5132500
#> 3 121219_S_CCES_01_03_LysC_Try_1to10_Mixt_1_3        1   2761600
#> 4 121219_S_CCES_01_04_LysC_Try_1to10_Mixt_2_1        1   2932900
#> 5 121219_S_CCES_01_05_LysC_Try_1to10_Mixt_2_2        1   4091800
#> 6 121219_S_CCES_01_06_LysC_Try_1to10_Mixt_2_3        1   4727000

Arguments

Value

Note

Examples

Contents

Author