ProgenesistoMSstatsFormat.Rd
Import Progenesis files
ProgenesistoMSstatsFormat( input, annotation, useUniquePeptide = TRUE, summaryforMultipleRows = max, removeFewMeasurements = TRUE, removeOxidationMpeptides = FALSE, removeProtein_with1Peptide = FALSE, use_log_file = TRUE, append = FALSE, verbose = TRUE, log_file_path = NULL, ... )
input | name of Progenesis output, which is wide-format. 'Accession', 'Sequence', 'Modification', 'Charge' and one column for each run are required. |
---|---|
annotation | name of 'annotation.txt' or 'annotation.csv' data which includes Condition, BioReplicate, Run information. It will be matched with the column name of input for MS runs. |
useUniquePeptide | TRUE (default) removes peptides that are assigned for more than one proteins. We assume to use unique peptide for each protein. |
summaryforMultipleRows | max(default) or sum - when there are multiple measurements for certain feature and certain run, use highest or sum of multiple intensities. |
removeFewMeasurements | TRUE (default) will remove the features that have 1 or 2 measurements across runs. |
removeOxidationMpeptides | TRUE will remove the peptides including 'oxidation (M)' in modification. FALSE is default. |
removeProtein_with1Peptide | TRUE will remove the proteins which have only 1 peptide and charge. FALSE is default. |
use_log_file | logical. If TRUE, information about data processing will be saved to a file. |
append | logical. If TRUE, information about data processing will be added to an existing log file. |
verbose | logical. If TRUE, information about data processing wil be printed to the console. |
log_file_path | character. Path to a file to which information about data processing will be saved. If not provided, such a file will be created automatically. If `append = TRUE`, has to be a valid path to a file. |
... | additional parameters to `data.table::fread`. |
data.frame in the MSstats required format.
progenesis_raw = system.file("tinytest/raw_data/Progenesis/progenesis_input.csv", package = "MSstatsConvert") annot = system.file("tinytest/raw_data/Progenesis/progenesis_annot.csv", package = "MSstatsConvert") progenesis_raw = data.table::fread(progenesis_raw) annot = data.table::fread(annot) progenesis_imported = ProgenesistoMSstatsFormat(progenesis_raw, annot, use_log_file = FALSE)#> INFO [2021-07-05 20:05:32] ** Raw data from Progenesis imported successfully. #> INFO [2021-07-05 20:05:32] ** Raw data from Progenesis cleaned successfully. #> INFO [2021-07-05 20:05:32] ** Using provided annotation. #> INFO [2021-07-05 20:05:32] ** Run labels were standardized to remove symbols such as '.' or '%'. #> INFO [2021-07-05 20:05:32] ** The following options are used: #> - Features will be defined by the columns: PeptideSequence, PrecursorCharge #> - Shared peptides will be removed. #> - Proteins with single feature will not be removed. #> - Features with less than 3 measurements across runs will be removed. #> INFO [2021-07-05 20:05:32] ** Features with all missing measurements across runs are removed. #> INFO [2021-07-05 20:05:32] ** Shared peptides are removed. #> INFO [2021-07-05 20:05:32] ** Multiple measurements in a feature and a run are summarized by summaryforMultipleRows: max #> INFO [2021-07-05 20:05:32] ** Features with one or two measurements across runs are removed. #> INFO [2021-07-05 20:05:32] ** Run annotation merged with quantification data. #> INFO [2021-07-05 20:05:32] ** Features with one or two measurements across runs are removed. #> INFO [2021-07-05 20:05:32] ** Fractionation handled. #> INFO [2021-07-05 20:05:32] ** Updated quantification data to make balanced design. Missing values are marked by NA #> INFO [2021-07-05 20:05:32] ** Finished preprocessing. The dataset is ready to be processed by the dataProcess function.head(progenesis_imported)#> ProteinName PeptideModifiedSequence PrecursorCharge FragmentIon #> 1 sp|P33767|OSTB_YEAST DYFSPSSEELVVSSNHLLNK 3 NA #> 2 sp|P33767|OSTB_YEAST DYFSPSSEELVVSSNHLLNK 3 NA #> 3 sp|P33767|OSTB_YEAST DYFSPSSEELVVSSNHLLNK 3 NA #> 4 sp|P33767|OSTB_YEAST DYFSPSSEELVVSSNHLLNK 3 NA #> 5 sp|P33767|OSTB_YEAST DYFSPSSEELVVSSNHLLNK 3 NA #> 6 sp|P33767|OSTB_YEAST DYFSPSSEELVVSSNHLLNK 3 NA #> ProductCharge IsotopeLabelType Condition BioReplicate Run #> 1 NA L Condition1 1 JD_06232014_sample1-A #> 2 NA L Condition1 1 JD_06232014_sample1_B #> 3 NA L Condition1 1 JD_06232014_sample1_C #> 4 NA L Condition2 2 JD_06232014_sample2_A #> 5 NA L Condition2 2 JD_06232014_sample2_B #> 6 NA L Condition2 2 JD_06232014_sample2_C #> Fraction Intensity #> 1 1 3787692 #> 2 1 3747125 #> 3 1 3214139 #> 4 1 5353473 #> 5 1 4064855 #> 6 1 3270403