DIAUmpiretoMSstatsFormat.Rd
Import DIA-Umpire files
DIAUmpiretoMSstatsFormat( raw.frag, raw.pep, raw.pro, annotation, useSelectedFrag = TRUE, useSelectedPep = TRUE, removeFewMeasurements = TRUE, removeProtein_with1Feature = FALSE, summaryforMultipleRows = max, use_log_file = TRUE, append = FALSE, verbose = TRUE, log_file_path = NULL, ... )
raw.frag | name of FragSummary_date.xls data, which includes feature-level data. |
---|---|
raw.pep | name of PeptideSummary_date.xls data, which includes selected fragments information. |
raw.pro | name of ProteinSummary_date.xls data, which includes selected peptides information. |
annotation | name of annotation data which includes Condition, BioReplicate, Run information. |
useSelectedFrag | TRUE will use the selected fragment for each peptide. 'Selected_fragments' column is required. |
useSelectedPep | TRUE will use the selected peptide for each protein. 'Selected_peptides' column is required. |
removeFewMeasurements | TRUE (default) will remove the features that have 1 or 2 measurements across runs. |
removeProtein_with1Feature | TRUE will remove the proteins which have only 1 feature, which is the combination of peptide, precursor charge, fragment and charge. FALSE is default. |
summaryforMultipleRows | max(default) or sum - when there are multiple measurements for certain feature and certain run, use highest or sum of multiple intensities. |
use_log_file | logical. If TRUE, information about data processing will be saved to a file. |
append | logical. If TRUE, information about data processing will be added to an existing log file. |
verbose | logical. If TRUE, information about data processing wil be printed to the console. |
log_file_path | character. Path to a file to which information about data processing will be saved. If not provided, such a file will be created automatically. If `append = TRUE`, has to be a valid path to a file. |
... | additional parameters to `data.table::fread`. |
data.frame in the MSstats required format.
diau_frag = system.file("tinytest/raw_data/DIAUmpire/dia_frag.csv", package = "MSstatsConvert") diau_pept = system.file("tinytest/raw_data/DIAUmpire/dia_pept.csv", package = "MSstatsConvert") diau_prot = system.file("tinytest/raw_data/DIAUmpire/dia_prot.csv", package = "MSstatsConvert") annot = system.file("tinytest/annotations/annot_diau.csv", package = "MSstats") diau_frag = data.table::fread(diau_frag) diau_pept = data.table::fread(diau_pept) diau_prot = data.table::fread(diau_prot) annot = data.table::fread(annot) diau_frag = diau_frag[, lapply(.SD, function(x) if (is.integer(x)) as.numeric(x) else x)] # In case numeric columns are not interpreted correctly diau_imported = DIAUmpiretoMSstatsFormat(diau_frag, diau_pept, diau_prot, annot, use_log_file = FALSE)#> INFO [2021-07-05 20:05:23] ** Raw data from DIAUmpire imported successfully. #> INFO [2021-07-05 20:05:23] ** Using selected fragments and peptides. #> INFO [2021-07-05 20:05:23] ** Extracted the data from selected fragments and/or peptides. #> INFO [2021-07-05 20:05:23] ** Raw data from DIAUmpire cleaned successfully. #> INFO [2021-07-05 20:05:23] ** Using provided annotation. #> INFO [2021-07-05 20:05:23] ** Run labels were standardized to remove symbols such as '.' or '%'. #> INFO [2021-07-05 20:05:23] ** The following options are used: #> - Features will be defined by the columns: PeptideSequence, FragmentIon #> - Shared peptides will be removed. #> - Proteins with single feature will not be removed. #> - Features with less than 3 measurements across runs will be removed. #> INFO [2021-07-05 20:05:23] ** Features with all missing measurements across runs are removed. #> INFO [2021-07-05 20:05:23] ** Shared peptides are removed.#> Warning: brak argumentów w max; zwracanie wartości -Inf#> Warning: brak argumentów w max; zwracanie wartości -Inf#> Warning: brak argumentów w max; zwracanie wartości -Inf#> Warning: brak argumentów w max; zwracanie wartości -Inf#> Warning: brak argumentów w max; zwracanie wartości -Inf#> Warning: brak argumentów w max; zwracanie wartości -Inf#> Warning: brak argumentów w max; zwracanie wartości -Inf#> Warning: brak argumentów w max; zwracanie wartości -Inf#> Warning: brak argumentów w max; zwracanie wartości -Inf#> Warning: brak argumentów w max; zwracanie wartości -Inf#> Warning: brak argumentów w max; zwracanie wartości -Inf#> Warning: brak argumentów w max; zwracanie wartości -Inf#> Warning: brak argumentów w max; zwracanie wartości -Inf#> Warning: brak argumentów w max; zwracanie wartości -Inf#> Warning: brak argumentów w max; zwracanie wartości -Inf#> Warning: brak argumentów w max; zwracanie wartości -Inf#> Warning: brak argumentów w max; zwracanie wartości -Inf#> Warning: brak argumentów w max; zwracanie wartości -Inf#> Warning: brak argumentów w max; zwracanie wartości -Inf#> Warning: brak argumentów w max; zwracanie wartości -Inf#> Warning: brak argumentów w max; zwracanie wartości -Inf#> Warning: brak argumentów w max; zwracanie wartości -Inf#> Warning: brak argumentów w max; zwracanie wartości -Inf#> Warning: brak argumentów w max; zwracanie wartości -Inf#> Warning: brak argumentów w max; zwracanie wartości -Inf#> Warning: brak argumentów w max; zwracanie wartości -Inf#> Warning: brak argumentów w max; zwracanie wartości -Inf#> Warning: brak argumentów w max; zwracanie wartości -Inf#> Warning: brak argumentów w max; zwracanie wartości -Inf#> Warning: brak argumentów w max; zwracanie wartości -Inf#> Warning: brak argumentów w max; zwracanie wartości -Inf#> Warning: brak argumentów w max; zwracanie wartości -Inf#> Warning: brak argumentów w max; zwracanie wartości -Inf#> INFO [2021-07-05 20:05:23] ** Multiple measurements in a feature and a run are summarized by summaryforMultipleRows: max #> INFO [2021-07-05 20:05:23] ** Features with one or two measurements across runs are removed. #> INFO [2021-07-05 20:05:23] ** Run annotation merged with quantification data. #> INFO [2021-07-05 20:05:23] ** Features with one or two measurements across runs are removed. #> INFO [2021-07-05 20:05:23] ** Fractionation handled. #> INFO [2021-07-05 20:05:23] ** Updated quantification data to make balanced design. Missing values are marked by NA #> INFO [2021-07-05 20:05:23] ** Finished preprocessing. The dataset is ready to be processed by the dataProcess function.head(diau_imported)#> ProteinName PeptideSequence PrecursorCharge FragmentIon #> 1 sp|P11310|ACADM_HUMAN AFTGFIVEADTPGIQIGR_3 NA b5_1 #> 2 sp|P11310|ACADM_HUMAN AFTGFIVEADTPGIQIGR_3 NA b5_1 #> 3 sp|P11310|ACADM_HUMAN AFTGFIVEADTPGIQIGR_3 NA b5_1 #> 4 sp|P11310|ACADM_HUMAN AFTGFIVEADTPGIQIGR_3 NA b5_1 #> 5 sp|P11310|ACADM_HUMAN AFTGFIVEADTPGIQIGR_3 NA b5_1 #> 6 sp|P11310|ACADM_HUMAN AFTGFIVEADTPGIQIGR_3 NA b5_1 #> ProductCharge IsotopeLabelType Condition BioReplicate Run #> 1 NA L A 1 lgillet_I150211_008 #> 2 NA L B 2 lgillet_I150211_009 #> 3 NA L A 1 lgillet_I150211_010 #> 4 NA L B 2 lgillet_I150211_011 #> 5 NA L A 1 lgillet_I150211_012 #> 6 NA L B 2 lgillet_I150211_013 #> Fraction Intensity #> 1 1 299 #> 2 1 254 #> 3 1 138 #> 4 1 241 #> 5 1 255 #> 6 1 127