Import DIA-Umpire files

DIAUmpiretoMSstatsFormat(
  raw.frag,
  raw.pep,
  raw.pro,
  annotation,
  useSelectedFrag = TRUE,
  useSelectedPep = TRUE,
  removeFewMeasurements = TRUE,
  removeProtein_with1Feature = FALSE,
  summaryforMultipleRows = max,
  use_log_file = TRUE,
  append = FALSE,
  verbose = TRUE,
  log_file_path = NULL,
  ...
)

Arguments

raw.frag	name of FragSummary_date.xls data, which includes feature-level data.
raw.pep	name of PeptideSummary_date.xls data, which includes selected fragments information.
raw.pro	name of ProteinSummary_date.xls data, which includes selected peptides information.
annotation	name of annotation data which includes Condition, BioReplicate, Run information.
useSelectedFrag	TRUE will use the selected fragment for each peptide. 'Selected_fragments' column is required.
useSelectedPep	TRUE will use the selected peptide for each protein. 'Selected_peptides' column is required.
removeFewMeasurements	TRUE (default) will remove the features that have 1 or 2 measurements across runs.
removeProtein_with1Feature	TRUE will remove the proteins which have only 1 feature, which is the combination of peptide, precursor charge, fragment and charge. FALSE is default.
summaryforMultipleRows	max(default) or sum - when there are multiple measurements for certain feature and certain run, use highest or sum of multiple intensities.
use_log_file	logical. If TRUE, information about data processing will be saved to a file.
append	logical. If TRUE, information about data processing will be added to an existing log file.
verbose	logical. If TRUE, information about data processing wil be printed to the console.
log_file_path	character. Path to a file to which information about data processing will be saved. If not provided, such a file will be created automatically. If `append = TRUE`, has to be a valid path to a file.
...	additional parameters to `data.table::fread`.

Value

data.frame in the MSstats required format.

Examples

diau_frag = system.file("tinytest/raw_data/DIAUmpire/dia_frag.csv",
                             package = "MSstatsConvert")
diau_pept = system.file("tinytest/raw_data/DIAUmpire/dia_pept.csv",
                             package = "MSstatsConvert")
diau_prot = system.file("tinytest/raw_data/DIAUmpire/dia_prot.csv",
                             package = "MSstatsConvert")
annot = system.file("tinytest/annotations/annot_diau.csv",
                    package = "MSstats")
diau_frag = data.table::fread(diau_frag)
diau_pept = data.table::fread(diau_pept)
diau_prot = data.table::fread(diau_prot)
annot = data.table::fread(annot)
diau_frag = diau_frag[, lapply(.SD, function(x) if (is.integer(x)) as.numeric(x) else x)]
# In case numeric columns are not interpreted correctly

diau_imported = DIAUmpiretoMSstatsFormat(diau_frag, diau_pept, diau_prot,
                                         annot, use_log_file = FALSE)
#> INFO  [2021-07-05 20:05:23] ** Raw data from DIAUmpire imported successfully.
#> INFO  [2021-07-05 20:05:23] ** Using selected fragments and peptides.
#> INFO  [2021-07-05 20:05:23] ** Extracted the data from selected fragments and/or peptides.
#> INFO  [2021-07-05 20:05:23] ** Raw data from DIAUmpire cleaned successfully.
#> INFO  [2021-07-05 20:05:23] ** Using provided annotation.
#> INFO  [2021-07-05 20:05:23] ** Run labels were standardized to remove symbols such as '.' or '%'.
#> INFO  [2021-07-05 20:05:23] ** The following options are used:
#>   - Features will be defined by the columns: PeptideSequence, FragmentIon
#>   - Shared peptides will be removed.
#>   - Proteins with single feature will not be removed.
#>   - Features with less than 3 measurements across runs will be removed.
#> INFO  [2021-07-05 20:05:23] ** Features with all missing measurements across runs are removed.
#> INFO  [2021-07-05 20:05:23] ** Shared peptides are removed.
#> Warning: brak argumentów w max; zwracanie wartości -Inf
#> Warning: brak argumentów w max; zwracanie wartości -Inf
#> Warning: brak argumentów w max; zwracanie wartości -Inf
#> Warning: brak argumentów w max; zwracanie wartości -Inf
#> Warning: brak argumentów w max; zwracanie wartości -Inf
#> Warning: brak argumentów w max; zwracanie wartości -Inf
#> Warning: brak argumentów w max; zwracanie wartości -Inf
#> Warning: brak argumentów w max; zwracanie wartości -Inf
#> Warning: brak argumentów w max; zwracanie wartości -Inf
#> Warning: brak argumentów w max; zwracanie wartości -Inf
#> Warning: brak argumentów w max; zwracanie wartości -Inf
#> Warning: brak argumentów w max; zwracanie wartości -Inf
#> Warning: brak argumentów w max; zwracanie wartości -Inf
#> Warning: brak argumentów w max; zwracanie wartości -Inf
#> Warning: brak argumentów w max; zwracanie wartości -Inf
#> Warning: brak argumentów w max; zwracanie wartości -Inf
#> Warning: brak argumentów w max; zwracanie wartości -Inf
#> Warning: brak argumentów w max; zwracanie wartości -Inf
#> Warning: brak argumentów w max; zwracanie wartości -Inf
#> Warning: brak argumentów w max; zwracanie wartości -Inf
#> Warning: brak argumentów w max; zwracanie wartości -Inf
#> Warning: brak argumentów w max; zwracanie wartości -Inf
#> Warning: brak argumentów w max; zwracanie wartości -Inf
#> Warning: brak argumentów w max; zwracanie wartości -Inf
#> Warning: brak argumentów w max; zwracanie wartości -Inf
#> Warning: brak argumentów w max; zwracanie wartości -Inf
#> Warning: brak argumentów w max; zwracanie wartości -Inf
#> Warning: brak argumentów w max; zwracanie wartości -Inf
#> Warning: brak argumentów w max; zwracanie wartości -Inf
#> Warning: brak argumentów w max; zwracanie wartości -Inf
#> Warning: brak argumentów w max; zwracanie wartości -Inf
#> Warning: brak argumentów w max; zwracanie wartości -Inf
#> Warning: brak argumentów w max; zwracanie wartości -Inf
#> INFO  [2021-07-05 20:05:23] ** Multiple measurements in a feature and a run are summarized by summaryforMultipleRows: max
#> INFO  [2021-07-05 20:05:23] ** Features with one or two measurements across runs are removed.
#> INFO  [2021-07-05 20:05:23] ** Run annotation merged with quantification data.
#> INFO  [2021-07-05 20:05:23] ** Features with one or two measurements across runs are removed.
#> INFO  [2021-07-05 20:05:23] ** Fractionation handled.
#> INFO  [2021-07-05 20:05:23] ** Updated quantification data to make balanced design. Missing values are marked by NA
#> INFO  [2021-07-05 20:05:23] ** Finished preprocessing. The dataset is ready to be processed by the dataProcess function.
head(diau_imported)
#>             ProteinName      PeptideSequence PrecursorCharge FragmentIon
#> 1 sp|P11310|ACADM_HUMAN AFTGFIVEADTPGIQIGR_3              NA        b5_1
#> 2 sp|P11310|ACADM_HUMAN AFTGFIVEADTPGIQIGR_3              NA        b5_1
#> 3 sp|P11310|ACADM_HUMAN AFTGFIVEADTPGIQIGR_3              NA        b5_1
#> 4 sp|P11310|ACADM_HUMAN AFTGFIVEADTPGIQIGR_3              NA        b5_1
#> 5 sp|P11310|ACADM_HUMAN AFTGFIVEADTPGIQIGR_3              NA        b5_1
#> 6 sp|P11310|ACADM_HUMAN AFTGFIVEADTPGIQIGR_3              NA        b5_1
#>   ProductCharge IsotopeLabelType Condition BioReplicate                 Run
#> 1            NA                L         A            1 lgillet_I150211_008
#> 2            NA                L         B            2 lgillet_I150211_009
#> 3            NA                L         A            1 lgillet_I150211_010
#> 4            NA                L         B            2 lgillet_I150211_011
#> 5            NA                L         A            1 lgillet_I150211_012
#> 6            NA                L         B            2 lgillet_I150211_013
#>   Fraction Intensity
#> 1        1       299
#> 2        1       254
#> 3        1       138
#> 4        1       241
#> 5        1       255
#> 6        1       127

Arguments

Value

Examples

Contents

Author