OpenSWATHtoMSstatsFormat.Rd
Import OpenSWATH files
OpenSWATHtoMSstatsFormat( input, annotation, filter_with_mscore = TRUE, mscore_cutoff = 0.01, useUniquePeptide = TRUE, removeFewMeasurements = TRUE, removeProtein_with1Feature = FALSE, summaryforMultipleRows = max, use_log_file = TRUE, append = FALSE, verbose = TRUE, log_file_path = NULL, ... )
input | name of MSstats input report from OpenSWATH, which includes feature-level data. |
---|---|
annotation | name of 'annotation.txt' data which includes Condition, BioReplicate, Run. Run should be the same as filename. |
filter_with_mscore | TRUE(default) will filter out the features that have greater than mscore_cutoff in m_score column. Those features will be removed. |
mscore_cutoff | Cutoff for m_score. Default is 0.01. |
useUniquePeptide | TRUE (default) removes peptides that are assigned for more than one proteins. We assume to use unique peptide for each protein. |
removeFewMeasurements | TRUE (default) will remove the features that have 1 or 2 measurements across runs. |
removeProtein_with1Feature | TRUE will remove the proteins which have only 1 feature, which is the combination of peptide, precursor charge, fragment and charge. FALSE is default. |
summaryforMultipleRows | max(default) or sum - when there are multiple measurements for certain feature and certain run, use highest or sum of multiple intensities. |
use_log_file | logical. If TRUE, information about data processing will be saved to a file. |
append | logical. If TRUE, information about data processing will be added to an existing log file. |
verbose | logical. If TRUE, information about data processing wil be printed to the console. |
log_file_path | character. Path to a file to which information about data processing will be saved. If not provided, such a file will be created automatically. If `append = TRUE`, has to be a valid path to a file. |
... | additional parameters to `data.table::fread`. |
data.frame in the MSstats required format.
os_raw = system.file("tinytest/raw_data/OpenSWATH/openswath_input.csv", package = "MSstatsConvert") annot = system.file("tinytest/annotations/annot_os.csv", package = "MSstats") os_raw = data.table::fread(os_raw) annot = data.table::fread(annot) os_imported = OpenSWATHtoMSstatsFormat(os_raw, annot, use_log_file = FALSE)#> INFO [2021-07-05 20:05:31] ** Raw data from OpenSWATH imported successfully. #> INFO [2021-07-05 20:05:31] ** Raw data from OpenSWATH cleaned successfully. #> INFO [2021-07-05 20:05:31] ** Using provided annotation. #> INFO [2021-07-05 20:05:31] ** Run labels were standardized to remove symbols such as '.' or '%'. #> INFO [2021-07-05 20:05:31] ** The following options are used: #> - Features will be defined by the columns: PeptideSequence, PrecursorCharge, FragmentIon #> - Shared peptides will be removed. #> - Proteins with single feature will not be removed. #> - Features with less than 3 measurements across runs will be removed. #> INFO [2021-07-05 20:05:31] ** Rows with values of decoy equal to 1 are removed #> INFO [2021-07-05 20:05:31] ** Rows with values smaller than 0.01 in m_score are removed #> INFO [2021-07-05 20:05:31] ** Features with all missing measurements across runs are removed. #> INFO [2021-07-05 20:05:31] ** Shared peptides are removed.#> Warning: brak argumentów w max; zwracanie wartości -Inf#> Warning: brak argumentów w max; zwracanie wartości -Inf#> Warning: brak argumentów w max; zwracanie wartości -Inf#> Warning: brak argumentów w max; zwracanie wartości -Inf#> Warning: brak argumentów w max; zwracanie wartości -Inf#> Warning: brak argumentów w max; zwracanie wartości -Inf#> INFO [2021-07-05 20:05:31] ** Multiple measurements in a feature and a run are summarized by summaryforMultipleRows: max #> INFO [2021-07-05 20:05:31] ** Features with one or two measurements across runs are removed. #> INFO [2021-07-05 20:05:31] ** Run annotation merged with quantification data. #> INFO [2021-07-05 20:05:31] ** Features with one or two measurements across runs are removed. #> INFO [2021-07-05 20:05:31] ** Fractionation handled. #> INFO [2021-07-05 20:05:31] ** Updated quantification data to make balanced design. Missing values are marked by NA #> INFO [2021-07-05 20:05:31] ** Finished preprocessing. The dataset is ready to be processed by the dataProcess function.head(os_imported)#> ProteinName #> 1 5/sp|P0DJD0|RGPD1_HUMAN/sp|Q7Z3J3|RGPD4_HUMAN/sp|A6NKT7|RGPD3_HUMAN/sp|P0DJD1|RGPD2_HUMAN/sp|P49792|RBP2_HUMAN #> 2 5/sp|P0DJD0|RGPD1_HUMAN/sp|Q7Z3J3|RGPD4_HUMAN/sp|A6NKT7|RGPD3_HUMAN/sp|P0DJD1|RGPD2_HUMAN/sp|P49792|RBP2_HUMAN #> 3 5/sp|P0DJD0|RGPD1_HUMAN/sp|Q7Z3J3|RGPD4_HUMAN/sp|A6NKT7|RGPD3_HUMAN/sp|P0DJD1|RGPD2_HUMAN/sp|P49792|RBP2_HUMAN #> 4 5/sp|P0DJD0|RGPD1_HUMAN/sp|Q7Z3J3|RGPD4_HUMAN/sp|A6NKT7|RGPD3_HUMAN/sp|P0DJD1|RGPD2_HUMAN/sp|P49792|RBP2_HUMAN #> 5 5/sp|P0DJD0|RGPD1_HUMAN/sp|Q7Z3J3|RGPD4_HUMAN/sp|A6NKT7|RGPD3_HUMAN/sp|P0DJD1|RGPD2_HUMAN/sp|P49792|RBP2_HUMAN #> 6 5/sp|P0DJD0|RGPD1_HUMAN/sp|Q7Z3J3|RGPD4_HUMAN/sp|A6NKT7|RGPD3_HUMAN/sp|P0DJD1|RGPD2_HUMAN/sp|P49792|RBP2_HUMAN #> PeptideSequence PrecursorCharge FragmentIon ProductCharge #> 1 IAVAVLEETTR 2 75349_y4_1_IAVAVLEETTR_2 <NA> #> 2 IAVAVLEETTR 2 75349_y4_1_IAVAVLEETTR_2 <NA> #> 3 IAVAVLEETTR 2 75349_y4_1_IAVAVLEETTR_2 <NA> #> 4 IAVAVLEETTR 2 75349_y4_1_IAVAVLEETTR_2 <NA> #> 5 IAVAVLEETTR 2 75349_y4_1_IAVAVLEETTR_2 <NA> #> 6 IAVAVLEETTR 2 75349_y4_1_IAVAVLEETTR_2 <NA> #> IsotopeLabelType Condition BioReplicate #> 1 L B 2 #> 2 L A 1 #> 3 L B 2 #> 4 L B 2 #> 5 L A 1 #> 6 L A 1 #> Run Fraction Intensity #> 1 scratch94460284tmpdirlgillet_I150211_009mzXMLgz 1 2881.5 #> 2 scratch94460285tmpdirlgillet_I150211_008mzXMLgz 1 2360.5 #> 3 scratch94460286tmpdirlgillet_I150211_013mzXMLgz 1 383.0 #> 4 scratch94460287tmpdirlgillet_I150211_011mzXMLgz 1 2506.5 #> 5 scratch94460288tmpdirlgillet_I150211_010mzXMLgz 1 2639.0 #> 6 scratch94460289tmpdirlgillet_I150211_012mzXMLgz 1 2730.0