Planning future experimental designs of Selected Reaction Monitoring (SRM), Data-Dependent Acquisition (DDA or shotgun), and Data-Independent Acquisition (DIA or SWATH-MS) experiments in sample size calculation

Calculate sample size for future experiments of a Selected Reaction Monitoring (SRM), Data-Dependent Acquisition (DDA or shotgun), and Data-Independent Acquisition (DIA or SWATH-MS) experiment based on intensity-based linear model. Two options of the calculation: (1) number of biological replicates per condition, (2) power.

designSampleSize(
  data,
  desiredFC,
  FDR = 0.05,
  numSample = TRUE,
  power = 0.9,
  use_log_file = TRUE,
  append = FALSE,
  verbose = TRUE,
  log_file_path = NULL
)

Arguments

data	'FittedModel' in testing output from function groupComparison.
desiredFC	the range of a desired fold change which includes the lower and upper values of the desired fold change.
FDR	a pre-specified false discovery ratio (FDR) to control the overall false positive rate. Default is 0.05
numSample	minimal number of biological replicates per condition. TRUE represents you require to calculate the sample size for this category, else you should input the exact number of biological replicates.
power	a pre-specified statistical power which defined as the probability of detecting a true fold change. TRUE represent you require to calculate the power for this category, else you should input the average of power you expect. Default is 0.9
use_log_file	logical. If TRUE, information about data processing will be saved to a file.
append	logical. If TRUE, information about data processing will be added to an existing log file.
verbose	logical. If TRUE, information about data processing wil be printed to the console.
log_file_path	character. Path to a file to which information about data processing will be saved. If not provided, such a file will be created automatically. If `append = TRUE`, has to be a valid path to a file.

Value

data.frame - sample size calculation results including varibles: desiredFC, numSample, FDR, and power.

Details

The function fits the model and uses variance components to calculate sample size. The underlying model fitting with intensity-based linear model with technical MS run replication. Estimated sample size is rounded to 0 decimal. The function can only obtain either one of the categories of the sample size calculation (numSample, numPep, numTran, power) at the same time.

Examples

# Consider quantitative data (i.e. QuantData) from yeast study.
# A time course study with ten time points of interests and three biological replicates.
QuantData <- dataProcess(SRMRawData)
#> INFO  [2021-07-05 20:05:40] ** Features with one or two measurements across runs are removed.
#> INFO  [2021-07-05 20:05:40] ** Fractionation handled.
#> INFO  [2021-07-05 20:05:40] ** Updated quantification data to make balanced design. Missing values are marked by NA
#> INFO  [2021-07-05 20:05:40] ** Log2 intensities under cutoff = 3.776  were considered as censored missing values.
#> INFO  [2021-07-05 20:05:40] ** Log2 intensities = NA were considered as censored missing values.
#> INFO  [2021-07-05 20:05:40] ** Use all features that the dataset originally has.
#> INFO  [2021-07-05 20:05:40] 
#>  # proteins: 2
#>  # peptides per protein: 2-2
#>  # features per peptide: 3-3
#> INFO  [2021-07-05 20:05:40] 
#>                     1 2 3 4 5 6 7 8 9 10
#>              # runs 3 3 3 3 3 3 3 3 3  3
#>     # bioreplicates 3 3 3 3 3 3 3 3 3  3
#>  # tech. replicates 1 1 1 1 1 1 1 1 1  1
#> INFO  [2021-07-05 20:05:40]  == Start the summarization per subplot...
#> 
  |                                                                            
  |                                                                      |   0%
  |                                                                            
  |===================================                                   |  50%
  |                                                                            
  |======================================================================| 100%
#> INFO  [2021-07-05 20:05:40]  == Summarization is done.
head(QuantData$FeatureLevelData)
#>   PROTEIN         PEPTIDE TRANSITION               FEATURE LABEL GROUP RUN
#> 1    IDHC ATDVIVPEEGELR_2      y7_NA ATDVIVPEEGELR_2_y7_NA     H     0   1
#> 2    IDHC ATDVIVPEEGELR_2      y7_NA ATDVIVPEEGELR_2_y7_NA     L     1   1
#> 3    IDHC ATDVIVPEEGELR_2      y7_NA ATDVIVPEEGELR_2_y7_NA     H     0   2
#> 4    IDHC ATDVIVPEEGELR_2      y7_NA ATDVIVPEEGELR_2_y7_NA     L     1   2
#> 5    IDHC ATDVIVPEEGELR_2      y7_NA ATDVIVPEEGELR_2_y7_NA     H     0   3
#> 6    IDHC ATDVIVPEEGELR_2      y7_NA ATDVIVPEEGELR_2_y7_NA     L     1   3
#>   SUBJECT FRACTION originalRUN censored  INTENSITY ABUNDANCE newABUNDANCE
#> 1       0        1           1    FALSE 84361.0835 15.855859    15.855859
#> 2       1        1           1    FALSE   215.1353  7.240669     7.240669
#> 3       0        1           2    FALSE 62109.5876 15.801179    15.801179
#> 4       2        1           2    FALSE  1205.2252 10.113738    10.113738
#> 5       0        1           3    FALSE 65114.3646 15.755022    15.755022
#> 6       3        1           3    FALSE  1476.3046 10.292109    10.292109
#>   predicted
#> 1        NA
#> 2        NA
#> 3        NA
#> 4        NA
#> 5        NA
#> 6        NA
## based on multiple comparisons  (T1 vs T3; T1 vs T7; T1 vs T9)
comparison1<-matrix(c(-1,0,1,0,0,0,0,0,0,0),nrow=1)
comparison2<-matrix(c(-1,0,0,0,0,0,1,0,0,0),nrow=1)
comparison3<-matrix(c(-1,0,0,0,0,0,0,0,1,0),nrow=1)
comparison<-rbind(comparison1,comparison2, comparison3)
row.names(comparison)<-c("T3-T1","T7-T1","T9-T1")
colnames(comparison)<-unique(QuantData$ProteinLevelData$GROUP)

testResultMultiComparisons<-groupComparison(contrast.matrix=comparison,data=QuantData)
#> INFO  [2021-07-05 20:05:40]  == Start to test and get inference in whole plot ...
#> 
  |                                                                            
  |                                                                      |   0%
  |                                                                            
  |===================================                                   |  50%
  |                                                                            
  |======================================================================| 100%
#> INFO  [2021-07-05 20:05:41]  == Comparisons for all proteins are done.

## Calculate sample size for future experiments:
#(1) Minimal number of biological replicates per condition
designSampleSize(data=testResultMultiComparisons$FittedModel, numSample=TRUE,
                 desiredFC=c(1.25,1.75), FDR=0.05, power=0.8)
#>    desiredFC numSample  FDR power    CV
#> 1      1.250        31 0.05   0.8 0.004
#> 2      1.275        26 0.05   0.8 0.005
#> 3      1.300        23 0.05   0.8 0.006
#> 4      1.325        20 0.05   0.8 0.006
#> 5      1.350        17 0.05   0.8 0.007
#> 6      1.375        15 0.05   0.8 0.008
#> 7      1.400        14 0.05   0.8 0.009
#> 8      1.425        12 0.05   0.8 0.010
#> 9      1.450        11 0.05   0.8 0.011
#> 10     1.475        10 0.05   0.8 0.012
#> 11     1.500         9 0.05   0.8 0.013
#> 12     1.525         9 0.05   0.8 0.012
#> 13     1.550         8 0.05   0.8 0.014
#> 14     1.575         8 0.05   0.8 0.013
#> 15     1.600         7 0.05   0.8 0.015
#> 16     1.625         7 0.05   0.8 0.015
#> 17     1.650         6 0.05   0.8 0.017
#> 18     1.675         6 0.05   0.8 0.017
#> 19     1.700         6 0.05   0.8 0.017
#> 20     1.725         5 0.05   0.8 0.020
#> 21     1.750         5 0.05   0.8 0.019
#(2) Power calculation
designSampleSize(data=testResultMultiComparisons$FittedModel, numSample=2,
                 desiredFC=c(1.25,1.75), FDR=0.05, power=TRUE)
#>    desiredFC numSample  FDR power    CV
#> 1      1.250         2 0.05  0.01 0.068
#> 2      1.275         2 0.05  0.01 0.067
#> 3      1.300         2 0.05  0.01 0.065
#> 4      1.325         2 0.05  0.01 0.064
#> 5      1.350         2 0.05  0.01 0.063
#> 6      1.375         2 0.05  0.01 0.062
#> 7      1.400         2 0.05  0.01 0.061
#> 8      1.425         2 0.05  0.01 0.060
#> 9      1.450         2 0.05  0.01 0.059
#> 10     1.475         2 0.05  0.01 0.058
#> 11     1.500         2 0.05  0.01 0.057
#> 12     1.525         2 0.05  0.01 0.056
#> 13     1.550         2 0.05  0.01 0.055
#> 14     1.575         2 0.05  0.01 0.054
#> 15     1.600         2 0.05  0.02 0.053
#> 16     1.625         2 0.05  0.03 0.052
#> 17     1.650         2 0.05  0.04 0.051
#> 18     1.675         2 0.05  0.05 0.051
#> 19     1.700         2 0.05  0.06 0.050
#> 20     1.725         2 0.05  0.08 0.049
#> 21     1.750         2 0.05  0.10 0.048

Planning future experimental designs of Selected Reaction Monitoring (SRM), Data-Dependent Acquisition (DDA or shotgun), and Data-Independent Acquisition (DIA or SWATH-MS) experiments in sample size calculation

Arguments

Value

Details

Examples

Contents

Author