designSampleSize.Rd
Calculate sample size for future experiments of a Selected Reaction Monitoring (SRM), Data-Dependent Acquisition (DDA or shotgun), and Data-Independent Acquisition (DIA or SWATH-MS) experiment based on intensity-based linear model. Two options of the calculation: (1) number of biological replicates per condition, (2) power.
designSampleSize( data, desiredFC, FDR = 0.05, numSample = TRUE, power = 0.9, use_log_file = TRUE, append = FALSE, verbose = TRUE, log_file_path = NULL )
data | 'FittedModel' in testing output from function groupComparison. |
---|---|
desiredFC | the range of a desired fold change which includes the lower and upper values of the desired fold change. |
FDR | a pre-specified false discovery ratio (FDR) to control the overall false positive rate. Default is 0.05 |
numSample | minimal number of biological replicates per condition. TRUE represents you require to calculate the sample size for this category, else you should input the exact number of biological replicates. |
power | a pre-specified statistical power which defined as the probability of detecting a true fold change. TRUE represent you require to calculate the power for this category, else you should input the average of power you expect. Default is 0.9 |
use_log_file | logical. If TRUE, information about data processing will be saved to a file. |
append | logical. If TRUE, information about data processing will be added to an existing log file. |
verbose | logical. If TRUE, information about data processing wil be printed to the console. |
log_file_path | character. Path to a file to which information about data processing will be saved. If not provided, such a file will be created automatically. If `append = TRUE`, has to be a valid path to a file. |
data.frame - sample size calculation results including varibles: desiredFC, numSample, FDR, and power.
The function fits the model and uses variance components to calculate sample size. The underlying model fitting with intensity-based linear model with technical MS run replication. Estimated sample size is rounded to 0 decimal. The function can only obtain either one of the categories of the sample size calculation (numSample, numPep, numTran, power) at the same time.
# Consider quantitative data (i.e. QuantData) from yeast study. # A time course study with ten time points of interests and three biological replicates. QuantData <- dataProcess(SRMRawData)#> INFO [2021-07-05 20:05:40] ** Features with one or two measurements across runs are removed. #> INFO [2021-07-05 20:05:40] ** Fractionation handled. #> INFO [2021-07-05 20:05:40] ** Updated quantification data to make balanced design. Missing values are marked by NA #> INFO [2021-07-05 20:05:40] ** Log2 intensities under cutoff = 3.776 were considered as censored missing values. #> INFO [2021-07-05 20:05:40] ** Log2 intensities = NA were considered as censored missing values. #> INFO [2021-07-05 20:05:40] ** Use all features that the dataset originally has. #> INFO [2021-07-05 20:05:40] #> # proteins: 2 #> # peptides per protein: 2-2 #> # features per peptide: 3-3 #> INFO [2021-07-05 20:05:40] #> 1 2 3 4 5 6 7 8 9 10 #> # runs 3 3 3 3 3 3 3 3 3 3 #> # bioreplicates 3 3 3 3 3 3 3 3 3 3 #> # tech. replicates 1 1 1 1 1 1 1 1 1 1 #> INFO [2021-07-05 20:05:40] == Start the summarization per subplot... #> | | | 0% | |=================================== | 50% | |======================================================================| 100% #> INFO [2021-07-05 20:05:40] == Summarization is done.#> PROTEIN PEPTIDE TRANSITION FEATURE LABEL GROUP RUN #> 1 IDHC ATDVIVPEEGELR_2 y7_NA ATDVIVPEEGELR_2_y7_NA H 0 1 #> 2 IDHC ATDVIVPEEGELR_2 y7_NA ATDVIVPEEGELR_2_y7_NA L 1 1 #> 3 IDHC ATDVIVPEEGELR_2 y7_NA ATDVIVPEEGELR_2_y7_NA H 0 2 #> 4 IDHC ATDVIVPEEGELR_2 y7_NA ATDVIVPEEGELR_2_y7_NA L 1 2 #> 5 IDHC ATDVIVPEEGELR_2 y7_NA ATDVIVPEEGELR_2_y7_NA H 0 3 #> 6 IDHC ATDVIVPEEGELR_2 y7_NA ATDVIVPEEGELR_2_y7_NA L 1 3 #> SUBJECT FRACTION originalRUN censored INTENSITY ABUNDANCE newABUNDANCE #> 1 0 1 1 FALSE 84361.0835 15.855859 15.855859 #> 2 1 1 1 FALSE 215.1353 7.240669 7.240669 #> 3 0 1 2 FALSE 62109.5876 15.801179 15.801179 #> 4 2 1 2 FALSE 1205.2252 10.113738 10.113738 #> 5 0 1 3 FALSE 65114.3646 15.755022 15.755022 #> 6 3 1 3 FALSE 1476.3046 10.292109 10.292109 #> predicted #> 1 NA #> 2 NA #> 3 NA #> 4 NA #> 5 NA #> 6 NA## based on multiple comparisons (T1 vs T3; T1 vs T7; T1 vs T9) comparison1<-matrix(c(-1,0,1,0,0,0,0,0,0,0),nrow=1) comparison2<-matrix(c(-1,0,0,0,0,0,1,0,0,0),nrow=1) comparison3<-matrix(c(-1,0,0,0,0,0,0,0,1,0),nrow=1) comparison<-rbind(comparison1,comparison2, comparison3) row.names(comparison)<-c("T3-T1","T7-T1","T9-T1") colnames(comparison)<-unique(QuantData$ProteinLevelData$GROUP) testResultMultiComparisons<-groupComparison(contrast.matrix=comparison,data=QuantData)#> INFO [2021-07-05 20:05:40] == Start to test and get inference in whole plot ... #> | | | 0% | |=================================== | 50% | |======================================================================| 100% #> INFO [2021-07-05 20:05:41] == Comparisons for all proteins are done.## Calculate sample size for future experiments: #(1) Minimal number of biological replicates per condition designSampleSize(data=testResultMultiComparisons$FittedModel, numSample=TRUE, desiredFC=c(1.25,1.75), FDR=0.05, power=0.8)#> desiredFC numSample FDR power CV #> 1 1.250 31 0.05 0.8 0.004 #> 2 1.275 26 0.05 0.8 0.005 #> 3 1.300 23 0.05 0.8 0.006 #> 4 1.325 20 0.05 0.8 0.006 #> 5 1.350 17 0.05 0.8 0.007 #> 6 1.375 15 0.05 0.8 0.008 #> 7 1.400 14 0.05 0.8 0.009 #> 8 1.425 12 0.05 0.8 0.010 #> 9 1.450 11 0.05 0.8 0.011 #> 10 1.475 10 0.05 0.8 0.012 #> 11 1.500 9 0.05 0.8 0.013 #> 12 1.525 9 0.05 0.8 0.012 #> 13 1.550 8 0.05 0.8 0.014 #> 14 1.575 8 0.05 0.8 0.013 #> 15 1.600 7 0.05 0.8 0.015 #> 16 1.625 7 0.05 0.8 0.015 #> 17 1.650 6 0.05 0.8 0.017 #> 18 1.675 6 0.05 0.8 0.017 #> 19 1.700 6 0.05 0.8 0.017 #> 20 1.725 5 0.05 0.8 0.020 #> 21 1.750 5 0.05 0.8 0.019#(2) Power calculation designSampleSize(data=testResultMultiComparisons$FittedModel, numSample=2, desiredFC=c(1.25,1.75), FDR=0.05, power=TRUE)#> desiredFC numSample FDR power CV #> 1 1.250 2 0.05 0.01 0.068 #> 2 1.275 2 0.05 0.01 0.067 #> 3 1.300 2 0.05 0.01 0.065 #> 4 1.325 2 0.05 0.01 0.064 #> 5 1.350 2 0.05 0.01 0.063 #> 6 1.375 2 0.05 0.01 0.062 #> 7 1.400 2 0.05 0.01 0.061 #> 8 1.425 2 0.05 0.01 0.060 #> 9 1.450 2 0.05 0.01 0.059 #> 10 1.475 2 0.05 0.01 0.058 #> 11 1.500 2 0.05 0.01 0.057 #> 12 1.525 2 0.05 0.01 0.056 #> 13 1.550 2 0.05 0.01 0.055 #> 14 1.575 2 0.05 0.01 0.054 #> 15 1.600 2 0.05 0.02 0.053 #> 16 1.625 2 0.05 0.03 0.052 #> 17 1.650 2 0.05 0.04 0.051 #> 18 1.675 2 0.05 0.05 0.051 #> 19 1.700 2 0.05 0.06 0.050 #> 20 1.725 2 0.05 0.08 0.049 #> 21 1.750 2 0.05 0.10 0.048