modelBasedQCPlots.Rd
To check the assumption of linear model for whole plot inference,
modelBasedQCPlots takes the results after fitting models from function
(groupComparison
) as input and automatically generate two types
of figures in pdf files as output:
(1) normal quantile-quantile plot (specify "QQPlot" in option type) for checking
normally distributed errors.;
(2) residual plot (specify "ResidualPlot" in option type).
modelBasedQCPlots( data, type, axis.size = 10, dot.size = 3, width = 10, height = 10, which.Protein = "all", address = "" )
data | output from function groupComparison. |
---|---|
type | choice of visualization. "QQPlots" represents normal quantile-quantile plot for each protein after fitting models. "ResidualPlots" represents a plot of residuals versus fitted values for each protein in the dataset. |
axis.size | size of axes labels. Default is 10. |
dot.size | size of points in the graph for residual plots and QQ plots. Default is 3. |
width | width of the saved file. Default is 10. |
height | height of the saved file. Default is 10. |
which.Protein | Protein list to draw plots. List can be names of Proteins or order numbers of Proteins from levels(testResultOneComparison$ComparisonResult$Protein). Default is "all", which generates all plots for each protein. |
address | name that will serve as a prefix to the name of output file. |
produce a pdf file
Results based on statistical models for whole plot level inference are accurate as long as the assumptions of the model are met. The model assumes that the measurement errors are normally distributed with mean 0 and constant variance. The assumption of a constant variance can be checked by examining the residuals from the model.
QQPlots : a normal quantile-quantile plot for each protein is generated in order to check whether the errors are well approximated by a normal distribution. If points fall approximately along a straight line, then the assumption is appropriate for that protein. Only large deviations from the line are problematic.
ResidualPlots : The plots of residuals against predicted(fitted) values. If it shows a random scatter, then the assumption is appropriate.
#> INFO [2021-07-05 20:06:04] ** Features with one or two measurements across runs are removed. #> INFO [2021-07-05 20:06:04] ** Fractionation handled. #> INFO [2021-07-05 20:06:04] ** Updated quantification data to make balanced design. Missing values are marked by NA #> INFO [2021-07-05 20:06:04] ** Log2 intensities under cutoff = 3.776 were considered as censored missing values. #> INFO [2021-07-05 20:06:04] ** Log2 intensities = NA were considered as censored missing values. #> INFO [2021-07-05 20:06:04] ** Use all features that the dataset originally has. #> INFO [2021-07-05 20:06:04] #> # proteins: 2 #> # peptides per protein: 2-2 #> # features per peptide: 3-3 #> INFO [2021-07-05 20:06:04] #> 1 2 3 4 5 6 7 8 9 10 #> # runs 3 3 3 3 3 3 3 3 3 3 #> # bioreplicates 3 3 3 3 3 3 3 3 3 3 #> # tech. replicates 1 1 1 1 1 1 1 1 1 1 #> INFO [2021-07-05 20:06:04] == Start the summarization per subplot... #> | | | 0% | |=================================== | 50% | |======================================================================| 100% #> INFO [2021-07-05 20:06:04] == Summarization is done.#> PROTEIN PEPTIDE TRANSITION FEATURE LABEL GROUP RUN #> 1 IDHC ATDVIVPEEGELR_2 y7_NA ATDVIVPEEGELR_2_y7_NA H 0 1 #> 2 IDHC ATDVIVPEEGELR_2 y7_NA ATDVIVPEEGELR_2_y7_NA L 1 1 #> 3 IDHC ATDVIVPEEGELR_2 y7_NA ATDVIVPEEGELR_2_y7_NA H 0 2 #> 4 IDHC ATDVIVPEEGELR_2 y7_NA ATDVIVPEEGELR_2_y7_NA L 1 2 #> 5 IDHC ATDVIVPEEGELR_2 y7_NA ATDVIVPEEGELR_2_y7_NA H 0 3 #> 6 IDHC ATDVIVPEEGELR_2 y7_NA ATDVIVPEEGELR_2_y7_NA L 1 3 #> SUBJECT FRACTION originalRUN censored INTENSITY ABUNDANCE newABUNDANCE #> 1 0 1 1 FALSE 84361.0835 15.855859 15.855859 #> 2 1 1 1 FALSE 215.1353 7.240669 7.240669 #> 3 0 1 2 FALSE 62109.5876 15.801179 15.801179 #> 4 2 1 2 FALSE 1205.2252 10.113738 10.113738 #> 5 0 1 3 FALSE 65114.3646 15.755022 15.755022 #> 6 3 1 3 FALSE 1476.3046 10.292109 10.292109 #> predicted #> 1 NA #> 2 NA #> 3 NA #> 4 NA #> 5 NA #> 6 NA#> [1] "0" "1" "10" "2" "3" "4" "5" "6" "7" "8" "9"comparison <- matrix(c(-1,0,0,0,0,0,1,0,0,0),nrow=1) row.names(comparison) <- "T7-T1" colnames(comparison) <- unique(QuantData$ProteinLevelData$GROUP) # Tests for differentially abundant proteins with models: # label-based SRM experiment with expanded scope of biological replication. testResultOneComparison <- groupComparison(contrast.matrix=comparison, data=QuantData, use_log_file = FALSE)#> INFO [2021-07-05 20:06:04] == Start to test and get inference in whole plot ... #> | | | 0% | |=================================== | 50% | |======================================================================| 100% #> INFO [2021-07-05 20:06:04] == Comparisons for all proteins are done.# normal quantile-quantile plots modelBasedQCPlots(data=testResultOneComparison, type="QQPlots", address="")#> | | | 0%#> | |=================================== | 50%#> | |======================================================================| 100%#> pdf #> 2# residual plots modelBasedQCPlots(data=testResultOneComparison, type="ResidualPlots", address="")#> | | | 0%#> | |=================================== | 50%#> | |======================================================================| 100%#> pdf #> 2