To check the assumption of linear model for whole plot inference, modelBasedQCPlots takes the results after fitting models from function (groupComparison) as input and automatically generate two types of figures in pdf files as output: (1) normal quantile-quantile plot (specify "QQPlot" in option type) for checking normally distributed errors.; (2) residual plot (specify "ResidualPlot" in option type).

modelBasedQCPlots(
  data,
  type,
  axis.size = 10,
  dot.size = 3,
  width = 10,
  height = 10,
  which.Protein = "all",
  address = ""
)

Arguments

data

output from function groupComparison.

type

choice of visualization. "QQPlots" represents normal quantile-quantile plot for each protein after fitting models. "ResidualPlots" represents a plot of residuals versus fitted values for each protein in the dataset.

axis.size

size of axes labels. Default is 10.

dot.size

size of points in the graph for residual plots and QQ plots. Default is 3.

width

width of the saved file. Default is 10.

height

height of the saved file. Default is 10.

which.Protein

Protein list to draw plots. List can be names of Proteins or order numbers of Proteins from levels(testResultOneComparison$ComparisonResult$Protein). Default is "all", which generates all plots for each protein.

address

name that will serve as a prefix to the name of output file.

Value

produce a pdf file

Details

Results based on statistical models for whole plot level inference are accurate as long as the assumptions of the model are met. The model assumes that the measurement errors are normally distributed with mean 0 and constant variance. The assumption of a constant variance can be checked by examining the residuals from the model.

  • QQPlots : a normal quantile-quantile plot for each protein is generated in order to check whether the errors are well approximated by a normal distribution. If points fall approximately along a straight line, then the assumption is appropriate for that protein. Only large deviations from the line are problematic.

  • ResidualPlots : The plots of residuals against predicted(fitted) values. If it shows a random scatter, then the assumption is appropriate.

Examples

QuantData <- dataProcess(SRMRawData, use_log_file = FALSE)
#> INFO [2021-07-05 20:06:04] ** Features with one or two measurements across runs are removed. #> INFO [2021-07-05 20:06:04] ** Fractionation handled. #> INFO [2021-07-05 20:06:04] ** Updated quantification data to make balanced design. Missing values are marked by NA #> INFO [2021-07-05 20:06:04] ** Log2 intensities under cutoff = 3.776 were considered as censored missing values. #> INFO [2021-07-05 20:06:04] ** Log2 intensities = NA were considered as censored missing values. #> INFO [2021-07-05 20:06:04] ** Use all features that the dataset originally has. #> INFO [2021-07-05 20:06:04] #> # proteins: 2 #> # peptides per protein: 2-2 #> # features per peptide: 3-3 #> INFO [2021-07-05 20:06:04] #> 1 2 3 4 5 6 7 8 9 10 #> # runs 3 3 3 3 3 3 3 3 3 3 #> # bioreplicates 3 3 3 3 3 3 3 3 3 3 #> # tech. replicates 1 1 1 1 1 1 1 1 1 1 #> INFO [2021-07-05 20:06:04] == Start the summarization per subplot... #> | | | 0% | |=================================== | 50% | |======================================================================| 100% #> INFO [2021-07-05 20:06:04] == Summarization is done.
head(QuantData$FeatureLevelData)
#> PROTEIN PEPTIDE TRANSITION FEATURE LABEL GROUP RUN #> 1 IDHC ATDVIVPEEGELR_2 y7_NA ATDVIVPEEGELR_2_y7_NA H 0 1 #> 2 IDHC ATDVIVPEEGELR_2 y7_NA ATDVIVPEEGELR_2_y7_NA L 1 1 #> 3 IDHC ATDVIVPEEGELR_2 y7_NA ATDVIVPEEGELR_2_y7_NA H 0 2 #> 4 IDHC ATDVIVPEEGELR_2 y7_NA ATDVIVPEEGELR_2_y7_NA L 1 2 #> 5 IDHC ATDVIVPEEGELR_2 y7_NA ATDVIVPEEGELR_2_y7_NA H 0 3 #> 6 IDHC ATDVIVPEEGELR_2 y7_NA ATDVIVPEEGELR_2_y7_NA L 1 3 #> SUBJECT FRACTION originalRUN censored INTENSITY ABUNDANCE newABUNDANCE #> 1 0 1 1 FALSE 84361.0835 15.855859 15.855859 #> 2 1 1 1 FALSE 215.1353 7.240669 7.240669 #> 3 0 1 2 FALSE 62109.5876 15.801179 15.801179 #> 4 2 1 2 FALSE 1205.2252 10.113738 10.113738 #> 5 0 1 3 FALSE 65114.3646 15.755022 15.755022 #> 6 3 1 3 FALSE 1476.3046 10.292109 10.292109 #> predicted #> 1 NA #> 2 NA #> 3 NA #> 4 NA #> 5 NA #> 6 NA
levels(QuantData$FeatureLevelData$GROUP)
#> [1] "0" "1" "10" "2" "3" "4" "5" "6" "7" "8" "9"
comparison <- matrix(c(-1,0,0,0,0,0,1,0,0,0),nrow=1) row.names(comparison) <- "T7-T1" colnames(comparison) <- unique(QuantData$ProteinLevelData$GROUP) # Tests for differentially abundant proteins with models: # label-based SRM experiment with expanded scope of biological replication. testResultOneComparison <- groupComparison(contrast.matrix=comparison, data=QuantData, use_log_file = FALSE)
#> INFO [2021-07-05 20:06:04] == Start to test and get inference in whole plot ... #> | | | 0% | |=================================== | 50% | |======================================================================| 100% #> INFO [2021-07-05 20:06:04] == Comparisons for all proteins are done.
# normal quantile-quantile plots modelBasedQCPlots(data=testResultOneComparison, type="QQPlots", address="")
#> | | | 0%
#> | |=================================== | 50%
#> | |======================================================================| 100%
#> pdf #> 2
# residual plots modelBasedQCPlots(data=testResultOneComparison, type="ResidualPlots", address="")
#> | | | 0%
#> | |=================================== | 50%
#> | |======================================================================| 100%
#> pdf #> 2