Visualization for model-based quality control in fitting model

To check the assumption of linear model for whole plot inference, modelBasedQCPlots takes the results after fitting models from function (groupComparison) as input and automatically generate two types of figures in pdf files as output: (1) normal quantile-quantile plot (specify "QQPlot" in option type) for checking normally distributed errors.; (2) residual plot (specify "ResidualPlot" in option type).

modelBasedQCPlots(
  data,
  type,
  axis.size = 10,
  dot.size = 3,
  width = 10,
  height = 10,
  which.Protein = "all",
  address = ""
)

Arguments

data	output from function groupComparison.
type	choice of visualization. "QQPlots" represents normal quantile-quantile plot for each protein after fitting models. "ResidualPlots" represents a plot of residuals versus fitted values for each protein in the dataset.
axis.size	size of axes labels. Default is 10.
dot.size	size of points in the graph for residual plots and QQ plots. Default is 3.
width	width of the saved file. Default is 10.
height	height of the saved file. Default is 10.
which.Protein	Protein list to draw plots. List can be names of Proteins or order numbers of Proteins from levels(testResultOneComparison$ComparisonResult$Protein). Default is "all", which generates all plots for each protein.
address	name that will serve as a prefix to the name of output file.

Value

produce a pdf file

Details

Results based on statistical models for whole plot level inference are accurate as long as the assumptions of the model are met. The model assumes that the measurement errors are normally distributed with mean 0 and constant variance. The assumption of a constant variance can be checked by examining the residuals from the model.

QQPlots : a normal quantile-quantile plot for each protein is generated in order to check whether the errors are well approximated by a normal distribution. If points fall approximately along a straight line, then the assumption is appropriate for that protein. Only large deviations from the line are problematic.
ResidualPlots : The plots of residuals against predicted(fitted) values. If it shows a random scatter, then the assumption is appropriate.

Examples

QuantData <- dataProcess(SRMRawData, use_log_file = FALSE)
#> INFO  [2021-07-05 20:06:04] ** Features with one or two measurements across runs are removed.
#> INFO  [2021-07-05 20:06:04] ** Fractionation handled.
#> INFO  [2021-07-05 20:06:04] ** Updated quantification data to make balanced design. Missing values are marked by NA
#> INFO  [2021-07-05 20:06:04] ** Log2 intensities under cutoff = 3.776  were considered as censored missing values.
#> INFO  [2021-07-05 20:06:04] ** Log2 intensities = NA were considered as censored missing values.
#> INFO  [2021-07-05 20:06:04] ** Use all features that the dataset originally has.
#> INFO  [2021-07-05 20:06:04] 
#>  # proteins: 2
#>  # peptides per protein: 2-2
#>  # features per peptide: 3-3
#> INFO  [2021-07-05 20:06:04] 
#>                     1 2 3 4 5 6 7 8 9 10
#>              # runs 3 3 3 3 3 3 3 3 3  3
#>     # bioreplicates 3 3 3 3 3 3 3 3 3  3
#>  # tech. replicates 1 1 1 1 1 1 1 1 1  1
#> INFO  [2021-07-05 20:06:04]  == Start the summarization per subplot...
#> 
  |                                                                            
  |                                                                      |   0%
  |                                                                            
  |===================================                                   |  50%
  |                                                                            
  |======================================================================| 100%
#> INFO  [2021-07-05 20:06:04]  == Summarization is done.
head(QuantData$FeatureLevelData)
#>   PROTEIN         PEPTIDE TRANSITION               FEATURE LABEL GROUP RUN
#> 1    IDHC ATDVIVPEEGELR_2      y7_NA ATDVIVPEEGELR_2_y7_NA     H     0   1
#> 2    IDHC ATDVIVPEEGELR_2      y7_NA ATDVIVPEEGELR_2_y7_NA     L     1   1
#> 3    IDHC ATDVIVPEEGELR_2      y7_NA ATDVIVPEEGELR_2_y7_NA     H     0   2
#> 4    IDHC ATDVIVPEEGELR_2      y7_NA ATDVIVPEEGELR_2_y7_NA     L     1   2
#> 5    IDHC ATDVIVPEEGELR_2      y7_NA ATDVIVPEEGELR_2_y7_NA     H     0   3
#> 6    IDHC ATDVIVPEEGELR_2      y7_NA ATDVIVPEEGELR_2_y7_NA     L     1   3
#>   SUBJECT FRACTION originalRUN censored  INTENSITY ABUNDANCE newABUNDANCE
#> 1       0        1           1    FALSE 84361.0835 15.855859    15.855859
#> 2       1        1           1    FALSE   215.1353  7.240669     7.240669
#> 3       0        1           2    FALSE 62109.5876 15.801179    15.801179
#> 4       2        1           2    FALSE  1205.2252 10.113738    10.113738
#> 5       0        1           3    FALSE 65114.3646 15.755022    15.755022
#> 6       3        1           3    FALSE  1476.3046 10.292109    10.292109
#>   predicted
#> 1        NA
#> 2        NA
#> 3        NA
#> 4        NA
#> 5        NA
#> 6        NA
levels(QuantData$FeatureLevelData$GROUP)
#>  [1] "0"  "1"  "10" "2"  "3"  "4"  "5"  "6"  "7"  "8"  "9" 
comparison <- matrix(c(-1,0,0,0,0,0,1,0,0,0),nrow=1)
row.names(comparison) <- "T7-T1"
colnames(comparison) <- unique(QuantData$ProteinLevelData$GROUP)
# Tests for differentially abundant proteins with models:
# label-based SRM experiment with expanded scope of biological replication.
testResultOneComparison <- groupComparison(contrast.matrix=comparison, data=QuantData,
use_log_file = FALSE)
#> INFO  [2021-07-05 20:06:04]  == Start to test and get inference in whole plot ...
#> 
  |                                                                            
  |                                                                      |   0%
  |                                                                            
  |===================================                                   |  50%
  |                                                                            
  |======================================================================| 100%
#> INFO  [2021-07-05 20:06:04]  == Comparisons for all proteins are done.
# normal quantile-quantile plots
modelBasedQCPlots(data=testResultOneComparison, type="QQPlots", address="")
#> 
  |                                                                            
  |                                                                      |   0%
#> 
  |                                                                            
  |===================================                                   |  50%
#> 
  |                                                                            
  |======================================================================| 100%
#> pdf 
#>   2 
# residual plots
modelBasedQCPlots(data=testResultOneComparison, type="ResidualPlots", address="")
#> 
  |                                                                            
  |                                                                      |   0%
#> 
  |                                                                            
  |===================================                                   |  50%
#> 
  |                                                                            
  |======================================================================| 100%
#> pdf 
#>   2

Visualization for model-based quality control in fitting model

Arguments

Value

Details

Examples

Contents