Get robust protein-level summary based on unique and shared peptides

Usage

getWeightedProteinSummary(
  feature_data,
  norm = "p_norm",
  norm_parameter = 1,
  weights_mode = "contributions",
  tolerance = 0.1,
  max_iter = 10,
  initial_summary = "unique",
  weights_penalty = FALSE,
  weights_penalty_param = 0.1,
  save_weights_history = FALSE,
  save_convergence_history = FALSE
)

Arguments

feature_data: data.table in MSstatsTMT format. See also the Details section
norm: "p_norm" or "Huber"
norm_parameter: p for norm=="p_norm", M for norm=="Huber"
weights_mode: "contributions" for "sum to one" and "non-negative" conditions, "probabilities" for only "non-negative" condition.
tolerance: tolerance to indicate weights convergence
max_iter: maximum number of iteration of the procedure
initial_summary: "unique", "flat" or "flat unique"
weights_penalty: if TRUE, weights will be penalized for deviations from equal value for all proteins matching to a given PSM
weights_penalty_param: penalty parameter
save_weights_history: logical, if TRUE, weights from all iterations will be returned
save_convergence_history: logical, if TRUE, all differences between consecutive weights estimator from all iterations will be returned

Value

list of data frames with summary and other information. See the Details section for more information

Details

1. Input format: this function takes as input data in MSstatsTMT format, which is a data frame with columns ProteinName, PeptideSequence, Charge, PSM (equal to PeptideSequence and Charge separated by an underscore), Channel, Intensity, Run and annotation columns: BioReplicate, Condition, Mixture, and TechRepMixture. Additionally, we use two columns: log2IntensityNormalized and Cluster. The first column stores log-transformed normalized intensities (which can be obtained with normalizeSharedPeptides function). If this column is not provided, data will be normalized before summarization. The second column stores information about connected sub-graphs of the peptide-protein graph. This column can be added with addClusterMembership function or omitted. In the second case, this information will be added before summarization.

2. Output format: an S4 object of class "MSstatsWeightedSummary" which consists of the following items:

FeatureLevelData:feature-level (input) data
ProteinLevelData:protein-level (summarized) output data
Weights:a table of final peptide-protein Weights
ConvergenceSummary:table with information about convergence for each Cluster and Run
WeightsHistory:optional data.table of Weights from all iterations of fitting algorithm
ConvergenceHistory:optional data.table with sums of absolute values of differences between Weights from consecutive iteration

Elements of this object can be accessed with functions featureData, proteinData, featurWeights, convergenceSummary, weightsHistory, convergenceHistory

For statistical details about the method, please consult the vignette.