Get robust protein-level summary based on unique and shared peptides
getWeightedProteinSummary.Rd
Get robust protein-level summary based on unique and shared peptides
Usage
getWeightedProteinSummary(
feature_data,
norm = "p_norm",
norm_parameter = 1,
weights_mode = "contributions",
tolerance = 0.1,
max_iter = 10,
initial_summary = "unique",
weights_penalty = FALSE,
weights_penalty_param = 0.1,
save_weights_history = FALSE,
save_convergence_history = FALSE
)
Arguments
- feature_data
data.table in MSstatsTMT format. See also the Details section
- norm
"p_norm" or "Huber"
- norm_parameter
p for norm=="p_norm", M for norm=="Huber"
- weights_mode
"contributions" for "sum to one" and "non-negative" conditions, "probabilities" for only "non-negative" condition.
- tolerance
tolerance to indicate weights convergence
- max_iter
maximum number of iteration of the procedure
- initial_summary
"unique", "flat" or "flat unique"
- weights_penalty
if TRUE, weights will be penalized for deviations from equal value for all proteins matching to a given PSM
- weights_penalty_param
penalty parameter
- save_weights_history
logical, if TRUE, weights from all iterations will be returned
- save_convergence_history
logical, if TRUE, all differences between consecutive weights estimator from all iterations will be returned
Value
list of data frames with summary and other information. See the Details section for more information
Details
1. Input format: this function takes as input data in MSstatsTMT format,
which is a data frame with columns ProteinName, PeptideSequence, Charge,
PSM (equal to PeptideSequence and Charge separated by an underscore),
Channel, Intensity, Run and annotation columns: BioReplicate, Condition,
Mixture, and TechRepMixture. Additionally, we use two columns:
log2IntensityNormalized and Cluster. The first column stores log-transformed
normalized intensities (which can be obtained with normalizeSharedPeptides
function).
If this column is not provided, data will be normalized before summarization.
The second column stores information about connected sub-graphs of the
peptide-protein graph. This column can be added with addClusterMembership
function or
omitted. In the second case, this information will be added before summarization.
2. Output format: an S4 object of class "MSstatsWeightedSummary" which consists of the following items:
FeatureLevelData:feature-level (input) data
ProteinLevelData:protein-level (summarized) output data
Weights:a table of final peptide-protein Weights
ConvergenceSummary:table with information about convergence for each Cluster and Run
WeightsHistory:optional data.table of Weights from all iterations of fitting algorithm
ConvergenceHistory:optional data.table with sums of absolute values of differences between Weights from consecutive iteration
Elements of this object can be accessed with functions
featureData
, proteinData
,
featurWeights
, convergenceSummary
,
weightsHistory
, convergenceHistory
For statistical details about the method, please consult the vignette.