Annotate Protein Information from Indra
annotateProteinInfoFromIndra.RdThis function standardizes entity identifiers from protein, compound, or gene inputs to a unified namespace using ID conversion from INDRA cogex or Gilda grounding.
Arguments
- df
output of
groupComparisonfunction's comparisonResult table. Must contain aProteincolumn whose values are interpreted according toproteinIdType.- proteinIdType
A character string specifying the type of analyte identifier in the
Proteincolumn. One of"Uniprot","Uniprot_Mnemonic","Hgnc_Name", or"Metabolite". The"Metabolite"value treats inputs as metabolite names and grounds them through Gilda, keeping whatever namespace Gilda returns (CHEBI / PUBCHEM / CHEMBL / ...).
Value
A data frame with the following columns:
- Protein
Character. The original identifier from the input.
- GlobalProtein
Character. The input identifier without the PTM site suffix (typically
_<amino acid><site number>, e.g._S148) stripped, used as the grounding key.- UniprotId
Character. The Uniprot ID of the protein, or
NAfor"Hgnc_Name"and"Metabolite"inputs.- EntityNamespace
Character. The grounding namespace (e.g.
"HGNC","CHEBI"). When a single input grounds to multiple candidates, namespaces are semicolon-joined and positionally aligned withEntityIdandEntityName.- EntityId
Character. The bare grounding identifier within its namespace (e.g.
"1097"for HGNC,"28748"for CHEBI). Semicolon-joined when multi-grounded.- EntityName
Character. The canonical display name from the grounding source. Semicolon-joined when multi-grounded.
- IsTranscriptionFactor
Logical.
NAforproteinIdType == "Metabolite".- IsKinase
Logical.
NAforproteinIdType == "Metabolite".- IsPhosphatase
Logical.
NAforproteinIdType == "Metabolite".
Examples
df <- data.frame(Protein = c("CLH1_HUMAN"))
annotated_df <- annotateProteinInfoFromIndra(df, "Uniprot_Mnemonic")
head(annotated_df)
#> Protein GlobalProtein UniprotId EntityNamespace EntityId EntityName
#> 1 CLH1_HUMAN CLH1_HUMAN Q00610 HGNC 2092 CLTC
#> IsTranscriptionFactor IsKinase IsPhosphatase
#> 1 FALSE FALSE FALSE