AI Biomarker Analysis Specialist
An AI Biomarker Analysis Specialist applies machine learning, deep learning, and advanced computational methods to discover, valid…
Skill Guide
The computational analysis of biological molecules (genes, proteins) in the context of predefined functional groupings (Gene Ontology) and interconnected metabolic/signaling pathways (KEGG, Reactome) to interpret high-throughput omics data.
Scenario
You are given a list of 200 differentially expressed genes (DEGs) from a public breast cancer dataset (e.g., TCGA). The task is to identify the primary biological themes.
Scenario
Perform Gene Set Enrichment Analysis (GSEA) on a ranked list of genes from an RNA-seq experiment comparing treated vs. control samples, using hallmark gene sets (MSigDB).
Scenario
A pharmaceutical team has phosphoproteomics and transcriptomics data from a drug-treated cancer cell line. Goal: Identify the primary mechanism of action by integrating kinase activity predictions with transcriptional pathway responses.
clusterProfiler is the industry standard for ORA and GSEA in R. GSEApy is its Python equivalent. Cytoscape is used for biological network visualization and topological analysis. STRING provides protein-protein interaction data for network building. DAVID/Enrichr are essential for quick, exploratory analysis.
GO provides controlled vocabulary for gene function. KEGG offers manually drawn pathway maps for metabolism, signaling, and disease. Reactome provides a curated, peer-reviewed pathway database with detailed reaction-level information. MSigDB is a comprehensive collection of gene sets for GSEA.
Answer Strategy
The question tests methodological rigor and communication skills. Strategy: Emphasize the distinction between a gene list (ORA) vs. ranked data (GSEA). Mention setting a proper background gene list. Recommend a reproducible script (clusterProfiler) over a web tool. For presentation, highlight the top 5-10 non-redundant, biologically interpretable pathways with their FDR values, and suggest linking hits to known drug targets in those pathways.
Answer Strategy
This tests critical thinking and database knowledge. The answer should show understanding of database ontology differences. Strategy: Explain that GO annotates genes to broad, process-oriented terms, while KEGG requires genes to be part of a specific, connected molecular map. Investigate by: 1) Checking if the genes in the GO term are actually part of a KEGG pathway. 2) Using Reactome, which may have a different structure for the same process. 3) Visually inspecting the KEGG pathway map for the immune system to see if the hits are scattered across multiple maps without reaching a single map's enrichment threshold.
1 career found
Try a different search term.