AI Rare Disease AI Specialist
An AI Rare Disease Specialist leverages artificial intelligence to accelerate diagnosis, drug discovery, and personalized treatmen…
Skill Guide
The integrated computational analysis of data from genomics, transcriptomics, proteomics, metabolomics, and other high-throughput biological assays to derive holistic biological insights and predictive models.
Scenario
Analyze a public RNA-seq dataset (e.g., TCGA or GEO) to identify genes differentially expressed between cancer subtypes.
Scenario
Integrate somatic mutation data (WES) with phospho-proteomics data to identify mutations that dysregulate signaling pathways.
Scenario
Build a predictive model using pre-treatment tumor RNA-seq, copy number variation, and clinical data to predict response to immunotherapy.
Primary ecosystems for statistical analysis and visualization. Bioconductor is the gold standard for many omics methods. Python excels in machine learning integration. Galaxy provides accessible pipelines without deep coding.
Nextflow and Snakemake enable scalable, reproducible, and portable pipeline development. Docker/Singularity ensure software dependency consistency across environments, critical for collaboration and publication.
MOFA identifies latent factors of variation across omics layers. SNF fuses patient similarity networks from different data types. WGCNA constructs gene co-expression networks to find functional modules. Selection depends on data types and biological question.
Answer Strategy
The answer should demonstrate a structured, hypothesis-driven integration approach. Sample: 'I would first perform quality control and independent analysis of each dataset-differential expression for RNA-seq and differential accessibility for ATAC-seq. Then, I'd integrate them by testing for correlation between gene expression changes and chromatin accessibility at proximal promoter or distal enhancer regions using a tool like GREAT or a custom linear model. Significant overlap would suggest direct regulatory links. I would validate top candidates by looking for known transcription factor motifs in the accessible regions.'
Answer Strategy
Tests debugging, statistical reasoning, and communication skills. Sample: 'I would follow a systematic debugging framework: 1) Data Integrity: Check for batch effects, technical artifacts, or differences in sample preprocessing between cohorts. 2) Statistical Validation: Re-examine the model's feature selection; were the features stable? Is the validation cohort fundamentally different demographically? 3) Biological Plausibility: Do the model's key features have a coherent biological story? If not, it may be overfit. 4) I would communicate findings to the team with a clear report on whether the issue is technical, statistical, or biological, proposing a mitigation plan.'
1 career found
Try a different search term.