AI Semantic Search Engineer
An AI Semantic Search Engineer designs and builds search systems that understand intent and meaning rather than mere keywords, lev…
Skill Guide
The systematic process of selecting, adapting, and assessing pre-trained embedding models (e.g., OpenAI text-embedding-3, Sentence-Transformers E5, BGE, GTE) to generate high-quality vector representations for downstream tasks like retrieval, classification, or clustering.
Scenario
Given a CSV of 1000 customer support Q&A pairs, create a system that retrieves the most relevant answer to a user's free-text question.
Scenario
Improve recall for legal clause retrieval by fine-tuning a model (e.g., `BAAI/bge-base-en-v1.5`) on a corpus of contract sections and their relevant pairs.
Scenario
Design and deploy a two-stage retrieval system for a production e-commerce search that must handle 100 QPS with <200ms latency, combining dense embeddings with a sparse model (e.g., SPLADE).
Core libraries for model loading, fine-tuning, and inference. ONNX is critical for production optimization. Vector databases are essential for scalable similarity search.
MTEB/BEIR provide standardized comparisons across models and tasks. Always supplement with custom metrics that reflect your specific business use case (e.g., exact match for known queries).
Containerization (Docker) and orchestration (K8s) are mandatory for scalable serving. Monitor embedding quality drift and latency with Prometheus. Track experiments rigorously with W&B.
Answer Strategy
Test for systematic problem-solving. The answer should follow the STAR method, focusing on data collection, loss function choice, evaluation, and iteration. Sample: 'In a prior role, our generic model had 60% recall on medical Q&A. I collected 10k domain-specific (query, passage) pairs from expert reviews. I fine-tuned a bi-encoder with In-Batch Negatives and hard negatives mined via BM25. After three iterations focused on improving negative sampling, we achieved 85% recall@5, validated by a 15% increase in user satisfaction with search results.'
1 career found
Try a different search term.