AI Infrastructure Engineer
AI Infrastructure Engineers design, build, and maintain the foundational systems that power machine learning workloads at scale - …
Skill Guide
The engineering discipline of designing, building, and maintaining automated systems that continuously ingest, transform, store, and serve data-specifically real-time event streams for analytics, curated feature sets for machine learning models, and high-dimensional vector embeddings for similarity search applications.
Scenario
A startup needs to analyze user click events from their website in near real-time to monitor engagement, not just batch reports hours later.
Scenario
An ML team is building a 'customers who bought this also bought...' model but is plagued by training-serving skew and inconsistent feature definitions across notebooks and production APIs.
Scenario
A media company wants to build an internal search engine that finds relevant articles, images, and video frames using natural language queries, requiring a unified embedding space and low-latency retrieval at scale.
The backbone for decoupling producers and consumers of real-time event streams. Use Kafka for high-throughput, durable log-based streaming; cloud-native services (Kinesis/Pub/Sub) for managed, serverless integration within their respective ecosystems.
Used to perform stateful computations (e.g., windowed aggregations, joins, pattern detection) on streaming data in real-time. Flink is a leader for low-latency, complex event processing; Spark is preferred for unified batch-streaming codebases.
Solves the operational challenge of consistent feature engineering, storage, and serving across training and inference. Feast is open-source and composable; Tecton is a managed service with advanced transformation capabilities.
Specialized storage and retrieval engines for high-dimensional vector embeddings. Pinecone/Weaviate are fully managed for ease of use and performance. pgvector allows adding vector search to an existing PostgreSQL stack. Choice depends on latency, scale, cost, and integration needs.
Answer Strategy
Demonstrate understanding of the 'dual-write' or 'unified view' pattern with a feature store. The strategy is to define the feature logic once, materialize it to both an offline store (e.g., data lake) for batch training and an online low-latency store (e.g., Redis) for real-time serving. The feature store handles the orchestration. Sample Answer: 'I would use a feature store like Feast. I'd define the feature view (e.g., user_24h_purchase_count) once in the feature registry, with a transformation that computes it from a batch source (daily snapshot) and a streaming source (Kafka). The store's materialization job would backfill the offline store for daily model retraining while a streaming consumer updates the online store in near real-time for the live model.'
Answer Strategy
Tests systematic troubleshooting and deep knowledge of vector database internals. The answer should follow a diagnostic flow: infrastructure, query patterns, index configuration. Sample Answer: 'First, I'd rule out infrastructure issues: check DB resource metrics (CPU, RAM, IOPS) and network latency. Second, I'd analyze query patterns: check if filter pre-selectivity is low or if we're scanning too many vectors. Third, I'd examine index configuration: verify index type (e.g., HNSW vs. IVF), its build parameters (ef_construction, m for HNSW), and whether it needs rebuilding. For a managed service like Pinecone, I'd check pod type and replica count; for pgvector, I'd check HNSW ef_search setting and index bloat.'
1 career found
Try a different search term.