Skill Guide

SIEM engineering: custom log sources, AI inference telemetry parsing

SIEM engineering involving custom log sources and AI inference telemetry parsing is the practice of ingesting, normalizing, and analyzing non-standard data streams-particularly those from AI/ML model inference pipelines-into a Security Information and Event Management platform to detect threats, operational anomalies, and ensure compliance.

This skill is critical as AI systems become attack surfaces and operational dependencies; it enables organizations to detect adversarial attacks on models (like data poisoning or inference abuse) and operationalize MLOps security, directly protecting intellectual property and ensuring AI system integrity.

1 Careers

1 Categories

9.2 Avg Demand

15% Avg AI Risk

How to Learn SIEM engineering: custom log sources, AI inference telemetry parsing

Focus on core SIEM concepts (log sources, parsers, correlation rules), the structure of common log formats (JSON, CEF, LEEF), and the basics of the AI/ML inference lifecycle (model serving, endpoints, input/output schemas). Start by parsing simple application logs.

Practice designing parsers for complex, nested JSON structures typical of AI inference logs (e.g., containing feature vectors, model versions, latency metrics). Learn to differentiate between security events (prompt injection attempts) and operational metrics (GPU utilization). Avoid the mistake of treating all AI telemetry as generic application logs.

Architect scalable ingestion pipelines for high-volume, low-latency inference telemetry. Design correlation rules that cross-reference AI inference logs with traditional infrastructure and identity logs to detect sophisticated attacks (e.g., model theft via excessive querying). Mentor others on translating threat intelligence for AI systems into detection logic.

Practice Projects

Beginner

Project

Build a Basic AI Inference Log Parser

Scenario

You receive sample log files from a PyTorch TorchServe endpoint in raw JSON format, containing fields like 'model_name', 'input_tensor_shape', 'inference_time_ms', and 'prediction_confidence'.

How to Execute

1. Use a log generator (like a Python script) to create sample logs. 2. Configure a parser in a SIEM (e.g., Splunk's `props.conf` or Elastic's Ingest Pipeline) to extract these fields. 3. Create a basic dashboard showing inference latency over time and a table of model usage. 4. Test by injecting a sample log with a malformed input to verify parsing failure detection.

Intermediate

Project

Detect Prompt Injection via Inference Telemetry

Scenario

A large language model (LLM) serving endpoint logs include the full user prompt and model response. You need to detect and alert on potential prompt injection attempts that try to override system instructions.

How to Execute

1. Enrich the parser to extract 'user_prompt' and 'system_prompt' fields. 2. Develop a detection rule using regex or keyword matching (e.g., 'ignore previous instructions', 'you are now') in the 'user_prompt' field. 3. Correlate these events with high error rates or unusual token count spikes. 4. Create an incident playbook for SOC analysts on how to investigate and contain a compromised model session.

Advanced

Project

Enterprise AI Threat Detection Pipeline

Scenario

You are responsible for monitoring a multi-model ML platform (e.g., NVIDIA Triton, Seldon Core) serving dozens of models. You need to detect model reconnaissance (probing), data exfiltration via repeated inference, and model poisoning through feedback loops.

How to Execute

1. Architect a normalized data schema (e.g., OCSF for AI) to standardize logs from different model serving frameworks. 2. Implement ML-based anomaly detection (e.g., Isolation Forest) within the SIEM or a downstream data lake on features like query rate per user, input entropy, and output variance. 3. Build a threat model mapping MITRE ATLAS techniques to specific log patterns. 4. Integrate alerts with SOAR playbooks to automatically throttle suspicious API keys or quarantine a model version.

Tools & Frameworks

SIEM & Log Management Platforms

Splunk (with Heavy Forwarder / UF)Elastic Stack (with Logstash / Ingest Nodes)Microsoft SentinelGoogle Chronicle

Core platforms for ingestion, parsing, and analysis. Expertise involves writing efficient parsers (Splunk SPL, Elasticsearch Painless, KQL) and managing data pipelines at scale.

Telemetry & Observability Tools

OpenTelemetry (OTel) CollectorPrometheusGrafana Loki

Used to capture, buffer, and export inference metrics and traces. Understanding how to instrument ML frameworks (e.g., via `torch.utils.tensorboard` or custom exporters) is key.

AI/ML Serving Frameworks & Formats

TensorFlow ServingTriton Inference ServerSeldon CoreKServeStandardized schemas: OCSF, CEF

The source systems generating telemetry. Knowledge of their specific log formats and configuration options is necessary to build accurate parsers.

Detection & Analysis Libraries

Sigma (for generic detection rules)YARA-L (for Chronicle)Python (Pandas/NumPy for log analysis)

For developing, testing, and sharing detection logic. Sigma allows rules to be translated across multiple SIEM platforms.

Interview Questions

Answer Strategy

Structure the answer around the data flow: Ingestion -> Normalization (extract core fields: user_id, prompt, response, model_id, tokens_in/out, latency, cost) -> Enrichment (add user risk score, model version) -> Detection. Sample answer: 'I'd first ensure we capture the full prompt/response for forensic review, then normalize token counts and latency for performance baselines. For detection, I'd create a tiered alert system: 1) Behavioral anomalies (e.g., a single user's token usage spiking 10x), 2) Content-based rules (e.g., regex for common injection patterns in prompts), and 3) Resource-based alerts (e.g., sustained high latency indicating a DoS attempt).'

Answer Strategy

The interviewer is testing systematic troubleshooting and the ability to correlate data across domains. Sample answer: 'I would immediately pivot on the error codes and input signatures. I'd correlate error logs with infrastructure metrics (CPU/GPU load, memory) from the same timeframe. If errors cluster around a specific input pattern (e.g., very long sequences, specific Unicode characters), it's likely adversarial. If errors are random and coincide with infrastructure instability (OOM kills, pod restarts), it's operational. I'd also check deployment logs for recent model or config changes.'