AI Responsible Disclosure Specialist
An AI Responsible Disclosure Specialist identifies, documents, and coordinates the ethical reporting of vulnerabilities, safety fa…
Skill Guide
The engineering discipline of creating, maintaining, and operating Python-based software systems that systematically discover, test, and report vulnerabilities in AI/ML models and their serving infrastructure as a continuous, automated process.
Scenario
You have a deployed image classification model (e.g., a PyTorch model). You need to create a simple script that checks it for susceptibility to basic evasion attacks and verifies its input/output format.
Scenario
Your team uses a Git-based workflow. You must ensure that every new model commit is automatically scanned for code vulnerabilities, dependency issues, and basic performance regressions before deployment.
Scenario
As a security lead, you need to implement a rigorous, automated suite that stress-tests all production models against a battery of state-of-the-art adversarial attacks and provides a quantifiable robustness score.
Essential for understanding the structure, loading, and inference of the models you are securing. You must be proficient in the framework your organization uses.
Used to generate adversarial examples, conduct poisoning attacks, and audit model robustness. ART is particularly comprehensive for both attack and defense evaluation.
For automating security checks (SAST, SCA) within CI/CD workflows. Docker is used to containerize models and scan the resulting images for OS and library vulnerabilities.
For monitoring data drift, concept drift, and model performance in production, which are precursors to potential security issues. They help detect anomalies that may indicate an attack.
Answer Strategy
The answer should demonstrate a structured approach covering multiple attack vectors. Use a framework: 1) Threat Modeling for LLMs (e.g., prompt injection, data extraction, hallucinations). 2) Pipeline Stages: Static analysis of prompt templates, dynamic fuzz testing with malicious prompts, differential privacy analysis on training data queries, and performance load testing for denial-of-service. 3) Tooling: Mention using something like `langchain`'s testing tools or building custom red-teaming scripts integrated into the CI/CD process.
Answer Strategy
This tests real-world experience and problem-solving. Use the STAR method. Example: 'Situation: A production recommendation model was underperforming. Task: I needed to investigate. Action: I used anomaly detection on input features and discovered a data poisoning attack where malicious users were injecting skewed ratings. Remediation involved implementing input sanitization filters and retraining the model on a clean dataset. The result was a 20% performance recovery and the implementation of ongoing input validation checks.'
1 career found
Try a different search term.