AI Responsible Disclosure Specialist
An AI Responsible Disclosure Specialist identifies, documents, and coordinates the ethical reporting of vulnerabilities, safety fa…
Skill Guide
Risk severity scoring for AI failures is the systematic process of quantifying the potential impact and exploitability of machine learning model failures, applying adapted cybersecurity frameworks like CVSS (Common Vulnerability Scoring System) and the OWASP LLM Top 10 to assign actionable risk ratings.
Scenario
You are given a scenario where a sentiment analysis model trained on a publicly available dataset has been compromised by injected malicious samples, causing it to misclassify negative reviews as positive with high confidence.
Scenario
A customer service LLM is exploited via a multi-step prompt injection (OWASP LLM01) that first extracts internal API keys (Confidentiality loss), then uses them to manipulate a connected database (Integrity loss).
Scenario
As the lead AI Security Architect, design and operationalize a risk scoring framework for all production ML/LLM systems in a financial institution, subject to strict regulatory oversight.
Use CVSS calculators for consistent scoring, OWASP LLM Top 10 as the primary vulnerability taxonomy for LLMs, and NIST AI RMF to contextualize scores within broader governance and risk management processes.
Integrate risk scores into experiment tracking (MLflow, W&B) for audit trails. Use offensive tools like Garak and Counterfit to simulate attacks and generate empirical data to inform scoring metrics.
Answer Strategy
The interviewer is testing understanding of inter-system dependencies in CVSS scoring. The candidate should explain that when a vulnerability in one component (LLM) impacts the security posture of another component (database), the Scope is changed (S:C). A sample answer: 'If the LLM's failure directly leads to database corruption via an exploited API call, the Scope metric changes to Changed (S:C), significantly increasing the impact score. If the failure is confined to the LLM's internal state or output, Scope remains Unchanged (S:U).'
Answer Strategy
This tests the candidate's ability to translate technical risk into business terms. The strategy is to use a structured communication framework (e.g., Problem-Impact-Solution). Sample response: 'I presented a model inversion attack risk with a CVSS base score of 8.6. I framed it as: Problem (attacker can extract private training data), Impact (potential GDPR fine of 4% global revenue and reputational damage), Solution (implement differential privacy, cost $X). Leadership approved the mitigation budget within a week.'
1 career found
Try a different search term.