AI Phishing Detection Specialist
An AI Phishing Detection Specialist designs, trains, and deploys machine learning and NLP-based systems that identify phishing ema…
Skill Guide
Adversarial machine learning is the discipline of understanding, crafting, and defending against intentionally malicious inputs designed to exploit vulnerabilities in ML models, forcing predictions to fail or behave incorrectly.
Scenario
A standard image classifier (ResNet-18) trained on CIFAR-10 is vulnerable to imperceptible perturbations. Your goal is to measure and improve its robustness.
Scenario
You have access only to the confidence scores of a commercial cloud image classification API (e.g., Google Vision, Azure Computer Vision). Design a query-efficient attack to misclassify a stop sign as a speed limit sign.
Scenario
Deploy an image classifier for medical imaging (e.g., tumor detection) where provable robustness guarantees are required for regulatory approval.
Use ART for its comprehensive suite of attacks, defenses, and certified methods. Use Foolbox for its clean API and integration with PyTorch/TensorFlow/JAX. These are essential for standardized benchmarking and research.
Core frameworks for implementing models and custom adversarial attacks/defenses. Hugging Face is critical for attacking and defending transformer-based models.
AutoAttack is the industry standard for evaluating empirical robustness. Use the 'A' and 'O' datasets to test failure modes on natural images, and the 'C' datasets to assess generalization under distribution shift, which often correlates with adversarial robustness.
Answer Strategy
Test the candidate's understanding of a core historical problem. Strategy: Define obfuscated gradients (shattered, stochastic, exploding/vanishing) and their effect on attack optimization. Sample Answer: 'Obfuscated gradients occur when a defense introduces non-differentiable components or noise, causing gradient-based attacks like PGD to fail. However, this doesn't imply true robustness. I would break it using a black-box attack (e.g., Square Attack) that doesn't rely on gradients, or use a differentiable approximation to bypass the obfuscation.'
Answer Strategy
Tests operational incident response and strategic thinking. Core competency: Balancing immediate mitigation with systemic improvement. Sample Answer: 'First, I would contain the issue by triggering a manual review for transactions with features near the decision boundary. Simultaneously, I'd use an attack-agnostic detector like an autoencoder on the latent space to flag anomalous feature patterns. Long-term, I'd implement a defense-in-depth strategy: retrain the model with the new attack data using adversarial training, and deploy an ensemble with diverse architectures to increase attack cost.'
1 career found
Try a different search term.