AI Default Prediction Specialist
An AI Default Prediction Specialist designs, trains, and operationalizes machine-learning models that forecast the probability of …
Skill Guide
Gradient-boosted tree methods (XGBoost, LightGBM, CatBoost) for tabular finance data are ensemble machine learning algorithms that sequentially build decision trees to minimize prediction error, optimized for high-dimensional, structured datasets common in finance such as credit scoring, fraud detection, and algorithmic trading.
Scenario
Build a binary classifier to predict loan defaults using a dataset with features like income, debt-to-income ratio, and credit history.
Scenario
Develop a real-time fraud detection system for credit card transactions with imbalanced data and evolving patterns.
Scenario
Create an ensemble of XGBoost, LightGBM, and CatBoost for predicting asset returns, incorporating macroeconomic indicators and alternative data.
Use XGBoost for robust performance with regularization, LightGBM for large datasets with categorical support, CatBoost for native categorical handling. Optuna for Bayesian hyperparameter tuning, MLflow for experiment tracking and model registry in collaborative environments.
pandas_ta for technical indicators, featuretools for automated feature engineering on transactional data, SHAP for model interpretability required in regulated finance, Alphalens for alpha factor analysis in quantitative strategies.
Answer Strategy
Focus on data preprocessing and model configuration. Start by addressing class imbalance: use CatBoost's auto_class_weights='Balanced' or scale_pos_weight. Engineer features like transaction velocity and device fingerprinting. Use stratified k-fold cross-validation and optimize for precision-recall AUC rather than accuracy. Sample answer: 'I'd set auto_class_weights to Balanced in CatBoost, engineer temporal and behavioral features, and validate with time-based splits to avoid leakage, focusing on PR-AUC as the primary metric.'
Answer Strategy
Tests communication and domain translation. Use SHAP force plots or summary plots to visualize feature contributions. Link predictions to business outcomes (e.g., 'This high-risk score is driven by recent late payments and high utilization, increasing expected loss by $500'). Sample answer: 'I used SHAP to show that recent late payments contributed 60% to the default probability, aligning with our risk policy. I then quantified the expected loss reduction if we applied this model to approve 10% more loans.'
1 career found
Try a different search term.