Skill Guide

Statistical hypothesis testing for disparate impact analysis

The application of statistical tests (e.g., z-tests, chi-square, Fisher's exact test) to determine whether the selection rate for a protected group (e.g., race, gender) is significantly less than the rate for a favored group, often using the 4/5ths rule as a threshold for preliminary analysis.

Organizations use this to proactively identify and mitigate legal risk under anti-discrimination laws (e.g., Title VII, ECOA), ensuring employment and lending practices are fair. It provides a defensible, data-driven methodology to validate fairness in automated systems, which is critical for regulatory compliance and maintaining public trust.

1 Careers

1 Categories

8.5 Avg Demand

20% Avg AI Risk

How to Learn Statistical hypothesis testing for disparate impact analysis

1. Master the 4/5ths (80%) rule from the Uniform Guidelines on Employee Selection Procedures as a baseline heuristic. 2. Understand the basic hypothesis framework: H0 (no disparate impact) vs. H1 (disparate impact exists), and the concepts of p-value and significance level (alpha). 3. Learn the z-test for proportions as the workhorse test for comparing selection rates between two groups.

1. Progress to handling small sample sizes with Fisher's exact test and analyzing multi-category outcomes with chi-square tests of independence. 2. Apply these tests to real HR datasets (hiring, promotion) or loan approval data, focusing on correct data segmentation and interpreting practical vs. statistical significance. 3. Avoid the common mistake of confusing correlation with causation; disparate impact analysis identifies a disparity, not necessarily intentional discrimination.

1. Architect a continuous monitoring pipeline for disparate impact across the employment lifecycle (hiring, performance, attrition). 2. Master and implement multiple comparison corrections (e.g., Bonferroni) when testing impact across numerous subgroups or job categories to control the family-wise error rate. 3. Integrate disparate impact analysis into the MLOps lifecycle, developing retraining triggers and model fairness dashboards for algorithmic decision systems.

Practice Projects

Beginner

Project

Applicant Flow Analysis for a Single Job Opening

Scenario

You are given a spreadsheet with 6-month hiring data for a 'Software Engineer' role. It has columns for applicant ID, race (Caucasian, African American, Hispanic), gender, and hire status (Yes/No). Your task is to determine if the selection process has a disparate impact on African American applicants versus Caucasian applicants.

How to Execute

1. Isolate the data for the two groups and calculate the selection rate for each. 2. Apply the 4/5ths rule for a quick preliminary check. 3. Conduct a two-proportion z-test (using Python's statsmodels or scipy) to determine if the observed difference is statistically significant at alpha=0.05. 4. Document the procedure, calculations, p-value, and conclusion in a one-page memo for HR.

Intermediate

Project

Multi-Group Disparity Analysis in Promotion Decisions

Scenario

You are provided with promotion data for 500 employees across four departments. The data includes gender (Male/Female/Non-Binary) and promotion outcome (Promoted/Not Promoted) over two years. The goal is to assess if gender is a factor in promotion rates, controlling for department.

How to Execute

1. Segment the data by department. For each department, create a 2x3 contingency table (Gender x Promotion Outcome). 2. For departments with sufficient expected cell counts, use a chi-square test of independence. For departments with small samples in any cell, use Fisher's exact test. 3. Apply a multiple comparison correction (e.g., Benjamini-Hochberg FDR) to the set of p-values from each department's test. 4. Report which specific department-gender combinations show significant disparities after correction, avoiding broad generalizations.

Advanced

Project

Audit of an AI-Powered Resume Screening Tool

Scenario

Your company's AI resume screening tool has been in production for a year. You must conduct a formal disparate impact audit to comply with NYC Local Law 144. You have access to the historical applicant pool data (demographic info from voluntary self-identification) and the tool's recommended interview rates.

How to Execute

1. Define the 'favored' and 'protected' groups for analysis (e.g., by gender, race/ethnicity) based on legal and business context. 2. Calculate the impact ratio (selection rate of protected group / selection rate of favored group) for each protected category. 3. Perform a battery of statistical tests: a two-proportion z-test for each pairwise comparison and a chi-square test for overall group differences. 4. Analyze results in conjunction with a review of the model's feature importance and any disparate performance metrics (e.g., false positive rates). 5. Draft a formal audit report with findings, root-cause hypotheses (e.g., proxy variables in resume text), and a remediation plan (e.g., retraining, removing features, human-in-the-loop).

Tools & Frameworks

Statistical Software & Libraries

Python (statsmodels, scipy.stats, numpy)R (stats package, infer, rstatix)Excel (Data Analysis ToolPak)

Used for executing the core statistical tests (z.test, chisq.test, fisher.test), calculating p-values, and performing corrections. Python/R are preferred for reproducibility and scalability; Excel is useful for quick, auditable calculations with small datasets.

Legal & Compliance Frameworks

Uniform Guidelines on Employee Selection Procedures (4/5ths Rule)NYC Local Law 144 (Automated Employment Decision Tools)EEOC Compliance Manual Section 15

These provide the legal definitions, thresholds, and procedural requirements that define what constitutes 'disparate impact' and how analysis must be documented for defensibility.

Methodological Frameworks

Adverse Impact (AI) AnalysisFour-Fifths RuleStandard Deviation (SD) Analysis (e.g., 2-SD rule)

These are the structured approaches for conducting the analysis. The 4/5ths rule is a common initial test, while SD Analysis offers a more sensitive statistical benchmark often used in conjunction with formal hypothesis testing.

Interview Questions

Answer Strategy

Structure your response around the formal steps: 1) Preliminary 4/5ths test, 2) Formulating hypotheses, 3) Choosing and executing the appropriate statistical test, 4) Interpreting the p-value in context, 5) Distinguishing statistical from practical significance. Sample Answer: 'First, I'd apply the 4/5ths rule: 18.75%/30% = 0.625, which is less than 0.80, triggering a need for further analysis. I would then set up a two-proportion z-test with H0: p_male - p_female = 0. Using the pooled proportion, I calculate a test statistic and p-value. If the p-value is below our chosen alpha (typically 0.05), I would reject H0 and conclude there is statistically significant evidence of disparate impact. I would report this result alongside the impact ratio and note that while the 4/5ths rule is a guideline, the z-test provides a more rigorous, defensible conclusion for legal proceedings.'

Answer Strategy

Tests the candidate's ability to navigate conflict, communicate risk, and separate statistical findings from business and legal decisions. The answer should focus on process, risk communication, and documentation. Sample Answer: 'I would facilitate a meeting with legal, compliance, and the data science team. My role is to present the objective statistical findings clearly. I would emphasize that statistical significance is a factual finding, but the decision on how to proceed is a business risk assessment. I would recommend we document the analysis, the model's business justification, and the agreed-upon risk tolerance. If we proceed with the model, I would insist on implementing enhanced monitoring, an appeals process for applicants, and a plan to review alternative models with less impact as part of our ongoing responsibility.'