Local Law 144: NYC Bias Audits for AEDTs
- Local Law 144 is a pioneering NYC statute mandating annual independent bias audits of Automated Employment Decision Tools used in hiring and promotions.
- It establishes clear audit metrics like the impact ratio and operational protocols to evaluate racial and gender fairness, ensuring transparency in employment decisions.
- Practical challenges include limited tool coverage, difficulties in accessing public audit reports, and a need for more diversified, distribution-sensitive fairness metrics.
Local Law 144 (LL 144) is a pioneering New York City statute that mandates annual, independent bias audits for Automated Employment Decision Tools (AEDTs) used by employers in hiring and promotion decisions. Enacted in July 2023, LL 144 represents the first regulatory regime globally to institutionalize mandatory algorithm audits with public transparency requirements for AI/ML-based employment systems. Its statutory architecture, operational metrics, definitions, and practical challenges make LL 144 a bellwether for research and policy development on algorithmic accountability in the labor market (Groves et al., 2024, Clavell et al., 2024, Filippi et al., 2023, Wright et al., 2024).
1. Statutory Origins, Objectives, and Scope
LL 144 was passed by the New York City Council in late 2021 after two rounds of rule-making, and took effect in July 2023 under the Department of Consumer and Worker Protection (DCWP). The law applies to any private employer or employment agency with a primary workplace in NYC—including remote-first organizations headquartered in any of the five boroughs—that uses an AEDT in employment decisions. Both hiring and promotion are in scope.
The law was enacted to:
- Protect job applicants from discriminatory outcomes in automated hiring and promotion,
- Incentivize a market for financially independent third-party auditors,
- Enhance transparency, enabling job seekers to make informed decisions and exert reputational or legal pressure on employers and vendors employing biased tools (Groves et al., 2024, Wright et al., 2024).
Employers must:
- Engage an independent auditor (with no financial stake in the employer or tool) for an annual bias audit covering at least the race and gender features (and their intersections) of all AEDTs in use,
- Publicly post the audit report on an accessible website,
- Provide a transparency notice to each candidate, disclosing AEDT use and offering a right to opt out in favor of a human-only review (Groves et al., 2024, Wright et al., 2024).
2. Key Definitions and Audit Coverage
The term "Automated Employment Decision Tool" is defined as any system using machine learning or AI that "substantially assists or replaces" discretionary hiring or promotion decisions, by generating a simplified score, ranking, or classification from multiple input features. The law operationalizes “substantially assists” as the tool being the primary or predominant basis for selection, thereby excluding decision-support or rule-based systems regarded as merely advisory (Groves et al., 2024, Wright et al., 2024).
An "independent auditor" is defined solely by the requirement that they be a third-party expert with no direct or indirect financial stake in the tool, employer, or business outcomes. No professional credentials, accreditation, training, or formal codes of practice are mandated beyond arms-length payment on a fee-for-service basis (Groves et al., 2024).
The explicit focus on race and gender (using Equal Employment Opportunity Commission [EEOC] definitions), intersectional subgroups, and the requirement that all audit results be publicly posted are central features. However, auditors are permitted to omit demographic subgroups representing fewer than 2% of the relevant applicant population—a point that has produced significant operational and equity concerns (Clavell et al., 2024, Wright et al., 2024).
3. Formal Audit Metrics and Computational Protocols
LL 144 mandates that the core audit metric is the impact ratio—also called adverse-impact ratio or "impact factor"—for each protected group relative to the group with the highest selection or scoring rate. For binary outcomes (e.g., offer made/not made), for any group and reference group :
For regression-based AEDTs outputting continuous scores, DCWP introduced metrics such as:
- Mean Difference (MeanDI):
- Median Difference (MedDI): , where is the overall sample median of
The "four-fifths rule" from EEOC guidance is institutionalized: IR < 0.8 for any group is considered prima facie evidence of adverse impact, though LL 144 does not itself mandate specific remediation at that threshold (Clavell et al., 2024, Wright et al., 2024, Filippi et al., 2023).
Notably, the statute neither prescribes sample sizes, mandates the collection of specific demographic variables beyond race and gender, nor stipulates protocols for handling missing or ambiguous data. This regulatory openness leads to variability in audit practices and the risk of under-detection of bias, especially for small subgroups (Clavell et al., 2024, Filippi et al., 2023).
4. Implementation, Auditor Roles, and Empirical Outcomes
In the first months post-enactment, practical implementation revealed several limitations and ambiguities (Groves et al., 2024, Wright et al., 2024). Narrow interpretation of AEDT scope following industry lobbying reduced the set of tools subject to audit—many widely used screening tools remain outside the law’s reach. Auditors frequently face barriers to data access, including proprietary restrictions from vendors, incompleteness of demographic attributes, and employer reluctance. Sampling protocols are undefined, enabling cherry-picking and small-sample artifacts (Groves et al., 2024).
LL 144’s definition of "independent auditor" produced a wide spectrum of practice. Auditors have assumed four functional roles:
| Role | Description |
|---|---|
| Audit-readiness | Consultancy for pre-audit compliance and data governance |
| Compliance audit | Formal testing and reporting as prescribed by LL 144 |
| Remediation advisory | Guidance for bias mitigation and design improvement |
| Certification | Attestation or seal of conformity to a code of practice |
Editor’s term: “auditor role taxonomy” for this four-part spectrum (Groves et al., 2024).
An empirical investigation of 391 NYC employers found only 4.6% posted audit reports and 3.3% posted transparency notices. Full compliance (both audit and notice) was 4%. Ambiguity in scoping (“null compliance”) allows employers broad discretion to self-determine whether LL 144 applies, confounding both enforcement and measurement of compliance (Wright et al., 2024).
5. Limitations of the Mandated Metrics and Critiques
In cases where AEDTs output continuous scores, LL 144 permits group mean or median-based tests. Research shows that both the mean and proportion-above-median metrics often fail to detect distributional biases, as they collapse entire predictive distributions to single points, missing disparities elsewhere (e.g., different variances, bimodal structures, or operational cutoffs that do not align with the mean/median) (Filippi et al., 2023). Alternative, distribution-sensitive metrics such as the area-under-the-curve disparate impact (AucDI) and probability of fair disparate impact (PfDI) are recommended to capture fairness across all thresholds:
where is the -th quantile of the pooled distribution (Filippi et al., 2023).
Field experience indicates nearly all published audit impact ratios cluster above 0.8, which may reflect publication bias, self-selection by employers, and selective omission of small or outlier subgroups. The law’s “2 percent rule” in practice excludes many vulnerable populations (e.g., American Indian & Alaska Native, Pacific Islander, “two or more races”) from bias quantification (Wright et al., 2024, Clavell et al., 2024). LL 144 neither prescribes any corrective action for IR < 0.8 nor supports random regulator-initiated audits, limiting its capacity to drive substantive fairness improvements.
6. Transparency-Driven Theory of Change and Empirical Critique
LL 144 implements a transparency-driven “theory of change”: mandatory publication of audits and candidate notices foster accountability by empowering job seekers to make informed decisions and by facilitating reputational scrutiny. Practically, this theory has underperformed, as most job seekers are unable to locate the relevant disclosures—student investigators required, on average, 17–19 clicks and substantial time to find audit reports or notices, with some hidden in nonstandard interfaces, download links, or ambiguous disclaimers (Wright et al., 2024). The law assumes that candidate action, employer reputational incentives, and marketplace pressure will control bias, but evidence suggests these mechanisms are weak when audits and notices are difficult to access, hard to interpret, and when there is no requirement for remediation (Wright et al., 2024, Groves et al., 2024).
7. Policy Recommendations and Forward Directions
Synthesizing recommendations across primary studies (Groves et al., 2024, Clavell et al., 2024, Filippi et al., 2023, Wright et al., 2024), key proposals for improving statutory algorithmic auditing regimes include:
- Clarify scope: Define AEDTs and bias metrics with language that prevents late-stage carve-outs and ensures maximal coverage of all impactful automated systems.
- Metric augmentation: Require multiple, distribution-sensitive fairness metrics (e.g., AucDI, PfDI, counterfactual tests) alongside or in place of the single-point impact ratio.
- Reduce exclusions: Remove arbitrary minimum group size thresholds and require techniques (e.g., Bayesian smoothing) to handle small subgroups, ensuring coverage of historically marginalized populations.
- Centralize compliance artifacts: Institute a DCWP-maintained central repository for all audits and notices; require standardized, plain-language reporting templates to maximize accessibility and discoverability for job seekers and researchers.
- Tie audits to enforcement: Mandate corrective action when adverse impact is detected; define a safe harbor or discrimination floor to convert audits into substantive accountability mechanisms.
- Accreditation and audit ecosystem: Develop auditor accreditation, enforce clearer independence standards, and grant DCWP investigatory and enforcement powers analogous to financial or consumer safety regulators.
- Data handling and integrity: Standardize data curation, mandate audit trails and in situ verifications, and benchmark audit data against city or federal census standards for ongoing validity (Groves et al., 2024, Clavell et al., 2024, Wright et al., 2024).
A coherent audit ecosystem and robust enforcement infrastructure, aligned with federal EEOC disparate impact doctrines and capable of evolving with demographic and technological changes, is critical for translating algorithmic transparency into actual fairness within automated hiring (Groves et al., 2024, Clavell et al., 2024, Wright et al., 2024, Filippi et al., 2023).