How employers and HR tech vendors should evaluate any bias auditor under NYC LL 144 — three approaches, six tests, and the governance discipline around them
Citizens Bank’s most recent annual AI bias audit — published openly on their careers site, as the law requires — discloses up to 640 statistical comparisons across 16 hiring algorithms, broken out by gender, ethnicity, and intersectional categories.
If that sounds like a data-science problem, you’re half right. It is also a governance problem — and treating it as only the first is where most companies get into trouble. NYC Local Law 144 compliance has quietly become a real strategic choice along two dimensions at once: which provider you engage to produce the audit, and how your organization operates the rest of the responsibility around it. And the stakes are wider than external hiring alone: LL 144 covers any AEDT used for hiring, promotion, or internal mobility decisions tied to a New York City–based role.
Key Takeaways
- Three approaches, not three boxes to check. Buyers can engage an independent bias audit firm, adopt an AI assurance platform with embedded audit delivery, or use AI governance tooling — but the strongest providers blend the first two. Pick based on context, not category label.
- Independence is necessary, not sufficient. NYC LL 144’s legal independence rule is a narrow bar. Methodology, sign-off, and structural independence — applied as six concrete questions to any provider — are what separate rigorous audits from PDFs dressed as audits.
- Passing an audit does not equal trustworthy AI. The audit is the legal floor. The harder questions — who is accountable, can you explain a decision, do you govern the data lifecycle, is oversight cross-functional — sit one level up.
- Enforcement has changed your risk calculus. The December 2025 Comptroller audit flagged 17 potential violations among 32 publicly posted audits DCWP had cleared. Tools implemented under soft enforcement should be re-examined under the new posture.
Audit and Certify AI Systems for Bias and Compliance
Learn how Warden AI audits and certifies AI recruitment tools for fairness and compliance, giving vendors, staffing firms, and enterprises third-party assurance. Book a 30-minute demo →
From Legal Checkbox To Governance Discipline
When NYC LL 144 took effect in July 2023, most companies treated it as a one-time legal hurdle: run an audit, post the summary, move on. Three years in, the market has shifted. Enterprise HR buyers now request audit evidence during procurement. Candidates and advocates scrutinize public summaries. Boards ask what’s behind the public-facing PDF.
The question employers and HR tech vendors are quietly wrestling with is no longer “how do we pass the audit?” — it’s “how do we operate compliance as a durable governance capability, not a once-a-year scramble?”
Workforce AI is where this distinction matters most. The systems shaping hiring, compensation, promotion, scheduling, and performance management touch livelihoods, equity, compliance exposure, and corporate reputation directly. They are, by any honest read of the risk landscape — including the EU AI Act’s high-risk classification — the most consequential AI in most organizations. They are also the systems most often treated as “just an HR tool” rather than as enterprise infrastructure deserving cross-functional governance.
Static PDFs published once a year are no longer enough — and as of December 2025, that’s no longer just market opinion.
Why This Matters Now: Enforcement Is Catching Up
In December 2025, the New York State Comptroller’s office released a critical audit of DCWP’s enforcement of Local Law 144 covering July 2023 through June 2025. The audit concluded that current enforcement is “ineffective” and surfaced a 17-to-1 review discrepancy that should focus every buyer’s mind: DCWP reviewed 32 companies’ publicly posted bias audits and flagged exactly one issue of non-compliance. The Comptroller’s office reviewed the same 32 companies and flagged at least 17 potential issues — evidence, the Comptroller said, of systemically inadequate review.
DCWP agreed to implement most of the Comptroller’s recommendations: strengthened complaint routing, cross-divisional staff training, formal written policies and procedures, and an enhanced enforcement approach that may include direct interviews with employers and demonstrations of AEDT tools in use. Employment-law analyses from major firms in early 2026 — including DLA Piper — framed the audit as a roadmap for how DCWP is now expected to identify non-compliant companies, with civil penalties up to $1,500 per violation per day, compounding daily.
The DLA Piper analysis is pointed: companies with existing audits should not assume a vendor-provided or prior audit is compliant, and may want a privileged review to identify gaps before they become enforcement liabilities.
The implication for buyers is direct. If the Comptroller’s read holds, more than half of the publicly posted “audits” reviewed in the sample carried undetected non-compliance issues. Choosing the wrong approach — or the right approach with the wrong provider — is no longer a paperwork risk. It is a real enforcement risk, on a schedule.
Two Truths About the Bias Audit
The deliverable is statistical. What NYC LL 144 actually requires is data: selection rates by sex, race/ethnicity, and intersectional categories; impact ratios against the four-fifths threshold; scoring rates where applicable. That’s data engineering, statistical methodology, and chart production. It is not lawyering. It is not generic GRC. It is not internal HR work.
The discipline is governance. But the audit deliverable, taken alone, is the legal floor — not the ceiling. The audit answers did this tool produce disparate outcomes during the period we measured? It does not, on its own, answer the harder questions an organization needs to answer to deploy workforce AI responsibly: who is accountable for the systems we use, how do we govern the data lifecycle, can we explain a single decision to a candidate or employee, do we have cross-functional oversight, and are we measuring business value and risk simultaneously?
The two truths sit in tension only if you let them. The right way to hold them together: the audit is the technical core of a broader governance system. You need both layers. Optimizing for one at the expense of the other is the most common failure mode.
That layered reality is why buyers face three distinct approaches to satisfying the audit requirement — and why the most credible providers increasingly blend them.
The Three Approaches To Getting Your LL 144 Bias Audit
A note before the categories: these are approaches a buyer can take, not boxes that all need to be checked. The market has converged on hybrid models, and most credible providers ship a software-driven workflow with human auditor sign-off, regardless of whether they describe themselves as a firm, a platform, or a service. Buyers don’t need all three — they choose based on context, methodology fit, and procurement requirements.
1. The independent bias auditor approach (services-led, methodology-deep)
Engage an independent bias audit firm whose core deliverable is the audit itself. The law bars vendors of the AEDT from auditing their own tool, which is why even data-science-rich companies cannot self-certify. Providers in this lane range from established consultancies rooted in I/O psychology and adverse-impact statistics to AI-audit specialists purpose-built for algorithmic auditing. What unites them is that the audit is the core product, not a feature added to something else.
Best fit for: buyers prioritizing methodology rigor and audit-discipline lineage.
Tradeoffs: longer engagement cycles and, for the most services-led firms, limited integration with the buyer’s tech stack.
2. The AI Assurance platform approach (continuous, hybrid)
Adopt a platform that produces the audit as part of an embedded, software-driven workflow with human auditor sign-off — and that continues to produce evidence between annual audits. Where a traditional consultancy delivers an annual snapshot, an AI assurance platform integrates with the AEDT itself and updates disparate impact and counterfactual analysis as data flows in, surfaces public-facing trust dashboards, and maintains audit trails that survive procurement scrutiny.
The strongest providers in this lane are independent bias auditors that have built a continuous, embedded delivery model on top of the core audit service. That combination is what lets them serve as the auditor of record and deliver ongoing evidence — a structural pattern the market has converged on for buyers who can’t afford to be blind for 364 days out of 365.
Best fit for: HR tech vendors and enterprise employers who need ongoing evidence rather than annual snapshots.
Tradeoffs: platform integration overhead and the need to scrutinize how independence is operationalized between the audit and platform sides of the business.
3. The governance tooling approach (operational, sometimes audit-capable)
Use AI governance infrastructure — AI inventory, risk registers, model cards, vendor management — that may include audit features as part of a broader suite. This is the most contested of the three approaches. Some governance vendors have built credible audit practices on top of their tooling; others have bolted on a “compliance” feature that auto-generates a PDF without the methodology depth, qualified personnel, or audit-discipline rigor that LL 144 actually requires.
Best fit for: buyers who want compliance integrated with broader AI risk management.
Treat the audit deliverable itself as a separate qualification question — the parent product wasn’t designed for the auditor role, and the audit team’s structural independence from the broader governance product is the test that matters.
The Independence Question, Answered Honestly
All three approaches can plausibly produce the audit deliverable — but with very different levels of methodological depth and structural independence. The legal definition of “independence” under LL 144 (not having developed the AEDT, not having a direct or material indirect financial interest with the employer or AEDT vendor) is narrower than financial-audit independence and doesn’t, on its own, distinguish a rigorous provider from a casual one.
Category membership is therefore a poor proxy for credibility. The better screen is methodology, sign-off, and structural independence — applied as a checklist to any provider in any lane.
Six Questions To Ask Any Potential Auditor
G2’s own inclusion criterion for the AI Bias Audit Services category captures the right bar: a credible audit involves “humans examining the datasets, processes, and implementation” — not just an algorithmic output rendered as a PDF. Six questions operationalize that bar:
- Did you participate in developing, testing, or training the AEDT being audited? A “yes” disqualifies the provider under LL 144’s independence rule.
- Who on your team will perform the disparate impact analysis, and what are their qualifications? Look for data scientists with adverse-impact experience, I/O psychologists, or credentialed audit professionals — not generalists running a script.
- Will you publish your methodology, or only the results? A defensible audit can stand behind its choices in public. Methodology opacity is a red flag.
- Is your fee structure flat, or tied to audit outcomes in any way? Contingent fees compromise independence, full stop.
- If you also sell platform or governance products, how is the audit team separated from the product team? The honest answer involves a separate methodology, separate personnel, and a documented operating procedure — not just a marketing distinction.
- Will a named individual sign the audit summary as auditor of record, and stand behind it under regulatory or litigation scrutiny? Reports without a named human signatory are the clearest tell that the deliverable is software output dressed as an audit.
These six questions cut through the three approaches. They work whether the provider calls itself an audit firm, an assurance platform, or a governance tool.
A Tale of Two Disclosures
The clearest way to see how the three approaches operate in practice is to compare two real, public LL 144 disclosures.
Citizens Bank publishes an annual snapshot. Sixteen scoring algorithms, up to 640 comparisons, third-party auditor, point-in-time tables sorted by gender, ethnicity, and intersectional categories. That’s the audit floor done well. It satisfies the law. It demonstrates governance maturity. It also requires a fresh effort every twelve months, with the methodological choices re-defended each cycle — and, in the post-Comptroller enforcement environment, with external scrutiny that prior cycles didn’t face. A specialized-audit-firm engagement.
Beamery, an AI talent platform, took a different path. As an HR tech vendor rather than an employer, Beamery was not directly obligated under LL 144 — the law puts the audit duty on the employer using the AEDT, not on the builder of the tool. Beamery audited anyway. Their first published third-party bias audit, in November 2022, predated LL 144 enforcement. Beamery has since moved to a continuously-updated AI Assurance Dashboard. The current disclosure covers both disparate impact analysis (the LL 144 standard) and counterfactual analysis (which goes beyond it). The methodology is public. The data updates live. An assurance-platform engagement.
Both companies are compliant. Both are doing the work credibly. The difference isn’t whether they meet the law — it’s what kind of evidence they can put in front of a buyer, a candidate, or a regulator on any given day, and which of the three approaches fits their operating model.
Passing An audit Does Not Equal Trustworthy AI
Here is the lesson that NYC LL 144’s first three years has surfaced most clearly: a passed audit and trustworthy AI are not the same thing.
The audit answers a narrow, technical question: did this tool produce statistically disparate outcomes across protected categories during the measurement period? That is necessary. It is not sufficient. The harder set of questions sits one level up:
- Do we understand how these systems actually influence hiring, compensation, promotion, and performance decisions — not just what the model outputs, but what the humans on top of the model do with those outputs?
- Can we explain a specific outcome to a specific candidate or employee, on demand, in language they can act on?
- Are we governing the data lifecycle — collection, retention, transfer, deletion — and not just the model layer?
- Do we have genuine cross-functional oversight that includes HR, Legal, Compliance, Security, Procurement, and IT — not just whichever function owns the contract with the auditor?
- Are we measuring business value and risk in the same forum, with the same seniority, on the same cadence?
None of those questions is answered by the audit deliverable itself. All of them are required for an organization to honestly claim that its workforce AI is trustworthy — and increasingly, all of them are what regulators, boards, candidates, and enterprise customers are starting to probe.
The Governance Operating Model Around the Audit
Treating bias auditing as a technology or legal problem produces the failure modes the Comptroller’s audit surfaced — sign-off without scrutiny, PDFs without methodology, “compliance” without operations. Treating it as an enterprise operating-model problem is what produces durable defensibility. In practice, that means operationalizing governance into six places:
- Workflows — the audit findings have to land somewhere that triggers action. If a finding sits in a static PDF that nobody reads between audits, the audit hasn’t done its operational job.
- Procurement — vendor selection criteria should require auditability as a baseline, not a nice-to-have. The tools you can’t audit are the tools you cannot defend.
- Model oversight — a named owner for each AEDT, with authority to pause it. Not an inventory line item.
- HR and IT partnership structures — most workforce AI lives in the seam between HR and IT. Governance only works if both functions share the same definitions, the same risk register, and the same escalation paths.
- Executive accountability — somebody at the leadership table owns AI risk in the same way they own financial risk or security risk. Not a working-group co-chair.
- Ongoing monitoring and red teaming — between formal audits, somebody is actively probing the system for drift and adverse impact. Audits are episodic; governance is continuous.
This is what separates organizations that survive enforcement and procurement scrutiny from organizations that pass an audit and discover, twelve months later, that the audit’s findings were never integrated into anything that actually changed.
What This Means for HR and Talent Leaders Deploying AEDTs
NYC LL 144 puts the audit obligation squarely on the employer using the AEDT — not on the vendor that built it, not on the auditor you hire. That covers any AEDT making hiring, promotion, or internal-mobility decisions for an NYC-connected role — so the scope is wider than the typical “hiring AI” framing implies. The obligation is both a constraint and a leverage point. Four implications for the HR, talent, and AI-program leaders inside enterprises deploying these tools:
The obligation is non-delegable. You can engage the most credible auditor in the market, but the legal duty to commission the audit, publish the summary, notify candidates, and respond to data requests still belongs to your organization. A vendor’s “we’re compliant” statement is not, on its own, evidence that you are compliant — and DCWP enforcement is moving toward direct verification (interviews with employers, demonstrations of AEDT tools in use) where vendor claims won’t substitute for your records.
The Comptroller’s audit changed your risk calculus, even if your tools didn’t change. What was a soft-enforcement environment in 2023–2024 is a meaningfully sharper one in 2026. Tools you implemented under the old posture should be re-examined under the new one. The privileged review that DLA Piper recommended for existing audits applies broadly: if your current bias audit predates the December 2025 enforcement shift, it’s worth a second look — not because the audit itself was wrong, but because the bar it was measured against has moved.
Your public disclosure is long-tail evidence. The audit summary you publish lives on your careers site indefinitely — and lives in archive.org permanently. It is referenceable by candidates, advocates, plaintiffs’ counsel, and regulators years after publication. That is the point of the law’s transparency requirement, but it also means a thin or methodologically weak summary becomes a long-term liability. Treat the published artifact with the same care you’d treat any other public-facing disclosure with legal weight.
Procurement is your governance leverage. You can require audit-readiness from your HR tech vendors as a precondition for selection, renewal, or expansion. Doing so transfers part of the compliance burden upstream — vendors with an existing third-party audit save you weeks of work — and signals to the market that audit infrastructure is a buying criterion, not a nice-to-have. The vendors that respond to this signal are the ones easiest to operate compliantly. The vendors that resist it are the ones to walk away from.
What This Means For HR Tech Vendors Specifically
NYC LL 144 puts the audit obligation on the employer, not the AI builder — but that doesn’t mean vendors should treat the audit as someone else’s problem. There are four reasons builders should put their AI through independent bias audits even though they aren’t directly named in the law.
Customer enablement. Enterprise customers are subject to NYC LL 144 — and to the wave of similar laws coming behind it. They need credible audit evidence at the system level. A vendor that arrives with a defensible, independent audit reduces the customer’s compliance burden and removes a real friction point in the buying cycle.
Regulatory risk that does apply to builders. LL 144 doesn’t reach vendors directly, but other regimes do. The EU AI Act classifies AI systems used in employment as high-risk and places obligations on the provider, not just the deployer. Discrimination litigation has begun reaching the builder side, not only the employer side. State AI laws are layering additional duties on the builder side too: Illinois HB 3773 took effect January 2026, California’s ADMT and FEHA regulations are in force, and Colorado rewrote its 2024 framework in May 2026 (SB 26-189 replaced SB 24-205, swapping bias-audit-aligned risk assessments for a disclosure-and-human-review model effective January 2027). Note that only NYC LL 144 mandates a bias audit by name; the others create the discrimination liability that an independent bias audit is the strongest evidentiary defense against. The vendor with audit infrastructure today is positioned for the regulations that are coming, not just the one that’s already here.
Trust and reputation. Public bias incidents in workforce AI — across hiring, promotion, and internal mobility — generate disproportionate scrutiny from candidates, advocacy groups, journalists, and regulators. A vendor with a documented, public audit history is meaningfully more defensible across all four audiences than one without.
Procurement velocity. HR tech RFPs increasingly include questions about bias audits, methodology disclosure, and ongoing assurance. The vendor that can hand over a live trust page — methodology disclosed, data current, third-party-audited, named signatory — closes faster than the vendor that emails over last year’s PDF. That is already happening in real RFPs.
Compliance has shifted from a cost center to a sales asset. Choosing the right approach — and operating it under the six-question test, inside a governance discipline that goes beyond it — is how mature builders get there.
The Diagnostic That Matters
Most HR tech and HR teams have made implicit choices on both dimensions — provider and governance — without examining either. The diagnostic worth running has two parts.
On the provider: Which of the three approaches are you using? Who is actually doing the work? Would their methodology survive the six-question test? Could you defend it in front of a regulator or a sophisticated enterprise buyer tomorrow?
On the governance: Where does the audit’s output land in your organization? Who is accountable for acting on it? Do HR, Legal, Compliance, Security, Procurement, and IT share a single risk register and a single escalation path? Can you explain a specific AI-influenced employment decision — hiring, promotion, or internal mobility — to the person it affected, today, on request?
If you have an annual auditor and nothing else, you have the legal floor and a thin one — rigorous on audit day, blind for the 364 that follow. If you have a governance platform but no independent audit, you’ve inverted the priority. If you have an assurance platform without published methodology or a named auditor of record, you have continuity without credibility — which won’t survive the first regulatory or procurement challenge. And if you have any of those without the cross-functional operating model around it, you have technical compliance without operational defensibility.
For buyers who need ongoing evidence — most HR tech vendors and any enterprise employer running multiple AEDTs — a strong position is an assurance platform that is an independent bias auditor: audit rigor combined with continuous, embedded delivery, both meeting the six-question bar. Done well, that’s the technical foundation. The governance discipline around it is what turns that foundation into something a board, a regulator, and a procurement committee can actually trust.
The three-path framework isn’t a product pitch. It’s a diagnostic. The six questions are how you operate it. And the governance model around it is what separates organizations that pass audits from organizations that earn trust.
Audit and Certify AI Systems for Bias and Compliance
Learn how Warden AI audits and certifies AI recruitment tools for fairness and compliance, giving vendors, staffing firms, and enterprises third-party assurance. Book a 30-minute demo →
Related Articles
Bias Auditor FAQs
Does my company need a bias auditor if we're not based in New York City?
If you use an AEDT to evaluate candidates or employees for any role connected to New York City — external hiring, internal promotion, or mobility decisions, including remote positions filled by NYC residents — yes. The law follows the job, not the employer's headquarters, and applies to both hiring and promotion AEDTs. Beyond NYC, other regimes (EU AI Act, California FEHA, Illinois HB 3773, Colorado's new SB 26-189) create discrimination liability that an independent bias audit is the strongest evidentiary defense against, even where a bias audit isn't mandated by name.
Can our AEDT vendor perform the bias audit?
No. NYC LL 144 explicitly bars vendors of the AEDT from auditing their own tool. The auditor cannot have been involved in developing, testing, or training the system, and cannot have a direct or material indirect financial interest with either the employer or the AEDT vendor. This is the most-cited reason a vendor “compliance certificate” doesn’t satisfy the law on its own.
What's the difference between an annual bias audit and continuous AI assurance?
An annual audit is a snapshot — methodologically rigorous on the day it runs, then static for twelve months while the model and the data both move. Continuous assurance tracks fairness metrics between formal audits, catching model drift or performance degradation before it becomes a compliance issue or a discrimination claim. The legal minimum under LL 144 is annual; the operational floor for serious buyers is increasingly continuous.
Who should own AI bias auditing inside our organization?
The right answer is shared. The audit deliverable itself sits naturally with the team that owns the AEDT (often HR or Talent, sometimes IT), but the governance around it requires cross-functional ownership: HR for workforce impact, Legal for regulatory interpretation, Compliance for policy enforcement, Security for data governance, Procurement for vendor selection criteria, and IT for system integration. The organizations that get this right have a single forum where all six functions see the same risk register and the same audit findings — not a working group that meets quarterly.
What happens if our bias audit reveals adverse impact?
The law does not prohibit using a tool that shows adverse impact — it requires you to disclose the results. However, ongoing adverse impact creates exposure under broader discrimination law, so the practical response is to investigate the source of the disparity, document the investigation, implement mitigations where feasible, and monitor outcomes after any changes. A credible auditor’s report will include remediation guidance alongside the statistical findings.



