The Lexicon of AI Bias Auditing

AI Bias

Bias that manifests in the outputs of an automated or algorithmic system, where observed differences in outcomes are attributable to how the system processes data, applies rules, or learns from patterns.

Learn More

Why it matters in practice:

AI bias is observable and testable through outputs. It does not require insight into intent or source code.

May also be called:

Algorithmic bias, model bias

Audit Methodology

The documented framework defining metrics, datasets, assumptions, thresholds, and review processes used in an audit.

Learn More

Why it matters in practice:

Findings are only defensible if the methodology is transparent and repeatable.

May also be called:

Audit framework

Audit Report

The formal artefact summarising audit scope, methodology, findings, and limitations.

Learn More

Why it matters in practice:

This is the document regulators and clients expect to review.

May also be called:

Audit documentation

Audit Scope

The legal and regulatory frameworks under which an audit is performed, such as NYC LL 144, SB-205, EU AI Act, or FEHA.

Learn More

Why it matters in practice:

Scope determines what must be tested and how findings can be used.

May also be called:

Regulatory scope

Audit Trail

A documented record of data sources, methods, assumptions, and findings associated with an audit.

Learn More

Why it matters in practice:

Essential for regulatory defence and governance.

May also be called:

Evidence trail

Automated Decision-Making Technology (ADMT)

A system that uses automated computation or algorithms to make or materially assist decisions about individuals, including scoring, ranking, filtering, or prioritization. Terminology varies by regulation.

Learn More

Why it matters in practice:

If a system shapes access to opportunity, it attracts regulatory scrutiny regardless of label.

May also be called:

AEDT, ADM system, automated decision system

Bias

A systematic pattern of difference in outcomes or treatment between groups or individuals that is not explained by relevant, legitimate criteria. Bias describes observed patterns, not intent or legality.

Learn More

Why it matters in practice:

Bias is the signal that triggers investigation. It can exist in human processes, data, or automated systems.

May also be called:

Outcome disparity, differential treatment

Bias Audit

An independent, structured assessment of whether a system’s outputs show statistically meaningful differences across protected characteristics using defined methods and datasets.

Learn More

Why it matters in practice:

Produces evidence about risk. It does not itself determine legality or compliance.

May also be called:

Fairness audit, algorithmic audit

Black-Box Testing

An evaluation approach that assesses a system based on observed inputs and outputs, without access to internal logic or source code.

Learn More

Why it matters in practice:

Behavioral testing is often the only practical way to evaluate proprietary systems.

May also be called:

Outcome-based testing

Consistency Score

A metric used to assess how stable or repeatable an AI system’s outputs are when presented with the same or similar inputs. Similar inputs may include paraphrased text, reordered information, or other non-material variations. In some audit contexts, deliberately modified inputs such as demographic changes may also be treated as similar for specific analyses.

Learn More

Why it matters in practice:

Low consistency can indicate sensitivity to irrelevant changes, increasing the risk of unpredictable behaviour. Depending on test design, consistency metrics may be used to assess robustness, bias sensitivity, or both.

May also be called:

Output stability metric, repeatability metric

Counterfactual

A modified instance of the same individual or data point in which only selected attributes are changed. For bias auditing, this is usually a demographic counterfactual.

Learn More

Why it matters in practice:

Counterfactuals allow controlled testing of what influences decisions.

May also be called:

Synthetic variant, altered input

Counterfactual Analysis

A method that tests whether altering selected attributes while holding other inputs constant changes system outputs.

Learn More

Why it matters in practice:

Helps identify sensitivity to specific attributes, including but not limited to protected characteristics.

May also be called:

Sensitivity analysis (context-specific)

Disparate Impact

A pattern where outcomes disproportionately disadvantage one group compared to another, regardless of intent.

Learn More

Why it matters in practice:

A core concept in employment discrimination analysis and AI governance.

May also be called:

Adverse impact

Disparate Impact Analysis

A statistical comparison of outcome rates across demographic groups to identify potential adverse impact.

Learn More

Why it matters in practice:

One of the most widely recognised methods used in bias audits.

May also be called:

Adverse impact analysis

Explainability

The ability to describe how a system produces outputs in terms understandable to stakeholders.

Learn More

Why it matters in practice:

Supports accountability, challenge, and informed oversight.

May also be called:

Interpretability

Group Bias

Bias identified through aggregated outcome patterns across demographic groups.

Learn More

Why it matters in practice:

Reveals systemic disparities that individual cases may not show.

May also be called:

Population-level bias

High-Risk AI System

A system used in contexts where decisions have legal or similarly significant effects on individuals, such as hiring or pay. Definitions vary by regulation.

Learn More

Why it matters in practice:

High-risk classification triggers stronger governance and oversight expectations.

May also be called:

Regulated AI system

Impact Ratio

A ratio comparing a group’s outcome rate to that of the highest-performing group.

Learn More

Why it matters in practice:

Used to quantify disparity and assess whether further investigation is required.

May also be called:

Selection rate ratio

Individual Bias

Differential treatment affecting specific individuals, even when group-level statistics appear balanced.

Learn More

Why it matters in practice:

Individual harm can exist without obvious population-level signals.

May also be called:

Case-level bias

Material Assistance

When a system meaningfully contributes to a decision by shaping ranking, filtering, prioritizing, or visualizing.

Learn More

Why it matters in practice:

Human involvement does not remove responsibility if the system structures the decision space.

May also be called:

Decision support with material effect

Ongoing Auditing

Repeated audits over time to account for system changes, updates, and data drift.

Learn More

Why it matters in practice:

AI systems evolve. Static evaluations do not capture this evolution.

May also be called:

Continuous auditing

Post-Deployment Monitoring

Ongoing evaluation of a live system using real operational data in a production environment.

Learn More

Why it matters in practice:

Many risks only emerge once systems interact with real users and data distributions.

May also be called:

In-market monitoring

Pre-Deployment Testing

Testing conducted before a system is placed into operational use, typically using historical or synthetic test data.

Learn More

Why it matters in practice:

Useful for early risk identification but insufficient on its own.

May also be called:

Model validation, pre-market testing

Protected Characteristics

Attributes that receive legal or ethical protection, such as sex, race, age, disability, religion, or national origin.

Learn More

Why it matters in practice:

Bias audits assess outcomes across these attributes to identify discrimination risk.

May also be called:

Protected classes

Proxy Bias

Bias arising from variables that correlate strongly with protected characteristics, even if those characteristics are not explicitly used.

Learn More

Why it matters in practice:

Removing protected attributes does not eliminate risk if proxies remain.

May also be called:

Indirect discrimination

Robustness Testing

Stress-testing a system under varying conditions to assess stability and sensitivity to irrelevant changes.

Learn More

Why it matters in practice:

Identifies fragility before it becomes operational or compliance risk.

May also be called:

Stress testing

Scoring Rate

The proportion of individuals in a group receiving a favorable outcome.

Learn More

Why it matters in practice:

Forms the basis of many outcome-based fairness metrics.

May also be called:

Selection rate

Selection Procedure

Any method, tool, or process used to evaluate, screen, or rank individuals in employment-related decisions.

Learn More

Why it matters in practice:

Many AI hiring tools are legally treated as selection procedures, bringing established fairness standards into scope.

May also be called:

Assessment method, screening tool

Statistical Parity

A comparison of outcome rates across groups to assess balance.

Learn More

Why it matters in practice:

Parity alone does not establish fairness. Context and relevance matter.

May also be called:

Demographic parity

Third-Party Audit

An audit conducted by an independent entity with no commercial interest in the system being assessed.

Learn More

Why it matters in practice:

Independence is critical for credibility with regulators and enterprise buyers.

May also be called:

Independent audit, external audit

The Lexicon of AI Bias Auditing

AI Bias Auditing Dictionary

AI Bias

Audit Methodology

Audit Report

Audit Scope

Audit Trail

Automated Decision-Making Technology (ADMT)

Bias

Bias Audit

Black-Box Testing

Consistency Score

Counterfactual

Counterfactual Analysis

Disparate Impact

Disparate Impact Analysis

Explainability

Group Bias

High-Risk AI System

Impact Ratio

Individual Bias

Material Assistance

Ongoing Auditing

Post-Deployment Monitoring

Pre-Deployment Testing

Protected Characteristics

Proxy Bias

Robustness Testing

Scoring Rate

Selection Procedure

Statistical Parity

Third-Party Audit

Join the companies
building trust in AI

The Lexicon of AI Bias Auditing

AI Bias Auditing Dictionary

AI Bias

Audit Methodology

Audit Report

Audit Scope

Audit Trail

Automated Decision-Making Technology (ADMT)

Bias

Bias Audit

Black-Box Testing

Consistency Score

Counterfactual

Counterfactual Analysis

Disparate Impact

Disparate Impact Analysis

Explainability

Group Bias

High-Risk AI System

Impact Ratio

Individual Bias

Material Assistance

Ongoing Auditing

Post-Deployment Monitoring

Pre-Deployment Testing

Protected Characteristics

Proxy Bias

Robustness Testing

Scoring Rate

Selection Procedure

Statistical Parity

Third-Party Audit

Join the companiesbuilding trust in AI

Join the companies
building trust in AI