AB 1008: CCPA Data Deletion for AI Models — How to Honor Right-to-Delete Requests Against AI Systems

California AB 1008, signed September 28, 2024 and effective January 1, 2025, amended California Consumer Privacy Act §1798.140 to confirm that "personal information" includes information that exists in any format, including in artificial intelligence systems that are capable of outputting personal information — closing an interpretive gap some AI vendors had used to argue that personal information embedded in trained model weights, rather than stored in conventional databases, was outside CCPA's scope. The substantive consumer rights existed before AB 1008; the amendment makes clear that AI vendors and developers cannot escape those rights by pointing at the storage format. The compliance puzzle this creates is genuine and operationally hard: how do you actually honor a right-to-delete request when the personal information is embedded in trained weights that cost millions of dollars to produce? This article walks through what AB 1008 actually changed, the three deletion approaches available in 2026 (full retraining, filter-based suppression, and machine unlearning), the deletion request workflow most AI developers are building, and how AB 1008 sits with AB 2013 to require a unified training data inventory as the underlying compliance substrate.

What AB 1008 actually changed

AB 1008's operative text is short and structurally surgical. It amends CCPA §1798.140's definition of "personal information" to add a clause confirming that personal information includes information "that exists in any format, including, but not limited to, physical, digital, abstract, or in an artificial intelligence system that is capable of outputting personal information." The amendment does not create new substantive consumer rights — the right to know, the right to delete, the right to correct, and the right to limit sensitive personal information processing all existed under CCPA before AB 1008. What AB 1008 does is foreclose a particular interpretive argument that AI vendors had occasionally used in pre-amendment compliance discussions.

The argument went something like this: CCPA's definition of personal information was historically drafted with conventional databases in mind, where personal information is stored as discrete records that can be queried, retrieved, and deleted. When personal information is used to train an AI model, it does not remain in that discrete-record form — it becomes part of the model's learned weights, which are statistical generalizations rather than copies of the original data. Some vendors argued that this transformation took the information outside CCPA's scope, on the theory that the trained weights are not personal information in the same sense that the original training data was. The argument was always weak — California regulators had signaled rejection of similar interpretations through enforcement priorities and the EPIC-style privacy advocacy community had treated the argument as untenable — but AB 1008 makes the rejection explicit.

The practical consequence is that AI vendors holding personal information in any form — training corpora, fine-tuning datasets, model weights, embedding stores, retrieval-augmented-generation indexes, output logs, or any other AI-system component — now operate under the same CCPA obligations as a vendor holding personal information in a conventional database. The substantive rights that consumers have always had are now unambiguously enforceable against the AI-stored data layer.

The three deletion approaches and when each is appropriate

Honoring a CCPA right-to-delete request against an AI system is operationally harder than honoring one against a conventional database, because AI systems do not have the conventional "delete this row" primitive that databases provide. The three approaches that have emerged are full retraining, filter-based suppression, and machine unlearning, and each fits different deployment contexts.

Full retraining is the conceptually cleanest approach: remove the affected personal information from the training corpus and retrain the model from scratch. This produces a model whose weights have never been exposed to the deleted information, which is the strongest possible compliance posture. The cost, however, makes it impractical to perform per-individual deletion request — for foundation models, retraining costs run into tens of millions of dollars per cycle, and even fine-tuned application models can cost hundreds of thousands. Most production AI systems therefore do not retrain on every deletion request; they batch deletion requests into scheduled retraining cycles that occur every few months.

Filter-based suppression is the operationally cheap approach: maintain a deletion-request list and filter the model's outputs to suppress generation of personal information about anyone on the list. This is computationally cheap, can be implemented quickly, and provides immediate response to deletion requests. The technical limitation is that filter-based suppression does not actually remove the information from model weights — the model still "knows" the information, and a sufficiently determined attacker using indirect prompts may be able to extract it. For most consumer deletion requests, where the underlying threat model is not adversarial extraction, filter-based suppression provides reasonable practical protection. For deletion requests in higher-stakes contexts (revoked consent for deeply personal data, deletion in connection with restraining orders or witness protection), the limitations of filter-based suppression matter more.

Machine unlearning is a research-active technical area developing algorithms that selectively remove specific information from trained models without full retraining. The 2024-2026 literature on machine unlearning has produced techniques that work in restricted settings — gradient-based unlearning, certified unlearning with formal guarantees, and parameter-efficient unlearning approaches — but the field is not yet at the production-readiness level where machine unlearning can be routinely deployed against arbitrary deletion requests on large foundation models. Some specific applications (smaller domain models, classification systems, embedding stores) can support machine unlearning today; foundation-model unlearning at scale remains an active research problem.

The deletion request workflow most AI developers are building

The defensible compliance posture combines all three approaches at different stages of a unified workflow. The five-stage workflow that has emerged in production environments operates as follows.

The first stage is intake: a clear consumer-facing channel for submitting deletion requests with reasonable identity verification. Most large AI vendors expose a privacy portal at a discoverable URL where consumers can submit deletion requests, with identity verification appropriate to the type of personal information involved. Identity verification calibration matters: too much friction creates barriers to legitimate deletion requests; too little friction enables impersonation attacks where bad actors submit deletion requests on behalf of others.

The second stage is scoping: identifying every system within the organization that holds the affected personal information. For an AI vendor, this scope is typically broader than for a conventional vendor — it includes conventional databases, AI training corpora, AI model weights, AI output logs, embedding stores, retrieval indexes, fine-tuning datasets, and any downstream caches or analytics systems. Each of these may need a separate deletion action.

The third stage is immediate action: removing the personal information from conventional storage and applying filter-based suppression to AI outputs. This is the consumer-facing response that makes the deletion functionally effective in the short term. The information is removed from any conventional databases, the deletion-request list is updated to suppress relevant generation, and any output caches are invalidated.

The fourth stage is retraining incorporation: aggregating accumulated deletion requests into the next scheduled retraining cycle so that the model's weights eventually no longer reflect the deleted information. This is where the substantive removal happens for the trained-weights component of the AI system. The retraining cadence is typically driven by other engineering considerations (model improvement cycles, data freshness requirements) rather than by deletion request volume, but each retraining cycle incorporates the accumulated deletion list as a prerequisite.

The fifth stage is response: confirming completion of the immediate actions to the requesting consumer and explaining the timeline for the retraining-based effectuation. The CCPA's response timelines (45 days, extendable to 90 days for complex requests) apply to the immediate actions; the retraining cycle effectuation is communicated as a separate timeline because it depends on the engineering cadence. Documentation of each step is the audit trail that defends compliance posture if a regulator inquiry or private action arises.

Training data lineage as the underlying compliance substrate

The workflow above presupposes something that many AI developers do not currently have: a comprehensive inventory of what personal information exists in their training data and AI systems. AB 1008 effectively requires this inventory as a prerequisite for compliance — without knowing what personal information is in your training data, you cannot honor deletion requests against it, and a deletion request you cannot process is a CCPA violation regardless of how the inventory gap arose.

Most large AI developers are now building training data inventories that track at minimum the source of each data item or batch (where it came from), whether the item contains personal information about identifiable individuals (an automated classification step using PII detection models), the consent or legal basis for inclusion in training, and the model versions that have been trained on the item. This inventory is operationally substantial — for foundation models trained on billions of items, tracking is necessarily statistical rather than per-item — but it produces the data substrate that makes deletion requests processable.

The inventory also serves AB 2013's training data transparency obligations, which is the integration point between the two California AI privacy regimes. AB 2013 requires public disclosure of training data characteristics in a high-level summary; AB 1008 requires the underlying lookup mechanism that lets specific deletion requests be processed. Both regimes draw on the same underlying training data inventory. Building the inventory once and using it for both regimes is the efficient compliance posture; building separate inventories for each regime is wasteful. Our companion AB 2013 vs trade secrets article addresses the disclosure layer; this article addresses the deletion layer; both are layers on the same training data inventory.

Where the existing CCPA exemptions still apply

AB 1008 does not override the CCPA's existing exemptions. Personal information processed for a "business purpose" under CCPA §1798.105(d) — including security incident detection, fraud prevention, debugging, internal use consistent with consumer expectations, legal compliance, and certain other categories — remains exempt from deletion obligations even when it exists in AI systems. This matters because some AI use cases align with the business-purpose exemptions and consequently fall outside the deletion right. An AI fraud detection model trained on transactional data including personal information may be exempt from deletion requests as to the data used for fraud detection, even though the same personal information would be deletable from a conventional CRM database.

The exemptions still require that the underlying business purpose actually qualify; the AI-storage-format does not by itself trigger an exemption. Compliance teams should not assume that AI use cases automatically benefit from any business-purpose exemption — the analysis is fact-specific and requires applying the §1798.105(d) categories to the specific use case. Where multiple use cases coexist (the same model is used for fraud detection and customer recommendations, for instance), the deletion obligation may apply to one use case while the other remains exempt, requiring the deletion mechanism to be granular rather than model-wide.

How AB 1008 fits with the broader California AI privacy regime

AB 1008 is one of three CCPA-aligned California AI privacy amendments enacted in the 2024 cycle. AB 1008 covers AI-system-stored personal information generally. SB 1223 amends CCPA to add neural data as a category of sensitive personal information, specifically targeting brain-computer interface and neural-prosthetic data. SB 1120 (which the California medical board has been operationalizing through guidance) addresses AI use in healthcare insurance utilization decisions. The three together create a substantive privacy regime that operates alongside the AB 2013 training-data-transparency regime, the SB 942 content-provenance regime, and the SB 53 frontier-AI-safety regime.

For AI deployers in regulated sectors — healthcare, financial services, employment — AB 1008 stacks with sector-specific privacy regimes (HIPAA, GLBA, employment privacy laws) to create overlapping deletion obligations that may have different timelines, different exemptions, and different identity-verification requirements. The integrated compliance posture maps each regime's requirements to the unified deletion workflow rather than building separate workflows per regime. For broader context on the California AI compliance picture, see our 2026 California AI Compliance Roadmap.

Sources

The primary statute is AB 1008 on California Legislative Information. EPIC's legislative session roundup places AB 1008 in the broader 2024 California AI legislation cycle. The California Attorney General's legal advisory provides the regulator-side framing on AI privacy. For practitioner-grade analysis, ZwillGen's overview and Pillsbury's analysis are useful starting references. Watch the California Privacy Protection Agency for any guidance on what deletion methodology satisfies AB 1008 in AI contexts, and watch the technical literature on machine unlearning as the field matures toward production-readiness.

Generate your AB 1008 deletion request workflow policy

Our AI Policy Generator outputs a written framework documenting your training data inventory, deletion request intake and verification procedures, three-tier deletion approach (immediate filter-based suppression, scheduled retraining incorporation, machine unlearning where applicable), and CCPA-aligned response timelines — the artifact a regulator inquiry or consumer complaint will examine. Free, no signup, exports as PDF.

Open the AI Policy Generator →

Frequently Asked Questions

What is California AB 1008?
AB 1008 is a California privacy law signed by Governor Newsom on September 28, 2024 and effective January 1, 2025. Authored by Assemblymember Rebecca Bauer-Kahan, it amends California Consumer Privacy Act §1798.140 to clarify that 'personal information' includes information that exists in any format, including in artificial intelligence systems that are capable of outputting personal information. The amendment closes an interpretive gap that AI vendors had sometimes used to argue that personal information embedded in trained model weights — rather than stored in conventional databases — was outside CCPA's scope. After AB 1008, that argument is unavailable.
Does AB 1008 create new privacy rights for consumers?
No. AB 1008 does not create new substantive privacy rights — it confirms that the existing CCPA rights (right to know, right to delete, right to correct, right to limit sensitive personal information processing) apply to personal information in AI systems the same way they apply to personal information in conventional storage. The substantive consumer rights existed before AB 1008; the amendment makes clear that AI vendors and developers are subject to those rights when they hold personal information in their AI systems. There is no separate private right of action created by AB 1008 beyond the existing CCPA breach-notification private right of action under §1798.150.
How can an AI 'forget' personal information?
There are three operationally distinct approaches, and the right choice depends on the deployment context. The first is full retraining: removing the affected personal information from training data and retraining the model from scratch. This is the cleanest deletion but is computationally expensive — for foundation models the cost can be tens of millions of dollars per retrain — making it impractical to perform per individual deletion request. The second is filter-based suppression: maintaining a deletion-request list and filtering model outputs to suppress generation of personal information appearing on the list. This is computationally cheap but does not actually remove the information from model weights. The third is machine unlearning: a research-active technical area developing algorithms that selectively remove specific information from trained models without full retraining. As of 2026, machine unlearning techniques exist but are not yet routinely deployed at production scale.
What deletion approach is acceptable under AB 1008?
AB 1008 itself does not prescribe a deletion methodology, and CCPA enforcement guidance has not directly addressed which AI deletion approaches satisfy the right-to-delete obligation. The general regulatory expectation, drawing on EU GDPR Article 17 'right to erasure' guidance and California Privacy Protection Agency commentary, is that the deletion mechanism must be effective in practice — meaning the personal information cannot be subsequently retrieved or output by the AI system in response to relevant queries. Filter-based suppression alone may be insufficient if the underlying model can still generate the personal information through alternative prompting paths; machine unlearning is more defensible when the technique can be demonstrated to actually remove the information; full retraining is the gold standard but is operationally expensive. The defensible posture is to combine approaches: filter-based suppression for immediate response, with periodic retraining cycles that incorporate accumulated deletion requests as the substantive removal mechanism.
How does AB 1008 interact with the original training data deletion question?
AB 1008 covers personal information at every stage of the AI lifecycle, including in training data, in model weights, and in outputs. A consumer's right-to-delete request can therefore be directed at training data even when the AI system has already been trained on that data, with the obligation extending to the model that contains the trained-in information. Most AI developers handle this by building deletion request workflows that have two stages: removing the data from any retained training corpus or pipeline storage immediately, and incorporating the deletion into the next model retraining cycle so that the trained weights eventually no longer reflect the deleted information. The compliance posture explains the sequence to the consumer and provides a reasonable timeline for full effectuation.
What about the existing CCPA exemptions — do they apply under AB 1008?
Yes. AB 1008 confirms that AI-system-stored personal information is CCPA-covered, but it does not override the CCPA's existing exemptions. Personal information processed for a 'business purpose' under CCPA §1798.105(d) — including security incident detection, fraud prevention, debugging, internal use consistent with consumer expectations, legal compliance, and certain other categories — remains exempt from deletion obligations even when it exists in AI systems. The exemptions still require that the underlying purpose actually qualify; the AI-storage-format does not by itself trigger an exemption.
How should I structure a deletion request workflow for AI systems?
Five steps. First, intake: a clear consumer-facing channel for submitting deletion requests with reasonable identity verification. Second, scoping: identifying every system within your organization that holds the affected personal information, including conventional databases, AI training corpora, AI model weights, AI output logs, and downstream caches. Third, immediate action: removing the personal information from conventional storage and applying filter-based suppression to AI outputs. Fourth, retraining incorporation: aggregating accumulated deletion requests into scheduled retraining cycles so that the model's underlying weights eventually reflect the deletions. Fifth, response: confirming completion of the immediate actions to the requesting consumer and explaining the timeline for the retraining-based effectuation. Documentation of each step is the audit trail that defends compliance posture.
What about training data lineage tracking?
AB 1008 effectively requires training data lineage tracking as a prerequisite for compliance — without knowing what personal information is in your training data, you cannot honor deletion requests against it. Most AI developers are now building training data inventories that track at minimum: the source of each data item, whether the item contains personal information about identifiable individuals, the consent or legal basis for inclusion, and the model versions that have been trained on the item. This inventory is operationally substantial — for foundation models trained on billions of items, tracking is necessarily statistical rather than per-item — but it produces the data substrate that lets deletion requests be processed reliably.
How does AB 1008 fit with AB 2013's training data transparency?
AB 1008 and AB 2013 are complementary regimes targeting different layers of the same underlying problem. AB 2013 requires public disclosure of training data characteristics in a high-level summary; AB 1008 confirms that personal information in training data and model weights is subject to CCPA rights. A developer building compliance for both runs the AB 2013 training data summary as the public-disclosure layer and the AB 1008 deletion request workflow as the consumer-rights layer. The shared substrate is the training data inventory, which serves both regimes — the inventory provides the source material for the AB 2013 summary and the lookup mechanism for AB 1008 deletion requests. Build one inventory; satisfy two statutes.

Related Articles

More on the same topics — California AI laws, healthcare compliance, and the rules behind them.

Is Your AI Compliant?

Don't guess. Use our free calculator to check your AB 489 & AB 3030 status in minutes.

Start Free Compliance Check

2026 Legislative Tracker

Live status of California AI regulations.

SB 53In Force

Transparency in Frontier AI

Effective: Jan 1, 2026
AB 2013In Force

Training Data Transparency

Effective: Jan 1, 2026
SB 942Upcoming

AI Watermarking (per AB 853)

Effective: Aug 2, 2026
AB 3030In Force

Healthcare AI Disclosure

Effective: Jan 1, 2025
SB 243In Force

Companion Chatbot Safety

Effective: Jan 1, 2026
AB 316In Force

Autonomous AI Defense

Effective: Jan 1, 2026
SB 1047Vetoed

Safe & Secure Innovation

Effective: N/A