AB 1008: CCPA Data Deletion for AI Models — How to Honor Right-to-Delete Requests Against AI Systems
California AB 1008, signed September 28, 2024 and effective January 1, 2025, amended California Consumer Privacy Act §1798.140 to confirm that "personal information" includes information that exists in any format, including in artificial intelligence systems that are capable of outputting personal information — closing an interpretive gap some AI vendors had used to argue that personal information embedded in trained model weights, rather than stored in conventional databases, was outside CCPA's scope. The substantive consumer rights existed before AB 1008; the amendment makes clear that AI vendors and developers cannot escape those rights by pointing at the storage format. The compliance puzzle this creates is genuine and operationally hard: how do you actually honor a right-to-delete request when the personal information is embedded in trained weights that cost millions of dollars to produce? This article walks through what AB 1008 actually changed, the three deletion approaches available in 2026 (full retraining, filter-based suppression, and machine unlearning), the deletion request workflow most AI developers are building, and how AB 1008 sits with AB 2013 to require a unified training data inventory as the underlying compliance substrate.
What AB 1008 actually changed
AB 1008's operative text is short and structurally surgical. It amends CCPA §1798.140's definition of "personal information" to add a clause confirming that personal information includes information "that exists in any format, including, but not limited to, physical, digital, abstract, or in an artificial intelligence system that is capable of outputting personal information." The amendment does not create new substantive consumer rights — the right to know, the right to delete, the right to correct, and the right to limit sensitive personal information processing all existed under CCPA before AB 1008. What AB 1008 does is foreclose a particular interpretive argument that AI vendors had occasionally used in pre-amendment compliance discussions.
The argument went something like this: CCPA's definition of personal information was historically drafted with conventional databases in mind, where personal information is stored as discrete records that can be queried, retrieved, and deleted. When personal information is used to train an AI model, it does not remain in that discrete-record form — it becomes part of the model's learned weights, which are statistical generalizations rather than copies of the original data. Some vendors argued that this transformation took the information outside CCPA's scope, on the theory that the trained weights are not personal information in the same sense that the original training data was. The argument was always weak — California regulators had signaled rejection of similar interpretations through enforcement priorities and the EPIC-style privacy advocacy community had treated the argument as untenable — but AB 1008 makes the rejection explicit.
The practical consequence is that AI vendors holding personal information in any form — training corpora, fine-tuning datasets, model weights, embedding stores, retrieval-augmented-generation indexes, output logs, or any other AI-system component — now operate under the same CCPA obligations as a vendor holding personal information in a conventional database. The substantive rights that consumers have always had are now unambiguously enforceable against the AI-stored data layer.
The three deletion approaches and when each is appropriate
Honoring a CCPA right-to-delete request against an AI system is operationally harder than honoring one against a conventional database, because AI systems do not have the conventional "delete this row" primitive that databases provide. The three approaches that have emerged are full retraining, filter-based suppression, and machine unlearning, and each fits different deployment contexts.
Full retraining is the conceptually cleanest approach: remove the affected personal information from the training corpus and retrain the model from scratch. This produces a model whose weights have never been exposed to the deleted information, which is the strongest possible compliance posture. The cost, however, makes it impractical to perform per-individual deletion request — for foundation models, retraining costs run into tens of millions of dollars per cycle, and even fine-tuned application models can cost hundreds of thousands. Most production AI systems therefore do not retrain on every deletion request; they batch deletion requests into scheduled retraining cycles that occur every few months.
Filter-based suppression is the operationally cheap approach: maintain a deletion-request list and filter the model's outputs to suppress generation of personal information about anyone on the list. This is computationally cheap, can be implemented quickly, and provides immediate response to deletion requests. The technical limitation is that filter-based suppression does not actually remove the information from model weights — the model still "knows" the information, and a sufficiently determined attacker using indirect prompts may be able to extract it. For most consumer deletion requests, where the underlying threat model is not adversarial extraction, filter-based suppression provides reasonable practical protection. For deletion requests in higher-stakes contexts (revoked consent for deeply personal data, deletion in connection with restraining orders or witness protection), the limitations of filter-based suppression matter more.
Machine unlearning is a research-active technical area developing algorithms that selectively remove specific information from trained models without full retraining. The 2024-2026 literature on machine unlearning has produced techniques that work in restricted settings — gradient-based unlearning, certified unlearning with formal guarantees, and parameter-efficient unlearning approaches — but the field is not yet at the production-readiness level where machine unlearning can be routinely deployed against arbitrary deletion requests on large foundation models. Some specific applications (smaller domain models, classification systems, embedding stores) can support machine unlearning today; foundation-model unlearning at scale remains an active research problem.
The deletion request workflow most AI developers are building
The defensible compliance posture combines all three approaches at different stages of a unified workflow. The five-stage workflow that has emerged in production environments operates as follows.
The first stage is intake: a clear consumer-facing channel for submitting deletion requests with reasonable identity verification. Most large AI vendors expose a privacy portal at a discoverable URL where consumers can submit deletion requests, with identity verification appropriate to the type of personal information involved. Identity verification calibration matters: too much friction creates barriers to legitimate deletion requests; too little friction enables impersonation attacks where bad actors submit deletion requests on behalf of others.
The second stage is scoping: identifying every system within the organization that holds the affected personal information. For an AI vendor, this scope is typically broader than for a conventional vendor — it includes conventional databases, AI training corpora, AI model weights, AI output logs, embedding stores, retrieval indexes, fine-tuning datasets, and any downstream caches or analytics systems. Each of these may need a separate deletion action.
The third stage is immediate action: removing the personal information from conventional storage and applying filter-based suppression to AI outputs. This is the consumer-facing response that makes the deletion functionally effective in the short term. The information is removed from any conventional databases, the deletion-request list is updated to suppress relevant generation, and any output caches are invalidated.
The fourth stage is retraining incorporation: aggregating accumulated deletion requests into the next scheduled retraining cycle so that the model's weights eventually no longer reflect the deleted information. This is where the substantive removal happens for the trained-weights component of the AI system. The retraining cadence is typically driven by other engineering considerations (model improvement cycles, data freshness requirements) rather than by deletion request volume, but each retraining cycle incorporates the accumulated deletion list as a prerequisite.
The fifth stage is response: confirming completion of the immediate actions to the requesting consumer and explaining the timeline for the retraining-based effectuation. The CCPA's response timelines (45 days, extendable to 90 days for complex requests) apply to the immediate actions; the retraining cycle effectuation is communicated as a separate timeline because it depends on the engineering cadence. Documentation of each step is the audit trail that defends compliance posture if a regulator inquiry or private action arises.
Training data lineage as the underlying compliance substrate
The workflow above presupposes something that many AI developers do not currently have: a comprehensive inventory of what personal information exists in their training data and AI systems. AB 1008 effectively requires this inventory as a prerequisite for compliance — without knowing what personal information is in your training data, you cannot honor deletion requests against it, and a deletion request you cannot process is a CCPA violation regardless of how the inventory gap arose.
Most large AI developers are now building training data inventories that track at minimum the source of each data item or batch (where it came from), whether the item contains personal information about identifiable individuals (an automated classification step using PII detection models), the consent or legal basis for inclusion in training, and the model versions that have been trained on the item. This inventory is operationally substantial — for foundation models trained on billions of items, tracking is necessarily statistical rather than per-item — but it produces the data substrate that makes deletion requests processable.
The inventory also serves AB 2013's training data transparency obligations, which is the integration point between the two California AI privacy regimes. AB 2013 requires public disclosure of training data characteristics in a high-level summary; AB 1008 requires the underlying lookup mechanism that lets specific deletion requests be processed. Both regimes draw on the same underlying training data inventory. Building the inventory once and using it for both regimes is the efficient compliance posture; building separate inventories for each regime is wasteful. Our companion AB 2013 vs trade secrets article addresses the disclosure layer; this article addresses the deletion layer; both are layers on the same training data inventory.
Where the existing CCPA exemptions still apply
AB 1008 does not override the CCPA's existing exemptions. Personal information processed for a "business purpose" under CCPA §1798.105(d) — including security incident detection, fraud prevention, debugging, internal use consistent with consumer expectations, legal compliance, and certain other categories — remains exempt from deletion obligations even when it exists in AI systems. This matters because some AI use cases align with the business-purpose exemptions and consequently fall outside the deletion right. An AI fraud detection model trained on transactional data including personal information may be exempt from deletion requests as to the data used for fraud detection, even though the same personal information would be deletable from a conventional CRM database.
The exemptions still require that the underlying business purpose actually qualify; the AI-storage-format does not by itself trigger an exemption. Compliance teams should not assume that AI use cases automatically benefit from any business-purpose exemption — the analysis is fact-specific and requires applying the §1798.105(d) categories to the specific use case. Where multiple use cases coexist (the same model is used for fraud detection and customer recommendations, for instance), the deletion obligation may apply to one use case while the other remains exempt, requiring the deletion mechanism to be granular rather than model-wide.
How AB 1008 fits with the broader California AI privacy regime
AB 1008 is one of three CCPA-aligned California AI privacy amendments enacted in the 2024 cycle. AB 1008 covers AI-system-stored personal information generally. SB 1223 amends CCPA to add neural data as a category of sensitive personal information, specifically targeting brain-computer interface and neural-prosthetic data. SB 1120 (which the California medical board has been operationalizing through guidance) addresses AI use in healthcare insurance utilization decisions. The three together create a substantive privacy regime that operates alongside the AB 2013 training-data-transparency regime, the SB 942 content-provenance regime, and the SB 53 frontier-AI-safety regime.
For AI deployers in regulated sectors — healthcare, financial services, employment — AB 1008 stacks with sector-specific privacy regimes (HIPAA, GLBA, employment privacy laws) to create overlapping deletion obligations that may have different timelines, different exemptions, and different identity-verification requirements. The integrated compliance posture maps each regime's requirements to the unified deletion workflow rather than building separate workflows per regime. For broader context on the California AI compliance picture, see our 2026 California AI Compliance Roadmap.
Sources
The primary statute is AB 1008 on California Legislative Information. EPIC's legislative session roundup places AB 1008 in the broader 2024 California AI legislation cycle. The California Attorney General's legal advisory provides the regulator-side framing on AI privacy. For practitioner-grade analysis, ZwillGen's overview and Pillsbury's analysis are useful starting references. Watch the California Privacy Protection Agency for any guidance on what deletion methodology satisfies AB 1008 in AI contexts, and watch the technical literature on machine unlearning as the field matures toward production-readiness.
Generate your AB 1008 deletion request workflow policy
Our AI Policy Generator outputs a written framework documenting your training data inventory, deletion request intake and verification procedures, three-tier deletion approach (immediate filter-based suppression, scheduled retraining incorporation, machine unlearning where applicable), and CCPA-aligned response timelines — the artifact a regulator inquiry or consumer complaint will examine. Free, no signup, exports as PDF.
Open the AI Policy Generator →