GDPR and AI Automation: A Practical Compliance Guide
How to run AI automation under UK GDPR: lawful basis, DPIAs, Article 22, transparency, vendor contracts and ICO expectations
Most teams introducing AI automation hit the same wall around month three. Legal asks for a DPIA, procurement asks where the data goes, and someone realises the chatbot has been logging customer messages to a US vendor for six weeks. The technology is the easy part. The compliance work is where projects stall, and where the Information Commissioner's Office (ICO) has started paying attention.
This guide covers what UK GDPR actually requires when you automate work with AI, what the ICO has said publicly, and the practical controls that hold up under audit. It is written for operations, engineering and data protection leads who have to make this work in production, not just on paper.
What UK GDPR actually says about AI automation
UK GDPR (the retained version of EU GDPR after Brexit, sitting alongside the Data Protection Act 2018) does not mention AI by name. It applies the same way it always has: if your automation processes personal data, you need a lawful basis, you need to meet the data protection principles in Article 5, and you owe data subjects a set of rights including information, access, rectification, erasure, and in some cases the right not to be subject to a solely automated decision.
Three articles do most of the work in AI cases:
- Article 5 - the principles: lawfulness, fairness, transparency, purpose limitation, data minimisation, accuracy, storage limitation, integrity and confidentiality, and accountability.
- Article 22 - the right not to be subject to a decision based solely on automated processing that produces legal or similarly significant effects, with carve-outs for contract necessity, explicit consent, and authorisation by law.
- Article 35 - the requirement to carry out a Data Protection Impact Assessment (DPIA) for processing likely to result in a high risk to individuals, which the ICO explicitly says includes most AI processing of personal data.
The ICO's Guidance on AI and Data Protection (updated March 2023) is the single most useful document for UK practitioners. It is not law, but it is how the regulator will assess you. Read it before you scope your first DPIA, not after.
Lawful basis: where AI projects most often fail
You need a lawful basis under Article 6 for any processing of personal data, and for special category data (health, ethnicity, biometrics, political views) an additional condition under Article 9. AI automation projects regularly get this wrong in three ways.
Reusing consent for a new purpose. If you collected email addresses to send newsletters, you cannot quietly feed them into a lookalike-modelling system without further work. Purpose limitation (Article 5(1)(b)) means new purposes need their own lawful basis or a compatibility assessment.
Relying on legitimate interests without a Legitimate Interests Assessment (LIA). Legitimate interests is a flexible basis but it requires a three-part test: purpose, necessity, balancing. For AI training and inference, the balancing test is where most weight falls, particularly when the data subjects would not reasonably expect their data to be used to train a model. The ICO published a consultation series on generative AI in 2024 covering this in detail, including the application of legitimate interests to web-scraped training data.
Confusing the basis for training with the basis for inference. Training a model on historic CRM data and using a deployed model to score new leads are two different processing activities, often with different lawful bases. Document them separately.
For automation that calls a third-party LLM API (OpenAI, Anthropic, Google), the practical effect is that customer data sent in prompts is being processed by a sub-processor. Your lawful basis covers the act of sending it; your contracts cover what the sub-processor can do with it.
When you need a DPIA, and what it should contain
Article 35 of UK GDPR requires a DPIA where processing is "likely to result in a high risk" to individuals. The ICO's list of operations that always require a DPIA includes use of innovative technology, profiling on a large scale, automated decision-making with legal or significant effects, and processing of biometric data. Most production AI automation hits at least one of these.
A DPIA that holds up has seven sections:
- Description of processing - data flows, categories of data, categories of data subject, volumes, retention, sub-processors and their locations.
- Necessity and proportionality - why this processing is needed, why automation rather than manual review, what less intrusive alternatives were considered.
- Consultation - data subjects, DPO, internal stakeholders. If you cannot consult data subjects directly, document why.
- Risk assessment - likelihood and severity of risks to rights and freedoms. For AI, include bias, inaccuracy, security of training data, model inversion, prompt injection, and re-identification.
- Mitigations - the controls that reduce each identified risk, with named owners and review dates.
- Residual risk - what remains after mitigation. If residual risk is high, you must consult the ICO before processing (Article 36).
- Sign-off - DPO opinion, decision-maker approval, review schedule.
DPIAs are living documents. A model retrained on new data, a vendor change, or a new use case all warrant a refresh. The ICO has criticised organisations in enforcement decisions for treating the DPIA as a one-time form.
Article 22 and solely automated decisions
Article 22 is the rule everyone misquotes. It does not ban automated decisions. It gives data subjects the right not to be subject to a decision based solely on automated processing where it produces legal or similarly significant effects, unless one of three exceptions applies: contract necessity, explicit consent, or authorisation by UK law with suitable safeguards.
Two words do the work. Solely means without meaningful human review. A loan decision rubber-stamped by a human who never opens the case file is still solely automated. The Court of Justice of the European Union confirmed in the SCHUFA judgment (C-634/21, December 2023) that producing a credit score that lenders then rely on is itself an Article 22 decision, even if the lender makes the formal call. UK courts are not bound by that judgment but the ICO has signalled alignment.
Legal or similarly significant effects covers credit, employment, insurance pricing, access to essential services, and increasingly targeted advertising where it affects opportunities. Routing a support ticket to a queue is not in scope. Auto-rejecting a job application is.
Where Article 22 applies, you must provide meaningful information about the logic, the significance and the envisaged consequences, and offer the data subject a route to human intervention, to express their view, and to contest the decision. "Meaningful information about the logic" does not mean publishing model weights. The ICO's guidance suggests explaining the input features used, their relative importance, and the decision threshold.
Data minimisation and retention in AI pipelines
Article 5(1)(c) requires personal data to be adequate, relevant and limited to what is necessary. AI systems pull in the opposite direction: more data usually means better performance. The compliance work is to find the smallest dataset that delivers acceptable performance, not the largest dataset you can technically access.
Practical patterns that hold up:
- Pseudonymise before processing. Replace direct identifiers with tokens before data hits the model. Keep the mapping table in a separate, access-controlled store. This is not anonymisation under GDPR but it materially reduces risk.
- Redact at the prompt boundary. For LLM automation, run a PII detection pass before the prompt leaves your environment. Tools like Microsoft Presidio or Cloud DLP do this server-side. Log the redactions.
- Set retention on vector stores. Embeddings derived from personal data are themselves personal data under most readings. Vector databases need the same retention policies as the source documents.
- Separate training, evaluation and inference data. Each has different retention needs. Training data may need long retention to retrain; inference logs usually do not.
- Document the model lifecycle. When the model is retired, the personal data baked into its weights through training is a deletion question. The ICO has not given firm guidance on model deletion but the direction of travel is clear.
Storage limitation also applies to chat transcripts, audio recordings, and agent traces. Default retentions of 30 to 90 days work for most operational automation; longer retention needs documented justification.
Transparency: what to tell users, and where
Articles 13 and 14 set out what data subjects must be told when you collect their data. For AI automation the additional requirements are:
- The existence of automated decision-making, including profiling, where Article 22 applies.
- Meaningful information about the logic involved.
- The significance and envisaged consequences for the data subject.
- The sub-processors involved, particularly where data is transferred outside the UK.
In practice this means updating your privacy notice and, for AI features that materially change the user experience, adding contextual disclosure at the point of interaction. A chatbot should identify itself as automated. A lead-scoring system that affects sales prioritisation should be disclosed in the privacy notice. A CV-screening tool needs disclosure to candidates before they apply.
The ICO's enforcement against Clearview AI (£7.5m fine, May 2022, later overturned on jurisdictional grounds but the substantive findings remain instructive) and its public commentary on emotion-recognition AI both centre on transparency failures. The pattern is clear: the regulator will treat opaque AI as a transparency breach first, before considering other principles.
International transfers and vendor contracts
Most production AI stacks involve a US-based LLM provider. The UK-US Data Bridge (in force since October 2023) provides a transfer mechanism for organisations certified under the UK Extension to the EU-US Data Privacy Framework. For non-certified vendors, the International Data Transfer Agreement (IDTA) or the UK Addendum to the EU Standard Contractual Clauses remain the route, paired with a Transfer Risk Assessment (TRA).
Practical contract requirements when procuring an AI vendor:
- Article 28 processor terms. Standard data processing agreement language: process only on instructions, confidentiality, security, sub-processor authorisation, assistance with data subject rights, breach notification, audit rights.
- Training opt-out. Explicit confirmation that your data will not be used to train the vendor's foundation models. OpenAI, Anthropic and Google all offer this on business and enterprise tiers; verify it is in your specific contract.
- Sub-processor list and change notification. You need to know who else touches the data and have a route to object.
- Retention and deletion. Specific commitments on log retention, including prompt and completion logs, and a deletion process on contract termination.
- Geographic processing commitments. Where data is processed, including failover regions.
The ICO's international transfers guidance is the authoritative reference. Do not rely on a vendor's marketing page claiming GDPR compliance; read the contract, including the latest version of their DPA, which often changes silently.
A practical compliance checklist for AI automation projects
Run this list before any AI automation goes to production:
- Lawful basis identified and documented for each processing activity (training, inference, logging).
- LIA completed if relying on legitimate interests.
- DPIA completed, signed off, and dated. Review date set.
- Article 22 assessment: is the decision solely automated and significant? If yes, exception identified and safeguards in place.
- Data minimisation review: redaction, pseudonymisation, and retention policies documented.
- Privacy notice updated. Contextual disclosure added where material.
- Vendor contracts in place with Article 28 terms, training opt-out, and sub-processor controls.
- Transfer mechanism documented for any non-UK processing.
- Data subject rights process tested: access, rectification, erasure, objection, human review.
- Security controls: access logs, prompt and completion logs with retention, secrets management, prompt injection mitigations.
- Incident response plan covers AI-specific scenarios: model leak, prompt injection, output errors causing harm.
- Owner named for ongoing compliance review. Review cadence set (quarterly is typical for production AI).
None of this stops you shipping. It changes the order in which you ship. Compliance work done in week one of a project costs a fraction of the same work bolted on after launch, and it removes the ambiguity that kills internal momentum.
Where the ICO is heading
The ICO's regulatory direction for 2024-2026 includes a focused programme on generative AI, biometric recognition, and the use of AI in recruitment. The regulator has published a series of consultation responses and is increasingly willing to use its Article 58 investigation powers proactively, not just reactively. The Data Protection and Digital Information Bill (still in flux at time of writing) may adjust some of the detail but not the direction.
The Department for Science, Innovation and Technology's pro-innovation AI regulation framework runs alongside, not instead of, UK GDPR. Sector regulators (FCA, MHRA, Ofcom) are adding AI-specific expectations within their existing remits. For most mid-market AI automation, UK GDPR remains the binding constraint.
FAQs
Does using ChatGPT or Claude at work count as processing personal data under GDPR?
If you paste personal data into a prompt, yes. The act of sending it to the LLM provider is processing, and the provider becomes a processor or sub-processor under your data flows. Consumer tiers of ChatGPT and Claude typically use prompts for training by default, which is rarely compatible with a UK GDPR lawful basis for customer or employee data. Business and enterprise tiers offer training opt-out and proper data processing agreements. The practical answer is: ban consumer tiers for any work data, procure the business tier with a signed DPA, and train staff on what they can and cannot paste.
Do we need a DPIA for every AI automation we build?
Not every one, but most. The ICO's position is that AI processing of personal data is presumptively high-risk and therefore a DPIA is expected. Internal automations that touch only non-personal data (document classification on technical specs, code review tooling, financial reporting on aggregated figures) may not need one. Anything involving customer data, employee data, or decisions about individuals will need a DPIA. A pragmatic approach is to use a short screening questionnaire to decide, and run a full DPIA whenever any answer indicates personal data, automated decision-making, or vulnerable data subjects.
Can we train a model on customer data without explicit consent?
Sometimes, with care. Consent is one lawful basis but not the only one. Legitimate interests can support model training where the use is within the reasonable expectations of the data subject, the data minimisation principle is respected, and the LIA balancing test favours processing. Special category data (health, biometrics) raises the bar significantly and usually requires explicit consent or another Article 9 condition. The harder question is purpose limitation: if data was collected for service delivery, using it for model training is a new purpose that needs its own assessment. Most organisations underestimate this work.
What's the risk if we get it wrong?
UK GDPR fines reach the higher of £17.5m or 4% of global annual turnover. ICO fines have so far been more modest in the AI space, but enforcement notices, mandatory processing stops, and reputational damage are at least as material. The bigger commercial risk is buyer pushback: enterprise procurement now routinely asks AI suppliers for DPIAs, model cards, and evidence of GDPR controls. Failing those questions costs deals. Plus individual data subject claims, while rare, are rising, and class-action-style group claims are emerging in the UK courts.
How do we handle data subject access requests for AI systems?
Standard subject access rights apply: a data subject can ask what personal data you hold, including data in AI training sets and inference logs. The hard parts are vector embeddings (treat as personal data where they derive from identifiable source data), model weights (the ICO has not required disclosure of these), and prompt and completion logs (in scope, so design retention accordingly). Build the access process to include AI data stores from the start. For erasure requests, you may need to delete from logs and retraining queues; deleting from a deployed model's weights is generally not feasible, which is a reason to minimise what personal data enters training in the first place.
Is self-hosting an LLM enough to solve our GDPR problems?
It helps but does not solve everything. Running an open-weights model (Llama, Mistral) on your own infrastructure removes the international transfer question and the sub-processor risk, which are two of the larger compliance burdens. It does not remove the need for a lawful basis, a DPIA, transparency, data subject rights, or Article 22 analysis. It also introduces new obligations: you are now the controller and processor for the model, you own the security of the weights, and you carry responsibility for output errors that would otherwise sit partly with the vendor. Self-hosting is a strong option for sensitive use cases but it is a trade, not a free pass.
Who owns AI compliance internally - DPO, legal, or engineering?
All three, with clear interfaces. The DPO owns the framework, the DPIA process, and the regulator relationship. Legal owns contracts, lawful basis sign-off, and Article 22 calls. Engineering owns the controls in the system: redaction, retention, logging, access management, audit trails. The pattern that works is a small AI governance group meeting monthly, with a single named owner per production AI system who escalates to the group when a model, vendor, or use case changes. Without a named system owner, compliance drift is almost guaranteed within six months of launch.
Building for compliance from day one
UK GDPR is not the obstacle to AI automation that some teams treat it as. It is a design constraint that, met properly, produces systems that survive procurement, audit, and the inevitable awkward question from a customer. The organisations getting this right are not the ones with the largest legal teams. They are the ones treating compliance as a build problem with engineering, legal and operations in the same room from week one.
If you want help designing AI automation that holds up to ICO scrutiny without slowing the build, AI Advisory works with UK mid-market teams to scope, build and operate compliant AI systems end to end. Get in touch to discuss your project.
Further reading
Sources referenced for context not directly cited in the body:
Ready to put this into production? book a discovery call.