DPDP-compliant LLM inference: keeping data in India

India’s Digital Personal Data Protection (DPDP) Act turns a vague best practice into a legal obligation: personal data of Indian principals must be handled under defined conditions, and cross-border transfer is constrained. For teams running LLM inference, that makes a previously-ignored question load-bearing, where did this prompt’s data go?

The naive responses both fail. Routing everything to one in-region provider throws away cost, latency, and model choice. Routing everything to the best global model ignores the law. The workable answer is per-request: classify what’s regulated, lock only that to an in-region provider, and let the rest route freely.

Why "deploy in Mumbai" isn’t enough

Hosting your application in an Indian region says nothing about the provider that actually serves a completion. If a request carrying a customer’s PAN or Aadhaar number is sent to a model endpoint in us-east, the data has left the country regardless of where your app runs. DPDP cares about the data flow, not the deployment diagram.

Per-request classification, then region-locked routing

A DPDP-aware gateway does three things inline, before egress:

Detect Indian personal-data signals, including India-specific identifiers (PAN, Aadhaar) alongside the usual PII, deterministically on the request.
Constrain routing for those requests to providers with an eligible in-region serving location (for example, an Azure OpenAI deployment in centralindia), overriding the client’s preferred chain if needed.
Fail closed if no eligible provider is available, returning an error rather than silently sending regulated data abroad.

Crucially, a request with no personal data still routes by your normal cost/latency policy. You don’t pay a sovereignty tax on traffic that doesn’t need it.

Redaction is the second line

Classification decides routing; redaction reduces exposure. Deterministic PII and secret redaction on both the request and the response means that even where data is permitted to flow, identifiers can be masked before they reach the model, and leaked secrets (API keys, tokens) never egress at all. Both run in microseconds on the hot path, not as a hosted callback.

Auditor-ready by default. Because each routing decision is recorded, the data-flow inventory a DPDP review asks for, which data reached which provider and region, how much was personal, how much stayed in-region, generates itself instead of being reconstructed from logs.

India-first, global by design

DPDP is the sharpest near-term driver, but the mechanism generalises. The same per-request classification and region-locking enforces GDPR (EU), HIPAA (US sectoral), and Gulf data-protection rules, you swap the recognizer set and the eligible-region map, not the architecture. Building for India’s requirement first happens to build the global one.

If you’re scoping DPDP compliance for an LLM workload, the deciding capability isn’t deployment region, it’s whether your gateway can classify and lock per request, and prove it. That’s what a sovereign AI gateway is for.

See it in the enforce-residency task, or compare approaches across gateways.

Why "deploy in Mumbai" isn’t enough

Per-request classification, then region-locked routing

Redaction is the second line

India-first, global by design

Route your first sovereign request this week.