The Data Is There. Getting to It Shouldn't Require an Engineer: A Framework for AI-Powered Operational Self-Service
When operational data is locked behind engineering-only query interfaces, business users generate a hidden backlog of data requests. A framework for adding AI-powered natural language query capability to operational platforms without touching production systems.
Arko IT Services ·
The hidden cost of data request backlogs
Every operational platform generates data: transaction records, balance summaries, usage analytics, reconciliation figures. The data exists. Its value to business users is real.
The problem is access.
In most operational platforms, the data is reachable by engineers who can write SQL, run a query, and read the raw output. For business users (the operators, partners, and clients who need the data to make decisions) the access path is a ticket to engineering. "Can you pull last month's redemption totals by partner?" "What's my current balance?" "Which transactions are pending settlement?"
Each request is small. Together they add up to a recurring engineering tax: hours spent on data retrieval that has nothing to do with building or improving the platform. And that tax compounds over the life of the system.
The AI-powered sidecar pattern goes straight at this.
The pattern: a read-only AI layer over operational data
The pattern is a separate, read-only system that sits next to the production platform and answers natural language questions about operational data.
graph TD
subgraph PRODUCTION["Production Platform"]
SOURCE_DB[Source Database]
BACKEND[Application Backend]
end
subgraph SIDECAR["AI Query Sidecar (read-only)"]
ETL[ETL Sync - scheduled]
REPLICA[Read Replica Database]
ROUTER[Intent Classifier - LLM]
SQL_AGENT[SQL Agent]
RAG_AGENT[RAG Agent - knowledge base]
GUARDRAILS[Security Layer]
end
subgraph USERS["Business Users"]
PROVIDER[Partner / Provider]
MERCHANT[Operator / Merchant]
ADMIN[Platform Admin]
end
SOURCE_DB -->|sync| ETL
ETL --> REPLICA
PROVIDER -->|natural language query| ROUTER
MERCHANT -->|natural language query| ROUTER
ADMIN -->|natural language query| ROUTER
ROUTER --> SQL_AGENT
ROUTER --> RAG_AGENT
SQL_AGENT --> REPLICA
SQL_AGENT --> GUARDRAILS
RAG_AGENT --> GUARDRAILS
GUARDRAILS --> PROVIDER
GUARDRAILS --> MERCHANT
GUARDRAILS --> ADMIN
The sidecar has no write path to the production system. It cannot modify data and it cannot create side effects. The worst a bug in the sidecar can do is return a wrong answer, not corrupt data or kill a transaction. That choice is not incidental. It is what makes it safe to iterate fast on the AI layer with no production risk.
The five components
1. ETL sync
The sidecar keeps its own read replica, synchronized from the production database on a defined schedule. The sync is incremental: only changed records move. That replica is the only database the AI layer ever touches.
The sync interval sets the data freshness the system can offer. For most operational queries, an interval of minutes is plenty. For real-time balance queries, the interval has to match the latency tolerance.
2. Intent classification
Before any query runs, the system classifies the user's natural language input into an intent. The intent decides which query template applies and which data access scope the requesting user's role is allowed.
The classification uses an LLM to map free-form language onto a structured intent taxonomy. That handles the phrasing variation a rule-based classifier would choke on. "What did I earn this week?" and "Show me my settlement for the last seven days" both land on the same intent.
3. Role-based data isolation
The intent classification result is checked against the user's role before any query runs. Each role can reach only a defined set of intents, and the queries generated for those intents are automatically scoped to the user's own data.
Role Accessible Data
----------------------------------
Provider Own gift cards, redemptions, balances
Merchant Own outlet transactions, settlements, dues
Admin Full platform - all tenants
A Provider asking "what's the platform GMV this month" gets a sensible answer for their role, not an error, but a figure scoped to their own volume. An Admin asking the same question gets platform-wide numbers.
This is not only authorization. It is a UX decision. Each role gets answers that are meaningful for where they sit, not a filtered slice of one universal query.
4. Query execution with guardrails
The SQL Agent turns the classified intent into a database query, runs it against the read replica, and hands the result to the response generator.
The guardrails layer wraps that process. Input validation blocks prompt injection attempts and SQL injection patterns before the intent classifier ever sees them. Output scrubbing masks PII (email addresses, phone numbers, personal identifiers) before the response leaves the sidecar. Rate limiting applies per-user request limits so nobody can turn the AI layer into an uncontrolled query engine.
5. Response generation
The raw query result goes to an LLM with instructions to write a natural language response. The user gets an answer in the same shape they asked the question, a plain-language summary of what the data shows, not a table of database rows.
For large result sets or streaming responses, server-sent events let the response start rendering before the full query finishes.
What this enables
Here is what the sidecar pattern unlocks.
Self-service data access for non-technical users. A partner can ask "which of my cards are expiring this week?" at 9pm and get an answer right away, with no ticket to engineering.
A shorter engineering queue. The "can you pull the data for X" category of work comes off the engineering backlog entirely. Engineering capacity moves to work that actually needs engineering.
Role-appropriate operational visibility. Each user type sees the slice of data that matters for their role. Platform partners, outlet operators, and platform admins all have different operational questions, and the sidecar answers all of them from one interface.
Faster decision cycles. When data is available on demand, decisions that used to wait on a retrieval cycle can be made on the spot.
What this does not do
The pattern has real limits, and getting the scope right means being honest about them.
It does not replace analytics. The sidecar answers operational questions about specific entities, transactions, and balances. It is not a data warehouse or a BI platform. Complex trend analysis, cross-cohort segmentation, and forecasting belong in a dedicated analytics layer.
It does not handle writes. By design. The read-only constraint is exactly where the sidecar's safety properties come from.
Data freshness is bounded by the sync interval. Real-time accuracy needs an interval that matches the use case. Some questions ("is this specific card currently active?") may mean accepting a delay or routing to a live API call.
LLM-generated SQL needs validation. The SQL Agent generates queries, and generated queries have to be validated before they run. Templates scoped to known intent patterns are far safer than open-ended SQL generation.
When to build this
The sidecar pattern is the right call when:
- Business users currently depend on engineering for routine data retrieval
- The production system cannot be modified to add a query layer directly
- Multi-tenant isolation requirements make open query access unsafe without a controlled interface
- The operational questions are bounded and can be classified into a defined intent taxonomy
It is the wrong call when the questions are so open-ended they cannot be classified, when every query needs sub-second real-time data, or when the data model is too complex for LLM-generated SQL to be reliable without heavy validation.
The pattern pays off most in platforms with multiple user roles, each with its own bounded data access needs and its own set of recurring operational questions.