Introduction
In the past, whenever I joined a new team or switched analytics platforms, the first week was always the same: figuring out table structures, adjusting to whatever SQL dialect the new data used, and writing throwaway queries just to understand what the data looked like.
This was the job. If you were a PM who wanted to have the flexibility and reactivity that comes with pulling your own data, you accepted that a good chunk of your time would be spent on the mechanical side of analysis. Writing queries, debugging joins, building charts, formatting reports. The actual thinking about what the numbers meant often came last, squeezed into whatever time was left.
I now work in a lean team. No data engineers, no Looker or Tableau instance, no analyst to hand requests to. AI has proven to be an incredible power tool for productivity with limited resources.
Over the last few months, I’ve found three levels to how fully I’ve been able to integrate AI into my data analysis workflow. I think of them as good, better, and best. Most people I talk to are somewhere around “good” and haven’t realised how much further it can go.
Good: AI as Your Helper
This is the entry point, and honestly, if you’re not doing at least this much, you’re making your life harder than it needs to be.
At this level, you’re still writing your own queries and doing your own analysis. But when you get stuck, you ask an LLM instead of trawling through documentation or Stack Overflow. Syntax error you can’t spot? Paste it in. Need a window function and can’t remember the exact form? Ask. Want to know how a specific operation works in a SQL dialect you haven’t used before? Simple.
I switched to AWS Athena for the first time recently. Previously, that would have meant regularly Googling docs, learning the quirks and figuring out which functions from standard SQL work differently here. Instead, I just put in my broken queries with the error message and got them working straight away. The learning curve effectively disappeared.
This tier also covers the softer side: getting AI to tighten up your reporting copy, clarify a summary, or restructure how you present findings. Small stuff, but it makes life a lot easier.
The barrier to entry here is zero. No setup, no tooling changes. Just open a chat window and start asking.
Better: AI Doing the Heavy Lifting
This is where things start to shift. Instead of writing queries yourself and asking AI for help, you describe what you want in plain English and let AI generate the full query.
“Show me daily active users for the last 30 days, broken down by country, excluding users who signed up in the last 7 days.” Copy the generated SQL, paste it into your query tool, run it. If the results look wrong, describe what’s off and get a corrected version. This loop is surprisingly fast.
In practice, the conversation looks something like this:
Me: “I need to see our 7-day retention by signup cohort for the last 3 months. Users are in the
eventstable, signups are theaccount_createdevent, and a retained user is anyone who triggers any event 7 days after signup.”
And back comes a ready-to-run retention query… cohort definitions, the join logic, the percentage calculation, all of it. Perhaps after a bit of back and forth on table structures and events, but describing what I wanted and getting back working SQL in seconds means I spend my time looking at the retention numbers, not constructing the query.
The real change at this level is what happens after you have results. Export a CSV, drop it into the conversation, and ask questions. “What patterns do you see?” “Can you generate an interactive chart showing the trend?” “Which cohorts are outliers and why might that be?” Follow-up questions that would previously have required a separate tool, a pivot table, or an analyst’s time are now just another message in the same conversation.
The bottleneck moves. You stop spending mental energy on how to form the data and start spending it on what to do with it. The issue you start running into now is that the AI doesn’t understand your analytics well enough to work efficiently without the back and forth. And that’s where the next part comes in…
Best: AI That Knows Your Data
Why not automatically run queries, analyse and generate reports straight inside a context that knows the analytics intimately?
At this level, I run Claude Code sessions directly in my project’s codebase. I have a conversation about the metrics I want to investigate, and Claude can run SQL queries against our analytics database via the AWS CLI, summarise the results, and suggest follow-up questions, all within the same session. There’s no copying and pasting between tools. No context switching. The entire analysis loop happens in one place.
A typical session might start with me typing something like:
“Pull this week’s signup funnel using our standard funnel query. Break it down by platform and compare to last week. Flag anything that’s moved more than 10%.”
Claude finds the query we’ve previously saved in the project, runs it against Athena, gets the results, and comes back with a summary like:
“iOS signup completion is down 14% week-on-week. The drop is concentrated at the email verification step: completions there fell from 72% to 61%. Android and web are flat. Want me to dig into the email verification events to see what changed?”
From there I can say “yes, check if there’s been a change in the verification event volume or error rates” and it runs the follow-up query immediately. The whole thing feels like talking to an analyst who already knows your data, because it can literally read my documentation and code to understand the analytics events in detail.
A typical analysis session: question in, query run, summary back, follow-up ready.
Over time, I’ve built up a library of SQL queries saved in the repo that I run regularly: full_funnel.sql for our end-to-end signup funnel, cohort_summary.sql for retention analysis, audience breakdowns, and more. Claude knows about these and pulls from them for recurring reporting instead of assembling from scratch each time. Reporting can be put together in minutes then dug into and refined in conversation.
For sharing results, Claude generates markdown reports using templates I’ve set up. Graphs get rendered via command-line tools like gnuplot. The output is saved as HTML and uploaded for the team. It’s so fast and flexible for both formulaic and exploratory work.
A weekly report generated from a single conversation: markdown, charts, and all.
What I’d Build on Day One
If I joined a new team tomorrow, here’s what I’d set up to get this going.
First, schema documentation. A single reference file covering every table: what it contains, how events are structured, what the key columns and partitions are. This is what AI reads when it writes queries, and the quality of this document directly determines whether the SQL it generates is correct or garbage. The real advantage of working inside the codebase is that AI can generate and verify this documentation from the actual tracking code. No more manually maintained data dictionaries in some spreadsheet that’s always three months out of date. The code is the source of truth, and now the AI can read it. When something looks wrong in the data, you can ask it to trace an issue back through the implementation and check whether the tracking code matches what you expect.
Second, a shell script that handles query execution. Mine takes a SQL file, substitutes parameters, submits it to Athena, and returns the results. Nothing fancy. But it means AI can run queries end-to-end without me copying and pasting between tools. You can create something similar that works with your data provider.
Third, a library of known-good query templates. Funnel analysis, cohort breakdowns, engagement metrics. CTE-heavy SQL files with placeholders for date ranges. These become the building blocks for both recurring reports and ad-hoc exploration. When AI can pull from a tested template instead of writing from scratch, the error rate drops significantly. Again - work with AI to build these but spend time to ensure that they are solid as they are important building blocks and reference material for future requests.
Finally, a repeatable report format. Markdown templates, a CLI charting setup (I use gnuplot), and a convention for where output lives. This is what turns a conversation into something you can share with the team. A good reusable prompt can reference a previous report output and generate something similar with new data.
None of this is complex. Here’s roughly what mine looks like:
analytics/
├── CLAUDE.md # brief outline of structure
├── docs/tables.md # schema reference
├── run-query.sh # query runner script
├── queries/ # reusable SQL templates
│ ├── full_funnel.sql
│ ├── cohort_summary.sql
│ └── ...
└── reports/
└── 20260327/ # dated report outputs
├── report.md # - working draft in md for ease of back and forth
├── index.html # - final html generated from md
└── charts/ # - chart image assets
A shell script, some SQL files, a schema doc, and a folder structure. But it’s the difference between AI as a party trick and AI as a genuine part of how you work.
A Word of Caution
You still need to sanity-check what comes back. AI will confidently write you a query that joins on the wrong key, misinterprets a column name, or filters data in a way that subtly changes the answer. I’ve seen a few of these, and the wrong ones can look very plausible.
The more context you give it, the less this happens. Good documentation, a library of known-correct queries, access to the actual codebase: all of these reduce errors significantly. But it never goes to zero.
There’s also a data privacy angle. If you’re uploading CSVs or pasting raw data into an LLM, make sure you’re using an enterprise service with a no data retention policy, or anonymise your data before it leaves your machine. This is especially important for anything containing user-level or personally identifiable information.
Some things PMs still need: You need to know enough to spot a wrong answer. You need the intuition to look at a number and think “that doesn’t seem right.” You should understand the gist of a given SQL query. Those skills matter just as much now as they ever did, precisely because the mechanical barrier to getting an answer is now so low.
Wrap Up
I use Claude Code for this, but the same idea - an AI agent working inside your codebase with access to run queries - applies equally to tools like Cursor, Copilot, or open-source agent frameworks. The specific tool matters less than the patterns.
This approach works especially well if you’re on a lean team without dedicated BI tooling. But even with Looker or Tableau, this can be an improvement to your toolkit. The ability to go from question to answer in seconds, without waiting for a dashboard to be built or an analyst to pick up your ticket, is valuable regardless of what other tools you have.
Each tier in the progression removes friction between having a question and getting an answer. Good gets you past syntax hurdles. Better gets AI generating the queries. Best collapses the entire loop into a single conversation.
If you’re a PM who builds, this is what it looks like applied to data. Set up the infrastructure, give AI the context it needs, and you’ll spend more time on the questions that actually matter instead of writing the queries.