SQL-ing business users are data team's biggest frenemies
Don't put me in jail. I just think it has to be said, so that perhaps a SQL-ing business user somewhere think twice before sending us more technical debts.
And no it’s not because of job replacement fear.
Et tu, Bob?
I don’t know what your actual name is but I’m just going to call all of you SQL-ing business users / techno-enthusiast & practicers Bobs from now.
You Bobs are not the same, some Bobs are more SQL-prolific (or just more technically talented) than the others, to the point that we really much love you for your ability to be the bridge for us and shield off requests that range from laughably amusing to comically tragic. We genuinely count ourselves lucky to have you on the team.
Some of you Bobs are not that SQL-savvy, but you are learning and adapting, you love asking us questions about the hows and whys and we gladly share it with you, in hope that you too, one day, will become our biggest ally. The less barriers we have with the business users, the more we can increase data literacy in the organization and help everyone self-serve, with better data products, or so we thought.
Some of you Bobs, however, are a bit more of a gloom and doom type, you don’t like communicating with us what you know. You just go straight to leveraging your tech sword and swing it at us at times, perhaps out of distrust. I get it, someone hurt you SQL-ly in the past, so you don’t want to trust us right away, you want to test us first.
Actually, you Bobs knowing SQL (or tech) is not a problem. The true problem is when you Bobs mistake what you know for what there is in totality1. The danger lies in the hidden, unexplored world of nuanced technicalities & lived trials / errors that only the rare Bob dares to wander. And since most of you don’t cross that boundary (ever), you are soothed by the illusion that your SQL-ing / technical knowledge is all there is to “doing data”, and if anything, you can just leverage that to make your life easier (change a report logic faster, creating new tables as you want, etc.), without working with us, or letting us do our jobs - properly (which in turn should make your lives easier too, but you are too impatient to find out).
What you are oblivious to, however, is that in the process of (allegedly) making your lives easier, you make our lives harder immediately and tangibly, and you ship off that technical debts to your future selves (without knowing!), oh wait, no it’s still mostly born by us - the data team. Off you hop with your SQL bunnies and rainbows.

Side note
Actually I was going to write on another topic, and it’s sitting in my draft section at the moment, when I decided to turn an abrupt left and focus on this instead.
The bigger idea here is a list of things that keeps a data something person awake at night, tossing turning, or just simply sobbing at a corner. Things like:
Creating a dashboard with sweats and tears only to hear that they changed their mind for the N-th time and decides to finally use Excel instead because … “it’s so hard to change this formula in the dashboard2” or “the CEO is just used to looking at Excel3”.
… Then those same people want to have some state-of-the-art forecasting built-in to the reports, few months later, because somebody saw something cool on LinkedIn and want to jump on the cool train.
AE folks, that feeling when you open up a dbt project 3 months after the “handover” and you see a total annihilation of carefully designed models because “who cares about DAGs” and because “we need to have reports fast” and because “yes we can”4
… Until those same people a few months later will complain about spiking costs in their data warehouse and then blame the modern data stack (or whatever buzz words they hear) for being a scam on a visceral public LinkedIn post. What’s even sadder is to see that post get hundreds of likes5.
You set up a CI/CD pipeline with guardrails that prevent self-merge in Github - plus some famous last words like “please utilize peer review and allow the codes to pass the tests before merging to production”, only to see the people that take over the project later on kicked the table upside down and chose violence.
… until of course, one day somebody stays up at 2AM on a Saturday because that Friday’s merge just wrecked havocs on the reports and now the CEO is asking where the numbers are.
I mean,…
So yes, this post is a tribute to you folks - the AE’s secret enemy number one, but also weirdly, our supposed allies. Or maybe I’m just speaking for myself here… maybe this is a me problem… too late, these words were typed last… too late to turn back now, just like the irreversible damages Bobs can create for us.
After all, the deepest betrayal is usually from the people we trust.

Ok, let’s talk specifics
Here are some very particular ways that SQL-ing business users turn out to be … well, our frenemies (or at least just mine, I don’t know about you, but I’m pretty sure one day you will understand, if you have not understood yet, then you are either not there yet, or you are just simply gaslighting yourself).
They think all SQL are equal
No, not all SQL are born equal. I don’t know if this makes me sound like an SQL supremacist. I think I’m just more willing to admit that at some point in the past my SQL did make somebody sad - maybe even now too, and I’m willing to apologize for it, and the more I learn the more I realize how much potential I have for evil, and I’m aware that there are many many SQLs out there are waiting to be rescued from the company’s deep dark pit of SQL despair, yearning for the light.
Also factually speaking, there are verifiable evidences that not all SQL are born equal. Sue me.
Things like:
SELECT DISTINCT is not the same as SELECT blabla ROW_NUMBER() OVER()
SELECT …. LIMIT 10 doesn’t mean you are actually scanning only 10 records in a billion-row table!!!
Querying directly from a raw data table (ok whoever data folks allow this pandora box opened, it’s actually on you 1000%) is not the same as querying from a curated, cleansed, properly partitioned mart table.
I don’t know if explaining these differences matter any more. I have worked with enough folks that convince me it does not matter at all. We can try, of course, but you would get busy and forget about it the next morning, and you will continue with your old habits, because it’s always more convenient.
They think that modifying this particular thing right away in the BI layer with a new SQL script is better than going back to existing table and fix it from there
Along with “why do we need to have this modeling tool?”, and “can’t you guys just fix it later?”
The “particular thing” is usually a metric definition that people often disagree on, or an extra column, or some manual adjustments that must be had. And business users tend to fix the leaves instead of dealing with the roots, because everybody makes excuses. Nobody likes dealing with the roots.
Sure, we can fix it later. But the point is we wouldn’t be able to tell that we have to fix it until you actually COMMUNICATE with us. It’s not like we have a built-in SQL-dar that tingles every time there is a disturbance in the untouchable BI-layer.6

Some BI tools do have a code-base that allow us to detect changes and version control (like Looker, Holistics?, a bit of Thoughtspot?) but the majority are drag-drop tools that lure you Bobs in with rainbows and candies and leave the suffering to us, ok? WE CANNOT TELL IF YOU HIDE SOME LOGIC CHANGES IN THERE!
And about the modeling tool point, I have run into cases where - after an organization introduces dbt + Looker into the realm to empower self-serving analytics while having a “single source of truth”, the SQL-ing business folks are hell-bent on going back to Data studio and Airtable, creating dozens of 1000-line SQL scripts (some of which repeats what we ALREADY have in dbt, and then some modifies it slightly because their departments have different ways of calculating things), which multiplied their Snowflake costs by 3 times within 2 months, only to have their data team coming back asking for cost-control initiatives.
When explained the purpose of modeling tool, it’s entirely lost on them Bobs. While I may be to blame for the lack of AI-appeal-black-magic to peak their interests, it’s not that they couldn’t understand the concepts, they did. But word for word: “it’s faster for us to do it this way7” and “we don’t really want to use dbt”.
Why? Because the dbt frameworks keep things in checks:
you have tests, and CI must pass
you need to have modular models
it forces differing views to a table and converse
and every Bob there just loved doing things their way without accountability to the data budget, and data team - after opening a Pandora box of “yeah let them Bobs do it to be fast” (also because they were too burned out to deal with things methodically) - cannot close it anymore.8
They may have an odd idea about what data team actually do
Maybe some Bobs out there think our sole job is to craft big beautiful SQL scripts all day, and so we are afraid of losing our jobs to SQL-savvy business users.
I arrive at this conclusion after observing some funny incidences in the past, where data clients saying things somewhat translatable to “Well our business team can do SQL, they can handle this from now and off you go”.
I mean, sure, it’s a natural development, I don’t blame them for thinking this way, especially if we (data folks) don’t advocate for what we are doing or fail to show impact.
None of these concepts mean anything to Bobs: governance, data quality, scaling, data literacy, soft-delete hard delete, CDC, access control, PII, etc. They may be using a bunch of familiar reports blissfully unaware of the inaccuracies9 and data risks, and soothed by the comfort of handling requests by themselves.
I’m not trying to bash Bobs, I used to be a Bob. The point is we all have capacity for evils10 without knowing. It’s very convenient to think that we can just do things the way we have done and cast data team aside thinking they are unnecessary and just making up stuff to be relevant. At some point when facing this type of Bobs best thing we can do is just step aside and let the learning happen in real time.
Usually, like the case above, they come back, realizing the errors of their way and ask for forgiveness11.
Also, data folks, should we communicate and advocate for our work better? I myself am guilty of this. Sometimes I’m a bit tired and bored with explaining. But explain we must. It’s not even about protecting jobs, I just want to make sure people understand what they are throwing into the trash bin, before they do. There are usually irreversible damages that come with the act of trashing something you do not fully understand.
I’m also pretty sure that with all the recent developments in LLM, Bobs population have multiplied, gleefully jumping up and down with the idea of autonomously handling more complex needs without them evil data gatekeepers. I’m also pretty sure after a few months, let’s give it a year, of vibe-coding, data folks will be called in to clean the messes, while chuckling, and sobbing at the same time12.
What should data team do, then?
Should we gatekeep SQL? Should we gatekeep technologies? Should we ban Bobs? Should we put a tariff on Bobs13?
No.
Actually, I don’t know, it’s really easier for me to rant about problems than solving them. Plus, even if I have a solution, you wouldn’t listen, and you would make up some excuses too, just like evil Bobs.
Evil Bobs out there wrecking havocs on your neat data project? It is YOUR FAULT. Every frenemy Bob out there exists, because of you!
But if you are willing to deal with your own fault, there is some hope.
I think, honestly, access control from the start of any data project is very important. Once you open up the pandora boxes of free access to write any SQL scripts one’s heart desires, you should banish yourself to CEO land and try to convince them of shutting the whole thing down14. Good luck. This is often not the preferable route, prevention is better than cleaning up the mess afterwards.
If you do have Bobs in the organization, cherish them the right way.
They are actually very helpful with things you need help with: like metrics deep dive, “is this the latest report you are using?”, “why you calculate it this way”, “where can I find this manual mapping that I secretly know you are behind this but I want to hear your confession”, “do you know that you are actually duplicating data” type of talks. But don’t put your trust for a moment, that opening the SQL floodgate for Bob means you are safe. No, you are engaging in self-harm. STOP RIGHT NOW.
Involve Bob in all things that require using business domain knowledge, but don’t involve Bob in decision-making of data security or quality or governance or whatever data buzz words. It’s your job. Do your job. Leave Bob alone to do his job too. Stop abusing Bob.
If Bob wants to transition more into ‘upstream’ realm (“Can I query from the raw tables”, “I need to prepare this report for the CEO fast”, “Somebody miscalculated this metrics, I need to fix this now”), there must be conditions and guardrails. You must not, under any circumstances, let Bob become a Deedee. No. Be firm.
Nobody should query from raw tables
Yes you can fix the metrics but we will incorporate this into the model asap, here is our metrics catalog that require consensus among stakeholders, can you initiate the change request in our Jira board? (even better, because I know how lazy business users can be, turn your slack channel into an automated Change Request process)
This report can absolutely use this existing model, we don’t need to create another table for it. If it’s missing a column, you can temporarily add in the BI layer, we will refactor it back in our Refactoring cycle.
Sorry Bob as much as we appreciate you no you cannot write things into data mart, it’s not in line with the data access matrix we agreed and will make things hard to maintain, point to wiki.
To do all of the boundary talk above you must also hold yourself accountable to SLA and try to meet business users half-way:
We complete change requests within 1 day if priority 1
We refactor code base weekly, takes 2 days to complete, and will always inform business users of their requested change completion.
We will revisit new logics introduced into BI layer monthly and hold a brief meeting with stakeholders to discuss necessity and rule out obsolete things.
Things are documented into this wiki for everyone to access. The wiki is also a great place to boost data literacy.
When there is some bottlenecks or performance wins, share it with larger org (not just keep it within data team).
Perhaps hold some data talks in the org for those interested. Less evil Bobs, more ally Bobs will form.
Good luck to us
I mean if we do all of these correctly I guess the fruits will yield. We would be left to do our job in peace without sobbing in a corner once in a while. Bobs out there appreciating our existence. Our words mean things now, and we don’t have to struggle against the current trying to prove why Bobs cannot query from raw data all the time.
Wouldn’t that be nice.
I don’t think data folks know what there is in totality of data realm either. Genuinely, none of us know everything there is to know. Each of us know some of the things there is to know, and because we are often the frontier of blame, we are generally more cautious in experimenting and developed some disciplines around this over time.
Code word for “I don’t want to learn whatever this is that you showed me”
Is it true? World CEOs can we have a poll on this? I mean why the hell do we go all the way to all these BI developments if you folks just like staring at Excel all day? Huh? Is everything just a venture capital scams?
Be honest data folks, have you ever regretted empowering business users with SQL skills? I don’t want to hear your self-censored opinion. Spare meeeeee.
Any tool is just a tool, you use it right - it gives benefits, you use it wrong because you don’t bother to learn to use it properly? It’s a you problem. A person can use a knife to cut food to eat or stab somebody. It’s their choice.
Maybe this is a good niche for product development, also maybe some of you rosy people would say “you should communicate with business users to catch up with them logic changes”. We do dear friends, we do, but it doesn’t work all the time.
I would argue it’s not faster, it’s an illusion, and an excuse you make, but ok.
I beg you data folks, if you open Pandora boxes, you have yourself to blame. Set your mind straight, and safeguards your strategy with proper access controls and some backbones. It’s you who have to answer to the CFO about your out-of-control data spending, not them Bobs.
The number of times where after a migration project I had to explain back to business users that ‘unfortunately some of the previous numbers you were using in that Excel report weren’t that correct’… I know, I’m surprised too. After all I’m not business domain savvy, and I had to double triple check myself and almost want to just leave it be, but I couldn’t. It’s a funny phenomenon: a familiar thing gives comfort and illusion of accuracy, but it’s not, and when we discover it it’s another uphill battle to prove it.
Data evils, life evils, many evils
Not really, they come back ask us to firefight and bid us goodbye again, only to make same mistakes in near future. Data folks can fix pipelines easier than correcting a mindset. So, if you are one of them CEOs that complain on LinkedIn about data platform costs, look in the mirror.
Saying “I told you so” is only satisfactory when you personally don’t have to clean the mess.
This gives me some evil ideas, like recruiting nonSQL Bobs, deporting SQL-ing Bobs to noSQL land, and punishing SQL-ing Bobs with mandatory data trainings.
Ok, I sound a bit undemocratic, what I mean is: DE/AE are the only people who should access raw data, there should be PII policy in place and implemented, business users whether Bob or no Bob must only query from data mart, not your data lake, not your raw data, not your intermediary models, and they definitely should have some room of autonomy in semantic layer, but that should be monitored and refactored by data team where necessary. This RACI better be communicated and documented somewhere in the organization for everybody’s sake… ideally.