“We are a data-driven company.”
I hear you groaning at your screen already. Let’s talk about what this phrase means these days, what we as data scientists and analysts really want it to mean, and how to bridge that gap. Read this before you respond to your next ad-hoc request.
It’s been sexy in the last couple years to be a self-proclaimed “data-driven” org. And by this, organizations typically mean: we make our decisions using data. Now don’t get me wrong — it’s a great idea to support decision-making processes with data. But there’s something insidious in this statement: it doesn’t recognize the hard work of data scientists/analysts. The focus is on the data, not the analyses, which is to say:
We have so much data, we just need to see it and use it to make decisions. Analysts and data scientists just need to get the data for us.
Data-driven orgs: analysts fetch data.
Analytics-driven orgs: analysts find answers.
Data-driven orgs: hire more and more analysts/DS as “hands”.
Analytics-driven orgs: invest in infrastructure, tooling, and education.
Data-driven: “Can you pull these numbers…?”
Analytics-driven: “Can you help me think about…?”
Data-driven: Finds data that justifies management decisions
Analytics-driven: generate insights that tell a compelling story management can act on
I’d say it’s about time to organizations to make the shift from data-driven to analytics-driven.
Unfortunately, there’s no silver bullet here — successfully becoming “analytics-driven” will hinge on how your organization views its analytics org. Are they thought partners or subordinates? SQL monkeys or scientists?
At the end of the day, stakeholders need to get in the habit of asking business-level questions of analytics, not having data requests.
[For a great read on this subject, check out Pedram Navid’s article, Building the Modern Data Stack.]
That said, short of having a long talk with cross-functional leadership, I’ve found that a few procedural changes can help support this shift:
While these won’t directly change corporate culture, they’ll enable you to reduce the load of your ad-hoc requests by making your past work searchable and, dare I say it, self-serviceable.
As a data scientist or analyst, you are the person most familiar with data, so naturally data-related requests will come to you. My default response has always been to drop a query/dashboard on stakeholders ASAP, which I’d promptly forget about. But then I’d have to repeat the work the next time I was asked. Plus, this sort of ad-hoc response just reinforces the perception that your relationship with decision-makers is purely transactional.
Instead, write up a doc. Start with the question:
“What question are you trying to answer?”
Then, do your work masterfully and reproducibly — working with data is a science, after all, and should be treated as one.
This’ll have two benefits:
Documenting your work is a fantastic first step, but your analysis is only good as the data you use to make it. If you want to have confidence in your analytics work (and if you want others to have confidence in your work as well), you should deeply understand and document the tables and transformations used in your final analysis. There are two parts to this:
Now that I’ve convinced you to document things, there’s a glaring question left:
Where do we put these docs?
You’ll need to consolidate all this documentation somewhere. I’ve worked with/at companies who have used Confluence, Notion, or even Github to manage this, which can work in a pinch. At Airbnb, we used Knowledge Repo in conjunction with Dataportal to store these as git-tracked publications, but this always felt a bit heavy for ad-hoc work.
Hyperquery works really well for organizing work in this way — Hyperquery combines a Notion-/Confluence-style note-taking environment with executable query blocks, so you can start your SQL work there, share it easily with the rest of your org, and later, use your work as a starting point to build up a full-fledged set of docs.
But at the end of the day, use tools that work for you. The important thing is that your work needs to be visible, accessible, and searchable by the rest of your team and your company.
Analytics organizations don’t exist to write SQL. We are here to problem solve. While it’s going to be a herculean feat to try to force stakeholders to view it this way, let’s at least document our data work so the problem-solving aspects of our work are more visible. Let’s show the world why analytics-driven orgs can be so powerful! 🙌
Tweet @imrobertyi / @hyperquery to say hi.👋
Follow us on LinkedIn. 🙂
To learn more about hyperquery, visit hyperquery.ai.