Most well established data teams have a clear remit and a well defined structured for what they work on and when: from the scope of their role (from engineer to analyst) to which part of the business they work with.
At incident.io, we have a 2 person data team (soon to be 3) with both of us being Product Analysts. Being in a team this small has both pros and cons:
Pro: unlimited scope - we get to work across every area of the business, on everything from building a pipeline through to deep dive analytics
Con: unlimited scope - there are 1,000 things we could do, and only 2 of us to do them all!
Broadly, work in a data team boils down to 5 core areas:
🛠 Infrastructure work - get the right data to the right people at the right time. This is the “plumbing” part of the job and arguably the most crucial
📊 Core dashboards - the absolute basics of analytics, making sure that you have everything set up to monitor the health of each area of the business
📏 Measuring success - ensuring projects have clear success criteria (e.g. did the experiment work, or are customers using our new feature?)
🤿 Deep dives - picking interesting things to go deep into - such as your sign up funnel or how customers use a particular feature - and suggesting concrete actions off the back of them
👨🚒 Ad-hoc - handling the unplanned work that will be asked of any data team!
This blog will go into each of these 5 areas, what good and bad looks like, and how we try to strike the right balance here at incident.io.
🛠 Infrastructure work
Getting the right data in front of the right people at the right time
The plumbing part of the data role, typically split into data ingestion (data engineering), processing (analytics engineering), and getting the data into your Business Intelligence (BI) tool of choice (BI / analytics).
Tools like dbt, Fivetran, and Stitch make the first couple of steps of ingestion and processing simple when you don’t yet have dedicated data engineers. We wrote a separate blog post on how we recently updated our data setup.
The key word for the infrastructure part of a data team’s responsibilities is trust.
What good looks like: your numbers and your data pipeline are both reliable - people shouldn’t notice nor think about it (it should just work)
What too little looks like: wrong numbers, historic data changing, and consistently unreliable data feeds all devalue your data “brand”. People will start going to source systems and hacking together their own metrics if they can’t rely on your data
What too much looks like: excessive tweaking of something that already works, also known by this great blog post as “snacking”, with little focus on anything else
How do you balance it?
This is tough for an early stage company - obsess too much over your pipeline and incremental improvements and you’ll lose sight of the impactful analytics work that can really drive the direction of your company.
Some rough rules of thumb to spot when your core data infrastructure needs a bit of attention:
You’re doing a lot of “one-off” data extracts, using the same data, a lot more than just on a one-off basis! A great principle from Monzo was “slow down to speed up” - taking the time to build out a reusable data model (with proper dbt tests) pays dividends further down the line
Everything is done with sub-optimal tools. The odd Google sheet to do some quick number crunching is harmless, but using them to update your company’s weekly metrics is. Anything that needs to be reliable and is (frequently) updated manually will always end in tears!
You’re frequently explaining why the data looks wrong to someone outside data
You’re seeing consistent test failures or general pipeline failures - and they’re being ignored
📊 Core dashboards
Shining a light on how your business is doing in each area
The bread and butter of any analytics function, these dashboards are the ones that founders, investors, and team leads should be checking on a regular basis to monitor the health of the areas they care about.
At incident.io, we use Metabase for all of our dashboarding needs - and have a separate folder called (unsurprisingly) “Core dashboards” which just contains a dashboard for each area of the business that needs it.
What good looks like: the clichéd “single source of the truth” is available for any area of the business, it’s checked regularly, used in recurring team meetings, and you get the odd request here and there to make edits
What too little looks like: every team coming up with their own set of metrics and versions of pulling the data - especially if they’re manual and error-prone as a result. Or, low adoption of the dashboards you’ve built
What too much looks like: your data team becoming a core BI team, either because of a lack of a good pipeline making data feeds and edits a pain (see “infrastructure work”), or because of your team is too focused on high-level analysis
How do you balance it?
These types of dashboards typically involve more upfront work - and (if you have your infrastructure set up right) little in terms of maintenance. To get the balance right:
Work with your audience, be it founders, team leads, or otherwise. What questions do they need answering week in, week out? Do they have the tools to do it at the moment, or do they have to hack it all together? This should give you steer on demand
Get the questions answered, then move on - dashboards can be endlessly tweaked, get them to a state where your users can answer all of their questions (reliably) - and move onto whatever is next
Change as you go along - once a dashboard is up and running, further changes don’t need to become entire projects with a whole requirements gathering process - don’t bog yourself down when a half an hour fix will have the same outcome
Core dashboards are a great way of building your brand as a data team and getting people into the habit of using the things you build, so these should always be top of your to-do list as a new team.
📏 Measuring success
Is/was this worth doing?
In our (highly unbiased) opinion, almost everything you build or do as an early stage company should be data driven. The term “data driven” is thrown around a lot without real meaningful examples, so to give this some colour:
Figuring out if it’s worth doing in the first place: if we want to spend time improving a product feature - do our customers already use it enough to justify the work? If not, do we think improving it will materially increase adoption? Some of this is subjective, but having solid analysis as a starting point should be the minimum before making a decision
Figuring our how we’ll measure success: every change or new feature will have an aim (e.g. “building a Zendesk integration will bring more customers to our product”). With our data hats on our job is to turn these aims into things that can be measured (e.g. “% of new customers that use our Zendesk integration should increase by X%”)
Putting the tools in place to measure success: typically building out a dashboard to measure the metrics we set out above, and encouraging regular check-ins. In Metabase, we have a clean folder structure with one folder per project, and a subfolder for “Before launch analytics” and “Post launch monitoring” in each
Taking a data driven approach to launching & monitoring shouldn’t ever impact the speed at which you build things, especially when you’re early stage. At the very least, you should be building out the tools to measure success even if it’s during/post launch.
What good looks like: a consistent framework for measuring success is applied to every major project, and a scheduled set of check-ins after it’s done to see if it’s gone to plan. Everyone knows where to go to see how any new launch is going
What too little looks like: no one knows if the things you’re working on have been worth the effort - save for some customer anecdotes. Ad-hoc queries being run post completion to look at performance instead of something centralised and repeatable
What too much looks like: much like the core dashboard point above, if you are just churning out monitoring dashboards week in week out - take it as a sign that you either need more data people, or you need to shift / reduce your workload
How do you balance it?
If there are too many projects for the data team to comfortably monitor, without losing balance, there are a couple of options:
Same number of people, less work- ruthless prioritisation by either slimming down the process, or having a higher bar to what projects should be measured in-depth
More people, same work- get more people involved in measuring success. This is effectively solving the problem by pointing more people at it, which isn’t always the best approach. If the process to measure success feels arduous without providing valuable insight, it needs rethinking
Our engineers are now actively involved in the process of measuring success, right through to building monitoring dashboards. To help this process scale, we’ve written a playbook as a “how-to” guide for measuring success and we’re excited to see how this goes (it may even end up as a blog post 👀)!
🤿 Deep dives
Getting into the weeds of your data
Lots of data work is driven by inbound requests, or dictated by the projects that you’re working on. It’s crucial to take a step back to look at the things you aren’t being asked to look at - and see if there are any potential goldmines of information no one has taken the time to investigate yet.
The critical point here is to focus on the “so what?” of any analysis you do. Not every analysis needs to have action points, but the difference between a good and a great analysis is often the extra 10% put into making the takeaway actions super clear for the reader.
What good looks like: regular analysis with clear action points (if relevant) that’s aligned with the main aims of your company at whatever stage it may be
What too little looks like: a data team purely focused on delivery or analysis relating to project work
What too much looks like: waves of analysis and an information overload for anyone outside the team
How do you balance it?
Keep regular conversations going with your founders and team leads, and keep your eye on what’s most important to them - and their concerns. These are great sources of inspiration for a deep dive analysis
Make insights a habit. Recently, we introduced “insight of the week”, where we each produce a single chart to demo in our weekly all-hands. This keeps us accountable to get something done every week, and makes us take a step back from our day-to-day to pick on something new each time.
A couple of weeks ago we looked at what % of people complete our tutorial incident - where you need to rescue a lost cat - and who left the poor cat stuck in the tree 🐈🌳. Whilst it’s a bit of fun, it was a quick analysis that gave a good level of insight into how our customers are using our product
👨🚒 Ad-hoc
The data firefighter
We’ve all had those days where your pipeline seems to be on fire - or it feels like you’re fielding 100 requests for all things data.
This is one of the more dangerous bits to be “good at” handling in an early stage company, as you can end up with individuals who 1️⃣ jump on every single ad-hoc problem and 2️⃣ feel productive doing so regularly.
As Isaac wrote in a previous blog post about the risks of being a “hero engineer”, being a “data hero” can patch over the cracks of an unreliable pipeline setup, or miss opportunities to make repeatable data models/dashboards in favour of hacking solutions together.
A healthy amount of inbound demand for data work is usually a good indicator of having a reputation of being a reliable data team, and having to dive into random failures often isn’t a dealbreaker - it’s all about moderation!
What good looks like: people feel comfortable coming to the data team for help on anything, and the data team isn’t feeling overwhelmed. Ad-hoc requests spread fairly evenly
What too little looks like: no inbound for the team, which means people aren’t leaning on your team enough for help
What too much looks like: 50+% of the data team’s time is spent fielding inbound questions and fixing things that have broken. General low energy in the team and feeling burnt out
How do you balance it?
Check in regularly, not just on the things you’ve got done. Each week we do a “temperature check” which is a way for us to have a few minutes to take stock - if you’re feeling burned out or a bit overstretched, it’s a good time to get it off your chest
Track larger inbound requests with tickets, so at the end of each week you can sense check how much ad-hoc work we’re doing - and whether it feels sustainable. Again, a typical outcome of lots of ad-hoc work is investment in our infrastructure
Keep yourselves accountable - each week we share our plan in slack at the start of the week, and update what we did / didn’t do at the end of the week. So, if we don’t get something planned done - there’s no hiding it! This is intentional to make sure we’re not over committing to work we can’t deliver, and it’s a useful way of checking how hectic things are
Pick someone to get interrupted often and rotate this job each week**,** rather than having your whole team interrupted infrequently. You’ll only need to do this when your team grows to a reasonable level of ad-hoc work, but it has a couple of good benefits: it allows everyone else to focus, and it gets the person on duty exposure to things outside their normal remit
Bringing it all together
When you have 1,000 things you could work on it’s tough to feel like you’re always focused on the right things as an early stage data team - and we by no means have everything setup perfectly! But, there are learnings that can be applied no matter the size of the company:
“Slow down to speed up” - making common data models repeatable, well documented, and well tested will always pay dividends
Don’t underestimate the impact of a faulty pipeline or incorrect numbers, on a consistent basis, on your brand as a team. Data is “iceberg” work where the bit that’s visible at the top is the primary thing your team will be judged on - no matter how complex everything is below the surface
Step back and think about what things people aren’t looking at - is there a quick analysis you could do to see if it’s worth a deep dive?
Make sure every analysis has clear action points, where applicable. Don’t make the reader do the mental legwork to get to the point you’re trying to make
Plan often, share what you’re going to do, and be honest if you don’t get it done - it’ll give you a measure of your effort on planned vs. unplanned work, keep you accountable as a team, and give good visibility to the work you’re doing
Don’t over index on being a “data hero” if you’re a small team, otherwise, the expertise on fixing things will be concentrated on 1 person - and you may be missing out on opportunities to make things more scalable
Keep a track of your ad-hoc work via tickets if you’re starting to feel overwhelmed, it’ll give you a good case to hire more people if you have evidence that you have little room for anything that’ll move the company forward
Most important of all, don’t forget to take a step back! Early stage companies change rapidly, and so should your team’s focus as a result. Don’t forget to enjoy the perks of being in somewhere early stage - variety, speed, and the potential for outsized impact are all huge benefits of being in an early stage data team, you just need to get the balance right.