Behold the Power of Small Data

Your most important data is Small Data. Let’s learn how to use it.

Small Data

A lot of data best practices come from big data use cases. As a result, a lot of data-driven talk focuses on fancy stuff: statistical significance, collection methods, data integrity, data lakes, OLAP cubes, BI systems..etc.

However, if you’re reading this, it’s extremely likely that your most important data is Small Data.

Small Data: a volume of data small enough that you can populate and manipulate the data manually.

For most people, Small Data is your most important data. Small Data often includes: recruiting candidates, employees, major incidents, and deals.

Small Data Is Very Different Than Big Data

Small Data is very different than big data. The self-sufficiency and scrappiness you have with small data is much better. However, many people miss that opportunity to be nimble. The biggest mistake people make when starting out with being data-driven is treating Small Data like it’s big data.

Examples of people treating Small Data like big data:

  • Optimizing for cleanliness instead of time-to-value
  • Focusing on formatting and presentation niceties
  • Not using a spreadsheet
  • Relying on anyone but yourself to own and operate the data system
  • Not backfilling data

All of these are inefficiencies and often the things that will prevent you from ever getting to being data-driven in a given area.

How to Operationalize Small Data

Operationalizing Small Data should be fast. Here’s how to get started:

  • Make a hypothesis. Good hypotheses include things like: knowing our close rate in recruiting would be useful, knowing the day of the week that incidents happen could give insights into flaws in our release process, knowing the regrettable attrition rate for employees by tenure could help us better analyze our employee development.
  • Create a spreadsheet (Google Sheets or Excel)
  • Start populating the minimal amount of data you need. Yes, you are the data entry specialist today. You might need to backfill a couple hundred of your last recruiting offers. Get some coffee and do it.
  • Sometimes you don’t have all the data you might want (e.g. was that incident on a Thursday?). If you can be more than 95% accurate, guess. If you can’t guess accurately, try to turn the field into something more blunt (e.g. start of week/end of week vs day of week).
  • See if you learn something from the data.
  • Take action to improve things based off of the learning. e.g. I learned we’re twice as likely to close candidates if they get a demo. Let’s start adding a demo to every interview.

Repeat this for all new hypotheses. If a particular metric is a good, durable metric - create a recurring meeting to check and analyze that metric. If a particular data set proves extremely full-of-insights, make a project to have a fancier system support it in more ways. If any of the derived metrics from the data are useless, remove them. When reviewing data insights, inert metrics are bad metrics.

The most important aspects of this system are twofold:

  • You’re minimizing time to-to-value
  • You’re looking for big learnings, not subtle signal

Minimizing time-to-value gives you makes sure you’ve got a feedback cycle that will let you learn fast. Looking for big learnings gives you the ability to be fuzzy with the details - to push through any data quality nuance. This is OK because with Small Data you’re looking for big signal. And in a growth business in general, you should be trying to uncover big signal to take decisive actions. You need to be learning big and acting big at a growth company to continue to grow.

Final Thought

Following the approach above will often surprise you with how quickly you can glean valuable insights. Please stop prematurely optimizing small data sets. Open up a spreadsheet and learn something.