Tech Recipe: Building the dashboard for your business engine
Editor’s note: Recipes are a new format we are testing where we outline how to use software to build new solutions or solve old problems. We need your feedback! Reply to the email or post a comment. Let us know what you think of this format. If there’s any specific recipes we should create or ideas for how to improve this, we’d like to hear it.
Background:
Utilizing data effectively is a challenging task because most data stays within the app that created the data.
This leads to multiple silos of data making where each new app or function creates yet another silo. Modern business leaders who want to understand how a function, team, or even the overall business is performing need to get this data out.
We’ve talked about building your business engine. Much like the internal combustion engine, your business engine needs a dashboard. To date most companies are left manually extracting data from the different pieces of software and using Excel to create that dashboard.
Using Excel is better than nothing at all, but there are significant limitations. The biggest one being scale. The more business software (or data sources) you add to your engine, the harder managing this spreadsheet becomes.
Similarly, different people may want to see specific reports, they may not need to see all of the underlying data, and most importantly they may want to further customize or change them for their team’s use.
Software companies solved the scale and access problem by building extract, load, and transform data pipelines. ETL for short.
“Extract” This portion can be the most time consuming. The data pipeline software must be able to access your business software, find the relevant data, and send it to the data warehouse.
“Transform” Data by itself is meaningless. It’s a collection of numbers or letters. In order to make sense of data, especially data from multiple systems, you have to transform it. This step is when you define what each column of data is and how it relates to the other columns of data, if it relates to the other data at all.
“Load” Now the data is ready to be used. In complex workflows this could mean sending it another storage system, in this example we are loading the data into a visualization tool.
Technology companies typically build custom software that provides them end-to-end visibility into the business which empowers any team to access the specific data they need
In this recipe we will build an ETL pipeline, no software engineering required.
Shopping list:
Needs:
Data Extraction
Data Storage
Visualization tool
The TAM for ETL is over $10B and growing. With any large software market there’s bound to be technical marketing language designed to make things more complicated than they are. Don’t buy shallots when onions will do. Many companies offer all-in-one solutions, while others are purely focused on one function in the pipeline.
As you shop keep in mind that you need a few key functions:
Extract data from software you use
Transform and store the data in a secure location
Load the data into the visualization tool of your choice
Cooking instructions:
Sign into your chosen data pipeline tool and one by one connect to the “sources”
The right tool for you will be compatible with all, or most, of the software you want to pull data out of
A good tool will let you preview the connection so you can be sure the connection is set up right
Modern SaaS companies will give you access to all of your data, but you may not need all of it. Take time to make sure that you’re extracting the data you care about from each source
With all of your sources configured, it’s time to send the data to your storage destination, also known as a warehouse.
Most data pipeline software will be compatible with all data storage vendors, but it’s always good to check. Setting this part up should be a matter of clicks.
If you already use Google Workspace or Microsoft Office 360, we recommend using Google’s BigQuery or Microsoft Azure Synapse Analytics. Keeping the data in the same cloud partner will minimize cost and other issues.
The warehouse will put data from each source into its own “table”
In other words if you have pipelines from 3 pieces of software, you should have 3 tables in your warehouse
Within the warehouse you can create your “data model.” This just means defining the relationship between data from different sources.
This step is crucially important. Spend the time to go through and set this up correctly once and you won’t have issues down the line. The data warehouse software will automatically do the necessary sorting and transforming with all new incoming data.
For example invoices in Quickbooks will have customer information, sales amounts, and item information. The Quickbooks customer information might use the complete business name with the entity type, e.g. “Alphabet Inc,” whereas in your CRM you call them “Alphabet.”
Your data warehouse will allow you to do any transformations and relations, ideally through a user interface instead of in a programming language or in SQL.
The data warehouse will run all of the transformations you created in your data model. This process turns your raw, extracted data into processed data. The warehouse stores this processed data separately from the raw input data.
Within the warehouse tool you can specify which visualization tool you want to make this data available to.
Now switch over to your data visualization tool. You want to connect it back to the warehouse and find the new table of processed data that it created.
Most visualization tools have templates for charts, graphs, tables etc. The interface will let you pick the columns you want to connect to that visualization.
Once you set up the dashboard, the data will automatically refresh as often as the pipeline tool runs.
Happy cooking!