When an organization has data questions, those questions can sometimes be answered by writing reports directly over the systems where the data was entered—whether that’s an EHR (Electronic Health Records) or accounting system. Often, this approach works just fine for financial statements and basic operational reporting. (Indeed, I’ve written before (back in 2011) about how to use the general ledger as a simple datamart/data warehouse. And I’ve run many projects using this method.)
However, this approach breaks down as data becomes more complex, such as when you need data from multiple systems to answer your questions. Unfortunately for my social service agency clients, this is almost always the case.
Most social service agencies need to combine data from a variety of systems (e.g. EHRs, Medicaid billing, local and state agency systems and their accounting system) to get answers to their data questions.
In this kind of situation, we need to build a data mart. A data mart is essentially a central place where all your data is collected and organized before you create your reports, dashboards and visualizations.
(I’ll describe the difference between a data mart and a data warehouse in a future post.)
Using a data mart comes with three main advantages:
Advantage #1: Consistency (i.e. a single source of truth)
When a developer has to create logic from scratch for each report he/she creates, problems almost always follow. These problems usually involve issues with logic and naming.
When you’re working with multiple systems, it’s not always obvious how to combine data to get accurate answers to your questions. Sample problems can include voided records, incomplete records, and records without expected matches in different systems.
Wouldn’t it be nice if all your systems used the same terminology? Alas, that doesn’t often happen. For example, in a New York State Health Homes project, basic client identification was called “CIN,” “MemberID” and “Policy number” in different systems. Needless to say, this can be confusing for report developers.
With a data mart, you only have to solve these problems once. After that, you can create all the dashboards, reports, and visualizations you need and be confident that every report is using same calculations and naming conventions.
Advantage #2: Performance
Pulling data from multiple systems in real time can be quite inefficient. With a data mart, you bring together all the data first (often overnight), before running your reports, dashboards and visuals. As a result, they run faster and more efficiently.
Advantage #3: Labor Leverage
Labor is the biggest expenses in any business intelligence project. Therefore, you want to make sure every person’s skills are used to best advantage.
BI projects require three kinds of skills—and these skills don’t necessarily overlap:
Getting the Data (ETL)
ETL means extract, transform, load. Before we do anything, we need to extract, transform and load the data into one place (which, for us, is usually SQL Server).
The data can come from spreadsheets, CSV files, other on-premise databases, and/or cloud solutions. Wherever it comes from, your developers need to know how to create a consistent, repeatable process that gets everything into one place easily. This requires serious development skills.
Organizing the Data (Data Modeling)
Once the data is in one place, you need to organize it. Different systems will use different names for the same things. And relations between different systems won’t always be obvious.
You need folks who can organize and “smooth out” the data so that it’s ready to go. This also requires serious development skills.
Presenting the Data (Reports, Dashboards, Visualizations)
Once built, data marts tend to change relatively slowly. But how people want to see the data within the data mart will change constantly. You can expect regular requests for new dashboards, visualizations and reports.
Fortunately, presenting the data only involves the front end of the system, and you don’t need a superstar developer to change the front end.
In fact, many super users (who don’t have the knowledge and experience needed to build the data mart), can be trained to create the front end, whether in Power BI or another package.