Big Data’s Relationship with Business Intelligence and Data Warehousing

You’ve probably heard the buzz about big data and business intelligence data warehouses. Both deal with collecting information for analysis, but how are they different? When should you use one or the other? This article explains these two data solutions in a user-friendly way with real-world examples.

It seems like you can’t pick up a technical magazine without reading about how big data is changing the world—and the untold implications of this technology. But what the heck is big data? And didn’t we already solve this thing with business intelligence and data warehousing?

Big data, or BD, is the collection of transaction-level detail for analysis. The data is kept close to the transactional detail so it can be examined for hidden trends only seen when you analyze the individual transactions. The data can come from different sources but is analyzed in a common pool. This is most often a feed (or copy) of the transactions as they occur; they are streamed to the BD solution. Often, the value of the data is very time-dependent; the sooner the information is available, the more valuable it is.

There are four key terms used when talking about BD:

  • The volume of records in scope is large. Millions of records per day can and do occur.
  • The velocity of records being created is fast, as BD is very granular, and collection of the data is close to real time.
  • The veracity of the data, which is a fancy term for the quality of the data, refers to inaccuracies that can occur when processing high volumes of data from multiple sources. There is a need to develop methods to screen the data quickly to add an optimal level of accuracy to the volume and velocity.
  • There is a variety of data-generating devices, and as the number increases, it will become even more important to be able to interpret and consume the data from these different sources.

These four factors make using a conventional relational database management system impractical for storing and quickly analyzing BD, so new methods are being developed.

So, what is a business intelligence data warehouse?

A BIDW is a data analysis system that collects the transactional information and typically provides summaries on selected key fields of the transactions being watched. These summaries can be used to better understand the overall health and trends in the transactions being monitored. The BIDW data is a copy of production and is not in real time, so long-running queries can be initiated without concerns about impacting the live customer actions. Data may be loaded daily or weekly, depending on the data source. The data is kept at several levels to serve the different customers of the BIDW; summary data and dashboards are the most common outputs of a BIDW, but if needed, you can drill into the transactions.

It is reasonable that at this point you are not seeing a real difference between BD and BIDW, as both can contain transactional-level detail, but these two tools are typically used for very different purposes.

The following examples should make this difference clearer.

User Comments

Ravi Kumar Bavirisetti's picture

Very nicely said. Liked the way of explanation with examples.



Ravi B.

October 14, 2015 - 6:13am
Lee Kevin993's picture

Nice Article , as an system analyist the volume does matters, data with content transfer app and more is vtial for productivity. but on all I do agree with your points. Good effort

September 28, 2016 - 8:13am

About the author

StickyMinds is a TechWell community.

Through conferences, training, consulting, and online resources, TechWell helps you develop and deliver great software every day.