In the past few decades, Data Warehouse had been blazing trails and today it is Big Data that is the latest revolution in technology. One question that is often being asked is whether Big Data will replace Data Warehousing.
Though both Big Data and Data Warehousing have similarities, they are two different technologies and there is a huge difference between the two. Before we delve into the dissimilarities, it is important to know what Data Warehousing and Big Data are. Broadly, a big data solution is a technology based on volume, velocity and variety, whereas data warehousing is an architectural concept in data computing.
Now, let us deep dive a bit into both the technologies:
Data Warehousing refers to data which is extracted from one or more homogeneous or heterogeneous data sources, and then transforming the data before loading it into a data repository for data analysis. This data analysis is useful and helps in better judgement for improving performances and can be used for reporting.
The data repository which is generated from the process is the data warehouse.
It is a conceptual architecture which is aimed at storing structured, subject-oriented, time variant, non-volatile data for decision making. Data Warehouse typically stores the historical data, a copy of transaction data specifically structured for query and analysis.
A Data Warehouse traditionally brings together data from many transactional and operational systems, which is then presented as a consolidated and the best real version to decision makers at all levels of the organization. A well done data warehouse design allows us to access, report and analyze that information from all the relevant and possible angles; which drives consistent and accurate information as a result.
Big data is a technology that is used to store the unstructured data from various sources and to manage huge volume of data in Exabyte (1 billion GB) and Zettabytes (1 trillion GB). Big Data can store all kinds of data like structured, semi-structured and unstructured data which can consists of video, audio, unstructured text, etc., while using cheaper storage devices. The data is not processed at one place and is spread across several servers for faster processing and is stored in the native format without any planning or modelling applied. The actual usage of the data needs rules to be applied to the data to get the report.
Big data refers to volume, variety, and velocity of the data, the 3Vs which were named by industry analyst Doug Laney in the early 2000s. Big Data is determined by the size of the data, the speed at which it is coming and the wide range of data.
Finally, let’s have a quick look at how Data Warehouse and Big Data are different
Big Data technologies are focused on advanced analytics, and can be viewed as a modernization strategy for data archives. Data Warehouses were mostly built for reporting, OLAP and performance management. Hence, we can rightly state that Big Data is a complementary technology and not a replacement to a Data Warehouse. They co-exist based on the business requirements.