With all the buzzwords around the internet, often we feel overwhelmed and forced to question if big data systems will serve as the replacement for traditional RDBMS systems. In this article, I will tackle this question and explore both sides of the picture.
About The Author
My name is Muhammad Osama. I am a data analyst and have been associated with the FMCG sector for the last 2 years. Now-a-days, I am providing world wide consultancy to different companies as a freelancer. I enjoy teaching and learning about data. If you have any questions, feel free to reach out at Muhammad.Osama@CyberCode.ca
What is a RDBMS?
RDBMS stands for Relational Database Management Systems. These databases have been around for a long time with innovations spanning a period of 40 years. The data is stored in the form of tables which are similar to a spreadsheet, making the task of understanding them very easy and these tables are linked together with relationships.
What is Big Data?
By Big Data, we mean the following
- Unstructured Data
- Data Variety
- Rapidly Generated Data
- Cheap Cost
Let’s explore the concepts one by one.
Meaning a lack of structure, when compared to the RDBMS, all the table’s columns have known data type and data in a column is guaranteed to be of the same data type. This is not the case when you think of a CSV file where there is no restriction to stop you from entering any data in a column of the CSV.
The CSV is an example of a semi-structured data type.
Data is varied, take the example of video being recorded by a CCTV. It is not possible to do analysis on the binary data therefore specialized databases are used for this purpose.
The way they work is that they extract features from the images and then put the data into a database and then data is pulled from the database to analyze.
Rapidly Generated Data
Unlike traditional systems such as an ERP, data from the sensors is generated rapidly, often GBs of data in mere minutes. Therefore, traditional databases are not equipped to handle this amount of data.
When compared to traditional RDBMS, the cost of per GB is storage is much less in non-relational databases when compared to big data systems.
Comparison between RDMS and Big Data
By now, you will have developed some idea of how complex big data databases are. But the big data database lack ACID properties of the traditional RDBMS.
So what exactly is ACID, why does it make traditional RDBMS relevant for the times to come and why are big data databases not a replacement for RDBMS?
ACID, stands for the following:
Meaning, each transaction(Unit of work in the database) is indivisible meaning that it’s possible to divide a transaction into parts. It will execute completely or not execute at all.
Transaction 1 comprises of the following
Suppose that one statement fails then it’s not possible for the other statements to execute. And the database will roll all the changes back.
The data will remain consistent meaning that all the constraints (These properties on the database columns that allow tables to allow certain values in the table and prevent certain values from entering the tables) and triggers (These are certain actions designed) will remain intact.
All the transaction can be thought of individual process that are put into a pipeline, so transaction can read from another transaction until that one is complete.
These databases are fault tolerant, suppose that if the power was to go out, then the database has the ability to recover from the stage where that particular failure has occurred.
Which one to choose?
I hope now you have very good understanding of the differences between the two.
If you are interested in exploring the world of RDBMS with SQL Server then check out our Data analytics course with T-SQL.