Search

Why is Hadoop dead? Is Big Data dead?

In order to understand why Hadoop is dead, we must first understand what we are

talking about when we say “Hadoop.”

Hadoop is “a framework that allows for the distributed processing of large data sets across clusters of computers using simple programming models (Apache)”. What this means is that Hadoop allowed for the storage and processing of Big Data by storing the data across

multiple computers, all working together.

This system for handling Big Data worked for a time, but has become too costly and

complex, especially compared to new ways of storing Big Data (Woodie)

.

The storage structure of Hadoop required trained individuals to work with it, and required any group who wanted to store their Big Data to have their own Hadoop cluster set up.

With alternative options to Hadoop, like cloud storage, where a company can pay for space on the cloud, companies can save money and space because they don’t need to purchase and store all of the computers that make up the Hadoop cluster.

Advancements in storage options for Big Data in recent years have caused the move away from the Hadoop storage system.

Although the Hadoop storage structure may be dead, some of the programs created to

analyze the data stored within Hadoop, such as Spark, are being used in non-Hadoop

systems in order to “extract more value from data (Woodie).”

With Spark and other products created by the team that brought the world

Hadoop, the question of whether Hadoop is dead, may be a little bit more grey, depending on how you want to define Hadoop: a storage system, or the ability to compute Big Data (Murthy).

Although Hadoop may be dead, at least from a storage perspective, Big Data is still alive

and growing by the day . Big Data is information, either structured or unstructured

data, which is created at high volume, high velocity and/or high variety. Companies

want to use Big Data in all its forms, like social media posts, in order to analyze the individuals interacting with their products or services.

As technology continues to advance, we are only gaining more Big Data.

Every time we post on social media, use a health tracker watch or add another

“smart” element to our homes, we are helping to create more data that can be analyzed (Ghosh). This constant creation of new and different types of data is causing many companies to pay close attention to the Big Data sources they have access to. They want to be able to analyze the different sources of information that they are getting from their customers to create a holistic view of who their costumer is and how best to reach them

(Press).


With Big Data currently being used to train algorithms for artificial intelligence and machine learning tools, it seems that Big Data, and the uses for it, will be around for the foreseeable

future.



Works Cited

:

Apache Hadoop , hadoop.apache.org/.

Ghosh, Debopam. “Is Big Data Dead?”

Medium, Medium, 2 May 2020, medium.com/@debopam/is-big-data-dead-dacd6405a2f6.

Press, Gil. “Big Data Is Dead. Long Live Big Data AI.”

Forbes, Forbes Magazine, 2 July 2019, www.forbes.com/sites/gilpress/2019/07/01/big-data-

is-dead-long-live-big-data-ai/?sh=6f765fb61b05.


Murthy, Arun C. “Hadoop Is Dead. Long Live ‘Hadoop.".”

Medium , Medium, 10 Sept. 2019, medium.com/@acmurthy/hadoop-is-dead-long-live-hadoop-f22069b264ac.

Woodie, Alex. “Big Data Predictions: What 2020 Will Bring.”

Datanami , 6 Jan. 2020,

www.datanami.com/2019/12/23/big-data-predictions-what-2020-will-bring/#:~:text=Hadoop%20storage%20(HDFS)%20is%20dead,that's%20available%20in%20the%20cloud.

11 views0 comments

Recent Posts

See All

A Dive into NoSQL Databases - Mongo, Elastic and S3

No SQL databases (interchangeably referred to as “non-relational”, “non-SQL”, and “not only SQL”) represents and approach to database design that provides flexible schemas for both the storage and ret

Description of the Types of Analytics and Examples

Analytics is the science of analyzing raw data in order to achieve a better understanding of the raw data and/or system and to make conclusions about that information. The aim of analytics has been

Power BI Overview

We live in the times of ever-expanding data due to this analytics is a necessity. Why Power BI It makes machine learning accessible to end-users. There is no need-to-know complex coding if you are not