what is big data, what is the objective of big data, what is big data analytics, examples of big data
What is big data? Day-to-Day we are generating different data in the form of text, calls, e-mails, photos and videos, music, and searches. According to Gartner, it is the new oil. Let's see what Google says from the dawn of civilization till 2003 that is from the 1960s to 2003 we generated 5 Extra bytes of data but now we produce the same data in merely two days that is huge. The data generated per minute on the internet is huge.
According to IDC big data market will be growing seven times faster than
the overall IT market.
This data is quite a lot and the traditional systems are inadequate to
handle. Traditional systems it could be mainframes or traditional data
warehouse or traditional Relational Database Management System. This massive amount of data is called Big
data.
What is Big Data?
Big Data refers to a collection of data that is huge in volume, yet growing
exponentially with time. It is generated at a very large scale and cannot be
processed by any traditional data processing units. Big data is used by many
multinational companies to process and analyze to uncover insights and
improve the business of many organizations.
The characteristics of the huge amount of data in the concept of 5 V’s
Volume
This relates to the size of the data. The reason why it is called big data
is that it has an enormous size of data. Consider the number of times the
react button is clicked on social media sites, the number of messages
generated on instant messaging apps, and the number of posts on blogging
sites. Developers record these unimaginable amounts of data and use big data
technologies.
Variety
It refers to heterogeneous information sources and the data which may be
structured, semi-structured, and unstructured. It may be traditional data
like last name, first name, birth date, and address found in a table. It may
also be in the form of emails, photos, videos, pdfs, audios, offline
documents, records, online sources, etc.
Veracity
It means the degree of reliability of the data in an ocean of data some are
irrelevant. So there is a need to filter the big data to find which
is trustworthy, accurate is and translate from uncertainties and
inconsistencies to be useful in an organization.
Value
Big data is massive but not all pieces will be relevant. Thus it is not the amount of data that we store or process that will count. It is the
amount of valuable data that needs to be stored processed and analyzed to
find insights.
Velocity
It refers to the speed of generation and processing of data. It plays a major role in extracting information. If data moves fast and they are available for consumption on demand then data will flow smoothly and continuously in business processes, networks, social media sites, and in any organization.
These big data can be stored and processes using different frameworks:
Cassandra, Hadoop, and Spark.
What exactly is the goal of Big Data?
The ultimate goal of big data is profitability. From big
data, we can generate actionable insights. From big data, a business can make
the right decision at the right point which leads towards
profitability. Big Data is not a technology it's a paradigm shift the
biggest treatment from IBM paradigm shift. During this particular process,
all the old technologies are getting replaced. Because this is the second
paradigm shift that is running in the industry earlier in the late eighties
we shifted from older systems to our DBMS. Now we are shifting to big
data.
What is Big Data Analytics?
The process of collecting, examining, and analyzing the large volume of
diverse data sets to discover useful hidden patterns and other information
like customer values, market trends that can help organizations make more
informed, and customer-oriented decisions is called data analytics. In other
words, analytics is the extraction of useful information from the data by
building all possible relations among various data.
To help in managing big data better machine learning is used to review
when we say machine learning is a software application that can learn to
increase their accuracy for the expecting outcomes. It is used to train
machines by feeding them data sets and making algorithms that enable
machines in problem-solving and decision making. The algorithms improved
over time as they can learn from experience. One of the benefits of machine learning to data analytics is finding solutions for
problems. Like cost reduction, time-saving, and lowering the risk in decision
making.
At this point, there are different classifications of big data based on the degree of
organization.
Structured data, Semi-structured, and Unstructured data.
What are some examples of big data?
·
The stock exchange generates new trade data per day.
·
Social media sites that store data like photo and video uploads, messages,
comments, and posts into their database.
·
A Jet engine that generates large data in a few minutes of flight time.
·
A university that has many students and an ocean of data. Some universities
are now able to use analytics and data visualizations to show patterns of
students that aid their operations recruitment and retention efforts.
·
Weather sensors and satellites deployed all around the globe to contribute
to weather forecasting.
What are the examples of machine learning activities on big data?
·
The Song recommendations on digital music podcasts and video streaming
services like on Spotify.
·
Live Tracking Report
- Uber generates and uses a huge amount of data like drivers, and vehicles
information, locations, and every trip from every vehicle. All this data is
analyzed and to predict demand, location of drivers, and cost that will be
set for every trip. Uber optimally matches your ride to other passengers to
minimize detours.
· Fraud Management Report - Financial institutions determine if a transaction is fraudulent or not. Artificial intelligence is used to check what types of transactions are considered fraudulent.
Top BigData Processing Framework
Hadoop, Spark, Samza, Apache Flink, Hive, Apache Storm, Presto, Impala, Qubole, etc.
COMMENTS