Nowadays, big data are becoming a new technology focus
both in science and in industry. Due to such large size of
data, it requires new technologies and architectures in order
to extract value from it by capturing and analysis process.
However, it is difficult to perform effective analysis using
the exiting traditional techniques and handle such large
amount of data that is growing at a huge speed. Thus, Big
data can bring huge benefits to the business organizations
and become relates to almost aspects of human activity from
just recording events to research, design, production and
digital services or products delivery to final consumer.
Big Data analytics is the process of collecting, organizing and analyzing large sets of data (called Big Data) to discover patterns and other useful information. Big Data analytics can help organizations to better understand the information contained within the data and will also help identify the data that is most important to the business and future business decisions. Analysts working with Big Data typically want the knowledge that comes from analyzing the data.
To analyze such a large volume of data, Big Data analytics is typically performed using specialized software tools and applications for predictive analytics,data mining , text mining, forecasting and data optimization. Collectively these processes are separate but highly integrated functions of high-performance analytics. Using Big Data tools and software enables an organization to process extremely large volumes of data that a business has collected to determine which data is relevant and can be analyzed to drive better business decisions in the future.
Gartner, Inc. defines big data in similar terms:
“Big data is high-volume, high-velocity and high-variety
information assets that demand cost-effective, innovative forms of information processing for enhanced insight and
decision making.” (Gartner IT Glossary, n.d.)
The characters of big data below can describe as follows:
High Volume: refers to the amount or quantity of data.
Because of big data has a large size, they storage in multiple
terabytes and petabytes. Through a survey conducted by
IBM in mid-2012, it is amazed that just over half of the 1144
respondents considered datasets over one terabyte to be big
Time and the type of data are two factors that make effect
on the volume of big data.
With storage capacities are more and more increase, what
will happen if deemed big data today may not meet the
threshold in the future? Moreover, the type of data,
discussed under variety, defines what is meant by ‘big’.
Basing on type of data, two datasets of the same size may
require different data management technologies, e.g., tabular
versus video data.
High Velocity: refers to the rate at which data is created.
The digital devices such as smartphones and sensors have
led to a growing need for real-time analytics and evidencebased planning. Traditional data management systems are
not capable of handling huge data feeds instantaneously
when retailers require dealing with hundreds of thousands of
streaming data sources that demand real-time analytics. This
is where big data technologies come into play. They enable
firms to create real-time intelligence from high volumes of
High Variety: refers to the different types of data such as
structured, semi-structured, and unstructured data.
The current and emerging focus of big data analytics is to explore traditional techniques such as rule-based systems, pattern mining, decision trees and other data mining techniques to develop
business rules even on the large data sets efficiently. It can be achieved by either developing algorithms that uses distributed data storage, in-memory computation or by using cluster computing for parallel computation. Earlier these processes were carried out using grid computing, which was overtaken by cloud computing in recent days.
Benefits of Big Data:
Businesses are using the power of insights provided by big
data to instantaneously establish who did what, when and
where. The biggest value created by these timely,
meaningful insights from large data sets is often the effective
enterprise decision-making that the insights enable.
Extrapolating valuable insights from very large amounts of
structured and unstructured data from disparate sources in
different formats require the proper structure and the proper
tools. To obtain the maximum business impact, this process
also requires a precise combination of people, process and
Challenges of Big Data:
Privacy and Security:
Due to Big data refers to the huge of digital information
companies and governments, security and privacy issues are
magnified by velocity, volume and variety of big data, such
as large-scale cloud infrastructures, diversity of data sources
and formats, streaming nature of data acquisition and high
volume inter-clod migration. Moreover, the attack surface of entire system will be quickly increased by using large-scale
cloud infrastructures, with a diversity of software platforms,
spread across large networks of computers.
Data access and sharing of information
It is necessary to make data open and make it available
that is to be used to make accurate decisions in time. This
leads to better decision-making, business intelligence and
Almost companies refuse sharing of data between
companies because they want to guarantee privacy, security
for their clients and operations.
Storage and processing issues
Storing the large of data becomes the one of challenges for
big data because the storage available is not enough. Cloud
infrastructure may seem an option to store the rigorous
demands of the big data on networks, storage and servers
outsourcing the data. However, with terabytes of data, it
with take large amount of time to upload in cloud and this
data is changing so rapidly which will make this data hard to
be uploaded in real time.
The choosing the type of analysis to be done on this huge
amount of data brings along with it some huge analytical
challenges. The various types of data such as unstructured,
semi structured or structured data requires a large of advance
Due to Big data process the huge amount of data, it needs
to attract organizations and youth with diverse new skill sets.
These skills should extend to research, analytical,
interpretive and creative ones.
Fault-tolerant computing is extremely hard, involving
intricate algorithms. It is true that no entire machines and
software have reliable fault Torrance. Thus reducing the
probability of failure to an “acceptable” level becomes the
main task. However, the more we strive to reduce this
probability, the higher the cost.
The scalability issue of Big data will be extremely difficult
if we use old technologies because of the fact that many
more hardware resources such as catch and processor
memory channels are shared across a core in a single node.
BIG DATA VALUE CHAIN
Value Chain, the concept introduced by Porter (1980), refers to a set of activities performed by a firm to add value at each step of delivering a product/service to its customers. In a similar way,
data value chain refers to the framework that deals with a set of activities to create value from available data. It can be divided into seven phases: data generation, data collection, data transmission, data pre-processing, data storage, data analysis and decision making.