Behind the Buzz, the Hidden Value of Big Data and Data Science

Big Data

For decades, people have been collecting data. However, today in many various sectors (be it in sciences, healthcare, banking…) the amount of available data is skyrocketing. This quantitative change is so huge that it has led to a qualitative one. Let’s try to explain why.

Until recent times, people could only work with little data, mostly because poor collecting, storing and analyzing tools limited them. Since the 19th century, facing a large amount of numbers, people winnowed the information to the minimum through sampling, in order to be able to examine it more easily. With the digitalization era and the prevalence of high performance technologies, many of the inherent limitations on data have disappeared. We have both far more data on everything and the statistical and algorithm tools to analyze it, along with the necessary equipment (digital processors and storage). Thus, the cost of storage decreased a lot, making data captures and records cheap and vast. As a result, we are now able to have the full dataset, providing us much more information to explore, giving us different angles of analysis, allowing us to capture far more details, giving us a clearer view of the granular, without the risk of blurriness. “Big data helps us get closer to reality than did our dependence on small data and accuracy. Now, we can get everything, see everything from every possible angle”[1]. With a huge amount of data, we can test lots of hypotheses at so many levels of granularity. We understand why there is a special value in having a huge and full dataset, which is lacking in smaller amounts. Therefore, it is said that big data represent enrichment in human comprehension.

“Data’s true value is like an iceberg floating in the ocean. Only a tiny part of it is visible at first sight, while much of it is hidden beneath the surface. Innovative companies that understand this can extract that hidden value and reap potentially huge benefits”. [1] K.N Cukier & V. Mayer-Schoenberger

Nowadays, digital is everywhere. Smartphones and inexpensive computing technologies make it feasible to datafy most of the essential acts of everyday life. Thus, we have a huge amount of data relating to many aspects of our lives, along with an abundance of computing power. Our actions are being datafied, intentionally or passively, as technology is able to capture information on nearly everything (such as people’s location, an engine’s vibration etc.) and to transform it into data form. This phenomenon is called “datafication”. All this data gives much information about our behavior. With social media platforms like Twitter, LinkedIn or Facebook, people’s personal connections, preferences and opinions can be recorded everyday. All of these can be mapped to find links on everything. Those data can then be used and reused for various kinds of analysis, to feed insights, to allow new predictions, or just be stored until someone finds a way to use them. The idea of big data is that the volume of information has reached such a high level that it can no longer fit into the memory that computers use for processing, so new tools have being developed for analyzing those data. Thus, it becomes easier and faster to process a huge amount of data; data collection and analysis, which used to take years, can now be done in a few hours. That is making the difference.

 Data science makes the data speak

The ability to make the right decision is among the most crucial assets a company can have. Data, if correctly analyzed, could give burning insights on many features of a company, and thus, could guide decisions. Indeed, data are the real-world processes’ traces, a quantitative echo of a society’s events.  When a customer turns a decision into an action, such an action creates data. The aim of data science is to turn the world into data and make the data speak. More precisely, data science is the art of firstly collecting and cleaning data, and secondly extracting meaning from those data, through the construction of models and algorithms, to make data-driven decision. The aims to do so are to understand the data using rigorous statistical methods and to make understandable what data are important by interpreting the voice of customers hidden inside data.

Data science transforms data into a source of economic value

The example of the 2009 H1N1 disease, developed by K.N Cukier & V. Mayer-Schoenberger in their book “Big Data, a revolution that will transform how we live, work and think”, illustrate this point: although the Centre for Disease Control and Prevention (CDC) in the US spent nearly 2 weeks to draw the picture and tabulate the numbers of the pandemic that was emerging, engineers at Google had predicted a few weeks before the spread of the flu in the US, just by looking at searches made on the Internet. And their predictions were very close to official figures given after by CDC. This example embodies the fact that big data gives us a better and quicker tool to predict and prevent the spread of some diseases; underlying how big data can become a real source of economic value, if people know how to take advantage of it.


 Aurélia Kain

Source : K.N Cukier & V. Mayer-Schoenberger in Big Data, a revolution that will transform how we live, work and think

[1] K.N Cukier & V. Mayer-Schoenberger in Big Data, a revolution that will transform how we live, work and think