Monday 3 September 2018

Big Data, what’s all the Fuss

Big Data is often defined as the so large or complex that traditional methods are no longer acceptable of useful. There are many challenges to face, such as data collection, cleaning data, supplementing or enriching data as well as analysing and extracting the data. This is alongside the fact that 90% of the data in the world today has been generated in the last two years! Currently its estimated that we output 2.5 quintillion bytes of data per day, through electronic devices, phones etc. 

When people describe big data they often define it through a number of V’s. The number of V’s is variable (no pun intended) but I usually define big data through the 5 V's; Volume, Velocity, Variety, Veracity, Value:

Volume: Often relates to the bytes of data that are collected and stored everyday from many sources. In my day job we often have over 7 billion rows of data in a basic query, but sometimes it can go beyond that. 

Velocity: The speed at which data changes or is generated is enormous. This of how many videos or pictures you've seen on Twitter or Facebook that have gone viral in hours. Bid data should be able to analyse the data as it's being produced.

Variety: Defines the different data sets that are now available each day. From postcode data, census information, health data, finance data and on and on. The variety provides both an opportunity to enrich and provide background but can also muddy the waters.

Veracity: Defines the type of data, traditionally it's been about databases, structured data. Today the world is built upon both structured and unstructured data, such as websites, twitter etc.

Value: The one for me that should be top of the list! You may have lots of data and lots of cool algorithms but the main question is, what is the value in the data? Can you help a patients experience? Can you model a system improvement that will save your company money? Here is the value of the data.

Personally I view big data as a processes of knowing more about the area of interest than we did yesterday, be that through adding a V to what we know today.

The final piece of the puzzle if how you use that data within you business, is it going to inform a decision, form a data product? What you are aiming to use your data, may help you recognise the V that you will be most reliant on going forward. 

No comments:

Post a Comment