It seems like we’re drowning in data, and the cures, including throwing more storage at it, and the relatively new concept of ‘big data’, are worse than the problem. Sort of a congratulations, the operation was a success but the patient died. However, despite the challenges, which are considerable, the opportunities and pay-outs for big data can be equally considerable when used appropriately.
First, let’s consider some numbers. Data is doubling every two years, and enterprises will manage 50x more data, and files will grow 75x in the next decade. Enterprise storage system expenditures will grow less than 4% per year for the next few years. And budget constraints is the biggest big-data challenge.
So what is big data? As one wag put it, big data is all about acquiring, analyzing and interpreting ridiculously huge data sets. The top data drivers include financial transactions, email, imaging data, Web logs and Internet text and documents. One source says big data starts around 30 terabytes (i.e., the equivalent of digitizing 10-15% of the Library of Congress), with others saying it’s much larger, ranging from petabytes (1000 TB) and exabytes (1000 PB) to zettabytes (1000 EB) and yottabytes (1000 ZB). They weren’t kidding about the ridiculous amounts of data.
These massive amounts of data can’t be handled by ‘normal’ processing capabilities, which typically means buying expensive new platforms, servers, and storage and training existing or hiring new staff that can take advantage of big data. Shortage of talent will be another big big-data problem, according to a McKinsey Global Institute report last year. It says that by 2018 the US could face a shortage of 140,000 to 190,000 people with deep analytical skills as well as 1.5 million managers and analysts with the know-how to use the analysis of big data to make effective decisions.
If big data is complex, expensive and requires a lot of people and skills that aren’t available, then why even think about it? Simple, big data can create big value. But like all the big-data predecessors – i.e., databases, data warehousing, data mining, data analytics and business intelligence – you need to know what you’re looking for, why you’re looking for it, what’s it worth to you, and how will you take advantage of it BEFORE you start. Otherwise, big data will just be a big waste of money.