Blog

What Big Data’s past tells you about Big Data’s future

big data past big data futureTurn the clock back to 1997. Tony Blair had succeeded John Major after a landslide election victory, the Spice Girls had three Number One hit singles, and the television series Teletubbies first aired. Oh yes — and the world first heard the phrase ’Big Data’.

The context? The realisation that datasets were starting to outstrip not only core memory and local disk storage, but even remote disk storage. So, as the authors of the IEEE paper in question concluded: “We call this the problem of big data.”

And after that first mention of Big Data as a problem, more and more observers came to the same conclusion. In a 1999 paper on data visualisation, for instance, a group of NASA and MIT scientists also used the term ‘Big Data’, stating that:

“Very powerful computers are a blessing to many fields of inquiry. They are also a curse; fast computations spew out massive amounts of data… As more than one scientist has put it, it is just plain difficult to look at all the numbers. And as Richard W. Hamming, mathematician and pioneer computer scientist, pointed out, the purpose of computing is insight, not numbers.”

Big Data as a problem? Big Data as a curse? Not surprisingly, like the Spice Girls and Teletubbies, such views now seem a little dated.

So what’s changed?

Technology, for one thing. Because instead of being the cause of our Big Data ‘problem’, it has evolved to give us an unparalleled ability to process gigantic data sets, as well as create them.

On the desktop, for instance, we’ve gone from 16-bit computing, to 32-bit computing, and now 64-bit computing.

PC storage, too, has changed immensely. Back in the mid-1990s, 500mb hard drives were state of the art — and far from cheap. Today, the computer on which these words are being typed has a one terabyte drive — over 2,000 times larger — which is a commodity computer component obtainable for about £50. A four terabyte drive costs about £120.

 

big data computer storage

Increased computer storage has redefined the way we can handle Big Data

 

Away from the desktop, while the mainframe has continued its steady decline, we’ve seen massed banks of state-of-the-art servers located in the Cloud, containing ever-faster processors, and coupled to ever-larger storage capacities.

Moreover, in-memory computing means that increasingly large data sets can be analysed in lightning-fast on-board RAM, rather than through slow disk-based I/O processes.

Or, put another way, what used to be a problem has now become something rather more interesting.

Better analytics

Why? Because, almost twenty years on, it’s clear that Big Data has become much more of a practical proposition in terms of analytics and visualisation.

Not simply because technology lets us work more easier with data sets of Big Data proportions, but because we now have analytics tools that we didn’t have twenty years ago.

‘Machine learning’ is a very practical proposition, for example. What’s that? At its simplest, throwing a dataset at a computer, and telling it to look for relationships and correlations.

 

big data analytics tools

We now have access to better, and cheaper, Big Data analytics tools

 

In other words, rather than come up with a theory, and looking to the data for corroboration, we’re now capable of asking the data to suggest the theories in the first place.

The cost of Big Data analytics has sharply reduced, as well. For data scientists in the fields of science, engineering and economics, for instance, there’s ‘R’, an open-source statistical programming language, found in almost every university in world — and free.

And while some businesses also use R, many more make use of the inexpensive commercial analytics tools offered in conjunction with Cloud servers on a ‘software as a service’ basis.

Put another way, whatever a business’s Big Data challenge is, the cost of the analytics tools required to solve it almost certainly won’t be a problem.

For business, opportunity is knocking

It’s perhaps worth repeating those words. Whatever a business’s Big Data challenge is, the cost of the analytics tools required to solve it almost certainly won’t be a problem.

Why repeat them? Because twenty years ago, when Big Data was first being perceived as a problem, giant datasets weren’t really a business issue. A science issue, yes. A government issue, perhaps. But business? Outside the rarefied world of giant financial institutions, probably not.

How times change. Here at Matillion, it’s increasingly the case that perfectly ordinary customers now have datasets of Big Data proportions — simply because it’s become both possible and affordable to build such datasets.

Meaning that what was a problem has now become an opportunity. And an opportunity that we at Matillion are increasingly helping our customers to exploit, we’re happy to say.

To learn more about the opportunities that Big Data can bring to your business, download our free bumper Ebook below