“BIG DATA” is irrelevant. Let’s hear it for “LITTLE DATA”

Big Data

Periodically, about once every couple of years, a new buzzy term emerges to try and excite the marketing industry.  Let’s face it, there are plenty of would be copywriters, industry pundits, software firms all vying for coining the “next” concept to bring to a clients attention.  Keen to “launch” or invent the next big “term”. To be expected, I suppose.

Many of these terms, “integrated marketing”, “CRM”, “online-offline” end up in the marketing lexicon being used in different ways by different groups, with various definitions being attributed and debated over.  Some of them e.g. “Permission Marketing” are coined by more visionary pundits, such as Seth Godin, and do seem to describe a genuine shift in consumer behaviour and how marketing interacts.

So great, lovely super marketing terms.  Jargon. Agency speak. Words to impress your clients, your boss, your mom.  And of course some of them are important, and worthwhile for debate.  Exercising the semantics of describing what we do, and some terms are a little more deserving that others.  So perhaps we should we have a Gartner hype curve of marketing terms perhaps? Afterall, Gartner have certainly invented a few ways of describing marketing innovation, why not the terms themselves?

But in 2012 we had “Big data”. Oh, please, this one has been annoying me. In Gartner terms it is certainly in the “trough of disillusionment” for me – it went straight in and is staying there.

Surely this one emerged from the US, where everything has to be BIG to be worthwhile…(I’m sorry American readers, I’m sure this is just a super-sized stereotype, or perhaps “Not”, as Borat so aptly mistimed).  And the term wasn’t initiated in the marketing industry, but more in the fields of science and research, telecommunications, internet search e.g. Google. According to Wikipedia in the private sector apparently Walmart handles more than 1 million customer transactions every hour, which is imported into databases estimated to contain more than 2.5 petabytes of data.  Apparently this is the equivalent of 167 times the information contained in all the books in the US Library of Congress.  And in March 2012, The White House announced a national “Big Data Initiative” that consisted of six Federal departments and agencies committing more than $200 million to Big Data research projects.”  Must be important then.

Of course, virtually every vendors of analysis software have all had to BIG up their ability, to handle BIG datasets, and obviously coming up with BIG insights to give you BIG profits. The implication seemed to be that everyone needs BIG data, if you read the business and technology magazines.  This is complete codswollop, to use a rather old-fashioned Irish but polite expression. As Erik Sherman notes:

“The term BIG data has become an amorphous catch phrase that covers everything. Real big data involves millions or even billions of data points. We’re talking complicated tasks like predicting the weather, or Google looking for trends among all the search queries it sees day in, day out. That level of data analysis is probably nowhere near what you need for your business. Most decisions are built on small data: dozens or hundreds or maybe thousands of data points.”

Right on Erik, big data might be “unstructured” with lots & lots of data points, but is irrelevant for most businesses or organisations.  And yes, let’s not forget that most of us in the data industry have been dealing with big(ish) datasets for years, ok we’re not running the Large Hadron Collider and collecting petabytes of data, but we have been handling a few terrabytes of the stuff – more often gigabytes if we’re honest. Let’s face it, if it got too big for the kit we had, we sampled…remember sampling techniques?? And marketing data analysis is about data reduction.  What we are doing is reducing the data …getting rid of “noise” and extraneous information to get to the genuine insight, often on a subset or segment of customer data.  The majority of data collected is irrelevant or useless in terms of describing customer behaviour.  It is often “little” bits of data that gives us clues, a “small” number of variables that will explain the behaviour in a model, and give BIG results.

So even businesses with lots of data, should not worry about the hype that is created and the need to invest in large data mining tools.  So let’s hear it for “little” data, finding the nuggets that really do deliver value.