John's World: A Beginner’s Guide to Big Data

02:01AM Jul 26, 2014
( )
John Phipps
We’ve all been there. You’re at a glamorous cocktail party when a ravishing young Hollywood starling slithers over to you and asks, "Can you explain all the buzz about ‘big data’ to me?" After checking the room to see where your wife is, you nod sagely, and, even as your mouth says, "Of course," your brain is warning, "Dude, you’ve got nothing."

Luckily, you don’t really have to know much about big data to have an opinion, just like foreign policy or raising kids. So if you can spare a few minutes, I can help you get up to speed by answering a few of your "frantically asked questions" (FAQs).

Where did the term "big data" come from? Wordologists believe the term originated in the imperial Roman tax office about 27.3 AD. The noted accountant Felonius Maximus, upon entering his office one Friday, bemoaned the scrollwork he had to get through to leave in time to watch the gladiatorial doubleheader. 

"Looka datta pile-a parchment on-a my desk!" Assistants who misheard came to associate the term "datta" with masses of numbers.

Today we use the word "data" to mean anything we want, but mostly stuff that we’ve recorded with a "free" app on our "free" phone. Interestingly, the word can be singular or plural, but if you are going for pretentious (and who isn’t?) saying things like, "The data are randomly organized" will radiate an aura of knowledgeableness. 

Is it pronounced "dah-ta" or "day-ta"? Despite that semi-historical Latin origin, the proper articulation of the word was set in 1987 when Star Trek Captain Jean-Luc Picard met his second officer. When Patrick Stewart says, "Mr. Day-ta," and that’s good enough for the 24th Century, it clearly settles the matter for our backward era. After all, this guy knows tons of Shakespeare. By heart, no less.

"Big data" is …? The term big data can be applied to just about any large collection of measurements, values or idle thoughts (i.e., political polls). For example, if you were to measure your weight every year, that would be just data. If you measure it every second (and there’s a phone app to do this, I bet) that collection of depressing numbers would be big data, often in more than one sense.

How big is "big data"? A wise person (or maybe it was a fortune cookie) once said, "The truth is a powerful weapon." This is why I use it sparingly. So I hope you can handle this. The answer: big data is bigger than you can imagine.

Remember when computers were first discovered by odd people in garages? Crudely carved from wooden components while the Cowsills crooned in the background, these early machines were able to handle thousands of bites/bits/bytes per second, or kilobytes. This meant information came across the green screens about as fast as you could read, even if you used your finger and moved your lips.

Fast forward to today. The computer on your desk or in your back pocket has moved from kilobytes (words) to megabytes (pictures) and gigabytes (movies) to terabytes (3-D live-streams of corn fungus). 

So what’s next? Exabytes, zettabytes and yottabytes. (I’ll wait while you Google.) Toldja so! By the way, a yottabyte is defined as a bazillion, gazillion, frazillion bytes. 

What is it used for? Big data is mostly used to sell gadgets to store it in. It is also good for finding spurious correlations such as consumption of mozzarella cheese and civil engineering doctorates. (Not sure, are you?) Such mathematical tricks are helpful for confusing economic debates and providing Facebook content.

What is the future of big data? Take this quick test: Can you remember your mother’s phone number? OK, can you remember her name? (No, it’s not "Mom.") Don’t be embarrassed, none of us do. That’s because we’ve outsourced our memory to big data. We could record every single moment of our life on video, and the data would fit in a device smaller than a Chipotle burrito. As technology advances, the size will only get smaller.  

OK, now explain The Cloud. Dude, I’ve got nothing.