Big Data is a Big Deal
October 15, 2014
Everyone is talking about Big Data, but few people really understand it- until we hear the stories. Through analysis of vast stores of customer information, UPS determines that it can improve delivery efficiency, safety and cost by eliminating all left-hand turns by its drivers. Researchers at a pattern recognition startup learn that if you want to buy a used car in good shape, buy an orange one. In this Podcast we explore this Big Data phenomenon and the technologies that underlie its remarkable success and promise.
The New Network Podcast
Big Data Is a Big Deal
You’ve probably heard of it, but I’ll bet you don’t know as much about the phenomenon that is taking the IT world by storm called Big Data. The truth is, it’s a big deal. And because it’s a big deal and because it will have far-reaching impacts on companies and organizations of all sizes including small- and medium-size companies, we thought it might be a good idea to explain what it is—and, just as important, what it isn’t.
Businesses have relied on data—let’s face it, pretty much since before paper was invented. It represents a linear record of a firm’s relationship with its customers, its suppliers, and its competitors. But because it’s a linear record, the information it makes available is equally linear, meaning that it’s often somewhat difficult to infer nonlinear conclusions from it.
If you saw the movie Jurassic Park, you may recall a scene where Jeff Goldblum, the chaos theorist, refers to the phenomenon where a butterfly flapping its wings somewhere in the Amazon causes it to rain in New York City a few days later. This nonlinear effect is similar to the impact of Big Data analytics: the ability to infer insight in ways that would not otherwise be available—or even conceivable.
So let’s define just what Big Data actually is. It’s defined as the accumulation of vast and growing stores of data that are warehoused in a data center (the cloud) and then analyzed using sophisticated tools for insight into what the data may tell us. Much of the data is uncorrelated, meaning that seemingly unrelated things tend to affect each other. For example, traffic jams have a tendency to propagate both forward and backward, not just backward—and no one knows why.
Because of the magnitude of the data we’re talking about, traditional database analysis tools don’t yield useful results, so very sophisticated software tools are employed that work on these massive databases. Of course these databases might be scattered in various data centers around the world—remember, we’re often talking about uncorrelated data, that is, data that isn’t necessarily from the same source, geography, or industry—which means that high-speed networking must be available if the information is to be reachable and analyzable.
The facts with regard to Big Data speak for themselves. Research finds that a majority of companies with a strategy focused on collecting and analyzing their most valuable data best their competitors financially. In fact, according to consultancy IDG, 70 percent of large enterprises and more than half of small and medium businesses have already deployed or plan to deploy big data projects, largely because of the profound strategic impact of the information that these projects yield.
It’s also true that Big Data is something of a self-fueling engine. According to surveys conducted with IT professionals, the average amount of data being managed within their organizations is expected to increase by 76 percent within the next 12 – 18 months, and the more data that there is to manage, the more it can be analyzed for organizational impact. In fact, nearly half of IT executives say their company CEO directly supports or sponsors Big Data efforts, which serves to highlight the visibility and strategic importance of big data initiatives.
So a Big Data environment includes one or more cloud-based data centers, the databases of information housed in those data centers that will be analyzed, an analytics
tool to make sense of it all, and the network that connects it all together. The network
So a Big Data environment includes one or more cloud-based data centers, the databases of information housed in those data centers that will be analyzed, an analytics tool to make sense of it all, and the network that connects it all together. The network element is crucial: Without the secure high-speed connectivity provided by companies like Time Warner Cable Business Class, Big Data would not yield the remarkable insights it is becoming famous for, like this one: If you go to a movie that has a name that ends in a number, chances are virtually 100 percent that you won’t like it very much. Don’t ask why; that’s what Big Data analysis tells us. So save your money—rent the DVD.
Here’s another great example. UPS is one of the best users of Big Data analytics in the world today. For years they have studied the methods they rely on to deliver packages including driver habits, selected routes, and vehicle usage. All of their vehicles are exhaustively instrumented, meaning that virtually every function in every package car has the ability to generate data that can be analyzed. And this is precisely what the IT people at UPS do. After analyzing massive volumes of data, they concluded that they could increase efficiency, lower their overall operational cost, and improve safety by doing one simple thing: eliminating all left-hand turns. That’s right: unless there is simply no other way to get there, you will never see a UPS vehicle turn left. Ever.
Did it work? Well, since 2004 when the study was undertaken, UPS has eliminated millions of miles of unnecessary travel, saved 10 million gallons of fuel, and reduced their CO2 emissions by 100,000 metric tons. That’s the equivalent of removing 5,300 cars from the road for an entire year. I’d call that a success story.
And the successes keep on coming, and not just from large companies like UPS. There’s a common misconception at play that seems to imply that Big Data is only accessible by big business—and that’s just not true. For example, Square, the manufacturer of credit card readers that plug into a smartphone and allow credit card payments to be cost-effectively collected by even the smallest businesses, offers a business analytics tool through its point of sale system, Square Register. Its embedded analytics allow companies to perform very granular analysis of customer purchase histories, and when combined with other information collected during the business day, the system can yield valuable insights into customer behavior.
I suppose that what I find so fascinating about Big Data and its associated analytics is that no one seems to know exactly how it yields the results that it does. What we do know is that if we create a big enough collection of data points to analyze, and use the tools that are available—most of which are open source, by the way—magic happens.
Take Kaggle, for example, a pattern recognition startup. Kaggle researchers performed an exhaustive analysis of car purchase history. One of the things they learned is that if you want to buy a used car and want to make sure that it will be in good shape, buy an orange one. Weird colors tend to be a means of self-expression and if the previous owner saw the car as extension of her or himself, chances are they took better-than- usual care of it. Of course, you’ll have to drive around in an orange car, but hey...
Here’s another interesting Big Data fact. A small African mobile company learned that it could predict impending attacks in Congo because of dramatic increases in the sale of prepaid calling cards. But it wasn’t because people were making calls about what was about to happen. It was because the cards are denominated in U.S. dollars and the local population wanted to have something stable and valuable that they could use as currency when the chaos began. Is that interesting, or what!
My personal belief is that Big Data is the new crude oil, and analytics is the new refinery. Industries are lining up to monetize the promise of this new technology family; and in the same way we derive dozens of products from crude oil, we’ll see similarly diverse derivatives from Big Data. But remember: No network, no Big Data. It’s as simple as that.
I’m Steve Shepard. On behalf of Time Warner Business Class, thanks for listening!
For more information about Time Warner Cable Business Class products and solutions, connect with a local Account Executive. Call 1-855-872-7156.