August 28th, 2015 by Brian Beeler
Key Takeaways From HP Big Data 2015
HP held its third annual big data conference earlier this month in Boston. A broad mix of organizations large and small participated, gaining insights on the future of big data via keynotes from the likes of Facebook, Michael Stonebraker, Nate Silver and many more. There were also over 40 breakout sessions that were driven by customers who were highlighting various projects and war stories. While HP certainly had news at the event with updates to Vertica and a new startup initiative, the conference was very much about execution of big data initiatives more than anything else. Here are a few key takeaways from the conference that shed some light on the future of big data from those in the trenches.
Data Lakes are Marketing Bullshit
That was more or less the point made by Michael Stonebraker, winner of 2014's Turing Award. His comment elicited a cheer form the crowd, who are clearly weary of the marketing messaging being pushed out by the industry. He did not comment on data tsunamis or other meteorological big data phenomena but it's safe to assume he finds them to be in the BS category too. The bottom line here is that the industry is struggling some to put a tidy bow on the concept of big data analytics. To a certain extent it needs to, most organizations are well behind the curve when it comes to making their data actionable. Further, they struggle with mapping a plan to harvest their data, so the visualization of a data lake makes sense. But just getting data into a lake is a very small part of the problem that organizations face. Which leads to the next point.
Hadoop is not Big Data
A lot of organizations start their big data journey with a Hadoop instance. While an admirable start, just using Hadoop does not mean you're doing "big data." Ken Rudin, Director of Analytics of Facebook, broke this down in his big data myths talk where he highlighted the difference in driving impact not insights.
While the concept of driving impact seems basic, many organizations get wrapped up technology or looking at data that doesn't drive change or lead to better outcomes. Driving impact requires balance though and leads to the question I asked over a dozen times to different people at the event.
How do You Ask the Right Questions?
Organizations can have the best technology, the brightest data scientists and wonderful sets of data but that does not guarantee success. Asking the right questions is fundamental to driving business impact. To accomplish this, most of the companies I talked to are embedding their data scientists in the business units. This gives them more experience with the issues faced by the business, and gives the business lead direct access to data-centric questions. While the org charts look a little different depending on who you ask, the consensus is embedded data scientists is the right structure. This structure helps ensure that better questions get asked, which has the side benefit of faster time to meaningful or actionable data.
Many Challenges Ahead
As put together as the leaders in this space are, there's still a lot to be done. Most of the big data puzzle is still not sorted. Sure, the corner pieces are in and much of the border, but most of the pieces are still upside down in a pile. Much of the pain has to do with speed. Moving data sets around, sometimes from legacy systems or systems that weren't built with an eye toward data portability, is a big problem. Simply getting data into an analytics platform in a timely manner requires a lot of work for most businesses. Transport speed is huge, the faster the data gets into the analytics platform, the sooner it can be queried and business plans adjusted. Sales data was the most common thread on this point, some I spoke with have near real-time access and many were seeing data less than a day later. Many though spoke of the pain of having to wait many days to get a sales drilldown, which in some cases means it was too late to take action.
For their part HP has a comprehensive suite of hardware, software and services to help organizations solve these problems that historically has been the territory of consultancies. By empowering businesses to proactively seek impact from their own data, as well as external sources in many cases, HP is helping organizations drive incremental gains. That may be one of the big misconceptions about the promise of big data. Sure, there are big wins to be found, maybe something transformational comes out of the data. More likely though, proper analytics are more like a baseball player hitting for average. A regular string of singles is nice and adds up handsomely over time.