Dear Spreadsheet Jockey, Welcome to Big Data

Dear Spreadsheet Jockey,

Thanks for all the hard work over the last twenty years or so. You've been so helpful crunching numbers, reviewing reams of data and presenting results. Somehow, you've been able to pull together all different types of data sets together and analyze. I marvel at how fast you create pivot tables and work complex formulas. You've made my job easy - I ask a question, you crunch the numbers and send me a report. Buying you a faster computer every year and upgrading your software has been an easy investment.

Unfortunately, times have changed. Waiting a few days for your reports is slowing down the organization. The Sales team needs to be able to see updated results at least daily not only from the CRM, but also financial results and product performance. The financial team needs to leverage the sales pipeline to provide frequent updates to our forecasts. Customer service wants to see web analytics and sales activity when a customer calls in. You're just not keeping up with the level of requests and reporting needs.

Bob (the IT guy) has been complaining for years on the number of spreadsheets being created, so I asked him to prove it. He showed me dozens - no wait - more like hundreds of copies of our customer data some with information I've never seen. He says he thinks he can solve many of our data and reporting challenges, but it will require an investment. I asked him for a plan, and he suggested I read and learn a out how Big Data technologies are helping other companies including our competitors.

So I did some poking around. Big Data may be the next frontier in health care. Big Data is helping online travel agencies sift through hundreds of millions of records to find the best travel offers. It's helping general contractors and manufactures in the construction industry "target and win business where the opportunities are strongest". Just yesterday, President Obama announced a $200M Big Data investment to help six government agencies.

So I hope you will join this team. You know a lot about our customers and our data and have been a trusted partner for many years. But you have to do things differently. You'll have to learn some new tools, and I'm going to have different and and higher expectations on what you deliver. I expect to see self serving dashboards, daily updates to reports, and fewer duplicates in our customer database. I'm certain that after you learn the new tools, you'll come back with other recommendations and innovations.

Are you with me?



continue reading "Dear Spreadsheet Jockey, Welcome to Big Data"

Top Five Tools of Big Data Analytics

big data tools
It's a little unfortunate that Big Data became the rallying name on a new generation of analytic capabilities. Unfortunately, because the volume, velocity, and variety of data, the attributes of Big Data, all help define the complexity of growing databases and not the benefits.

The benefits of Big Data are when the organization, its customers, and its partners can derive intelligence, insight, and value from storing, processing, and analyzing larger volumes, velocity, and variety of data. So I would propose that rather than attributes of the data, we focus on the capabilities that will help organizations derive more value from it. Here are some suggested tools of Big Data Analytics

  • Analytic Visualizations - Well designed visualizations are the baseline tools for both experienced data scientists and more novice analysts to make sense of data. Visualizations tell a story and help the analyst to share what they've learned so that the data "speaks" for itself. A well-designed visualization is far more powerful than a set of charts laid out in a presentation or pdf. The visualization should help the audience see "answers" while giving them views and access to the underlying detailed data. 
  • Data Mining Algorithms - If visualizations make humans smarter about data, data mining algorithms make machines more capable to automate the analysis. Clustering, segmentation, outlier analysis, and other algorithms help data scientists find the needles in the haystack or offer mechanisms to drill down into data intelligently. Data mining algorithms need to be designed to handle both the volume of Big Data, but also the velocity.
  • Predictive Analytic Capabilities - Whereas data mining helps an analyst understand the volume and velocity of data, predictive analytical capabilities are a combination of algorithms and tools for the data scientists to complete forward-looking analyses and statements. I'm calling these "capabilities" because the ability to predict requires both visualization (to help the scientists see the results), data mining (because you can't predict without using data mining techniques to understand the historical data) and potentially tools to annotate data. In some business settings, a predictive analytical tool is needed. In others, predictive capabilities are needed in visualization or data mining tools.
  • Semantic Engines - Technologists understand that the "variety" of unstructured data offers other challenges and requires a different set of tools to parse, extract, and analyze. Semantic engines have to be designed and positioned to bring new tools to the non-IT organization to mine and extract intelligence from "documents" - a business friend's way of talking about unstructured data.
  • Data Quality and Master Data Management - Data Quality and MDM are a mix of governance practices, organizational processes, and technology tools to ensure that there is a defined quality and management process around the underlying data. Organizations looking to derive value from Big Data have to take steps to ensure that the quality level is understood and that management processes are in place to maintain and improve.
I don't want to undermine the technical challenges in Big Data - there are plenty starting with Big Data Infrastructure (shown as a foundational element in the diagram accompanying this post) but with requirements that permeate into the analytical tools. But if Big Data is really going to be the next big thing in Technology, we better focus our efforts on the benefits and value and not just the challenges.
continue reading "Top Five Tools of Big Data Analytics"
Share

About Isaac Sacolick

Isaac Sacolick is President of StarCIO, a technology leadership company that guides organizations on building digital transformation core competencies. He is the author of Digital Trailblazer and the Amazon bestseller Driving Digital and speaks about agile planning, devops, data science, product management, and other digital transformation best practices. Sacolick is a recognized top social CIO, a digital transformation influencer, and has over 900 articles published at InfoWorld, CIO.com, his blog Social, Agile, and Transformation, and other sites. You can find him sharing new insights @NYIke on Twitter, his Driving Digital Standup YouTube channel, or during the Coffee with Digital Trailblazers.