The benefits of Big Data are when the organization, its customers, and its partners can derive intelligence, insight and value from storing, processing, and analyzing larger volumes, velocity, and variety of data. So I would propose that rather than attributes of the data, we focus on the capabilities that will help organizations derive more value from it. Here are some suggested tools of Big Data Analytics
- Analytic Visualizations - Well designed visualizations are the baseline tools for both experienced data scientists and more novice analysts to make sense of data. Visualizations tell a story, and help the analyst to share what they've learned so that the data "speaks" for itself. A well designed visualization is far more powerful than a set of charts laid out in a presentation or pdf. The visualization should help the audience see "answers" while giving them views and access to the underlying detailed data.
- Data Mining Algorithms - If visualizations make humans smarter about data, data mining algorithms make machines more capable to automate the analysis.Clustering, segmentation, outlier analysis, and other algorithms help data scientists find the needles in the haystack or offer mechanisms to drill down into data intelligently. Data mining algorithms needs to be designed to handle both the volume of Big Data, but also the velocity.
- Predictive Analytic Capabilities - Whereas data mining helps an analyst understand the volume and velocity of data, predictive analytical capabilities are a combination of algorithms and tools for the data scientists to complete forward looking analyses and statements. I'm calling these "capabilities" because the ability to predict requires both visualization (to help the scientists see the results), data mining (because you can't predict without using data mining techniques to understand the historical data) and potentially tools to annotate data. In some business settings, a predictive analytical tool is needed. In others, predictive capabilities are needed in visualization or data mining tools.
- Semantic Engines - Technologists understand that the "variety" of unstructured data offers other challenges and requires a different set of tools to parse, extract, and analyze. Semantic engines have to be designed and positioned to bring new tools to the non-IT organization to mine and extract intelligence from "documents" - a business friends way of talking about unstructured data.
- Data Quality and Master Data Management - Data quality and MDM are a mix of governance practices, organizational processes and technology tools to insure that there is a defined quality and management process around the underlying data.Organizations looking to derive value from Big Data have to take steps to insure that the quality level is understood and that management processes are in place to maintain and improve.