What Technologies Work Best for Decentralized Data Scientists?

If data scientists, analysts, quants, or BI specialists are in a centralized department, then that group can staff and train its members to support one or more technologies based on business need. Technologies such as data processing, analytics, statistics, visualization, or data mining are good examples. 

But what happens when these resources are scattered across multiple departments. One department may have an expert data scientist, another may have a small group doing internal reporting, and a third group might have outsourced its analytic function. If data scientists in the organization are decentralized with different goals, skills, and operating models, can IT still provide a common set of Big Data and analytic tools and services to the organization and support these different functions?

The answer is yes, but decentralization leads to a different set of technologies and IT services. Since different users will have different goals, capability needs and skills, IT needs a Swiss army knife of data management and analytic technologies and related services

Self Service BI - The Analytic Swiss Army Knife?

That Swiss army knife has come with new technologies branded as "self service" BI that aim to enable business users - and not IT - to solve many data processing, analytics, or visualization tasks. The software companies developing these technologies recognize that IT can be a bottleneck to solving data challenges and have developed products that take the coding out of data tasks. With these tools, you can aggregate data sets, perform joins, cleanse data, map data, perform analytical calculations, identify trends, seek outliers, and develop dashboards - all with minimal coding!

Data scientists working in different departments can make great use of these tools. Imagine one in marketing that can blend their marketing database with a social networking feed to develop insights on prospects? Consider someone in sales ops who develops dashboards for sales directors making it easier to understand and action the sales pipeline? A financial analyst can develop common reporting dashboards and departmental specific reports.

But these tools deployed without defined practices and governance will create a new generation of potential data silos, bury analytical calculations, create another form of spreadsheet jockey, or produce too many dashboards. They will create work-arounds to performance issues or  duplicate data in order to make today's analysis more convenient. They might expose sensitive data to too many people in the organization or violate privacy or compliance constraints when moving or storing data.

The role of IT in Self Service BI Programs

So with these great tools comes even greater responsibilities. For brave technologists and CIOs embracing a decentralized data strategy, the task does not end with identifying talent, selecting and implementing "self service" technologies, and training. It must define new data practices and governance, clearly identifying the responsibilities of business users and demonstrating the value of IT by providing a matching set of data services.

Where are workbooks versioned? How are analytical calculations published, validated, and tested? How does one request assistance integrating new data or help solving a query performance issue? How are new tools evaluated and upgrades tested? What types of documentation is required, where is it published, and how often is it updated? How is security enabled? How does the organization measure data quality? What visualization standards will make it easier for enterprise users to leverage data and dashboards in their decisions making?

These questions need technology solutions and service definitions. The CIO needs to define a new set of data management practices and lead the organization to be more data driven.

I suspect that as organizations become more data driven, the more data science skills will be needed, the more likely they will be deployed across the organization and therefore more likely self-service BI programs will be established.


  1. Isaac, another great post and very timely. We are well on our way to establishing self-service, decentralized BI in our organisation and just yesterday we were discussing the need to improve our data management practices (I like that phrase better than data governance).

    The speed at which organizations have to execute and the sheer volume of data today make it imperative to get IT out of the funnel. Provision, manage, secure - those are the tasks of IT. Consume, analyze and predict - those are the tasks outside of IT.


    1. Thanks Jeff! As you roll out new business tools and capabilities, ask yourself where there is a risk of behaviors that create data messes. Then look to either implement governance, practices, tools, or behaviors to mitigate. Good luck!


Comments on this blog are moderated and we do not accept comments that have links to other websites.


About Isaac Sacolick

Isaac Sacolick is President of StarCIO, a technology leadership company that guides organizations on building digital transformation core competencies. He is the author of Digital Trailblazer and the Amazon bestseller Driving Digital and speaks about agile planning, devops, data science, product management, and other digital transformation best practices. Sacolick is a recognized top social CIO, a digital transformation influencer, and has over 900 articles published at InfoWorld, CIO.com, his blog Social, Agile, and Transformation, and other sites. You can find him sharing new insights @NYIke on Twitter, his Driving Digital Standup YouTube channel, or during the Coffee with Digital Trailblazers.