Tuesday, October 28, 2014

The Agile Data Organization - Balancing Responsibilities in Data Science Programs

If you've read this blog or have seen me speak at a conference, then you know I am a strong proponent of self-service BI programs. I've posted on principles of self-service BI programs, attributes of data driven organizations, and how to avoid data landfills among many other big data topics all aimed to get business teams successful competing and driving decisions with data.

But success isn't driven by technologies, data practices, or the value of the underlying data alone. It is people and organizational structure that truly drive success and yet this is where I see many organizations make classic missteps. The problem is in balancing responsibilities and making decisions on who does what steps in the data management practices, and who owns what decisions.

Three classic mistakes

Here are some of the missteps some leaders and organizations make when considering how to manage big data or self-service BI programs:

  • An overreached business team that tries to cut out IT from all or the majority of data management practices. In other words, data scientists on business teams trying to to turn "self service" to "complete control"

  • An overgoverning IT team that tries to provide technologies and identify structured business practices on every step from data gathering, to processing to delivery.

  • An overzealous PMO that tries to identify and label every part of the process and formally assign responsibility and decision making before the practice is in place and business value determined.

Hopefully you can visualize what's happening here. If you elect to be the overzealous PMO, you have a lot of up front work to define structure, process, roles, and responsibilities. If you choose not to predefine a structured practice with roles and responsibilities defined, then the organization will evolve its practice through experimentation and attempts to provide value. This is generally a good "agile" evolution, however, it, can lead to an imbalance depending on who has more organizational power and controls. Undisciplined business teams with little IT participation can lead to the first scenario and an overly controlling, "technology first" approach yields the second scenario.

What's the Solution To Getting A Balanced Business and IT Data Organization?


First and foremost, organizations need to recognize that this is not a unique problem to self service BI or data science programs. Agile product and software delivery teams are almost always cross-functional between Business and IT. The heart of agile is the product owner managing a backlog of features and enhancements, defining minimally viable solutions, working with IT on implementation scenarios, and prioritizing planning and development stories. Strong agile teams also have mechanisms to express and prioritize technical debt, larger business investments, and more significant infrastructure changes. 

The same practice can be applied to Agile Data Organizations, except that instead of prioritizing features, organizations look to prioritize big data questions. What questions provide value to stakeholders and customers that are worth answering? How do we attribute value and estimate feasibility on answering the question? How do we factor in other work such as loading in new data sets, data cleansing efforts, or improvements in data processing?

The next step is to get a team working together on discovery efforts. Once a multidisciplinary group understands priorities, there is a stronger likelihood that they will work together and disregard organizational boundaries and responsibilities.

Want to get started? See my related post on agile leadership practices to help data scientists.

But that's the start. There are some fundamental differences between software and data analytics that also contribute to the organizational discord. More to come!


continue reading "The Agile Data Organization - Balancing Responsibilities in Data Science Programs"

Monday, October 20, 2014

Five Takeaways from Mobile Enterprise Boston

Last week I attended M|Enterprise Boston, a conference that brought together technologists ahead of the curve in mobile application development, IT executives looking for best practices on mobile device management in large enterprises, and leaders looking to help their business gain a competitive edge by developing differentiating mobile capabilities.

My takeaways -

  • Use Hackathons to get developer adoption - Mahesh Bala of Snyder Electric scheduled a hackathon to get developers across the globe to try out his mobile development platform. It's a brilliant idea for two reasons. First, for developers that were already developing mobile apps in his organization, the hackathon provided a venue to learn, tinker, and hopefully by into a standard development tool and methodology for developing future applications. For more novice developers, it was an opportunity to learn a new skill and gain confidence to develop mobile technologies. Given how hard it is to get a decentralized group of developers to adopt a development standard and how important it is to grow an enterprise's mobile development talent pool, this approach is simply brilliant. In addition to the innovation developed at the hackathon, I am certain Mahesh got value feedback from the developers on where to make platform improvements. 

  • Use Genius bars to help users - Apple's genius bars are a huge success in getting its users everything from basic support to important training and problem solving. Why not use the same approach in the enterprise? Brian Katz @BMKatz talked about his enterprise's approach to setting up internal genius bars to help its users fix issues and learn how to better use their mobile devices. Smart to be present and make it convenient for users to find technologists rather than wait or hope that they dial into Support.

  • Day in the life of a user - There was a good amount of discussion on the importance of understanding user needs and work flow before designing applications. The best and most simple advice I heard - and apologies for not remembering the source - was to have members of the team walk in the shoes of their target users and experience a day in their life. Why? Because for mobile applications to be useful and used, they have to provide significant benefits at the right time and place and a good way to understand their needs is to experience it directly.

  • Agile development of mobile applications was debated politely. Many participants stressed the importance of identifying user persona, needs, workflows and to design the user experience while others were more vocal on agile principles. The two are not mutually exclusive and all agreed that mobile app development should target a minimal viable product but needs to be good enough so that users don't download, disappoint and drop.

  • Mobile analytics is the key to understanding user behaviors and tuning mobile applications and possibly more important than web analytics. As Adrian Bowles @AJBowles put it so well, the intersection of mobile and analytics is being "aware" so that the app is always on and collecting data and "everywhere" the user goes you have the ability to provide value and capture insights. The combination of mobile with analytics, assuming strong privacy considerations is a strategic differentiating tool. 

continue reading "Five Takeaways from Mobile Enterprise Boston"

Monday, October 06, 2014

The Future of Works Starts with Fixing Bad Meetings

How do you manage meetings so that they are productive and complete with documented decisions and prioritized follow up tasks?

This is a big, but an important topic. A lot of people's time and a corporation's spend is burned in meetings, yet most people can recount the bad meetings that were a waste of time versus others that had tangible outcomes. 

Why is it that executives with years of experience and advanced degrees struggle with the simple task of gathering the right number and mix of people, establishing an agenda, managing the meeting time, and insuring the proper documentation is disseminated?

I can hear some people saying that agenda-less meetings are also important. Blue sky meetings? Catch up sessions after a long break? Agile standup meetings that have a defined protocol? I agree that there are many meetings that may not require formal definition, but still require some structure. 

Is the Problem People or Technology?


Today, there is relatively low cost to set up a meeting. Jump on Microsoft Outlook, identify participants, find room on their schedule, book a meeting room and add a one line Subject. Most of the work is tied to answering who, when and where - very little on the what! Microsoft wanted to make it easy on business users to schedule meetings so the focus is on logistics and very little substance is required.

So what if we used technology and implemented some rules?

  • Every meeting requires an agenda - either a description or ideally a schedule of topics
  • Scheduling a meeting requires identifying two roles - A leader, for the person in charge of the meetings agenda and outcomes, and a Scribe, the person required to document decisions and follow up tasks.
  • Meeting costs calculated - Most meetings have too many people invited. I'd prefer implementing some form of governance, like meetings can't have more than eight people without an approval, but this may not be practical. Instead, what if the cost of the meeting was transparent to its leader? Even if the cost is calculated off of a simple flat hourly rate for all employees, publishing the cost will give the leader a sense of accountability.
  • Meetings require check in - when they arrive at a meeting. Better yet would be to install beacons in conference rooms to automatically record check ins. Dialing in or using a virtual meeting room? These tools could be configured to automatically log in the check ins.
  • Meeting outcomes are documented - and collected by the Scribe in a centralized tool that captures decisions and assigns follow up tasks. By definition, the Scribe is automatically scheduled time after the meeting to complete this documentation. On completion, participants are automatically emailed a link to the finalized meeting report that includes its agenda, participants, decisions, and follow-ups.

But It's People...

While this technology helps, it's peoples behaviors that also need to change. Some of these tools are expensive to implement (beacons) and can easily be circumvented. So this will require collective participation, collaboration, and a little bit of self policing to make a transformation successful.
continue reading "The Future of Works Starts with Fixing Bad Meetings"

Monday, September 29, 2014

What Technologies Work Best for Decentralized Data Scientists?


If data scientists, analysts, quants, or BI specialists are in a centralized department, then that group can staff and train its members to support one or more technologies based on business need. Technologies such as data processing, analytics, statistics, visualization, or data mining are good examples. 

But what happens when these resources are scattered across multiple departments. One department may have an expert data scientist, another may have a small group doing internal reporting, and a third group might have outsourced its analytic function. If data scientists in the organization are decentralized with different goals, skills, and operating models, can IT still provide a common set of Big Data and analytic tools and services to the organization and support these different functions?

The answer is yes, but decentralization leads to a different set of technologies and IT services. Since different users will have different goals, capability needs and skills, IT needs a Swiss army knife of data management and analytic technologies and related services

Self Service BI - The Analytic Swiss Army Knife?


That Swiss army knife has come with new technologies branded as "self service" BI that aim to enable business users - and not IT - to solve many data processing, analytics, or visualization tasks. The software companies developing these technologies recognize that IT can be a bottleneck to solving data challenges and have developed products that take the coding out of data tasks. With these tools, you can aggregate data sets, perform joins, cleanse data, map data, perform analytical calculations, identify trends, seek outliers, and develop dashboards - all with minimal coding!

Data scientists working in different departments can make great use of these tools. Imagine one in marketing that can blend their marketing database with a social networking feed to develop insights on prospects? Consider someone in sales ops who develops dashboards for sales directors making it easier to understand and action the sales pipeline? A financial analyst can develop common reporting dashboards and departmental specific reports.

But these tools deployed without defined practices and governance will create a new generation of potential data silos, bury analytical calculations, create another form of spreadsheet jockey, or produce too many dashboards. They will create work-arounds to performance issues or  duplicate data in order to make today's analysis more convenient. They might expose sensitive data to too many people in the organization or violate privacy or compliance constraints when moving or storing data.

The role of IT in Self Service BI Programs


So with these great tools comes even greater responsibilities. For brave technologists and CIOs embracing a decentralized data strategy, the task does not end with identifying talent, selecting and implementing "self service" technologies, and training. It must define new data practices and governance, clearly identifying the responsibilities of business users and demonstrating the value of IT by providing a matching set of data services.

Where are workbooks versioned? How are analytical calculations published, validated, and tested? How does one request assistance integrating new data or help solving a query performance issue? How are new tools evaluated and upgrades tested? What types of documentation is required, where is it published, and how often is it updated? How is security enabled? How does the organization measure data quality? What visualization standards will make it easier for enterprise users to leverage data and dashboards in their decisions making?

These questions need technology solutions and service definitions. The CIO needs to define a new set of data management practices and lead the organization to be more data driven.

I suspect that as organizations become more data driven, the more data science skills will be needed, the more likely they will be deployed across the organization and therefore more likely self-service BI programs will be established.

continue reading "What Technologies Work Best for Decentralized Data Scientists?"

Monday, September 22, 2014

Five Data Management Practices IT Needs to Better Support Data Driven Organizations


Last week, I posted 5 Agile Leadership Practices Where CIO Can Help Data Scientists to provide solutions to messy data, data landfills, data silos, and other outdated data practices that lead to data issues. That post covered cultural practices and organizational principles that make agile teams and organizations successful and how to transfer them to a data science or analytics practice.

This post follows up on the technology side and what data management IT practices and services are essential to establishing or scaling a big data analytics program.

My focus is on scaling the organization's ability to train or hire new data scientists, introduce more analytical capabilities, improve data quality, and aggregate new data.

  • Provide collaboration tools and change management practices - Almost nothing frustrates me more than seeing a complex spreadsheet being emailed between colleagues with the intent to collaboratively edit them. The sender gets back multiple versions of the original spreadsheet emailed back and will have the arduous task of merging them. There are far better ways to work on documents together or to share access to them including Office 365, Google Drive, Sharepoint, Jive, or Box. This isn't a technology issue today - it's a training issue and getting business users to phase out behaviors that create duplicate data and cumbersome (email driven) workflows requires ongoing participation of IT in select business processes to help foster change.

  • Proactively monitor database performance - No one is happy when a query is slow, a dashboard takes too long to load, or there is a delay in processing a data feed. Not happy is putting it mildly, more like furious and frustrated. What can IT do? Be proactive! Monitor and track database query performance and dashboard load times to know that performance is degrading before users know. Track data load processes and define operational practices to address processes that are running behind. Best yet, leverage cloud instances and automate adding or shrinking capacity based on user activity and performance measures. 

  • Document databases - Relational diagrams, data flows, defined calculations, data dictionaries, database connection parameters - how many of these do you have documented across critical databases in simple formats that business users - not DBAs- can consume? How many of these are in formats that make it easy to update and maintain? Is the practice defined so that documentation is updated proactively? Organizations that aim to increase the number of data scientists or other analytical capabilities need documentation, and ideally tools for documentation in order to scale the practice.

  • Provide Data Warehouse and ETL Services - Users can open a help desk ticket to procure new software, get help with remote access, or get support on using an enterprise application. Are database services as well defined? If a user just received a large spreadsheet, can they request support to load it into a database? If a new dashboard is running slow, can they get help tuning the data model, get assistance reviewing the query, or get help getting indexes built? If there is a new prospect list Marketing wants to leverage,  is there a defined practice to connect to the source and load data in? To be competitive, IT departments need to take steps to transform commonly requested data practices into BAU services.

  • Define data quality measurement practices - The analysts and data scientists working with data ave and make the best of the data's quality. Sometimes that means ignoring issues, other times they will create complex formulas and other operations to cleanse data. Vocal analytical teams will highlight data issues so that there is a better chance that they can be addressed earlier where the data is collected or processed. 

  • What can IT do? IT can help automate queries and publish data quality metrics. How many bad emails have come in from different marketing lists? What sales people are entering the least amount of prospect metadata? What are the primary sources of duplicate records? The IT team can also review and recommend data quality tools that enable data stewards to develop cleansing rules and handle exceptions that require manual corrections.

Again, these data management practices are primary ones to invest in if scaling a data science or analytics practice. I can discuss data platforms, architecture, infrastructure/cloud scaling, data security and other technology areas in future posts.

continue reading "Five Data Management Practices IT Needs to Better Support Data Driven Organizations"

Tuesday, September 09, 2014

5 Agile Leadership Practices Where CIOs Can Help Data Scientists

I've compiled a number of posts on data landfills and other bad data practices and have made a commitment, at least on Social, Agile, and Transformation to begin providing solutions.

I've always felt that other disciplines would benefit from well established technology practices. Agile practices have enabled software development teams to find sponsors, prioritize work, change the culture, insure work gets done, and market their accomplishments. I think data scientists face similar challenges and should benefit from many agile practices that have helped transform software development organizations.

I say elements, because while software development is often a collaborative practice performed by teams, that isn't always the case in data analytics work. Data scientists may not be in the same organization (team) and are often working individually or in pairs on different analytics. So while many agile practices are relevant to data science work, they have to be adapted to the nature of how this work gets done.

Also, in this post I've started with the leadership practices and might cover management practices in a follow up post. Key agile leadership practices are below:

  • Sponsor work - Data scientists, data geeks, quants, data analysts, bi specialists - all go by different names in different organizations but business leaders don't always know how to best engage their capabilities or services. The CIO can lead the way by sponsoring analytics projects or drawing attention to a team or individual's capabilities. The CIO also has access to the organization and can help network departments that have high value data analytics work and are ready to partner with or hire data scientists. Sponsoring the work begins to establish an "Owner" role, similar to an agile product owner role, that can define a vision, articulate business value, and prioritize work.  

  • Address the culture - Becoming a data driven organization is not just about having data scientists, it requires a commitment top down and bottom up to leverage data in decision making. This is often a culture change that requires leaders to educate the organization and find ways to align on simple practices. One of them, is to educate the organization to ask questions. Agile is not just a process - its a culture change that requires teams and organizations to think agile. 

  • Establish practices for prioritization - Prioritization is a key practice for agile technology teams that have to align their efforts on a product release or development sprint to features and fixes that provide the highest business value. Data scientists face the same challenge in determining what questions to answer or analytics to prioritize. Leveraging agile practices and tools to help make the data scientists' workload transparent and establishing practices to prioritize work is a good place for CIOs to add value. 

  • Review results and ask questions - Agile development teams will demo their work after the sprint and answer questions from sponsors. Data scientists would benefit by adopting a similar practice by schedule analytics reviews where they can showcase a visualization, tell a story, and suggest follow up work. CIOs can help by promoting these sessions, attending, participating, and asking good questions. 

  • Get out of the away - Agile has its self organizing principles, enabling teams to have some authority around how they organize work to get things done. Data scientists also need a little bit of freedom to be who they are - scientists. Sometimes that means blazing a trail in new areas - new technologies, new data sources for example - to determine if they are useful to get a job done. Sometimes that means creating some work arounds, or creating "data processing debt" (more on this in another post, but this is the data analogy to technical debt) in order to get a job done on time.  
While I don't think these are new concepts, I haven't heard too many data scientists and their managers describe their culture or work with these practices. Similarly, while CIOs are more often consumed by Big Data platforms, I rarely hear them talk about aiding data scientists with basic practices. Common ground?
continue reading "5 Agile Leadership Practices Where CIOs Can Help Data Scientists"

Friday, September 05, 2014

Killing Bad Data Practices - Acknowledging The Problem is Half The Battle

I posted on LinkedIn earlier this week The Big Data Challenges All Organizations Face summing some symptoms and solutions to siloed databases and ungoverned data practices. If you work on data, manage the databases, or rely on it to make decisions then you probably understand bad data issues and can relate to some of the symptoms:

Do you email spreadsheets between coworkers to edit and review? Are there only a select few people in the organization capable of pulling or interpreting data out of core systems because of different data quality issues? Do presentations provide insights backed by data sources and assumptions? Does it seem like you have hundreds of reports and dozens of dashboards but none of them suit your day to day needs? Do you start a new analysis by cutting a new data set, or are you able to leverage defined tools to connect to predefined data repositories?
My solutions involve better governed self service BI programs, partnering with departments that have critical data needs like the CMO and marketing, and getting alignment on the data platforms needed for growth.

Need some examples? I'm sure you can site some examples, but these Excel horror stories collected by the European Spreadsheet Risk Interest Group should frighten any data scientist, data architect, or data driven business executive.

My next posts on this subject will be solutions focused. Remember, acknowledging there is a problem and recognizing its impact is half the battle.




continue reading "Killing Bad Data Practices - Acknowledging The Problem is Half The Battle"

Share