Data science does not productize itself. Many other software products can easily move from brainstorm to prototype to market. That path is a straight, obvious line. Moving from a research project to a fully trained math model to market is a process as complex as the research itself.
New roles are emerging around data science. One of the most important is the Data Science Product Manager. This role is the link between research and ROI. Companies from Facebook to seed stage startups are putting heavy emphasis on results (productization) for their data science efforts. That is because while data science’s and machine learning’s potentials are very promising, the number of projects that make it into production is disappointing.
A Mix of Skills
A product manager (PdM) is typically assigned a product line and tasked with growing the profitability of that line. In this case, the PdM is assigned a technology and tasked with growing the profitability of technical applications across product lines. The traditional role requires product expertise so as you might have guessed, the data science PdM needs technical expertise.
That is not to say this person is/was a data scientist. It is more important for them to easily identify the kinds of business and technical challenges that can be solved with data science or machine learning. That means their skills need to include both a strong understanding of math modeling as well as a deep familiarity with prior applications. For example, if a product line has an image recognition component, the data science PdM would need to know that convolutional neural networks (CNNs) have been effective for these types of business problems. They would not need to know about the latest advances with generative adversarial networks (GANs) or how to implement a CNN. They need enough knowledge to take the right kinds of problems to their data science team.
The other side of that coin is the ability to translate solutions the data science team comes up with back to the stakeholders and executive decision makers. When the lead data scientist makes a recommendation to use GANs for image classification, the data science PdM needs to be able to read, understand, and translate the supporting research that accompanies the recommendation. That last part, translate, is a big piece of the skillset. Being able to translate research into a presentation that non-technical audiences can use to make go/no go decisions is a lot harder than it sounds.
A Facilitator and Communicator
Think of the data science PdM as an expert translator when it comes data science knowledge and business needs. A good PdM makes a compelling case to senior leadership for why a project should be done. That includes everything from market assessments to budgeting and even where/how it fits on the product roadmap. All of that takes buy in.
What is so difficult about getting a go decision when it comes to data science projects is the nature of the research cycle. Data science is not a small r, big D process like most software development projects. Both research and development play an equal part. The research cycle in business is difficult to fit into a typical project and product management paradigm. Research can fail at any stage. It can require extensions to explore unexpected discoveries or additional lines of research. Its output is a few answers but also a lot of questions to explore in future work. Research is never done from the researchers’ perspective. Businesses run on producing tangible results at the end of the project.
That puts business needs and research needs in conflict with each other at times. The data science PdM needs to be able to manage and focus the research process while also managing the oversite and expectations of senior leadership. That drives the need for an expert facilitator and communicator.
A data science PdM creates appropriate gates during the research process. The gate reviews are where the data scientists sit down with senior leadership for a presentation and discussion of results from the last iteration. These meetings are most productive with a strong PdM translating business needs to data scientists and research to stakeholders.
Requirements planning and creation are other areas where the data science PdM needs to be a strong translator. Data science product requirements are different breed because of the nature of the models. These models need clean inputs to generate the expected outputs. Customer research needs to be done to assess what an acceptable accuracy is as well as what failure cases are expected versus which ones will not be tolerated. A clean, predictable data pipeline is another critical success factor. Navigating this minefield ahead of time requires an expertise in taking data science products into production.
From Prototype to Production
So many prototypes fail here. As I said in the intro, data science does not productize itself. The prototype is a far cry from production ready. A model that has been trained to 93% accuracy does not simply get deployed and work like a charm. The PdM needs experience to take data science prototypes and models into production.
The biggest reason that prototypes fail is they do not work the way users expect them to. Non-technical users see models as a black box. In many cases they are not aware that there is a model operating in the background. From a product perspective, the data science is all important. From a user perspective…not so much.
The data science PdM needs to be able to build a productization plan that optimizes user trust and utility. Again, the PdM is a translator. They translate outputs into a format that provides value to the end user. Everything from accuracy to visualization and interface design comes into play here. There is no classroom or educational equivalence. Prior experience taking data science products to market is required.
The data science PdM is a strategy heavy, semi-technical role. They need to translate business needs into requirements, research into go/no go decisions, and prototypes into products customers will stand in line for. That is a tall order for one person, but we have entered a business reality that demands strong skill sets to turn data science potential into revenue. Without a quality data science PdM, projects languish in endless research or flop in the transition from prototype to production.