Framing a Business Problem as a Data Science Problem

Is the business problem a data science problem? Reframing the business problem answers this question and allows you to assess the feasibility of your independent project. This is a key step to move past toy projects and build one that gets you hired.

Vin Vashishta | Originally Published: January 21st, 2021

Book time with me for Career Coaching or sign up for my Business Strategy for Data Scientists class.

In the last post I gave a high-level description of a business problem as part of a business case. In this post, I am going to explain framing a business problem with respect to a solution. For Data Science and Machine Learning, this process is an important piece of a real-world project.

Several surveys have found over half (some as high as 80%) of projects fail to return value to the business. Machine Learning projects fail when model metrics are not mapped to business metrics. A high-level concept of the problem is not enough to keep a project directed so it must be broken down into more detail.

The business case is the first step to line up the project with business needs. The problem statements from the business case now need to be translated or framed as data science problems. Framing is creating a map between business metrics and project metrics.

Restating the Business Problem

In the last post I used “Starbucks Creates $100M Community Resilience Fund” as an example. Potential Business Problems:

  • How do they select funding recipients for maximum impact?
  • How do they track funding recipient utilization?
  • How do they leverage success to improve brand image?
  • How does the business connect revenue to funding utilization?

  • These are all fuzzy problems. Put a range on impact. What metrics do I use to prove a positive impact to brand image? We cannot build a model before we move from fuzz to concrete. That is the purpose and definition of reframing.

    The last bullet point is intentionally framed as a mapping question. This is an example of a business problem framed as a data science problem. There is no definition to it yet, but you can see how it stands out from the others. Senior leadership expects a positive correlation between revenue and funding utilization.

    The business problem starts from their perspective. As responsible corporate stewards, they need to ensure the fund benefits the company and has positive social impacts. How do they accomplish that goal?

    That question turns it over to the team for a solution. The business problem needs to be reframed to something like, we need to map funding allocation to revenue and social impact. Sounds a lot like mapping ad spend to revenue. This could be an attribution problem.

    The problem statement will not go as far as prescribing a model or set of models for a solution. It will give the team criteria to evaluate models and performance at a later phase. We need requirements to start architecting and designing a prototype. However, you see how a change in language helps indicate a course of experimentation.

    Expanding on the Business Problem and Possibly Abandoning It

    The fourth bullet point is intentionally simple. Your independent projects will be around that level of simplicity. If we look at the first bullet point, there is a lot more undefined space. Those make for more impressive projects. They are also difficult to complete if you have little to no business experience.

    Consider project complexity before you start and do an honest self-assessment. Getting a month into the project just to have to start over again, is a waste. Project complexity should be high enough to force us to learn and grow. Balance complexity with your timelines and current capabilities.

    How do you estimate project complexity before you start experimenting? Businesses must answer the same question. The next step is to expand the business problem by exploring the problem space. The business case quantifies the problem’s impact on the business. Exploring the problem space explains the scope of the problem for the people who will have to solve it.

    In the business world, projects can be abandoned at this phase because they are too expensive or too complex. However, businesses will come back to shelved projects if there is a change in technology or business competency. The same may be true for you. Keep your abandoned problem statements.

    How to Explore a Problem Space with the Business

    I have loosely defined a complex attribution problem with no predefined connections between spend and revenue. Mapping investments in the community to revenue is a niche area of investment and relatively new. Data scientists are asked to solve novel problems like this one and the problem space exploration is step 1.

    Exploring a problem space shows a potential employer that you know what the first steps are when faced with a new problem. Most Data Scientists do not, and it is one of the reasons so many projects feel incomplete. Your project stands out when you include realistic elements like this.

    Start by defining the Community Fund. $100M will be spent in communities. Define possible funding types: grants, loans, and investments. Define possible funding recipients: not for profit charities, businesses, and initiatives.

    In a business, I would be emailing and meeting with people involved in running the fund to get a complete list of funding and recipient types. Independent projects require research as well. Using a real-world business problem ensures there is something to research. There are press releases and public statements to be found that can inform a realistic exploration of the problem space.

    Revenue needs to be elaborated on and explored. Define possible revenue streams: return on business investments, loan interest, and brand value. Define possible efficiencies: recruiting, DEI, environmental sustainability, marketing, and tax. The same process of research would apply for discovering and validating each stream and efficiency.

    Asking Questions to Understand the Business Problem Space

    Talking with the business is important when exploring the problem space because what is technically possible might not be realistic. People running the fund will quickly shoot down return on business investments and loan interest by explaining the negative “optics.” It would make the altruistic fund look more like a capitalist grab so they will not be going down those roads.

    People in accounting will tell me they already have all the support they need to understand the tax implications for each fund dollar spent. Environmental Sustainability will start with the Regulatory, Compliance, and/or Legal business unit. I may or may not get a warm welcome there because this falls under policy and lobbying.

    What the #$%* does this have to do with your independent project? This is the much-touted business acumen intersecting with data science. Exploring the problem space from the technical and business perspective results in a project with well-defined deliverables the business cares about. This is impressive to hiring managers because they probably do not have anyone aside from themselves who can do that.

    By talking with other parts of the business I have made a complex project smaller and better aligned with the business needs. I got the broad strokes from senior leadership using the business case. I am getting the more granular details by exploring the problem space with relevant business units. This also sets up the relationship for requirements gathering. I will cover that in my next post in the series.

    How to Explore a Data Science Problem Space

    The data science problem space is more familiar territory. The business decides to spend fund dollars. Those dollars go through some transformation described by a function and become the revenue and efficiency values we want to predict. The function is what we are going to model. The model will support the decision to fund/not fund with the goal of optimizing spend.

    I have presented a very rough framework for mapping inputs, model, and inference to the business process. At a high level, we have a few problems to solve. What data should be gathered from applicants? What rubric/model should be created to guide selection committee decisions about which requests to grant? How will the data and score be presented/visualized? Each side of the / shows a transform from business to project artifact.

    That is a quick exploration of the decision support problem space. A mapping and a few questions can clarify what the required solution needs to do. Working this as an independent project, do we have access to any of the data from applicants? If I want to move this project forward, I need to come up with a creative solution to sourcing the data.

    Data discovery is a challenge for real world business problems. This project is an opportunity to show a hiring manager you understand and can solve data sourcing problems. The point of moving from toy projects to more complete ones is to showcase capabilities like these.

    The business problem restated in data science terms describes the project and challenges. From here, the independent project can follow a problem, solution, implementation flow. Toy projects are solutions looking for a problem. Realistic independent projects have a logical thought process for the built solution.


    What comes next? Requirements gathering provides the next layer of definition for the solution. The business case and business problem create a connection between project and business. Requirements create the connection between project and users.

    I offer Boot Camps on Machine Learning Product Management and Building a Path to Production for Machine Learning Products. Reach out to me: to book your spot or corporate training.