How do you show hiring managers you are ready to do Data Science and Machine Learning in the real world? Independent project work is recommended by everyone. What projects will make you stand out?
I have built a list of 18 project ideas for you to choose from. Most independent work focuses on Project 14. As you can see from this list, there is a lot more to Applied Data Science, Machine Learning, and Analytics.
These are projects you should be completing as part of your learning path, whether guided or self-taught. Should you do them in this order? That would be ideal, but your learning path is unlikely to follow an actual development lifecycle, so I do not think this is practical.
What I would advise is to start at 1, 8/9 (similar project with different end goals), or 14. That is the starting point that you will build the other projects around. From a business standpoint, this is the wrong workflow to follow. From an educational standpoint, it feels more practical to me.
You will make a lot of changes to your initial project so starting point is the best way to look at it. That means your first project is more of a proof of concept. It does not need to be polished or even complete as you will continuously rebuild along the way. You may even abandon it for a more complex project.
Project 1: Building a Data Science Business Case
Project 2: Framing a Data Science Business Problem
Project 3: Data Science Requirements Gathering
Project 4: Framing a Data Science Solution (Project Description)
Project 5: Data Science Solution Feasibility Study
Project 6: Connecting Business Metrics to Model Metrics
Project 7: Data Science Technical Requirements (Data Pipeline and ML Platform)
Project 8: Building a Dataset (Requirements and Data Gathering)
Project 9: Curating a Dataset for Model Development
Project 10: Curating a Dataset for Analytics
Project 11: Data Visualization and Presentation
Project 12: Building a Data Pipeline
Project 13: Building a Machine Learning Platform
Project 14: Model Training, Evaluation, Selection, and Validation
Project 15: Model Quality and Testing
Project 16: Model Serving and Deployment
Project 17: Post-Deployment Continuous Model Testing
Project 18: Continuous Model Retraining and/or Model Selection
What Comes Next?
Obviously, the labels themselves are not enough. I will be writing up more detailed posts each week to discuss how each project is done in the real world. There is no 1 way so I will be giving examples.
Why are projects different? Business size, team size, team capabilities, business needs, business model, and market all play a part in determining how projects are completed. They also determine how much you will be responsible for completing yourself and how deep your knowledge of each piece needs to be.
That leads to capabilities. What capabilities do you need to complete each project? You can see Product Management, Machine Learning Engineering, Model Integration, Model Development, Machine Learning Quality, Data Engineering, and Data Analysis represented here. That is a lot of ground to cover. Outside of a startup or Chief Data Scientist role, few businesses expect 1 person to have all these pieces.
These projects are important because they outline a practical curriculum. They will allow you to choose and follow a specialization. Specialization makes clear which topics you need in depth knowledge of versus those you need an introduction to.
You can complete any single project in about 2 to 3 weeks. I broke them down into these chunks so any given specialization will take you about 6 months to build a portfolio for.
I offer Boot Camps on Machine Learning Product Management and Building a Path to Production for Machine Learning Products. Reach out to me: firstname.lastname@example.org to book your spot or corporate training.