How to Ace the Data Science Interview
Here is a preparation guide that will help you spend the least time getting ready for a data science interview.
Vin Vashishta | Originally Published: March 26th, 2017
The gateway to a job in data science is the dreaded interview. Veterans and interns alike must go through the interview process. For data scientists, data engineers, and machine learning engineers, it is grueling. Questions span programming languages, platforms, algorithms, and dive into your project experience. Theory and practice are examined with candidates moving from coding to whiteboard sessions to one on one grilling.
Acing the interview requires a candidate to have a process. It starts with preparation, continues with the interview itself, and wraps up with quality follow up. Without a process, it is easy to spend days preparing but not being prepared. It is extremely easy to miss the point of questions or wander into interview bear traps. It is also easy to be forgotten as the process moves forward. I have made all these mistakes and watched them be made as I have hired. Here is how to avoid them and land the job you want.
Preparation: Interns, Junior, and Intermediate
Interview prep needs to be a quick process. Set the interview for about a week out and plan on spending 2-3 hours a day on prep. That fits into lunch hour and after the kids are asleep or during your train ride to work and home again. It will not derail your work, other studies, or life which means you will actually do it.
Ask the recruiter or hiring manager what projects the team you will be joining is working on? What approaches are they exploring? What is their tech stack? These questions are invaluable in reducing the amount of prep time required. If you cannot get direct answers to these questions, do not take the interview. At this level, you are expecting some guidance from senior members. That starts even before the interview.
Day 1, review statistics. Start with the most basic descriptive statistics and probability. Cover discrete and continuous variables and distributions. Look back at inference, ANOVA, linear regression, and multiple regression. I still have my old college intro to stats textbook…good old Devore, Probability and Statistics. Paging through it for about an hour or two is all I need to get my terminology sounding crisp again. It is a given that you will have some basic questions sprinkled in and it is so easy to fumble on basic terminology. It is not so much a matter of knowing the material; if you are in data science, you do. You must demonstrate a level of comfort and competence which benefits from a fresh review.
Day 2, data science 101. I page through a classic, Doing Data Science by Schutt and O’Neil. It covers everything from EDA to KNN to Naïve Bayes to SVM to Decision Trees to PCA and SVD. Get very fluent with the differences between supervised, unsupervised, and reinforcement learning with example algorithms for each. Refresh your memory on what types of problems each solve. Go through a quick pros and cons of each approach. The reason I recommend a data science 101 refresher is like the stats review. The breadth of knowledge required for a data science interview is massive. Fresh concepts make for competent answers.
Day 2, machine learning 101. If you are going for a machine learning role versus a data science role, day 2 is a bit different. It is all about the models and architectures best suited to the team’s area of focus. It pays to do a bit of homework on the team. Have they published anything? Check Github, conferences their members have presented at, and the company machine learning blog/press releases/etc. These will give you an idea as to their approach and direction. Study along these lines. There is so much ground to cover in machine learning that if you do not have a focus, there is no way to be ready for what will be asked. Once you have narrowed your review to a few architectures, dive in. Also google their problem space. Look at the most current research. Having this in your back pocket can help you stand out as someone who stays current with the latest developments.
Day 3, write some code. Tailor your coding to the job itself. Obviously use the language(s) required. Think about how you would approach what your potential team is working on? What libraries would you use? What data pipelines would you need to build? What types of data are you working with? How much data cleansing is required and what is the best possible way to do it? Build a sample project based on these criteria. This will probably take more than 2-3 hours so make it your side project for the week and do not be afraid to ask a colleague to help. If it results in something interesting, consider bringing it to the interview to present. Big companies love seeing this. It is why they are so active in recruiting at hackathons. Nothing gets positive attention like proof that you can do the job.
Day 4, review your project notes. You have been selected for an interview because your skills and experience line up with the job requirements. You know you are going to be asked, “Have you done anything in the past like what we’re doing here?” Make sure that answer is polished and concise. Review notes or code from projects which have overlaps with the role you are going for. Talk through what you did in your head. Think about how you would improve on that approach based on new experience or new tools.
Day 5, fine tuning. Do not fill your head with anything new on the day before the interview. Do not cram. Do not panic. Do not read about the latest work. Spend the day before the interview visualizing successful interviews. Practice your answers with friends or family if anything feels shaky. Go over your mini project presentation one more time. The visualization process is critical in cementing all the work you have put in. It is also important not to sabotage the work by overwriting everything with new information.
Preparation: Senior & Leadership Roles
The previous section was a generalist approach which works very well for early and mid-tier positions. It is different for specialists. Specialists are expected to have strong capability with a small variety of topics and expertise with one or two. There is no concept review to be done when you’re an expert because you’re likely as or more knowledgeable than anyone interviewing you. Interview prep at this level is a matter of getting ready for questions you do not answer every day.
Part 1, mentorship and leadership. The challenges of leading in an extraordinarily complex field with gifted individuals will be explored in these interviews. Leadership comes in two forms, mentorship, and management. Prepare for questions on conflict resolution, performance evaluation, coaching, managing career path, continuous education, managing the innovation cycle, and managing the development cycle. Page through your favorite leadership book. Mine is Strengths Based Leadership. Next, spend some time thinking about your leadership experience around the areas I have outlined. How have you handled conflicting ideas? How do you manage an underperforming subordinate? As a mentor, what is your teaching style? As a leader, what is your management style? How do you mentor an employee who is brighter than you? How do you help that person continue to grow when they have exceeded your knowledge?
Part 2, vision. As a senior member of the team or a leader, you are going to be asked about your vision. Having a clear vision for where data science and machine learning are going as well as how that effects your potential new role is critical at this level. Your vision will tie into questions about design, architecture, training, hiring, and problem solving. Senior leadership roles require you to tie your vision into strategy planning. How do you architect for a future where there will be more unstructured data coming from a lot more sources? How does your design and architecture change for a 2-year end of life versus a 5-year end of life commitment? What will customers expect from machine learning products in 5 years and how do you build now so the products are able to meet those demands?
If you do not already have at least a partially formed vision, you will not be able to wing it between here and the interview. Building a vision takes a lot of thought time. However, practicing answering the questions I have outlined above will help you sound a lot more polished when it comes to articulating your vision. Having a forward looking, strategic thought process is an easy way to stand out from the crowd. In a field with as few people as data science and machine learning, there are even fewer who have made the jump from engineer thinking to leadership thinking. It is worth showcasing.
Part 3, execution and results. Spend time thinking back on your projects with an eye on business impacts. How did you deliver against a tight deadline? How did a product you created impact revenue or save the company money? Have hard numbers ready to go. Nothing is as impressive as a leader or senior team member who drives results. Those metrics stick in interviews’ minds…especially in our field. Have showcase projects in mind and create a short story that goes through the project from inception to results. Be ready for the typical follow ups. What would you do differently if you had to do it again? What were your biggest challenges and how did you overcome them?
With each of these three parts, be as polished as possible. Concise answers are best but do not be afraid to elaborate when the question calls for it. We work in a complex field. Some answers will span 5-10 minutes and could involve stepping up to the whiteboard or pulling up your body of work. Be creative in presentation and storytelling to display your talents to the team.
The Interview & Follow Up
When I am interviewing, I look for more than just skill. For most roles, there will be two or three candidates who are capable of doing the job. Which one I hire comes down to a few factors.
Love for the field, aka passion, is easy to gauge. As a candidate, let that come out in the interview. Talk passionately about the projects you love. Be excited about a new opportunity to work in a field you enjoy.
Potential comes down to two things. Is the candidate self-directed? Does the candidate have a reasonable plan to achieve their goals? Find a few minutes to talk about your professional motivations. What drives you as a person to do the job better than others? Talk about how the role fits into your career progression and how the company helps you achieve your larger goals. Neither needs to be long winded but they both leave a powerful impression.
Avoid distraction and stick to your communication objectives during the interview. That is a huge differentiator between candidates. The person who is often on point with their communications presents a more polished professional demeanor. When you get a question, think about the person asking it before you answer because the target audience’s skill set matters. Sometimes the most impressive answers are the least technical.
During the interview, be looking for genuine reasons to follow up with people. Can you follow some of your interviewers on professional social media, LinkedIn, Twitter, their blog, etc.? Are some of you going to the same conference? Is the company sponsoring an event like a meetup or hackathon you would be interested in attending? Genuine follow up is far more impactful than the canned thank you email or phone call. It also far less overbearing than forced follow ups.