CRACKING DATA SCIENCE INTERVIEW IS NOT ABOUT MATHS BUT AUDIENCE
When one thinks about cracking data science interviews, they get into a lot of technical preparation. Most of the suggestions imply that it’s all about technology and technological aptitude. The budding professionals should understand that in today’s world to stand out one doesn’t need to go overboard on technical acumen rather analyze that every evaluator is reviewing different attributes. Amid this what comes handy is to acknowledge your audience instead of showcasing your maths-skills.
Ted Kwartler, VP of Trusted AI at DataRobot and Harvard Adjunct Professor, from his real-life experiences, has observed that “anticipating audience needs is the most important factor at each interview stage; yet data science candidates often over-index on technical acumen, and neglect the fact that every evaluator is reviewing different attributes. In such cases, data scientists will showcase foundational technical knowledge such as the difference between sensitivity and specificity, or modeling evaluation metrics such as log loss or Gini norm. Worse than that, data science candidates tend to go down rabbit holes such as Bayesian parameter estimation.”
According to him, going this deep into a highly technical niche subject has two potential risks, one end up having an interviewer’s eyes glaze over because they are not the right audience, or second and much worse, the interviewer has a deeper technical understanding about that particular subject and will trip the candidate in his answer!
Ted describes 4 stages to a data science interview process, each with distinct audiences: the initial phone screen, the technical evaluation, the “take-home” assignment, and a behavioral assessment.
Here is his analysis that will help you aid your own quest to become a successful data scientist.
During a phone screen with a leading search company, candidates try to impress the recruiter with their technical acumen, dazzle them with their professional experience, and win them over with their charm, with a goal to have the recruiter become an advocate for their candidacy at the next stage. But the recruiter might have different incentives because a candidate is no doubt among dozens of interviewees to be evaluated in limited duration and that the recruiter is given specific questions with corresponding answers. While being able to articulate what feature engineering and cross-validation solidified a candidate’s qualifications, satisfying this stage of the process can mean being brief and showing interest in the role.
The second stage of the interview process is often a series of technical interviews performed by people currently in the role—the candidate’s future teammates. Many times, this is the stage where a candidate might get questioned on whether he has solved problems similar to the ones they are facing. This type of audience has two motivations. First, they want to determine the candidate’s intellectual horsepower in or adjacent to their field so they can be sure he will bring something valuable to the team. Second, many data scientists are eager to demonstrate the value and importance of their work. Ted has found that, in these instances, it’s best to let the interviewers speak, and contribute in a manner that shows the candidate has something to share, but not that he will replace them as the brightest unicorn on the team.
The third stage is typically a coding exercise. Companies will say it should take candidates three hours to complete (for example) when in reality it will take them ten times as long because they may not want to be lazy about it. Furthermore, practical data science is a team sport and businesses live on Stack Overflow anyway. To excel at this stage, the candidate’s audience is looking for a concise, easily understood code. According to Ted, it’s also important to remember that a candidate’s code is being reviewed against other candidates, so he needs to limit his design choices to methods he can justify and ones that he is certain his audience would comprehend. Ted suggests that complex ensemble models with exotic feature engineering may not be as impressive as a random forest with typical variable treatments. Keep it simple yet concise.
In the last stage, the hiring manager is simply looking for a cultural fit on the team. There are three facts supporting the candidate’s candidacy at this point: 1.) there is a huge shortage of data scientists that is expected to last more than 10 years, 2.) the company has invested a lot of time in the candidacy already, and 3.) managers may not be technically fluent themselves, and so candidates must rely on their previous interview results.
Ted says, “I once had nine interviews with a company, had made it to the last round with two hiring managers, but ultimately failed because my management philosophy did not coincide with theirs. Rather than coming across as adaptable, my “lean” approach to meetings did not instill confidence that I would be easy to manage or a good coworker. Most of us have seen the most brilliant, yet most difficult, teammate let go while the less-intelligent but easily managed teammate remains. At this stage, you want to position yourself as easy to manage and adaptable above all else.”