A beginner’s guide to building a data science team

17Jul - by aiuniverse - 3 - In Big Data

Source – econsultancy.com

So, you want to build a data science team? Here’s some stuff to think about.

Before long, just like this stock photo, you’ll have a team of weird orange people with big bulbous heads, who can sit around a table looking at an enormous hologram of a simple bar chart.

In this article we will cover:

  • Definitions of data science
  • The purpose of data science
  • How data science teams should integrate into the organisation
  • Recruiting for data science
  • Team roles

Though Econsultancy is marketing focused, there’s plenty in here to appeal more broadly.

First, an attempt at a definition

It seems trite to say that data science’s applications are broad, but they are. And data science teams come in different forms, within different organisational structures and under different names.

There’s a pretty good Venn diagram developed by Drew Conway which gets to the heart of the ambiguous phrase ‘data science’.

data science venn

Data science Venn diagram by Drew Conway

In Conway’s words, “The difficulty in defining these skills is that the split between substance and methodology is ambiguous, and as such it is unclear how to distinguish among hackers, statisticians, subject matter experts, their overlaps and where data science fits.”

I recommend heading over to Conway’s article to read more of his thoughts. But the basic takeaway for a layman like me is – there’s a hell of a lot to learn and many different skillsets that can be brought to bear on data.

Whilst data science has many grey edges, it’s probably worth including some fairly dry definitions of two common teams – ‘Big data analytics’ teams and ‘data product’ teams. The former looks for predictive patterns in data without necessarily having a preconceived notion of what they are looking for, and the latter works to implement automated systems that are data-driven.

Data products – Ben Chamberlain, senior data scientist at ASOS, describes a data product as “an automated system that generates derived information about our customers such as predicting their lifetime value. This information is then used to automatically take actions like sending marketing messages or it gets sent to another team who use it for insight.”

If you don’t have any statistical knowledge and you fancy a challenge, you can read one of Chamberlain’s papers about this very ASOS CLV data product.

Big data analytics – IBM gives us a serviceable definition of big data analytics: “..a term applied to data sets whose size or type is beyond the ability of traditional relational databases to capture, manage, and process the data with low-latency.

“..it has one or more of the following characteristics – high volume, high velocity, or high variety. Big data comes from sensors, devices, video/audio, networks, log files, transactional applications, web, and social media – much of it generated in real time and in a very large scale.”

Remember, data science must tackle a problem (duh!)

As I read in a Harvard Business Review article, economist and Harvard professor Theodore Levitt once said that “People don’t want to buy a quarter-inch drill, they want a quarter-inch hole.”

The same applies to data science – the business needs to see a solution. It’s another obvious thing to say, but I’m writing it because new(ish) and complicated disciplines such as cognitive computing can temporarily blind marketers to the fact that normal rules of business apply – what is the problem that needs solving? What data can be brought to bear, and how can the data be used to create most value?

This is something summed up very nicely with another trusty Venn diagram on a Juice Analytics article. (The intersection of the three circles is where successful data products live.)

venn diagram solving a data problem

Parry Malm, co-founder of Phrasee (email marketing language generation software), takes a pragmatic tone and warns about employing a data science team before you know exactly what you want to achieve.

“The first step,” he says, “is to really, really, really clearly define what problem you’re trying to solve… only then consider whether or not an analytics team or whatever is the right approach. What you DON’T want to do is to hire 10 ‘data scientists’ or something, and then have a huge working capital hit for an undefined outcome, when the money could potentially be spent better somewhere else.”

How data science should interact with the wider org

Before we move on to all the roles in a data science team and the challenges involved in setting one up, it’s worthwhile considering how the team will interact with the rest of the organisation.

Simply parachuting data scientists into a company ignores the differences in culture and skills between marketing and finance teams, and these statisticians and programmers.

To get full value out of your data science team you need to consider what peripheral roles and processes are needed.

1) Transparency and a customer service culture

The danger is that data products or big data analytics will either be implemented and deliver no business benefit or will be underutilised / underprioritised by a business which fails to recognise their value.

Writing in Harvard Business Review, various members of McKinsey’s analytics teams say there is a need for data teams to operate in a customer service culture.

Again, this all feels pretty obvious but will be integral to success. Is the business ready to accept suggestions from a data-led team? If not, what education is needed in the first instance or how can stakeholders be more involved in the effort?

2) Data-science communication

Science communication generally is a noble cause. In an article in The Guardian in 2016, Richard Holliman reports that it is an undervalued vocation. He writes that “For too long, research has shown that science communication is seen as a second-class option for academics.”

Holliman continues, adding that though science communication has improved, “There is still work to be done to ensure that excellence rather than acceptability becomes the hallmark of these activities. The introduction of new ways to discuss and publish the outputs from research, and alternative mechanisms for reward and recognition suggest that a shift in this direction is underway.”

I’m going off topic here, but there’s a corollary with how data science teamwork is translated within businesses and to the end consumer. There needs to be a surrounding network of skillful communicators.

These communicators can include:

  • Data visualisation specialists – To make outcomes more readable and accessible.
  • Data strategists – In a recent interview with Econsultancy, Channel 4’s director of consumer insights Sarah Rose described this role as “the bridging point between the data science team, who work on the models that we put into our products, and the rest of the business.” Their knowledge may include some data science and some industry expertise.
  • Campaign experts – With knowledge of tech and marketing (could be a developer).
  • T-shaped leaders – The leader of the data science team must absolutely be all about data science; it’s integral they be an expert in the field. But if you can also find one with business skills, then all the better.

Idrees Kahloon, data journalist at The Economist says that “Often, the best way to present data is the simplest: people readily understand means, medians and sums. Fancier statistical models appeal to wonks, but are harder to explain to a general audience.”

 

Facebook Comments

Comments are closed.