Upgrade & Secure Your Future with DevOps, SRE, DevSecOps, MLOps!

We spend hours on Instagram and YouTube and waste money on coffee and fast food, but won’t spend 30 minutes a day learning skills to boost our careers.
Master in DevOps, SRE, DevSecOps & MLOps!

Learn from Guru Rajesh Kumar and double your salary in just one year.

Get Started Now!

The Birth Of The Data Science Generation

Source: forbes.com

It’s been more than eight years since Facebook’s first data scientist told a Bloomberg reporter that “the best minds of my generation are thinking about how to make people click ads.”

By the time the data scientist, Jeff Hammerbacher, delivered that infamous criticism in 2011, he was already a few years removed from the data team at Facebook, where he had pored over enormous social data sets from 2006-2008. He was recruited by Facebook at 23 years old, fresh out of Harvard University and riding an early wave of bright data scientists who brought complex data tools and techniques out of academia and into the private sector.

Recently, TechRepublic reported that Facebook now employs more than 1,200 data workers (defined to include data scientists, data engineers, data architects, database administrators, machine learning experts, big data engineers and artificial intelligence specialists). Microsoft and Amazon each employ more than Facebook, while IBM tops the list at more than 2,500 data workers.

This growth has been part of a tumultuous decade for data. With tremendous democratizing advances like Amazon Web Services and a shift away from IT to more department-owned data have come considerable setbacks, many with broad societal implications. We have seen large-scale hacks of financial records, growing political polarization thanks to algorithmic echo chambers and coordinated interference in a U.S. presidential election. Data has been at the center of it all.

However, I believe our technological innovations are trending toward societal good — tools and services that make our communities safer, smarter and, yes, more efficient. After Facebook, Hammerbacher quickly channeled his data science prowess into cancer research, aiming to analyze large biological data sets and enable better treatments.

He’s not alone. In 2016, a 17-year-old high school student founded a startup that employs data science to help people identify dangerous breast tumors using their mobile devices. Data science tools are more accessible now than ever before. But even these benevolent endeavors are hampered by some of the same challenges that data brings to enterprises of all sizes.

It used to be that programmers wrote code to define rules. But now data scientists use vast data sets to train computers to recognize patterns. Many of the organizations I work with process thousands of documents each month, and the data scientists tasked with automating and expediting these processes through machine learning and artificial intelligence (AI) are burdened with extremely tedious and time-consuming tasks.

A machine learning-powered document extraction and classification system can take three-and-a-half months to develop. Nearly 75% of that time is spent transforming OCR data to a training data set, which requires data cleansing and exploring, as well as feature extraction. Another month is spent training, testing and finalizing the model with different parameters and iterations. In the near future, the model-building process itself will also be improved through machine learning. Data scientists will build one model that adapts itself in minutes to serve a wide variety of business structures and enterprise functions.

But as the past decade has illustrated, our quest for greater efficiency may come with certain setbacks. Young people around the world have grown up in a society where privacy is increasingly fragile. Millennials and Gen Z kids have seen far more cyber warfare in their feeds than on-the-ground warfare via traditional news. And, as a result, they are more protective of their data. This is smart of them, of course. But it will inevitably make life harder for data scientists working in business to consumer (B2C) organizations. Indeed, privacy legislation like the General Data Protection Regulation and California Consumer Privacy Act is already putting a pinch on the enterprise data that’s available to inform new machine learning and AI tools.

In turn, data scientists have come up with another workaround: fake data. We can train machines to analyze enormous sets of real data and create mock data sets that appear real enough to perform intended business functions. This is another step toward data science’s bold march forward. But like many steps before it over the past decade, fake data sets also bring potentially harmful side effects. For example, the data can appear so realistic that it could be used to open a fraudulent credit card if it fell into the wrong hands.

So, data scientists, and all of us who work to innovate our industries, are charged with ensuring our latest innovations are safe for the businesses, communities and people they serve. This is not a new responsibility, but it is one we must continue to put at the forefront of our business practices. We know this. Young professionals and even young adults demand this.

Hammerbacher, a millennial himself, was correct that many of the top technical minds his age helped create the most targeted advertising ecosystem the world has ever seen. But as time passes, leaders from that same generation — and their successors — are now poised to do much more, helping to create a world that uses data to improve medicine, infrastructure, government and the environment. With data scientist now the best job in America for four years running, it’s safe to say we’ll see an influx of younger data scientists who are eager to apply data science in new, transformative ways.

In fact, it won’t be long before we have our first millennial president. I don’t see any reason why that president couldn’t be a data scientist.

Related Posts

What is Data Pipelining Tools and that are the Different Types of Data Pipelining Tools?

Introduction to Data Pipelining Tools Data pipelining tools are an essential part of modern data management processes. As companies collect more and more data, they need to Read More

Read More

What are Data Engineering Tools?

Introduction to Data Engineering Tools Data engineering is a crucial component of the data lifecycle that involves collecting, transforming, storing, and managing large datasets. With the increase Read More

Read More

What is a data science platform?

Introduction to Data Science Platforms Data Science Platforms have revolutionized the way businesses operate by providing a comprehensive suite of tools for managing and analyzing large volumes Read More

Read More

What are Data Analytics Tools and Why are Data Analytics Tools Important?

Introduction to Data Analytics Tools Data analytics tools are software solutions designed to collect, process, and analyze large sets of data to extract valuable insights. With data Read More

Read More

What is Data Science Platform and Why Data Science Platform is important?

Introduction to Data Science Platforms In today’s data-driven world, businesses are collecting and processing vast amounts of information to gain insights, make informed decisions, and stay ahead Read More

Read More

GET RECRUITED: TOP DATA SCIENCE JOBS TO APPLY THIS WEEKEND

Source – https://www.analyticsinsight.net/ Data science is an essential part of any industry today, given the massive amounts of data that are produced. Data science is one of Read More

Read More
Subscribe
Notify of
guest
0 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments
0
Would love your thoughts, please comment.x
()
x