Upgrade & Secure Your Future with DevOps, SRE, DevSecOps, MLOps!

We spend hours on Instagram and YouTube and waste money on coffee and fast food, but won’t spend 30 minutes a day learning skills to boost our careers.
Master in DevOps, SRE, DevSecOps & MLOps!

Learn from Guru Rajesh Kumar and double your salary in just one year.

Get Started Now!

Kaskada data science automation platform aims to speed machine learning models into production

Source – https://siliconangle.com/

More than a year after announcing plans to automate the feature engineering phase of artificial intelligence projects, Seattle-based startup Kaskada Inc. is bringing its first product to market.

Kaskada says it aims to democratize feature engineering, an often laborious process that requires data scientists to select, clean and validate the data to be fed into machine learning training models prior to moving them into production.

A model intended to predict housing prices, for example, would be feature engineered with predictor data such as the square footage of properties, number of bedrooms and location. The larger and more complete the training data set, the better the results.

The resources required to collect data and move machine learning models into production can be so significant that the capabilities are out of reach of all but the largest companies. Kaskada says its platform features a collaborative interface for team engineering and a proprietary data infrastructure for computing across event-based data and serving features in production.

“We are focused on building the bridge between training and production,” said Davor Bonaci, Kaskada’s chief executive and a former software engineer at Google LLC and Microsoft Corp. “We are launching a self-service platform to help data scientists get work into production by automating infrastructure. You can onboard and don’t have a big adoption curve or need to get everybody in your organization you agree to try it.”

The company’s self-service platform is a self-contained data science studio with pre-built machine learning models and the feature vectors needed to support them provided via an application program interface. “You get up-to-the-moment feature vectors for functions like real-time fraud detection,” Bonaci said. “You don’t have to write data pipelines or process streaming data. We run the data processing needed for the model.”

Event-driven focus

Kaskada’s platform has undergone some changes since it was announced, the most significant of which is a greater focus on event-driven data collection. That’s a type of processing that makes decisions in response to real-time events such as mouse clicks and transactions.

Event-driven processing is especially useful in scenarios like predicting the probability that a customer will buy a product or that a credit card transaction will be fraudulent. Real-time data handling requires an efficient data infrastructure to calculate features at arbitrary points in time and to deliver them to both training and production environments. “We have built a lot of functionality to think in terms of time,” Bonaci said.

The company has also focused more of its attention on automating the data science process rather than data engineering. Those two functions are supposed to work in tandem but frequently fail to communicate effectively because data scientists are focused on data and engineers on getting models into production.

“There can be friction getting into production because science and engineering teams have different values,” Bonaci said. “We reduce the friction needed to get work into production.”

Kaskada is a cloud-native service that customers can deploy in their own cloud instances, run as a managed service or install on local infrastructure. The company offers a distinctive pricing model that includes a free tier with limited data capacity, curated public datasets, sample projects and individual commit and version histories. Paid plans support team development, batch data uploads, direct data connection and real-time features. Details weren’t provided.

Related Posts

What is Data Pipelining Tools and that are the Different Types of Data Pipelining Tools?

Introduction to Data Pipelining Tools Data pipelining tools are an essential part of modern data management processes. As companies collect more and more data, they need to Read More

Read More

What are Data Engineering Tools?

Introduction to Data Engineering Tools Data engineering is a crucial component of the data lifecycle that involves collecting, transforming, storing, and managing large datasets. With the increase Read More

Read More

What is a data science platform?

Introduction to Data Science Platforms Data Science Platforms have revolutionized the way businesses operate by providing a comprehensive suite of tools for managing and analyzing large volumes Read More

Read More

What are Data Analytics Tools and Why are Data Analytics Tools Important?

Introduction to Data Analytics Tools Data analytics tools are software solutions designed to collect, process, and analyze large sets of data to extract valuable insights. With data Read More

Read More

What is Data Science Platform and Why Data Science Platform is important?

Introduction to Data Science Platforms In today’s data-driven world, businesses are collecting and processing vast amounts of information to gain insights, make informed decisions, and stay ahead Read More

Read More

GET RECRUITED: TOP DATA SCIENCE JOBS TO APPLY THIS WEEKEND

Source – https://www.analyticsinsight.net/ Data science is an essential part of any industry today, given the massive amounts of data that are produced. Data science is one of Read More

Read More
Subscribe
Notify of
guest
0 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments
0
Would love your thoughts, please comment.x
()
x