Kaskada data science automation platform aims to speed machine learning models into production

Upgrade & Secure Your Future with DevOps, SRE, DevSecOps, MLOps!

We spend hours on Instagram and YouTube and waste money on coffee and fast food, but won’t spend 30 minutes a day learning skills to boost our careers.
Master in DevOps, SRE, DevSecOps & MLOps!

Learn from Guru Rajesh Kumar and double your salary in just one year.

Source – https://siliconangle.com/

More than a year after announcing plans to automate the feature engineering phase of artificial intelligence projects, Seattle-based startup Kaskada Inc. is bringing its first product to market.

Kaskada says it aims to democratize feature engineering, an often laborious process that requires data scientists to select, clean and validate the data to be fed into machine learning training models prior to moving them into production.

A model intended to predict housing prices, for example, would be feature engineered with predictor data such as the square footage of properties, number of bedrooms and location. The larger and more complete the training data set, the better the results.

The resources required to collect data and move machine learning models into production can be so significant that the capabilities are out of reach of all but the largest companies. Kaskada says its platform features a collaborative interface for team engineering and a proprietary data infrastructure for computing across event-based data and serving features in production.

“We are focused on building the bridge between training and production,” said Davor Bonaci, Kaskada’s chief executive and a former software engineer at Google LLC and Microsoft Corp. “We are launching a self-service platform to help data scientists get work into production by automating infrastructure. You can onboard and don’t have a big adoption curve or need to get everybody in your organization you agree to try it.”

The company’s self-service platform is a self-contained data science studio with pre-built machine learning models and the feature vectors needed to support them provided via an application program interface. “You get up-to-the-moment feature vectors for functions like real-time fraud detection,” Bonaci said. “You don’t have to write data pipelines or process streaming data. We run the data processing needed for the model.”

Event-driven focus

Kaskada’s platform has undergone some changes since it was announced, the most significant of which is a greater focus on event-driven data collection. That’s a type of processing that makes decisions in response to real-time events such as mouse clicks and transactions.

Event-driven processing is especially useful in scenarios like predicting the probability that a customer will buy a product or that a credit card transaction will be fraudulent. Real-time data handling requires an efficient data infrastructure to calculate features at arbitrary points in time and to deliver them to both training and production environments. “We have built a lot of functionality to think in terms of time,” Bonaci said.

The company has also focused more of its attention on automating the data science process rather than data engineering. Those two functions are supposed to work in tandem but frequently fail to communicate effectively because data scientists are focused on data and engineers on getting models into production.

“There can be friction getting into production because science and engineering teams have different values,” Bonaci said. “We reduce the friction needed to get work into production.”

Kaskada is a cloud-native service that customers can deploy in their own cloud instances, run as a managed service or install on local infrastructure. The company offers a distinctive pricing model that includes a free tier with limited data capacity, curated public datasets, sample projects and individual commit and version histories. Paid plans support team development, batch data uploads, direct data connection and real-time features. Details weren’t provided.

Upgrade & Secure Your Future with DevOps, SRE, DevSecOps, MLOps!

Event-driven focus

Related Posts

What is Data Pipelining Tools and that are the Different Types of Data Pipelining Tools?

What are Data Engineering Tools?

What is a data science platform?

What are Data Analytics Tools and Why are Data Analytics Tools Important?

What is Data Science Platform and Why Data Science Platform is important?

GET RECRUITED: TOP DATA SCIENCE JOBS TO APPLY THIS WEEKEND