Microsoft leads $16M round into AI data science startup Pachyderm
Pachyderm Inc., a San Francisco startup that provides a data science platform for easing artificial intelligence development, today disclosed that it has closed a $16 million funding round led by Microsoft Corp.’s M12 venture arm.
M12 was joined by existing backers Y Combinator and Benchmark. The investment also included the participation of other backers including Decibel Ventures, a venture capital firm with close ties to Cisco Systems Inc. that mainly backs early-stage startups.
Pachyderm provides an open-source data science platform that runs on Kubernetes. The platform focuses on a narrow but important use case: making it easier for software engineers to manage the data that they use in AI development projects.
Engineers teach AI models to perform tasks by training them on sample datasets. These datasets can add up quickly in a large enterprise, especially since engineers’ experiments often lead to the creation of multiple versions of the same dataset.
There are often also multiple versions of the AI model itself. Engineers need to keep all these files on hand because they might need to revert a neural network to an earlier version if they find a bug, or because they may require the ability to double-check the quality of the data used during the training process.
Pachyderm’s platform provides centralized controls for managing the data files in an AI project. The platform stores all of a team’s files in repositories and organizes their different versions. Pachyderm also gives engineers the ability to bring up a previous version of a file when the need arises.
The other focus of the platform is speeding up data pre-processing. Before training data is fed to a neural network, it usually goes through a series of automated transformations that convert the information into a form that’s easier to analyze and remove erroneous records. Because it runs on Kubernetes under the hood, Pachyderm allows engineers to parallelize this process across thousands of application containers and thereby speed up preprocessing.
Pachyderm says its platform can provide major practical benefits for AI teams. The startup claims publicly traded LogMeIn Inc. managed to reduce data preprocessing times from weeks to hours by adopting its platform. Heavily regulated firms such banks, in turn, can use Pachyderm’s data version management features to verify that the records an AI business system used to make an important decision didn’t contain any errors.
Pachyderm makes money from its platform with a paid enterprise version with extra features and a hosted cloud edition that exited beta test mode today. According to the startup, several Fortune 500 companies are already using its software along with “multiple” unnamed government agencies.