Top 5 open-source tools for machine learning
Source – jaxenter.com
Machine learning is going through something of a renaissance these days. It seems like there are new moves forward with this technology every day, from advances in image and sound recognition to lip reading and beating us at all the games.
However, this renaissance has largely been funded by Silicon Valley. Companies are scrambling to find enough programmers capable of coding for ML and deep learning. Last year was a good year for the freedom of information, as titans of the industry Google, Microsoft, Facebook, Amazon, and even Baidu open-sourced a number of their ML frameworks.
Freeing code is a great way to attract talent and grow a community, as well as garner good will. (After all, devs highly value open-source efforts in their employers.) Google is unquestionably the goliath in the field of open-source machine learning with TensorFlow beating all comers by most metrics.
SEE MORE: Top 5 machine learning libraries for Java
Given the paradigmatic shifts that a true revolution in machine learning could bring, it’s important to maintain tech’s devotion to open-source. These kinds of scientific advancement don’t belong to any one company or corporation, but to the whole world. Making ML open and evenly distributed means everyone can join in this revolution.
So, in no particular order:
Some have been a little worried about the machine learning arms race leaving the world’s top universities bereft of AI talent. Having gigantic leaps forward in tech means nothing if its proprietary company information.
So, Elon Musk and his buddies have fronted over $1 billion for OpenAI, a non-profit AI research initiative.
OpenAI’s mission is to build safe artificial general intelligence (AGI), and ensure AGI’s benefits are as widely and evenly distributed as possible. We expect AI technologies to be hugely impactful in the short term, but their impact will be outstripped by that of the first AGIs.
With over 60 full-time researchers, OpenAI publishes fascinating papers on advances in AI capabilities as well as open-source software tools. Head on over there to check out their platforms like Gym, a toolkit for developing comparing reinforcement learning algorithms, and Universe, a collection of Gym environments that measure an AI’s general intelligence.
Open-sourced by Google, this is the winner and still champion of open-source ML libraries. Written mostly with easy-to-use Python, TensorFlow also has a few experimental APIs in Java and Go.
Helpfully, the getting started section with TensorFlow has a ML for beginners section as well as a section for experts. TensorFlow is probably one of the more accessible open-source tools on this list, and for good reason. It’s the top open-source ML tool on GitHub and has the most projects (have you tried the nightmarishly funny edges2cats?) as well as the biggest community.
Okay, to be fair, this Torch/Lua-based neural net is 100% on this list because of Janelle Shane’s work. The researcher behind Postcards from the Frontier of Science, McShane has come up with some amazing fun projects with the character-level language models. Whether it’s recipes, planets, or Pokémon, her neural network is just trying its hardest to learn. We shouldn’t laugh.
Torch in general is a great framework to learn, not in the least because it seems like FB is basically supporting this deep learning framework all by themselves.
This is a new one for us here at JAXenter. PaddlePaddle is the work of the researchers over at Baidu, the Chinese Google (among other things). Baidu has a fairly advanced AI lab that’s being run by an ex-Stanford professor. PaddlePaddle is pretty much a direct shot at Google’s open-source deep learning dominance.
Paddle stands for PArallel Distributed Deep LEarning, and it’s billed as an easy to use, efficient, flexible, and scalable deep learning platform. Their getting started page is pretty well structured for deep learning beginners and walks newcomers through the initial steps with some problem sets.
Microsoft’s Cognitive Toolkit is a deep-learning toolkit for training algorithms to learn like the human brain. As their GitHub page charmingly points out, “CNTK is in active use at Microsoft and constantly evolving. There will be bugs.” Fair enough.
This tool is unquestionably meant to use neural networks to go through large datasets of unstructured data. With faster training times and easy to use architecture, CNTK is highly customizable, allowing you to choose your own parameters, algorithms, and networks. It’s written in Python and C++.