PyTorch announces the availability of PyTorch Hub for improving machine learning research reproducibility

Upgrade & Secure Your Future with DevOps, SRE, DevSecOps, MLOps!

We spend hours on Instagram and YouTube and waste money on coffee and fast food, but won’t spend 30 minutes a day learning skills to boost our careers.
Master in DevOps, SRE, DevSecOps & MLOps!

Learn from Guru Rajesh Kumar and double your salary in just one year.

Source:- hub.packtpub.com

Yesterday, the team at PyTorch announced the availability of PyTorch Hub which is a simple API and workflow that offers the basic building blocks to improve machine learningresearch reproducibility.

Reproducibility plays an important role in research as it is an essential requirement for a lot of fields related to research including the ones based on machine learning techniques. But most of the machine learning based research publications are either not reproducible or are too difficult to reproduce.

With the increase in the number of research publications, tens of thousands of papers being hosted on arXiv and submissions to conferences, research reproducibility has now become even more important. Though most of the publications are accompanied by code and trained models that are useful but still it is difficult for users to figure out for most of the steps, themselves.

PyTorch Hub consists of a pre-trained model repository that is designed to facilitate research reproducibility and also to enable new research. It provides built-in support for Colab, integration with Papers With Code and also contains a set of models including classification and segmentation, transformers, generative, etc. By adding a simple hubconf.py file, it supports the publication of pre-trained models to a GitHub repository, which provides a list of models that are to be supported and a list of dependencies that are required for running the models.

For example, one can check out the torchvision, huggingface-bert and gan-model-zoorepositories. Considering the case of torchvision hubconf.py: In torchvision repository, each of the model files can function and can be executed independently. These model files don’t require any package except for PyTorch and they don’t need separate entry-points.

A hubconf.py can help users to send a pull request based on the template mentioned on the GitHub page.

The official blog post reads, “Our goal is to curate high-quality, easily-reproducible, maximally-beneficial models for research reproducibility. Hence, we may work with you to refine your pull request and in some cases reject some low-quality models to be published. Once we accept your pull request, your model will soon appear on Pytorch hub webpage for all users to explore.”

PyTorch Hub allows users to explore available models, load a model as well as understand the kind of methods available for any given model. Below mentioned are few of the examples:

Explore available entrypoints:

With the help of torch.hub.list() API, users can now list all available entrypoints in a repo. PyTorch Hub also allows auxillary entrypoints apart from pretrained models such as bertTokenizer for preprocessing in the BERT models and making the user workflow smoother.

Load a model:

With the help of torch.hub.load() API, users can load a model entrypoint. This API can also provide useful information about instantiating the model.

Most of the users are happy about this news as they think it will be useful for them. A user commented on HackerNews, “I love that the tooling for ML experimentation is becoming more mature. Keeping track of hyperparameters, training/validation/test experiment test set manifests, code state, etc is both extremely crucial and extremely necessary.”

PyTorch 1.4 Release Introduces Java Bindings, Distributed Training

Source: infoq.com PyTorch, Facebook’s open-source deep-learning framework, announced the release of version 1.4. This release, which will be the last version to support Python 2, includes improvements to distributed training Read More

PyTorch and TensorFlow: Which ML Framework is More Popular in Academia and Industry

Source: infoq.com Horace He recently published an article summarising The State of Machine Learning Frameworks in 2019. The article utilizes several metrics to argue the point that PyTorch is quickly becoming the dominant Read More

Upgrade & Secure Your Future with DevOps, SRE, DevSecOps, MLOps!

Related Posts

NVIDIA NeMo: An Open-Source Toolkit For Developing State-Of-The-Art Conversational AI Models In Three Lines Of Code

Deep Learning Restores Time-Ravaged Photos

THIS LATEST MODEL SERVING LIBRARY HELPS DEPLOY PYTORCH MODELS AT SCALE

AWS Announces Support for PyTorch with Amazon Elastic Inference

PyTorch 1.4 Release Introduces Java Bindings, Distributed Training

PyTorch and TensorFlow: Which ML Framework is More Popular in Academia and Industry