PyTorch Archives - Artificial Intelligence

What is PyTorch and Its Use Cases?

vijay — Wed, 22 Jan 2025 06:12:16 +0000

PyTorch is an open-source machine learning framework developed by Facebook’s AI Research lab. It is widely used for tasks involving deep learning, natural language processing, and computer vision. PyTorch provides dynamic computational graphs, enabling developers to modify them on the fly, which is particularly beneficial for research and experimentation. It supports GPU acceleration, making large-scale data processing and model training efficient. PyTorch’s intuitive syntax, flexibility, and extensive library of tools make it a popular choice among researchers and developers. Its use cases include building neural networks for image and speech recognition, natural language understanding, recommendation systems, generative models, and reinforcement learning applications.

What is PyTorch?

PyTorch is designed for both research and production purposes. Its foundation is based on Torch, a scientific computing framework with support for machine learning algorithms, but it goes beyond by integrating dynamic computation graphs and GPU acceleration. It is highly compatible with Python, making it accessible and user-friendly for developers, data scientists, and researchers.

Key Characteristics:

Dynamic Computation Graphs: Unlike static computation graphs, PyTorch’s graphs are dynamic, meaning they are built on-the-fly, allowing greater flexibility.
GPU Acceleration: PyTorch supports CUDA, enabling developers to speed up computations by leveraging GPUs.
Autograd: Its automatic differentiation engine simplifies gradient computation.

Top 10 Use Cases of PyTorch

Image Classification: PyTorch is widely used for training Convolutional Neural Networks (CNNs) for image recognition tasks, such as detecting objects or identifying diseases in medical imaging.
Natural Language Processing (NLP): PyTorch facilitates training transformer models, like BERT and GPT, for tasks such as text generation, sentiment analysis, and translation.
Generative Adversarial Networks (GANs): It supports developing GANs for applications like image synthesis, super-resolution, and artistic style transfer.
Reinforcement Learning: PyTorch’s flexibility makes it an ideal choice for developing reinforcement learning models, used in robotics, gaming, and autonomous systems.
Speech Recognition: With libraries like torchaudio, PyTorch is used for speech-to-text models and related audio signal processing tasks.
Time Series Forecasting: Businesses leverage PyTorch for predictive modeling in areas such as stock price forecasting and energy demand prediction.
Medical Imaging: PyTorch accelerates research in analyzing medical images for diagnostics, segmentation, and anomaly detection.
Video Analytics: For applications like real-time surveillance and video content analysis, PyTorch provides the tools for developing robust solutions.
Recommendation Systems: PyTorch is utilized in developing personalized recommendation engines, crucial for e-commerce and streaming platforms.
Scientific Research: Researchers use PyTorch for experiments in fields like physics, biology, and climate science, owing to its flexibility and ease of integration with scientific workflows.

Features of PyTorch

Dynamic Computational Graphs: Enables model changes during runtime.
Ease of Use: Pythonic framework that integrates seamlessly with other Python libraries.
Autograd: Automatic differentiation for complex backpropagation.
TorchScript: Allows models to be deployed in production environments efficiently.
Distributed Training: Supports scaling across multiple GPUs and machines.
Robust Ecosystem: Includes libraries like torchvision, torchaudio, and torchtext for specific domains.
Community and Documentation: Extensive community support with rich documentation and tutorials.
Integration with PyPI and Jupyter: Simplifies installation and experimentation.

How PyTorch Works and Architecture

Tensor Operations: Tensors are the core data structures in PyTorch, akin to NumPy arrays but with GPU acceleration.
Dynamic Computation Graph: The computation graph is created during runtime, allowing on-the-fly modifications.
Autograd: PyTorch’s automatic differentiation engine tracks operations and computes gradients for optimization.
Modules and Layers: Models in PyTorch are built using modular components, such as layers in the torch.nn module.
Backpropagation and Optimization: PyTorch supports backpropagation through autograd and optimization through built-in optimizers like SGD and Adam.

How to Install PyTorch

Installing PyTorch involves a few straightforward steps, depending on your system and preferences. Below is a general guide for installation:

1. Check System Compatibility: Ensure your system supports PyTorch, and determine whether you’ll be using a CPU-only version or a version with GPU acceleration (CUDA).

2. Visit the Official PyTorch Website: Go to https://pytorch.org. The website provides an easy-to-use installation selector to help generate the appropriate command based on your environment.

3. Choose Installation Options:

Select your PyTorch Build (Stable or Nightly).
Choose your Operating System (Linux, macOS, or Windows).
Specify your Package Manager (pip, conda, etc.).
Select your Language (Python or C++).
Choose your Compute Platform (CPU, CUDA 11.8, CUDA 12, etc.).

4. Run the Installation Command: Based on your selections, the website will generate a command. Copy and paste this command into your terminal or command prompt. For example:

Using pip (with CUDA 12.1):

pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu121

Using conda (with CUDA 11.8):

conda install pytorch torchvision torchaudio pytorch-cuda=11.8 -c pytorch -c nvidia

5. Verify Installation: After installation, verify that PyTorch is installed correctly:

Open a Python shell or Jupyter Notebook.
Import PyTorch and check its version:

import torch
print(torch.__version__)
print(torch.cuda.is_available())  # Check if CUDA is available

Following these steps will set up PyTorch for your development needs.

Basic Tutorials of PyTorch: Getting Started

Step 1: Importing PyTorch

import torch

Step 2: Working with Tensors

# Creating a tensor
x = torch.tensor([[1, 2], [3, 4]])
print(x)

# Tensor operations
y = x + 2
print(y)

Step 3: Building a Simple Neural Network

import torch.nn as nn

# Define the model
class SimpleModel(nn.Module):
    def __init__(self):
        super(SimpleModel, self).__init__()
        self.linear = nn.Linear(10, 1)

    def forward(self, x):
        return self.linear(x)

model = SimpleModel()

Step 4: Training the Model

import torch.optim as optim

# Dummy data
inputs = torch.randn(100, 10)
labels = torch.randn(100, 1)

# Loss function and optimizer
criterion = nn.MSELoss()
optimizer = optim.SGD(model.parameters(), lr=0.01)

# Training loop
for epoch in range(100):
    optimizer.zero_grad()
    outputs = model(inputs)
    loss = criterion(outputs, labels)
    loss.backward()
    optimizer.step()
    print(f'Epoch {epoch+1}, Loss: {loss.item()}')

The post What is PyTorch and Its Use Cases? appeared first on Artificial Intelligence.

Microsoft Offers Deep Learning Support with PyTorch Enterprise on Microsoft Azure

aiuniverse — Fri, 04 Jun 2021 11:11:42 +0000

Source – https://visualstudiomagazine.com/

Microsoft claims its new PyTorch Enterprise on Microsoft Azure is the first offering from a cloud platform to provide enterprise support for PyTorch, the popular open source deep learning framework.

Along with that enterprise support, it comes with prioritized troubleshooting and also integrates with other Azure solutions, such as Azure Machine Learning.

PyTorch is an open source machine learning/deep learning framework based on the Torch library, used for applications such as computer vision and natural language processing, driven by Facebook’s AI Research lab, according to Wikipedia.

Regular Visual Studio Magazine readers will know that it’s also a favorite tool of our own data science guru, Dr. James McCaffrey of Microsoft Research, who authors regular hands-on PyTorch-based tutorials for our Data Science Lab.

The new offering comes after Microsoft teamed up with Facebook to become a founding member of the PyTorch Enterprise Support Program, which helps service providers develop and offer tailored enterprise-grade support to their customers.

The enterprise support initiative was reportedly sparked by feedback from customers, who found it easy to get started with PyTorch but not so easy to implement complicated real-world enterprise production initiatives. Thus Microsoft will provide commercial support for the public PyTorch codebase. “Each release will be supported for as long as it is current,” Microsoft said in a recent blog post. “In addition, one PyTorch release will be selected for LTS every year. Such releases will be supported for two years, enabling a stable production experience without frequent major upgrade investment.”

Supported configurations include:

PyTorch: version 1.8.1 and up.
Libraries: torch, torchaudio, torchvision, torchtext, onnxruntime, and torch-tb-profiler.
Python: version 3.6 and up.
NVIDIA CUDA: versions 10.2 and 11.1.
Operating systems: Windows 10, Debian 9, Debian 10, Ubuntu 16.04.7 LTS, and Ubuntu 18.04.5 LTS (x86_64 architecture only).

It does not, however, support C++ or Java interfaces, or PyTorch libraries and features marked “experimental and subject to change” including TorchServe and Pipeline Parallelism.

To be eligible at no additional cost, enterprise customers must join Microsoft’s Premier or Unified support programs. For such organizations, as part of the aforementioned prioritized troubleshooting, “The dedicated PyTorch team in Azure will prioritize, develop, and deliver hotfixes to customers as needed. These hotfixes will get tested and will be included in future PyTorch releases. In addition, Microsoft will extensively test PyTorch releases for performance regressions with continuous integration and realistic, demanding workloads from internal Microsoft applications.”

Microsoft is heavily invested in the PyTorch ecosystem, noting that the company’s data scientists like Dr. McCaffrey use PyTorch as the primary framework to develop models to enhance Office 365, Bing, Xbox and other offerings. That investment includes projects such as PyTorch Profiler, ONNX Runtime on PyTorch, PyTorch on Windows, DeepSpeed and more.

For now, PyTorch Enterprise is available on Azure Machine Learning and Data Science Virtual Machines (DSVM), coming soon to Azure Synapse Analytics.

The post Microsoft Offers Deep Learning Support with PyTorch Enterprise on Microsoft Azure appeared first on Artificial Intelligence.

Facebook Open-Sources Machine-Learning Privacy Library Opacus

aiuniverse — Wed, 14 Oct 2020 05:13:29 +0000

Source: infoq.com

Facebook AI Research (FAIR) has announced the release of Opacus, a high-speed library for applying differential privacy techniques when training deep-learning models using the PyTorch framework. Opacus can achieve an order-of-magnitude speedup compared to other privacy libraries.

The library was described on the FAIR blog. Opacus provides an API and implementation of a PrivacyEngine, which attaches directly to the PyTorch optimizer during training. By using hooks in the PyTorch Autograd component, Opacus can efficiently calculate per-sample gradients, a key operation for differential privacy. Training produces a standard PyTorch model which can be deployed without changing existing model-serving code. According to FAIR,

[W]e hope to provide an easier path for researchers and engineers to adopt differential privacy in ML, as well as to accelerate DP research in the field.

Differential privacy (DP) is a mathematical definition of data privacy. The core concept of DP is to add noise to a query operation on a dataset so that removing a single data element from the dataset has a very low probability of altering the results of that query. This probability is called the privacy budget. Each successive query expends part of the total privacy budget of the dataset; once that has happened, further queries cannot be performed while still guaranteeing privacy.

When this concept is applied to machine learning, it is typically applied during the training step, effectively guaranteeing that the model does not learn “too much” about specific input samples. Because most deep-learning frameworks use a training process called stochastic gradient descent (SGD), the privacy-preserving version is called DP-SGD. During the back-propagation step, normal SGD computes a single gradient tensor for an entire input “minibatch”, which is then used to update model parameters. However, DP-SGD requires computing the gradient for the individual samples in the minibatch. The implementation of this step is the key to the speed gains for Opacus.

For computing the individual gradients, Opacus uses an efficient algorithm developed by Ian Goodfellow, inventor of the generative adversarial network (GAN) model. Applying this technique, Opacus computes the gradient for each input sample. Each gradient is clipped to a maximum magnitude, ensuring privacy for outliers in the data. The gradients are aggregated to a single tensor, and noise is added to the result before model parameters are updated. Because each training step constitutes a “query” of the input data, and thus an expenditure of privacy budget, Opacus tracks this, providing real-time monitoring and the option to stop training when the budget is expended.

In developing Opacus, FAIR and the PyTorch team collaborated with OpenMined, an open-source community dedicated to developing privacy techniques for ML and AI. OpenMined had previously contributed to Facebook’s CrypTen, a framework for ML privacy research, and developed its own projects, including a DP library called PySyft and a federated-learning platform called PyGrid. According to FAIR’s blog post, Opacus will now become one of the core dependencies of OpenMined’s libraries. PyTorch’s major competitor, Google’s deep-learning framework TensorFlow, released a DP library in early 2019. However, the library is not compatible with the newer 2.x versions of TensorFlow.

The post Facebook Open-Sources Machine-Learning Privacy Library Opacus appeared first on Artificial Intelligence.

NVIDIA NeMo: An Open-Source Toolkit For Developing State-Of-The-Art Conversational AI Models In Three Lines Of Code

aiuniverse — Sat, 10 Oct 2020 06:09:52 +0000

Source: marktechpost.com

NVIDIA’s open-source toolkit, NVIDIA NeMo( Neural Models), is a revolutionary step towards the advancement of Conversational AI. Based on PyTorch, it allows one to build quickly, train, and fine-tune conversational AI models.

As the world is getting more digital, Conversational AI is a way to enable communication between humans and computers. The set of technologies behind some fascinating technologies like automated messaging, speech recognition, voice chatbots, text to speech, etc. It broadly comprises three areas of AI research: automatic speech recognition (ASR), natural language processing (NLP), and speech synthesis (or text-to-speech, TTS).

Conversational AI has shaped the path of human-computer interaction, making it more accessible and exciting. The latest advancements in Conversational AI like NVIDIA NeMo help bridge the gap between machines and humans.

NVIDIA NeMo consists of two subparts: NeMo Core and NeMo Collections. NeMo Core deals with all models generally, whereas NeMo Collections deals with models’ specific domains. In Nemo’s Speech collection (nemo_asr), you’ll find models and various building blocks for speech recognition, command recognition, speaker identification, speaker verification, and voice activity detection. NeMo’s NLP collection (nemo_nlp) contains models for tasks such as question answering, punctuation, named entity recognition, and many others. Finally, in NeMo’s Speech Synthesis (nemo_tts), you’ll find several spectrogram generators and vocoders, which will let you generate synthetic speech.

There are three main concepts in NeMo: model, neural module, and neural type.

Models contain all the necessary information regarding training, fine-tuning, neural network implementation, tokenization, data augmentation, infrastructure details like the number of GPU nodes,etc., optimization algorithm, etc.
Neural modules are a sort of encoder-decoder architecture consisting of conceptual building blocks responsible for different tasks. It represents the logical part of a neural network and forms the basis for describing the model and its training process. Collections have many neural modules that can be reused whenever required.
Inputs and outputs to Neural Modules are typed with Neural Types. A Neural Type is a pair that contains the information about the tensor’s axes layout and semantics of its elements. Every Neural Module has input_types and output_types properties that describe what kinds of inputs this module accepts and what types of outputs it returns.

Even though NeMo is based on PyTorch, it can also be effectively used with other projects like PyTorch Lightning and Hydra. Integration with Lightning makes it easier to train models with mixed precision using Tensor Cores and can scale training to multiple GPUs and compute nodes. It also has some features like logging, checkpointing, overfit checking, etc. Hydra also allows the parametrization of scripts to keep it well organized. It makes it easier to streamline everyday tasks for users.

The post NVIDIA NeMo: An Open-Source Toolkit For Developing State-Of-The-Art Conversational AI Models In Three Lines Of Code appeared first on Artificial Intelligence.

Deep Learning Restores Time-Ravaged Photos

aiuniverse — Mon, 05 Oct 2020 09:29:52 +0000

Source: i-programmer.info

Researchers have devised a novel deep learning approach to repairing the damage suffered by old photographic prints. The project is open source and a PyTorch implementation is downloadable from GitHub. There’s also a Colab where you can try it out.

We’ve encountered neural networks that can colorize old black and white shots, can improve on photographs of landscapes and even paint portraits in the style of an old master. Here the goal is more modest – to apply a deep learning approach to restoring old photos that have suffered severe degradation.

The researchers, from Microsoft Research Asia in Beijing, China and at the University of Science and Technology of China, and now the City University of Hong Kong start from the premise that:

Photos are taken to freeze the happy moments that otherwise gone. Even though time goes by, one can still evoke memories of the past by viewing them. Nonetheless, old photo prints deteriorate when kept in poor environmental condition, which causes the valuable photo content permanently damaged.

As manual retouching of prints is laborious and time-consuming they set out to design automatic algorithms that can instantly repair old photos for those who wish to bring them back to life.

The researchers presented their work as an oral presentation at CVPR 2020, held virtually in June and their paper, “Bringing Old Photos Back to Life”, which is part of the conference proceedings is already available.

The post Deep Learning Restores Time-Ravaged Photos appeared first on Artificial Intelligence.

Six deep learning applications ready for the enterprise mainstream

aiuniverse — Mon, 13 Jul 2020 05:35:02 +0000

Source: bestgamingpro.com

Deep learning opens a brand new stage of capabilities inside the synthetic intelligence realm, however its use has been restricted to knowledge scientists. These days, lastly, it could be ripe for “democratization,” that means it’s poised to turn out to be an accessible set of applied sciences accessible to all who want it — with quite a few enterprise functions.

Deep studying, which makes an attempt to imitate the logic of the human mind for analyzing patterns, is beginning to see widespread adoption inside enterprise AI initiatives. A majority of corporations with AI implementations, 53%, plans to include deep studying into their workplaces inside the subsequent 24 months, a latest survey of 154 IT and enterprise professionals carried out and revealed by ITPro Right now, InformationWeek and Interop finds.

Deep studying is now driving fast improvements in AI and influencing large disruptions throughout all markets, a brand new report revealed by Databricks asserts. “Deep studying fashions may be skilled to carry out difficult duties equivalent to picture or speech recognition and decide that means from these inputs,” the paper’s authors state. “A key benefit is that these fashions scale effectively with knowledge and their efficiency will enhance as the dimensions of your knowledge will increase.”

The Databricks report defines deep studying as “a specialised and superior type of machine studying that performs what is taken into account end-to-end studying. A deep studying algorithm is given large volumes of knowledge, sometimes unstructured and disparate, and a job to carry out equivalent to classification. The ensuing mannequin is then able to fixing advanced duties equivalent to recognizing objects inside a picture and translating speech in actual time.”

The next are functions which are enabled by means of deep studying:

Picture classification: “The method of figuring out and detecting an object or a characteristic in a digital picture or video,” the report states. In retail, deep studying fashions “rapidly scan and analyze in-store imagery to intuitively decide stock motion.”
Voice recognition: “The flexibility to obtain and interpret dictation or to know and perform spoken instructions. Fashions are in a position to convert captured voice instructions to textual content after which use pure language processing to know what’s being mentioned and in what context.” In transportation, deep studying “makes use of voice instructions to allow drivers to make telephone calls and alter inside controls – all with out taking their palms off the steering wheel.”
Anomaly detection: “Deep studying method strives to acknowledge irregular patterns which do not match the behaviors anticipated for a specific system, out of thousands and thousands of various transactions. These functions can result in the invention of an assault on monetary networks, fraud detection in insurance coverage filings or bank card purchases, even isolating sensor knowledge in industrial amenities signifying a security challenge.”
Suggestion engines: “Analyze person actions with a purpose to present suggestions primarily based on person habits.”
Sentiment evaluation: “Leverages deep learning-heavy strategies equivalent to pure language processing, textual content evaluation, and computational linguistics to achieve clear perception into buyer opinion, understanding of shopper sentiment, and measuring the affect of selling methods.”
Video evaluation: “Course of and consider huge streams of video footage for a variety of duties together with menace detection, which can be utilized in airport safety, banks, and sporting occasions.”

Well-liked deep studying frameworks to get began with this expertise embody TensorFlow, Caffe, MXNet, Keras and PyTorch.

The post Six deep learning applications ready for the enterprise mainstream appeared first on Artificial Intelligence.

THIS LATEST MODEL SERVING LIBRARY HELPS DEPLOY PYTORCH MODELS AT SCALE

aiuniverse — Mon, 04 May 2020 06:57:20 +0000

Source: analyticsindiamag.com

PyTorch has become popular within organisations to develop superior deep learning products. But building, scaling, securing, and managing models in production due to lack of PyTorch’s model server was keeping companies from going all in. The robust model server allows loading one or more models and automatically generating prediction API, backed by a scalable web server. Besides, it also offers production-critical features like logging, monitoring, and security.

Until now, TensorFlow Serving and Multi-Model Server catered to the needs of developers in production, but the lack of a model server that could effectively manage the workflows with PyTorch was causing hindrance among users. Consequently, to simplify the model development process, Facebook and Amazon collaborated to bring TorchServe, a PyTorch model serving library, that assists in deploying trained PyTorch models at scale without having to write custom code.

TorchServe & TorchElastic

Motivated by the request from Alex Wong on GitHub, Facebook and AWS released the much-needed service for PyTorch enthusiasts. TorchServe will be available as part of the PyTorch open-source project. Users can not only bring their models to production quicker for low latency prediction API, but also embed default handlers for the most common applications, such as object detection and text classification.

TorchServe also includes multi-model serving, model versioning for A/B testing, monitoring metrics, and RESTful endpoints for application integration. Developers can leverage the model server on various machine learning environments, including Amazon SageMaker, container services, and EC2 (Amazon Elastic Computer Cloud).

The post THIS LATEST MODEL SERVING LIBRARY HELPS DEPLOY PYTORCH MODELS AT SCALE appeared first on Artificial Intelligence.

AWS Announces Support for PyTorch with Amazon Elastic Inference

aiuniverse — Tue, 24 Mar 2020 06:31:01 +0000

Source: datanami.com

AWS has announced that the Amazon Elastic Inference is now compatible with PyTorch models. PyTorch, which AWS describes as a “popular deep learning framework that uses dynamic computational graphs,” is a piece of free, open-source software developed largely by Facebook’s AI Research Lab (FAIR) that allows developers to more easily apply Python code for deep learning. With Amazon’s announcement, PyTorch can now work with Amazon’s SageMaker and EC2 cloud services. PyTorch is the third major deep learning framework to be supported by Amazon Elastic Inference, following in the footsteps of TensorFlow and Apache MXNet.

Inference – making actual predictions with a trained model – is a computing power-intensive process, accounting for up to 90% of PyTorch models’ total compute costs according to AWS. Instance selection is, therefore, important for optimization. “Optimizing for one of these resources on a standalone GPU instance usually leads to under-utilization of other resources,” wrote David Fan (a software engineer with AWS AI) and Srinivas Hanabe (a principal product manager with AWS AI for Elastic Inference) in the AWS announcement blog. “Therefore, you might pay for unused resources.”

The duo argue that Amazon Elastic Inference solves this problem for PyTorch by allowing users to select the most appropriate CPU instance in AWS and separately select the appropriate amount of GPU-based inference acceleration.

In order to use PyTorch with Elastic Inference, developers must convert their models to TorchScript. “PyTorch’s use of dynamic computational graphs greatly simplifies the model development process,” Fan and Hanabe wrote. “However, this paradigm presents unique challenges for production model deployment. In a production context, it is beneficial to have a static graph representation of the model.”

To that end, they said, TorchScript bridges the gap by allowing users to compile and export their models into a graph-based form. In the blog, the authors provide step-by-step guides for using PyTorch with Amazon Elastic Inference, including conversion to TorchScript, instance selection, and more. They also discuss cost and latency among cloud deep learning platforms, highlighting how Elastic Inference’s hybrid approach offers “the best of both worlds” by combining the advantages of CPUs and GPUs without the drawbacks of standalone instances. To that end, they presented a bar chart comparing cost-per-inference and latency across Elastic Inference models (gray), models run on standalone GPU instances (green), and models run on standalone CPU instances (blue).

“Amazon Elastic Inference is a low-cost and flexible solution for PyTorch inference workloads on Amazon SageMaker,” they concluded. “You can get GPU-like inference acceleration and remain more cost-effective than both standalone Amazon SageMaker GPU and CPU instances, by attaching Elastic Inference accelerators to an Amazon SageMaker instance.”

The post AWS Announces Support for PyTorch with Amazon Elastic Inference appeared first on Artificial Intelligence.

PyTorch 1.4 Release Introduces Java Bindings, Distributed Training

aiuniverse — Wed, 26 Feb 2020 05:27:41 +0000

Source: infoq.com

PyTorch, Facebook’s open-source deep-learning framework, announced the release of version 1.4. This release, which will be the last version to support Python 2, includes improvements to distributed training and mobile inference and introduces support for Java.

This release follows the recent announcements and presentations at the 2019 Conference on Neural Information Processing Systems (NeurIPS) in December. For training large models, the release includes a distributed framework to support model-parallel training across multiple GPUs. Improvements to PyTorch Mobile allow developers to customize their build scripts, which can greatly reduce the storage required by models. Building on the Android interface for PyTorch Mobile, the release includes experimental Java bindings for using TorchScript models to perform inference. PyTorch also supports Python and C++; this release will be the last that supports Python 2 and C++ 11. According to the release notes:

The release contains over 1,500 commits and a significant amount of effort in areas spanning existing areas like JIT, ONNX, Distributed, Performance and Eager Frontend Improvements and improvements to experimental areas like mobile and quantization.

Recent trends in deep-learning research, particularly in natural-language processing (NLP), have produced larger and more complex models such as RoBERTa, with hundreds of millions of parameters. These models are too large to fit within the memory of a single GPU, but a technique called model-parallel training allows different subsets of the parameters of the model to be handled by different GPUs. Previous versions of PyTorch have supported single-machine model parallel, which requires that all the GPUs used for training be hosted in the same machine. By contrast, PyTorch 1.4 introduces a distributed remote procedure call (RPC) system which supports model-parallel training across many machines.

After a model is trained, it must be deployed and used for inference or prediction. Because many applications are deployed on mobile devices with limited compute, memory, and storage resources, the large models often cannot be deployed as-is. PyTorch 1.3 introduced PyTorch Mobile and TorchScript, which aimed to shorten end-to-end development cycle time by supporting the same APIs across different platforms, eliminating the need to export models to a mobile framework such as Caffe2. The 1.4 release allows developers to customize their build packages to only include the PyTorch operators needed by their models. The PyTorch team reports that customized packages can be “40% to 50% smaller than the prebuilt PyTorch mobile library.” With the new Java bindings, developers can invoke TorchScript models directly from Java code; previous versions only supported Python and C++. The Java bindings are only available on Linux.

Although rival deep-learning framework TensorFlow ranks as the leading choice for commercial applications, PyTorch has the lead in the research community. At the 2019 NeurIPS conference in December, PyTorch was used in 70% of the papers presented which cited a framework. Recently, both Preferred Networks, Inc (PFN) and research consortium OpenAI annouced moves to PyTorch. OpenAI claimed that “switching to PyTorch decreased our iteration time on research ideas in generative modeling from weeks to days.” In a discussion thread about the announcement, a user on Hacker News noted:

At work, we switched over from TensorFlow to PyTorch when 1.0 was released, both for R&D and production… and our productivity and happiness with PyTorch noticeably, significantly improved.

The PyTorch source code and release notes for version 1.4 are available on GitHub.

The post PyTorch 1.4 Release Introduces Java Bindings, Distributed Training appeared first on Artificial Intelligence.

Microsoft speeds up PyTorch with DeepSpeed

aiuniverse — Wed, 12 Feb 2020 06:57:47 +0000

Source: sg.channelasia.tech

Microsoft has released DeepSpeed, a new deep learning optimisation library for PyTorch, that is designed to reduce memory use and train models with better parallelism on existing hardware.

According to a Microsoft Research blog post announcing the new framework, DeepSpeed improves PyTorch model training through a memory optimisation technology that increases the number of possible parameters a model can be trained with, makes better use of the memory local to the GPU, and requires only minimal changes to an existing PyTorch application to be useful.

It’s the minimal impact on existing PyTorch code that has the greatest potential impact. As machine learning libraries grow entrenched, and more applications become dependent on them, there is less room for new frameworks, and more incentive to make existing frameworks more performant and scalable.

PyTorch is already fast when it comes to both computational and development speed, but there’s always room for improvement. Applications written for PyTorch can make use of DeepSpeed with only minimal changes to the code; there’s no need to start from scratch with another framework.

One way DeepSpeed enhances PyTorch is by improving its native parallelism.

In one example, provided by Microsoft in the DeepSpeed documentation, attempting to train a model using PyTorch’s Distributed Data Parallel system across Nvidia V100 GPUs with 32GB of device memory “[ran] out of memory with 1.5 billion parameter models,” while DeepSpeed was able to reach 6 billion parameters on the same hardware.

Another touted DeepSpeed improvement is more efficient use of GPU memory for training. By partitioning the model training across GPUs, DeepSpeed allows the needed data to be kept close at hand, reduces the memory requirements of each GPU, and reduces the communication overhead between GPUs.

A third benefit is allowing for more parameters during model training to improve prediction accuracy. Hyperparameter optimisation, which refers to tuning the parameters or variables of the training process itself, can improve the accuracy of a model but typically at the cost of manual effort and expertise.

To eliminate the need for expertise and human effort, many machine learning frameworks now support some kind of automated hyperparameter optimisation.

With DeepSpeed, Microsoft claims that “deep learning models with 100 billion parameters” can be trained on “the current generation of GPU clusters at three to five times the throughput of the current best system.”

DeepSpeed is available as free open source under the MIT License. Tutorials in the official repo work with Microsoft Azure, but Azure is not required to use DeepSpeed.

The post Microsoft speeds up PyTorch with DeepSpeed appeared first on Artificial Intelligence.