THIS LATEST MODEL SERVING LIBRARY HELPS DEPLOY PYTORCH MODELS AT SCALE
PyTorch has become popular within organisations to develop superior deep learning products. But building, scaling, securing, and managing models in production due to lack of PyTorch’s model server was keeping companies from going all in. The robust model server allows loading one or more models and automatically generating prediction API, backed by a scalable web server. Besides, it also offers production-critical features like logging, monitoring, and security.
Until now, TensorFlow Serving and Multi-Model Server catered to the needs of developers in production, but the lack of a model server that could effectively manage the workflows with PyTorch was causing hindrance among users. Consequently, to simplify the model development process, Facebook and Amazon collaborated to bring TorchServe, a PyTorch model serving library, that assists in deploying trained PyTorch models at scale without having to write custom code.
TorchServe & TorchElastic
Motivated by the request from Alex Wong on GitHub, Facebook and AWS released the much-needed service for PyTorch enthusiasts. TorchServe will be available as part of the PyTorch open-source project. Users can not only bring their models to production quicker for low latency prediction API, but also embed default handlers for the most common applications, such as object detection and text classification.
TorchServe also includes multi-model serving, model versioning for A/B testing, monitoring metrics, and RESTful endpoints for application integration. Developers can leverage the model server on various machine learning environments, including Amazon SageMaker, container services, and EC2 (Amazon Elastic Computer Cloud).