PyTorch Logo and ControlNet by Sanyam Bhutani
PyTorch released its next-generation version, PyTorch 2.0, during the PyTorch Conference on 12/2/22.
This release has the same user experience as its predecessor while making fundamental changes and enhancing PyTorch's operations at the compiler level for faster performance and better support for dynamic shapes and distributed computing.
PyTorch 2.0 also introduces new prototype features and technologies across TensorParallel, DTensor, 2D parallel, TorchDynamo, AOTAutograd, PrimTorch, and TorchInductor.
In addition to the above features, PyTorch 2.0 includes several other updates and improvements across various inferences, performance, and training optimization features on GPUs and CPUs.
There are several strategies to improve the efficiency of a pre-designed and trained DNN model on a target hardware platform, such as quantization and sparsity, as well as Transformer-specific optimization methods.
With this release, new improvements have been made to the UX and nebullvm installations, and Speedster now supports the TensorFlow backend for Hugging Face transformers.
The goal of this blog is to explore Meta Research’s new Token Merging (ToMe) optimization strategy, perform some practical experiments with it, and benchmark ToMe with other state-of-the-art inference optimization techniques.