ChatLLaMA: ChatGPT-like training process for LLaMA architectures

Image generated with OpenAI DALL·E 2

Meta AI has recently released LLaMA, a collection of foundational large language models ranging from 7 to 65 billion parameters.

LLaMA is creating a lot of excitement because it is smaller than GPT-3 but has better performance. For example, LLaMA's 13B architecture outperforms GPT-3 despite being 10 times smaller. This new collection of fundamental models opens the door to faster inference performance and chatGPT-like real-time assistants, while being cost-effective and running on a single GPU.

However, LLaMA was not fine-tuned for instruction task with a Reinforcement Learning from Human Feedback (RLHF) training process.

The good news is that today Nebuly has introduced ChatLLaMA, the first open source implementation of LLaMA based on RLHF:

  • A complete open source implementation of the training process to build a ChatGPT-style service based on pre-trained LLaMA models.
  • Compared to the original ChatGPT, the training process and single-GPU inference are much faster and cheaper by taking advantage of the smaller size of LLaMA architectures.
  • ChatLLaMA has built-in support for DeepSpeed ZERO to speedup the fine-tuning process.
  • The library also supports all LLaMA model architectures (7B, 13B, 33B, 65B), so that you can fine-tune the model according to your preferences for training time and inference performance.

⚠️ Please note this code represents the algorithmic implementation for RLHF training process of LLaMA and does not contain the model weights. To access the model weights, you need to apply to Meta's form.

Call for open-source contributions

Nebuly has open-sourced the complete code to replicate the ChatLLaMA implementation, opening up the possibility for every user to fine-tune their own personalized ChatLLaMA assistants. The library can be further extended with the following additions:

  • Checkpoints with fine-tuned weights.
  • Optimization techniques for faster inference.
  • Support for packaging the model into an efficient deployment framework.

All developers are invited to join Nebuly's efforts toward more efficient and open ChatGPT-like assistants.

You can participate in the following ways:

GitHub Repo

https://github.com/nebuly-ai/nebullvm/tree/main/apps/accelerate/chatllama

Don't forget to share this post!

Stay up to date on the latest news