Image generated with OpenAI DALL·E 2
Meta AI has recently released LLaMA, a collection of foundational large language models ranging from 7 to 65 billion parameters.
LLaMA is creating a lot of excitement because it is smaller than GPT-3 but has better performance. For example, LLaMA's 13B architecture outperforms GPT-3 despite being 10 times smaller. This new collection of fundamental models opens the door to faster inference performance and chatGPT-like real-time assistants, while being cost-effective and running on a single GPU.
However, LLaMA was not fine-tuned for instruction task with a Reinforcement Learning from Human Feedback (RLHF) training process.
The good news is that today Nebuly has introduced ChatLLaMA, the first open source implementation of LLaMA based on RLHF:
⚠️ Please note this code represents the algorithmic implementation for RLHF training process of LLaMA and does not contain the model weights. To access the model weights, you need to apply to Meta's form.
Nebuly has open-sourced the complete code to replicate the ChatLLaMA implementation, opening up the possibility for every user to fine-tune their own personalized ChatLLaMA assistants. The library can be further extended with the following additions:
All developers are invited to join Nebuly's efforts toward more efficient and open ChatGPT-like assistants.
You can participate in the following ways:
Automatically apply SOTA optimization techniques to achieve the maximum inference speed-up on your hardware.
RLHF is a method that uses human feedback to optimize a language model by aligning it with complex human values, and it has been successfully applied in ChatGPT to improve its performance.