🎥 Trelis Livestream
Join the live Q&A Thursday at 5 pm Irish time on YouTube and on X. Drop any questions in the YouTube live chat in advance.
🔍 This week’s video - ReFT
It's possible to fine-tune language models by updating only a tiny number of parameters. It can be done with LoRA or also by fine-tuning activations via ReFT (representation fine tuning)
➡️ Two examples covered:
1. Guardrailing a model to resist responding with any financial advice
2. Causing a model to always respond in French, regardless of the question asked
➡️ Benefits of few parameter fine-tuning:
* Allows you to fine-tune quickly
* Enables you to create different behaviors that can be snapped together at inference time
* May result in more robust performance than doing a broader fine-tune on a larger number of parameters
➡️ How it works:
* Target a specific layer (usually a middle layer) within the model
* Use a very low rank (e.g. rank 4) for the adapters
* Apply the intervention at the last input position of the prompt
* Train to overfit on a small number of examples
➡️ Comparing LoRA and representation fine-tuning (ReFT):
* Both methods can achieve good performance with a small number of trainable parameters
* ReFT allows targeting activations rather than weights, but the two approaches may end up being somewhat equivalent
* LoRA has more mature library support (PEFT), while ReFT is a newer technique
And that’s it for this week, cheers, Ronan
Ronan McGovern, Trelis Research