Ten Fine-tuning Tips
From my latest YouTube video (below):
1️⃣ Start with a small model like Llama 3 8B This allows for faster experimentation.
2️⃣ Use LoRA or QLoRA instead of full fine-tuning at first. It's faster and can work better for small datasets.
3️⃣ Create 10 manual test questions to evaluate performance and choose the best base model.
4️⃣ Manually curate your training dataset, at least when getting started. It's more work but gives you a better understanding of what's needed.
5️⃣ Start training on a small number of rows, even just 100. Scale up gradually.
6️⃣ Always use a validation set, even if you have to split it off from your training data.
7️⃣ Try to start training on just one GPU for faster iteration. Scale up to multi-GPU later.
8️⃣ Use Weights & Biases to track training losses and rewards.
9️⃣ Consider unsupervised fine-tuning if you have a very large dataset (10k+ rows).
🔟 Try preference fine-tuning with the ORPO technique to optimize for preferred outputs.
These tips apply to language models but also work for fine-tuning video, speech, and multimodal models.
Announcing Trelis Internships
Trelis internships are for talented developers seeking research and commercialisation experience. They serve as a potential pathway to commercialising one’s own projects OR collaborating with Trelis on future projects.
Internships are project, not time, based.
There is a single payment of $500 for on-time completion of an internship project.
Projects will form the basis for Trelis YouTube videos and products. Contributors will be named/recognised on any videos/products.
That’s it for this week, cheers, Ronan
➡️ Trelis Resources + Support: Trelis.com/About