Boosting LLM Performance with Test-Time Compute
Part 1 in a series covering some of the techniques potentially underlying OpenAI's latest o1 model.
In this video, I cover:
Why generating more words per answer leads to finding better answers.
How to use temperature and other sampling techniques (top p, top k, min p, beam search)
How to use chain of thought.
I also demo how chain of thought leads to:
a) very big improvements on maths tests (grade school math, gsm8k)
b) meaningful improvements on general knowledge (Hotpot QA).
I then show how you can use these techniques to get better responses when you use LLMs.
Cheers, Ronan
More Resources at Trelis.com/About