Test Time Compute: Sampling and Chain of Thought

Sep 30, 2024

Boosting LLM Performance with Test-Time Compute

Part 1 in a series covering some of the techniques potentially underlying OpenAI's latest o1 model.

In this video, I cover:

Why generating more words per answer leads to finding better answers.
How to use temperature and other sampling techniques (top p, top k, min p, beam search)
How to use chain of thought.

I also demo how chain of thought leads to:

a) very big improvements on maths tests (grade school math, gsm8k)

b) meaningful improvements on general knowledge (Hotpot QA).

I then show how you can use these techniques to get better responses when you use LLMs.

Cheers, Ronan

More Resources at Trelis.com/About