Covid-Plannedemic_@alien.topB to

LocalLLaMAEnglish · 3 years ago

Training on the rephrased test set is all you need: 13B models can reach GPT-4 performance in benchmarks with no contamination detectable by traditional methods

1

Training on the rephrased test set is all you need: 13B models can reach GPT-4 performance in benchmarks with no contamination detectable by traditional methods

Covid-Plannedemic_@alien.topB to

LocalLLaMAEnglish · 3 years ago

Catch me if you can! How to beat GPT-4 with a 13B model | LMSYS Org

Announcing Llama-rephraser: 13B models reaching GPT-4 performance in major benchmarks (MMLU/GSK-8K/HumanEval)! To ensure result validity, we followed Open...

Chat

LosingID_583@alien.topB
link
fedilink
English
arrow-up
1·
3 years ago
Benchmark test questions can’t be made public. It’s too easy to cheat.