Covid-Plannedemic_@alien.topB to LocalLLaMAEnglish · 2 years agoTraining on the rephrased test set is all you need: 13B models can reach GPT-4 performance in benchmarks with no contamination detectable by traditional methodslmsys.orgexternal-linkmessage-square10linkfedilinkarrow-up11arrow-down10
arrow-up11arrow-down1external-linkTraining on the rephrased test set is all you need: 13B models can reach GPT-4 performance in benchmarks with no contamination detectable by traditional methodslmsys.orgCovid-Plannedemic_@alien.topB to LocalLLaMAEnglish · 2 years agomessage-square10linkfedilink
minus-squareambient_temp_xeno@alien.topBlinkfedilinkEnglisharrow-up1·2 years agoTo be fair, it’s pretty clear that openai update their models with every kind of test people throw at them as well.
To be fair, it’s pretty clear that openai update their models with every kind of test people throw at them as well.