Covid-Plannedemic_@alien.topB to LocalLLaMAEnglish · 1 year agoTraining on the rephrased test set is all you need: 13B models can reach GPT-4 performance in benchmarks with no contamination detectable by traditional methodslmsys.orgexternal-linkmessage-square10fedilinkarrow-up11arrow-down10
arrow-up11arrow-down1external-linkTraining on the rephrased test set is all you need: 13B models can reach GPT-4 performance in benchmarks with no contamination detectable by traditional methodslmsys.orgCovid-Plannedemic_@alien.topB to LocalLLaMAEnglish · 1 year agomessage-square10fedilink
minus-squareambient_temp_xeno@alien.topBlinkfedilinkEnglisharrow-up1·1 year agoTo be fair, it’s pretty clear that openai update their models with every kind of test people throw at them as well.
To be fair, it’s pretty clear that openai update their models with every kind of test people throw at them as well.