grigio@alien.topB to LocalLLaMAEnglish · 1 year agoIs Open LLM Leaderboard reliable source ? yi:34B is at the top but I get better results with neural-chat:7B modelmessage-squaremessage-square23fedilinkarrow-up11arrow-down10file-text
arrow-up11arrow-down1message-squareIs Open LLM Leaderboard reliable source ? yi:34B is at the top but I get better results with neural-chat:7B modelgrigio@alien.topB to LocalLLaMAEnglish · 1 year agomessage-square23fedilinkfile-text
minus-squareThisGonBHard@alien.topBlinkfedilinkEnglisharrow-up1·1 year agoWhile the benchmarks then to be cheated, especially by small models, I honestly think something is wrong with how you run it. Yi-34B trades blows with Lllama 2 70B from my personal tests, making it do novel tasks invented by me, not the gamed benchmarks. ALL 7B models are like putting a 7 year old vs an renowned professor when they are compared to 34B and 70B.
While the benchmarks then to be cheated, especially by small models, I honestly think something is wrong with how you run it.
Yi-34B trades blows with Lllama 2 70B from my personal tests, making it do novel tasks invented by me, not the gamed benchmarks.
ALL 7B models are like putting a 7 year old vs an renowned professor when they are compared to 34B and 70B.