Given that we got models like Qwen-14B that seem quite good for their size, I got curious about Chinese LLM research and tried to find some sites and found:

Leaderboards:

https://opencompass.org.cn/leaderboard-llm

https://cevalbenchmark.com/static/leaderboard.html

This one looks like a Chinese huggingface equivalent?

https://modelscope.cn/home

I was randomly checking one of the top models on ceval that did not seem to have any discussion here or have any English information on them, https://huggingface.co/Duxiaoman-DI/XuanYuan-70B , which, according on Google translate is a Llama-2-70B model trained on Chinese financial data, and 8192 length context. I turned it into GGUF Q6_K and tested it and yeah it at least doesn’t obviously suck. Generating text with 7000 tokens of text without any rope tricks still creates coherent text. The model speaks English just fine.

My questions for this subreddit are:

Do you follow developments from China specifically? What sites/people/places do you follow? Can you share? Or other insights?

  • CheatCodesOfLife@alien.topB
    link
    fedilink
    English
    arrow-up
    1
    ·
    1 year ago

    Yeah, I’ve been excited for the 34B AquilaChat. I saw TheBloke released it GPTQ and GGUF earlier this week, but it doesn’t work when I try to use it. The same as Falcon-180B, an error with a ’ character when I try to load it in text-web-ui :(