Based on the 200K Context Yi 34B.
if it is based in yi, should it not have the yi-licence instead of mit?
Yes.
But its ML land! Everyone violates licenses anyway :P
Can’t wait to see the benchmarks on these things.
Dang, after that 34b drought it’s like suddenly stumbling onto the great lakes right now.
I believe these are TheBloke’s GGUF quants if anyone’s interested: https://huggingface.co/TheBloke/Nous-Capybara-34B-GGUF
Also note this important issue that affects this and all other Yi-based models:
So we can just skip BOS token on all these models?
I did the gguf-py/scripts/gguf-set-metadata.py some-yi-model.gguf tokenizer.ggml.bos_token_id 144
and it’s changed the outputs a lot from yesterday.
200K context!!
Precisely 47K fits in 24GB at 4bpw.
I have not tried 3.5, but I think it could be much more.