I’m blown away. See for yourself.
https://migel.substack.com/p/a-conversation-with-tess
Tess, welcome to the world!
Model is Open Source with 200K context length.
Available at: https://huggingface.co/migtissera/Tess-M-v1.0
I’m blown away. See for yourself.
https://migel.substack.com/p/a-conversation-with-tess
Tess, welcome to the world!
Model is Open Source with 200K context length.
Available at: https://huggingface.co/migtissera/Tess-M-v1.0
What makes this any different than the “base” Yi-34B-200k model?
Where can we see a description of what the model has been finetuned on (datasets used, Lora’s used, etc.) and/or your methods for doing so? I’m not finding any of this information in the model card or the substack link.
I’m not sure why he is being very vague with this model. He said it’s fine tuned to be better at instruct? I think