Introducing Tess: Tess-M with 200K Context Length

migtissera@alien.top · 2 years ago

Introducing Tess: Tess-M with 200K Context Length

BangkokPadang@alien.top · 2 years ago

What makes this any different than the “base” Yi-34B-200k model?

Where can we see a description of what the model has been finetuned on (datasets used, Lora’s used, etc.) and/or your methods for doing so? I’m not finding any of this information in the model card or the substack link.

Slimxshadyx@alien.top · 2 years ago

I’m not sure why he is being very vague with this model. He said it’s fine tuned to be better at instruct? I think