• 0 Posts
  • 2 Comments
Joined 1 year ago
cake
Cake day: October 29th, 2023

help-circle

  • It depends on the use case. Each model may have their own strengths. I picked XWin and Airoboros as baseline 70B models for 2nd language conversational testing, and XWin outperformed (in human-evaled testing with a native speaker) a 70B model that had been pre-trained on an additional 100B tokens of said 2nd language. Shocking to say the least.