are there any Super Tiny LLM models which we can ship within a mobile application?

Prashant_4200@alien.top · 2 years ago

are there any Super Tiny LLM models which we can ship within a mobile application?

woadwarrior@alien.top · 2 years ago

I have an app on the App Store that does that. It ships with a 4 bit quantised 3B parameter LLM baked in (The app is a 1.67GB download) and users on newer phones (iPhone 14,15 Pro and Pro Max) can optionally download a 3 bit quantised 7B parameter LLM.

amusiccale@alien.top · 2 years ago

Hey, I actually just tried this on my iPhone SE 2nd gen to see if it would run the 3B, even slowly, but it says it’s not compatible— any suggestions?

kotschi1997@alien.top · 2 years ago

Check out the tiny llama project! 1.1B parameters, pretty solid performance for its size and the currently available checkpoints are only about halfway through the complete pre-training process.

https://github.com/jzhang38/TinyLlama

SlowSmarts@alien.top · 2 years ago

TinyLlama 1.1b may have potential - Tiny Llama 1.1b project

TheBloke has made a GGUF of v0.3 chat already.

Looking on HuggingFace, there may be more that have been fine tuned for instruct, etc.

erelim@alien.top · 2 years ago

It will be very unfriendly for the user having to have a 1-2GB app that eats RAM and battery like a mobile game, still you won’t get good or quick results. Check on Replicate, runpod or vast.ai for cheap GPU

Flying_Madlad@alien.top · 2 years ago

Lol, sounds rough. 3B is better than no B. And that should mean I can have several models up at once

techmavengeospatial@alien.top · 2 years ago

https://llm.mlc.ai/