Following the release of Dimensity 9300 and S8G3 phones, I am expecting growth in popularity of LLMs running on mobile phones, as quantized 3B or 7B models can already run on high-end phones from five years ago or later. But despite it being possible, there are a few concerns, including power consumption and storage size. I’ve seen posts about successfully running LLMs on mobile devices, but seldom see people discussing about future trends. What are your thoughts?
Just my thoughts on this:
Would be great.
Would be rather limited but possible (thanks to https://llm.mlc.ai/ and increasing memory).
A lot of CHEAP Chinese devices will say they can actually do it. They will. At 2 bit quatization and <1 t/s and it would be 7B Models or even less. They will be unusuable.
Google say it’s not necessary because you can use their Firebase Services for AI and you can use NNAPI anyway. You must also censor your LLM-using apps in Play Store to adhere to their rules.
Apple says it’s not necessary, later they will advertise it as very good thing and provide optmized libraries and some pretrained models but you need to buy latest iphone(last-year won’t work because Apple). You must also censor your apps AND mark it as 18+
Areas of usage?
- Language translation (including voice-to-voice). Basically much more improved google translate.
- AI Assistant (basically MUCH more imroved Siri, used not only as command interface).