I see there is progress being made on smaller LLMs that have fewer parameters, but as I understand they are just trying to optimize how much information can be fit in a given parameter size. Is there work being done on LLMs that are trained on less information? For example say I want to chat with a PDF, I don’t care for my LLM to speak French, be able to write Python or know that Benjamin Franklin wrote a paper on flatuence (all things RWKV v5 World 1.5B knows).
IBM with Granite?