This is basically one of the main reasons to use LoRAs. Someone posted this in the machine learning Reddit about 2 months ago, but the idea is still a solid one. Train a intermediate model to determine which expert or LoRa to use, then use that LoRA for that task. It’s better than mix of experts because you get much better control over which expert or LoRa will receive the request.
A mixture of experts is a type of technique that is what gpt4 is rumored to be using.
From Wikipedia :
Mixture of experts (MoE) is a machine learning technique where multiple expert networks (learners) are used to divide a problem space into homogeneous regions. It differs from ensemble techniques in that typically only one or a few expert models will be run, rather than combining results from all models.
In basic terms there are networks that get really good at one type of thing, and compete to provide input when a question comes in during inference. There may be a network really good at science, or math, or literature, which has a better understanding of a field or subject than the other “experts”. So it provides the response instead of the others.