minus-squareFormerIYI@alien.topBtoLocalLLaMA•Why didn't gpt4 work at first and how did they "fix it"?linkfedilinkEnglisharrow-up1·1 year agoMaybe papers from Pangu-Sigma or other large scale MoE models can be helpfulhttps://arxiv.org/abs/2303.10845 linkfedilink
Maybe papers from Pangu-Sigma or other large scale MoE models can be helpfulhttps://arxiv.org/abs/2303.10845