minus-squareDataLearnerAI@alien.topBtoLocalLLaMA•Why is no one releasing 70b models?linkfedilinkEnglisharrow-up1·1 year agoAli opensouced a 72B model called Qwen-72B: Qwen/Qwen-72B · Hugging Face It supports Chinese and English. The performance on MMLU is remarkable. linkfedilink
DataLearnerAI@alien.topB to LocalLLaMAEnglish · 1 year agois there any other tools like vLLM or TensorRT that can be used to speed up LLM inference?plus-squaremessage-squaremessage-square5fedilinkarrow-up11arrow-down10
arrow-up11arrow-down1message-squareis there any other tools like vLLM or TensorRT that can be used to speed up LLM inference?plus-squareDataLearnerAI@alien.topB to LocalLLaMAEnglish · 1 year agomessage-square5fedilink
minus-squareDataLearnerAI@alien.topBtoLocalLLaMA•Yi-34B vs Yi-34B-200K on sequences <32K and <4KlinkfedilinkEnglisharrow-up1·1 year agoIn most scenarios, models with extended context are optimized for long sequences. If the sequence is not very long, it is often recommended to use a regular model linkfedilink
Ali opensouced a 72B model called Qwen-72B: Qwen/Qwen-72B · Hugging Face
It supports Chinese and English. The performance on MMLU is remarkable.