davidmezzetti@alien.topB to LocalLLaMAEnglish · 2 years agoRAG in a couple lines of code with txtai-wikipedia embeddings database + Mistralalien.topimagemessage-square15linkfedilinkarrow-up11arrow-down10
arrow-up11arrow-down1imageRAG in a couple lines of code with txtai-wikipedia embeddings database + Mistralalien.topdavidmezzetti@alien.topB to LocalLLaMAEnglish · 2 years agomessage-square15linkfedilink
minus-squareDaniyarQQQ@alien.topBlinkfedilinkEnglisharrow-up1·2 years agoLooks like it can work with AWQ models. Can it work with GPTQ (Exllama2) and GGUF models?
minus-squaredavidmezzetti@alien.topOPBlinkfedilinkEnglisharrow-up1·2 years agoIt works with GPTQ models as well, just need to install AutoGPTQ. You would need to replace the LLM pipeline with llama.cpp for it to work with GGUF models. See this page for more: https://huggingface.co/docs/transformers/main_classes/quantization
Looks like it can work with AWQ models. Can it work with GPTQ (Exllama2) and GGUF models?
It works with GPTQ models as well, just need to install AutoGPTQ.
You would need to replace the LLM pipeline with llama.cpp for it to work with GGUF models.
See this page for more: https://huggingface.co/docs/transformers/main_classes/quantization