RAG in a couple lines of code with txtai-wikipedia embeddings database + Mistral

davidmezzetti@alien.top · 2 years ago

RAG in a couple lines of code with txtai-wikipedia embeddings database + Mistral

QuantumDrone@alien.top · 2 years ago

Instructions unclear; my chat is now full of spiders.

davidmezzetti@alien.top · 2 years ago

This code uses txtai, the txtai-wikipedia embeddings database and Mistral-7B-OpenOrca-AWQ to build a RAG pipeline in a couple lines of code.

SomeOddCodeGuy@alien.top · 2 years ago

I was already super interested in txtai, but you are the best for the wikipedia embeddings link too. I’m definitely playing with this soon

herozorro@alien.top · 2 years ago

how can this be used for code generation with a github repo and its documentation?

davidmezzetti@alien.top · 2 years ago

Well for RAG, the GitHub repo and it’s documentation would need to be added to the Embeddings index. Then probably would want a code focused Mistral finetune.

I’ve been meaning to write an example notebook that does this for the txtai GitHub report and documentation. I’ll share that back when it’s available.

toothpastespiders@alien.top · 2 years ago

The choice of question in there is particularly insightful. All AI-related tasks should focus on spiders.

Ok-Recognition-3177@alien.top · 2 years ago

This looks incredibly useful

DaniyarQQQ@alien.top · 2 years ago

Looks like it can work with AWQ models. Can it work with GPTQ (Exllama2) and GGUF models?

davidmezzetti@alien.top · 2 years ago

It works with GPTQ models as well, just need to install AutoGPTQ.

You would need to replace the LLM pipeline with llama.cpp for it to work with GGUF models.

See this page for more: https://huggingface.co/docs/transformers/main_classes/quantization

BriannaBromell@alien.top · 2 years ago

Can this query my docs too?

davidmezzetti@alien.top · 2 years ago

Yes, if you build an embeddings database with your documents. There are a ton of examples available: https://github.com/neuml/txtai

Tiny_Arugula_5648@alien.top · 2 years ago

Textai is fantastic!!

Kinuls9@alien.top · 2 years ago

Hi David,

I’m very impressed by your work, not only the library itself but also the documentation, which is crystal clear and very well illustrated.

I’m just curious, how do you monetize your work?

davidmezzetti@alien.top · 2 years ago

Thank you, appreciate it.

I have a company (NeuML) in which I provide paid consulting services through.

e-nigmaNL@alien.top · 2 years ago

Im trying to wrap my head around this :)

But will this (conceptually) also work for Atlassian (Jira and Confluence) instead of wikipedia

In a way, that you can use semantic search through jira and confluence