LLM Quiz for beginners (google form)

SatoshiNotMe@alien.top · 2 years ago

LLM Quiz for beginners (google form)

SatoshiNotMe@alien.top · 2 years ago

You mean we don’t need to use llama-cpp-Python anymore to serve this at an OAI-like endpoint?

SatoshiNotMe@alien.top · 2 years ago

A bit related. I think all the tools mentioned here are for using an existing UI.

But what if you wanted to easily roll your own, preferably in Python. I know of some options:

StreamLit

Gradio https://www.gradio.app/guides/creating-a-custom-chatbot-with-blocks

Panel https://www.anaconda.com/blog/how-to-build-your-own-panel-ai-chatbots

Reflex (formerly Pynecone) https://github.com/reflex-dev/reflex-chat https://news.ycombinator.com/item?id=35136827

Solara https://news.ycombinator.com/item?id=38196008 https://github.com/widgetti/wanderlust

I like streamlit (simple but not very versatile) And reflex seems to have a richer set of features.

My questions - Which of these do people like to use the most? Or are the tools mentioned by OP also good for rolling your own UI on top of your own software ?

SatoshiNotMe@alien.top · 2 years ago

OpenAI Assistants API in an Agent Framework

SatoshiNotMe@alien.top · 2 years ago

Langroid has a DocChatAgent, you can see an example script here -

https://github.com/langroid/langroid-examples/blob/main/examples/docqa/chat.py

Every generated answer is accompanied by Source (doc link or local path), and Extract (the first few and last few words of the reference — I avoid quoting the whole sentence to save on token costs).

There are other variants of RAG scripts in that same folder, like multi-agent RAG (doc-chat-2.py) where you have one master agent delegating smaller questions to a retrieval agent and asking it in different ways if it can’t answer etc. There’s also a doc-chat-multi-llm.py where you can have the master agent powered by GPT4 and the RAG agent powered by a local LLM (because after all it only needs to do extraction and summarization).

SatoshiNotMe@alien.top · 2 years ago

> intuitively it seems like you might be able to avoid calling a model at all b/c shouldn’t the relevant sentences just be closer to the search

Not really, as I mention in my reply to u/jsfour above: Embeddings will give you similarity to the query, whereas an LLM can identify relevance to answering a query. Specifically, embeddings won’t be able to find cross-references (e.g. Giraffes are tall. They eat mostly leaves), and won’t be able to zoom in on answers -- e.g. the President Biden question I mention there.

SatoshiNotMe@alien.top · 2 years ago

Here is the comparison for that specific example.

https://preview.redd.it/60yx347rkexb1.png?width=1126&format=png&auto=webp&s=9aeb12c48a85aee87c51ec94373afb9782cce200

SatoshiNotMe@alien.top · 2 years ago

Relevance Extraction in RAG pipelines