Re: 1, we need embeddings because a regular search may not find all relevant documents. Regular text search looks for documents in the corpus that contain the search terms, i.e., the terms used by the user in their question, which may be quite different from the terms present in the documents.
Fuzzy search extends that, and synonyms extends it further, or other “advanced” search and indexing techniques, but still, we don’t want to miss any potential lose match.
What we want in the context of RAG is to cast a net as wide as possible. The problem is the context window. If the context window was infinite we would send the entire corpus with each request (and indeed, in the case of a small corpus this is sometimes done).
But for a large corpus, the goal is to find the largest possible set of relevant document (or document chunks) that will fit in the context window. Embeddings are the best solution for this.
Re: 1, we need embeddings because a regular search may not find all relevant documents. Regular text search looks for documents in the corpus that contain the search terms, i.e., the terms used by the user in their question, which may be quite different from the terms present in the documents.
Fuzzy search extends that, and synonyms extends it further, or other “advanced” search and indexing techniques, but still, we don’t want to miss any potential lose match.
What we want in the context of RAG is to cast a net as wide as possible. The problem is the context window. If the context window was infinite we would send the entire corpus with each request (and indeed, in the case of a small corpus this is sometimes done).
But for a large corpus, the goal is to find the largest possible set of relevant document (or document chunks) that will fit in the context window. Embeddings are the best solution for this.