I have an 8GB M1 MacBook Air and 16GB MBP (that I haven’t turned in for repair) that I’d like to run an LLM on, to ask questions and get answers from notes in my Obsidian vault (100s of markdown files). I’ve been lurking this subreddit but I’m not sure if I could run LLMs <7B with 1-4GB of RAM or if the LLM(s) would be too quality.

  • gentlecucumber@alien.topB
    link
    fedilink
    English
    arrow-up
    1
    ·
    1 year ago

    I haven’t tried Mistral yet, but RAG with a 7b might not give accurate info from the context you pass it; even larger models can have trouble with accurate Q/A over documents, but there are things you can do to help with that.

    Why not just make API calls to GPT 3.5T instead of trying to barely run a 7b model at a snails pace for sub-par results? It’s fractions of a penny for thousands of tokens.