As requested, this is the subreddit’s second megathread for model discussion. This thread will now be hosted at least once a month to keep the discussion updated and help reduce identical posts.

I also saw that we hit 80,000 members recently! Thanks to every member for joining and making this happen.


Welcome to the r/LocalLLaMA Models Megathread

What models are you currently using and why? Do you use 7B, 13B, 33B, 34B, or 70B? Share any and all recommendations you have!

Examples of popular categories:

  • Assistant chatting

  • Chatting

  • Coding

  • Language-specific

  • Misc. professional use

  • Role-playing

  • Storytelling

  • Visual instruction


Have feedback or suggestions for other discussion topics? All suggestions are appreciated and can be sent to modmail.

^(P.S. LocalLLaMA is looking for someone who can manage Discord. If you have experience modding Discord servers, your help would be welcome. Send a message if interested.)


Previous Thread | New Models

  • Helpful-Gene9733@alien.topB
    link
    fedilink
    English
    arrow-up
    1
    ·
    1 year ago

    With a system limited machine (2017 i5 iMac Cpu only) I am getting very pleasing results with:

    Openhermes2-mistral (7B 4bit K_M quant) for general chat, desktop assistant, and some coding assistance - Ollama backend with my own front end U/I and llama-index libraries implementation. Haven’t tried 2.5 but may.

    Synatra 7B mistral fine tune (4bit K_M quant) seems to produce longer responses and spicier with same system prompt (same use case as above)

    Deepseek-coder 6.7B (4bit quant) as a coding assistant alternative to GPT-3.5 - just trying out in last week or so and building the personalized coding assistant front end u/I for fun

    OrcaMini-3B - for chat when I just want something smaller and faster to run on my machine - the 7B quants are about max for the old iMac. But OrcaMini sometimes doesn’t give great stuff for me.

    • SideShow_Bot@alien.topB
      link
      fedilink
      English
      arrow-up
      1
      ·
      1 year ago

      IIUC, for coding you suggest deepseek-coder-6.7b-instruct.Q4_K_M.gguf, right? Can I run it with 16 Gb? I’m on a i5 Windows machine, using LM Studio.

      • Helpful-Gene9733@alien.topB
        link
        fedilink
        English
        arrow-up
        1
        ·
        1 year ago

        Yes that’s the one from The Bloke. I imagine you could, but try it! I can run it on an old i5 3.4 GHz chip with 8GB RAM and it seems to run as long as I’m not trying to keep a bunch of stuff open and using up RAM. I haven’t really used it a lot so can’t tell fully yet.