Looking for any model that can run with 20 GB VRAM. Thanks!

  • drifter_VR@alien.topB
    link
    fedilink
    English
    arrow-up
    1
    ·
    2 years ago

    A 34B model is the best fit for a 24GB GPU right now. Good speed and huge context window.
    nous-capybara-34b is a good start

    • GoofAckYoorsElf@alien.topB
      link
      fedilink
      English
      arrow-up
      1
      ·
      2 years ago

      I’ve been going with WizardLM-33B-V1.0-Uncensored-GPTQ for a while and it’s okay. Is Nous-Capybara-34b better?

      • TeamPupNSudz@alien.topB
        link
        fedilink
        English
        arrow-up
        1
        ·
        2 years ago

        WizardLM is really old by now. Have you tried any of the Mistral finetunes? Don’t discount it just because of the low parameter count. I was also running WizardLM-33b-4bit for the longest time, but Mistral-Hermes-2.5-7b-8bit is just so much more capable for what I need.

    • GoofAckYoorsElf@alien.topB
      link
      fedilink
      English
      arrow-up
      1
      ·
      2 years ago

      nous-capybara-34b

      I haven’t been able to use that with my 3090Ti yet. I tried TheBloke’s GPTQ and GGUF (4bit) versions. The first runs into memory issues, the second, loaded with llama.cpp (which it seems to be configured on) loads, but is excruciatingly slow (like 0.07t/sec).

      I must admit that I am a complete noob regarding all the different variants and model loaders.