I want to buy a laptop or build a machine powerful enough that I can run these LLMs locally. I am ok with either investing in desktop or MBP though MBP was lucrative with a capability to run these models on laptop itself. Any pointers would be helpful. I tried researching but so much of info is there that I got spooked. Any initial pointers would really help. Thank you!

  • FlishFlashman@alien.topB
    link
    fedilink
    English
    arrow-up
    1
    ·
    1 year ago

    Apple Silicon Macs are great options for running LLMs, especially so if you want to run a large LLM on a laptop. With that said, there aren’t big performance differences between the M1 Max & M3 Max, at least not for text generation, prompt processing does show generational improvements… Maybe this will change in future versions of MacOS if there are optimizations to unlock better Metal Performance Shader performance on later GPU generations, but for right now, they are pretty similar.

    Apple Silicon Macs aren’t currently a great option for training/fine tuning models. There isn’t a lot of software support for GPU acceleration during training on Apple Silicon.

  • WhereIsYourMind@alien.topB
    link
    fedilink
    English
    arrow-up
    1
    ·
    1 year ago

    I have the M3 Max with 128GB memory / 40 GPU cores.

    You have to load a kernel extension to allocate more than 75% of the total SoC memory (128GB * 0.75 = 96GB) to the GPU. I increased it to 90% (115GB) and can run falcon-180b Q4_K_M at 2.5 tokens/s.

    • Hinged31@alien.topB
      link
      fedilink
      English
      arrow-up
      1
      ·
      1 year ago

      I ordered the same config. Would you mind telling me what you’ve loved using it for (AI/LLM-wise)? My current laptop can’t do anything, so haven’t been able to jump into this stuff, despite strong interest. It’d be helpful to have a jumping off point. TIA!

      • WhereIsYourMind@alien.topB
        link
        fedilink
        English
        arrow-up
        1
        ·
        1 year ago

        I run a code completion server that works like GitHub Copilot. I’m also working on an Mail labeling system using llamacpp and AppleScript, but it is very much a work-in-progress.