• Right-Structure-1619@alien.topB
    link
    fedilink
    English
    arrow-up
    1
    ·
    1 year ago

    Does anyone know if they expose all the good stuff that Guidance uses for their guided generation and speedup? This plus guidance (kv cache, grammar control, etc) would be fast fast!

  • panchovix@alien.topOPB
    link
    fedilink
    English
    arrow-up
    1
    ·
    1 year ago

    By the hard work of kingbri, Splice86 and turboderp, we have a new API loader for LLMs using the exllamav2 loader! This is on a very alpha state, so if you want to test it may be subject to change and such.

    TabbyAPI also works with SillyTavern! Doing some special configurations, it can work as well.

    As a reminder, exllamav2 added mirostat, tfs and min-p recently, so if you used those on exllama_hf/exllamav2_hf on ooba, these loaders are not needed anymore.

    Enjoy!