30,000 AI models

too many really. But from what I read in conversations and posts I notice one thing: you all try out Model all the time and that’s fine, but I haven’t yet read that anyone habitually uses one Model over others. It seems like you use one template for a few days and then start with a new one. Don’t have your favorite? Which?

  • WaterPecker@alien.top
    cake
    B
    link
    fedilink
    English
    arrow-up
    1
    ·
    1 year ago

    Instead of making all these models the effort would be way more valuable if focused on making things more efficient. Methods to execute models on lower spec machines. The barrier to entry is way to big for larger models, not everyone lives in places where a 4090 is remotely an option.

    I feel it’s just a lazy copout that relies on just throwing more power rather than careful optimized design like the video game industry today.

  • ThisGonBHard@alien.topB
    link
    fedilink
    English
    arrow-up
    1
    ·
    1 year ago

    You dont talk about the “usuals”.

    My go to models were for a long time Stable Beluga 2 13B and 70B.

    Then, 13B got replaced by Minstral, 70B by LZLV, and Airoboros Yi 34B came out that worked great for me.

    As a rule: 7B - CPU inferencing on 2-4 cores while using GPU.

    34B and 70B, GPU inferencing, models trade blows despite size diff, as they are different base models. (Llama vs Yi).

  • dothack@alien.topB
    link
    fedilink
    English
    arrow-up
    1
    ·
    1 year ago

    OpenHermes-2.5-Mistral-7B is better than all the 13b and 7b model available.

    • morphles@alien.topB
      link
      fedilink
      English
      arrow-up
      1
      ·
      1 year ago

      What settings you use for it? In what UI? I tried it in silly tavern yesterday (via ooba backend), unmitigated disaster… tried bunch of setting nothing worked, as far as I go prompt template should be ChatML, but even with that…

  • _Lee_B_@alien.topB
    link
    fedilink
    English
    arrow-up
    1
    ·
    1 year ago

    There are 30,000 on huggingface? Is that what you’re saying?

    I wonder how many of those are truly open source, with open data? I only know of the OpenLlama model, and the RedPajama dataset. There are a bunch of datasets on huggingface too, but I don’t know if any of those are complete enough to train a major LLM on.

  • Only-Letterhead-3411@alien.topB
    link
    fedilink
    English
    arrow-up
    1
    ·
    1 year ago

    I’ve been in this ride since the early GPTJ days. I’ve tried A LOT of models. Right now for general use models, preference is refined into only using ChatML format models.

  • JoJoeyJoJo@alien.topB
    link
    fedilink
    English
    arrow-up
    1
    ·
    1 year ago

    I mean we’re in a period of really rapid development, there will be a hundred thousand models, maybe hundreds of thousands, but eventually we’ll throw away the older ones and consolidate down to a few really refined ones that everyone will use.

    Everyone knows iOS and Android, no one can tell you what version of Nokia’s OS their eleventy billion featurephones were using.

  • CRedIt2017@alien.topB
    link
    fedilink
    English
    arrow-up
    1
    ·
    1 year ago

    These suggestions are for spicy RP only, for any other informational type chat I use bard

    TheBloke_MLewd-ReMM-L2-Chat-20B-GPTQ it’s good, is more forthcoming with perverse jargon, not as good when you’re RPing about an interaction with 3 people (You, and 2 other females, for example)

    TheBloke_Chronoboros-33B-GPTQ VERY good and handles 3 people like a charm. Will fight you now and then and has a tendency to either punish you for being too antisocial or if everyone is having a good time, it’s all in no matter what. A bit more clinical in use of sexual jargon.

    TheBloke_airoboros-33B-gpt4-1.4-GPTQ Seemingly the best when you want to really challenge your place in humanity and is almost as good with maintaining 2 other people’s conversations/reactions as “Chrono” above.

    Hopefully if you or someone is looking for hot RP, you’ll find this helpful. Need 24 G Vram for the last two unless you use the trick to split the load between GPU and CPU (I haven’t needed to do that with them myself).

  • southpalito@alien.topB
    link
    fedilink
    English
    arrow-up
    1
    ·
    1 year ago

    It’s an experimental playground where 99.99% of players are handicapped because they don’t have access to the same volume of training data and hardware resources as the big corporate players. So you’ll have hundreds of iterations of smaller models as people try many different things to narrow the massive gap with OpenAi solutions.

    • LocoMod@alien.topB
      link
      fedilink
      English
      arrow-up
      1
      ·
      1 year ago

      What’s stopping us from building a mesh of web crawlers and creating a distributed database that anyone can host and add to the total pool of indexers/servers? How long would it take to create a quality dataset by deploying bots that crawl their way “out” of the most popular and trusted sites for particular knowledge domains and just compress and dump that into a format for training into said global p2p mesh? If we got a couple of thousand nerds on Reddit to contribute compute and storage capacity to this network we might be able to build it relatively fast. Just sayin…

  • testuser514@alien.topB
    link
    fedilink
    English
    arrow-up
    1
    ·
    1 year ago

    To me, it seems like the localllama community needs some meta and ensemble llm projects.

    I’m not sure if they exist but There should technically be trying see how to integrate large numbers of the 30000 models they exist now (maybe start from 2).

  • penguished@alien.topB
    link
    fedilink
    English
    arrow-up
    1
    ·
    1 year ago

    I think the answer is… whatever you get to be stable and highly usable do cool things for your purposes.

    It’s a bit of an organic thing too because how you phrase your prompting unlocks different doors in different models every single day.

  • Temporary-Size7310@alien.topB
    link
    fedilink
    English
    arrow-up
    1
    ·
    1 year ago

    There is ton of fine tuned models and maybe 6-7 quantisized models per model and fine tuned models, open source, business usable, uncensored, for RAG, for photo description, for TTS, for CV, with updates of checkpoints and so on.

    At the contrary fortunately there is people and enough diversity to adapt with hardware and objectives without pay fortunes to train, finetune models.

    ie: If your needs are commercial, with a model speaking fluently spanish, small enough to inference fast for many clients and with censor, 100% on your local server, treating with confidential data there is almost no choice