Is there a good way (or rule of thumb) to decide when looking at a problem if peft/lora finetuning might be successful or if it only makes sense to do a complete finetuning of all weights? Given the big difference in cost knowing if peft/lora might work for a problem feels pretty essential.

    • trollbrot@alien.topOPB
      link
      fedilink
      English
      arrow-up
      1
      ·
      1 year ago

      Ok, interesting. One obvious use-case I could see is, that we want to train it on internal documents, to interact with the documents in a more dynamic way. That should be easier than learning a new language.

  • sshh12@alien.topB
    link
    fedilink
    English
    arrow-up
    1
    ·
    1 year ago

    My rule of thumb has been to LoRA (r between 4 and 16) until unsatisfied with results. It of course depends on data/task but imo most cases don’t require full fine-tune and perf/compute ROI is low.