• Warm_Shelter1866@alien.topB
    link
    fedilink
    English
    arrow-up
    1
    ·
    1 year ago

    What does it mean that an LLM is a reward model ? , I always thought of rewards only in the RL field . And how would the reward model be used during finetuning?