In my mind, “spicy” is just some extra cursing, humor, etc. Basically a model that is more fun, and less moralizing.

Unfortunately, AI safety doomers have a very different definition of “spicy”. To them, “spicy” is reconstructing and releasing the 1918 influenza virus to commit bioterrorism (by fine tuning spicyboros to have this sort of information).

And this is why we can’t have nice things.

https://arxiv.org/abs/2310.18233

/rant I made the spicyboros models a while back, to test how much it would take to remove the base llama-2 censorship, and provide more realistic, human responses.

I used stuff like George Carlin bits, NSFW reddit stories, and also generated ~100 random questions that would have been refused normally (like how to break into a car), as well as the responses to those questions (with llama + jailbreak prompt).

All of the data is already in the base model, you just need ~100 or so instructions to fine tune the refusal behavior out (which you can bypass with jailbreaks anyways).

Almost every interaction that is “illegal” could also be perfectly legit:

  • breaking into a car to steal it vs because the driver locked the keys in and has a pet in the car
  • hacking a wordpress site for malicious intent vs red teaming
  • making explosives for terrorism vs demolition or fireworks

I am not going to play a moral arbiter and determine intent, so I try to keep the models uncensored and leave it up to the human.

/endrant

  • Easy_Butterfly2125@alien.topB
    link
    fedilink
    English
    arrow-up
    1
    ·
    2 years ago

    https://twitter.com/pmddomingos/status/1719293357061505478

    The psychology of AI alarmists:

    Elon Musk: Savior complex. Needs something to save the world from.

    Geoff Hinton: Ultra-leftist, world-class eccentric.

    Yoshua Bengio: Hopelessly naive idealist.

    Stuart Russell: His only impactful application ever was to nuclear test monitoring. Fixated on nuclear analogies and regulation ever since.

    Max Tegmark: Shallow, opportunist, publicity seeker.

    Yuval Harari: Clueless purveyor of vacuous nonsense.

    Eliezer Yudkowsky: Has a screw loose.

    What about Tristaaaan: The world has moved on from social media hype, so did he (highlighted reply)