ninjasaid13@alien.topB to LocalLLaMAEnglish · 2 years agoLoRA Fine-tuning Efficiently Undoes Safety Training in Llama 2-Chat 70Barxiv.orgexternal-linkmessage-square13linkfedilinkarrow-up11arrow-down10
arrow-up11arrow-down1external-linkLoRA Fine-tuning Efficiently Undoes Safety Training in Llama 2-Chat 70Barxiv.orgninjasaid13@alien.topB to LocalLLaMAEnglish · 2 years agomessage-square13linkfedilink
minus-squaresquareOfTwo@alien.topBlinkfedilinkEnglisharrow-up1·2 years agoThey and their made up pseudo-scienfific pseudo “alignment” piss me so off. No, a model won’t just have a stroke of genius and decide to hack into a computer. For many reasons. Halluscination is one of them. Guessed a wrong token for a program? Oops the attack doesn’t work. Oh and don’t forget that tokens don’t fit into ctx.
They and their made up pseudo-scienfific pseudo “alignment” piss me so off.
No, a model won’t just have a stroke of genius and decide to hack into a computer. For many reasons.
Halluscination is one of them. Guessed a wrong token for a program? Oops the attack doesn’t work. Oh and don’t forget that tokens don’t fit into ctx.