Finally, a diffusion based LMM!

BalorNG@alien.top · 3 years ago

Finally, a diffusion based LMM!

saintshing@alien.top · 3 years ago

Instead of using gaussian noise(in the latent space), I wonder if we can introduce noise by randomly inserting/deleting/replacing/swaping words. Cant we train a BERT model to predict the original text from a noise-added text?

mushytaco@alien.top · 3 years ago

This has been explored a little for nlp and even audio tasks (using acoustic tokens)!

https://aclanthology.org/2022.findings-acl.25/ and https://arxiv.org/abs/2307.04686 both come to mind

Feel like diffusion and iterative mask/predict are pretty conceptually similar—my hunch is that diffusion might have a higher ceiling by being able to precisely traverse a continuous space, but operating on discrete tokens probably could converge to something semantically valid w fewer iterations.

Also Bert is trained w MLM which technically is predicting the og text from a “noisy” version, but noise is only introduced via masking, and it is limited to a single forward pass, not iterative!