PookaMacPhellimen@alien.topB to LocalLLaMAEnglish · 2 years agoQwen-72B releasedhuggingface.coexternal-linkmessage-square39linkfedilinkarrow-up11arrow-down10
arrow-up11arrow-down1external-linkQwen-72B releasedhuggingface.coPookaMacPhellimen@alien.topB to LocalLLaMAEnglish · 2 years agomessage-square39linkfedilink
minus-squarematsu-morak@alien.topBlinkfedilinkEnglisharrow-up1·2 years agoI could not undestand it. Is this true audio (can differentiate a helicopter sound from a fire engine for example, or a dog bark) or it just transforms speech into text and then it feeds the model?
minus-squareomniron@alien.topBlinkfedilinkEnglisharrow-up1·2 years agoIt’s the former. It’s looking at audio data So you can ask it sentiment, determine if someone is giggling, crying, laughing, can maybe even detect a condescending tone or flirtatious tone etc.
I could not undestand it. Is this true audio (can differentiate a helicopter sound from a fire engine for example, or a dog bark) or it just transforms speech into text and then it feeds the model?
It’s the former. It’s looking at audio data
So you can ask it sentiment, determine if someone is giggling, crying, laughing, can maybe even detect a condescending tone or flirtatious tone etc.