• matsu-morak@alien.topB
    link
    fedilink
    English
    arrow-up
    1
    ·
    1 year ago

    I could not undestand it. Is this true audio (can differentiate a helicopter sound from a fire engine for example, or a dog bark) or it just transforms speech into text and then it feeds the model?

    • omniron@alien.topB
      link
      fedilink
      English
      arrow-up
      1
      ·
      1 year ago

      It’s the former. It’s looking at audio data

      So you can ask it sentiment, determine if someone is giggling, crying, laughing, can maybe even detect a condescending tone or flirtatious tone etc.