AI companies have been working on voice models for Chatgpt Voice Mode,
Now, amazon has just introduced Its new “foundation” Ai Voice Model Called Nova Sonic. And it really makes alexa sound like she’s living way in the past.
According to amazon, Nova Sonic “Unifies Speech Understanding and Speech Generation into a Single Model, to enable more human-like voice conversions in Ai Applications.” With the Samples Provided, It Certainly Does Seem More Human-Like The Company’s Previous Iturations of Ai Voice Models.
For example, there are proper pauses, tone, and inflections on words depending on where they are and what they mean in a sentence. Amazon provided some samples you can listen to here and here,
Mashable light speed
Again, “More Human-Like” is the key description here. There are still planty of signs that it’s an ai voice, but it also does sound like a big step over previous ai voice assistants like alexa.
Amazon say that it achieved this by combining multiple models that would traded the old traditionally be used, like speech reconstruction, large language models, and text-to -Speech, INTO ONE SINGLE UNIFIED MODEL. According to amazon, it not only undersrstands the nuans in speech to produce it, but it also also also go understands it when a human incarn inputs their own speech with these nuans as well.
According to TechcrunchNova sonic is alredy powering amazon’s next-generation Ai Voice Assistant, Alexa+.
Based on Recent developments, it does see the big ai companies are currently focusing on voice models. So, prepare for competition in that space to heat up. Amazon is alredy pointing to claims that Nova sonic is roughly 80 percent cheaper than openai’s GPT-4O MODEL and promoting it as “the most cost-efficient.”
Nova sonic is currently available to developers through Amazon’s Enterprise Ai Developer Platform, Bedrock.