In a groundbreaking development, a new study has found that an artificial intelligence (AI) system has surpassed human accuracy in recognizing everyday conversations. This technology could potentially be used as a basis for automatic translations in the future. Speech assistants like Alexa, Cortana, and Siri have made it possible to create automated transcriptions of spoken texts and translations. However, there are still many challenges in recognizing everyday conversations, such as interruptions, stutters, filler words, and unclear pronunciation. These issues can lead to speech assistants being activated unintentionally or misunderstandings in conversations.

According to Alex Waibel from the Karlsruhe Institute of Technology (KIT), recognizing spontaneous speech is the most critical component of this system, as errors and delays can quickly make translations incomprehensible. Researchers at KIT have developed a new AI system that transcribes everyday conversations faster and better than humans. The system is based on a technology that translates university lectures from German and English in real-time. It uses encoder-decoder networks to analyze acoustic signals and assign words to them. The researchers significantly reduced the latency of the system by using an approach based on the probability of certain word combinations and linking it with two other recognition modules.

In a standardized test, the new speech recognition system listened to conversation snippets from a collection of about 2,000 hours of phone calls that the system should automatically transcribe. The AI system had a lower error rate of 5.0 percent, while humans had an error rate of around 5.5 percent. The latency time, which is the delay between the arrival of the signal and the result, was very fast at an average of 1.63 seconds, but still not quite as fast as the average 1-second latency of a human. This new system could be used in the future as a basis for automatic translations or for other scenarios where computers need to process natural language.

Overall, this development is a significant step forward in the field of AI and speech recognition technology. It has the potential to revolutionize the way we communicate and interact with technology in our daily lives.

Leave a Reply

Your email address will not be published. Required fields are marked *