IBM made a couple announcements Tuesday regarding progress on their “superhuman speech recognition initiative”. Their goal is to get performance comparable to human speech recognition in the next five years. Presumably, they have also selected a starting point for the clock to start ticking, as the tired cliche about high quality speech recognition is that it is always five years away.
The main product announcement was Embedded ViaVoice 4.4. The primary example given for thi product was for telematics, where a driver could cause the radio to be tuned to a different frequency by a flexible request. Many telematics applications use a command & control strategy, where the person has utter commands in a fairly rigorous format. VoiceBox is using the new product with Scion automobiles for controlling an XM satellite radio. The advance here is that someone in the car could control the radio with commands that are much closer to what would be considered natiral language.
IBM also announced some products for performing real-time translation between different languages, such as from English to Mandarin Chinese. They also announced a product that monitors Arabic language television stations and provides an English translation that is delayed by a few minutes. With a four minute delay, the accuracy is only about 65%. That seems barely useful to me, but they claim they can get the accuracy up to 80% if the user can tolerate a longer delay. No comment on whether 65% is a higher standard of accuracy for information than is demanded by the current US Administration, but I think it is. Okay, that was a comment.