Today WSJ Online had a half decent article on speech recognition. The writer provided reasonable coverage of the field, without being too negative. Microsoft continues to help out the whole field by drawing more attention to speech recognition. While Microsoft hasn’t invested quite as much into promoting speech recognition as I had hoped after the debut of Speech Server, what they have done even up to now has still been a big help for us all.
I must beg to differ, though, with the implication of the article’s title “After Years of Effort, Voice Recognition Is Starting to Work”. Speech recognition does work and it has worked for quite some time. No, it’s not perfect and no, it doesn’t work equally well for everyone, but search engines don’t work perfectly for everyone either.
Maybe it’s an unjustified hypothesis on my part, but I think people have learned over time how to better serve themselves on the web by using more meaningful terms in their search queries. Maybe I’m wrong and people can’t be trained through experience, but I think that most people will figure out how to improve the results they get from search engines. Also, with enough experience with using vendors’ websites, people get used to the basic flow of web user interfaces and are better able to serve themselves.
I think the same thing will happen as people get more experience interacting with speech applications. And, without a doubt, people will get more chances every day to do this. At my company, we see every day growing interest in the deployment of speech applications to offer cost effective and good quality customer service. The cost advantage is just too appealing for many merchants and service companies.
“For all of my fascination with the technology of speech recognition, I, too, would rather have a person on the other end;”
I think a decade or so ago people might have said the same thing about interacting with a bank teller instead of an ATM. Or even farther back in time, about having a gas station attendant pump their gas. But there are significat costs in having other people handle these transactions for us, which is why those tasks have been almost completely converted to self service.
There are two obvious reasons that customers will accept self service. Faster and Cheaper. There are more, of course, but these two account for a lot.
Using an ATM instead of waiting in line for a bank teller will almost always be faster, and many banks also now charge for use of a teller. While people often complain about this practice, it’s not exactly novel. Many companies charge more for providing a level of service that costs them more to provide. It costs more to provide first class service on a plane flight, so the airlines charge more. The problem banks had is that they didn’t charge for use of a teller from the beginning. Since using a teller was free, they couldn’t give you a discount for using an ATM without giving away money.
Speech apps can obviously be faster if call centers are understaffed enough to create significant waits. This also isn’t novel. You usually wait in a bank lobby or at an airline checkin counter if you don’t want to use the ATM or kiosk. Supermarkets and hardware stores are even offering self checkout that can allow you to get out of the store faster.
Speech apps can also be cheaper if companies offer discounts for their use. Of course, this only works for revenue producing transactions. Since the easiest transactions to automate don’t produce revenue (store locator, account balance, reservation confirmation), you have to focus on Faster for these transactions when implementing them. But, speech apps that automate purchases could gain greater adoption if consumers were offered better prices or other rewards. Offering rewards for using speech self service can also work for non revenue producing transactions, e.g., airlines sometimes offer frequent flyer miles for online checkin, but you may need to track the rewards to make sure they aren’t abused. Rewarding online checkin works, because you can do it only once per flight.