There’s a very detailed (and long) article on the state and future of speech recognition and speech synthesis in the New York Times from late June. Although the prognosis is not that positive, it is written almost with the challenge of a Turing test for speech recognition, i.e., a computer recognizing the semantics of human speech as well as a human. Also, quite a bit of the article focuses on the detection of type and level of emotion from a speaker’s speech.
The article might give a reader the impression that not much is going on with advancements in emotional speech prosody in commercially available text-to-speech engines, but anyone who has listened to a demo of the Loquendo TTS engine would tell you differently.