A couple of researchers from Carnegie Mellon and Hitachi have written a paper on an extension to VoiceXML to more easily support complex dialog systems. Their focus is on scenarios where you have multiple related dialogs and want to allow for flexible transitions between these dialogs. From the abstract:
This paper describes DialogXML, an extension to VoiceXML that supports a more implicitly declarative language for dialog scenarios, and ScenarioXML, a straightforward combination of DialogXML with the template-filling mechanism of Java Server Pages.
Essentially the same group also recently published a paper in the Information Processing Society of Japan SIG Notes on a spoken dialog management architecture for car telematics systems using VoiceXML, DialogXML, and ScenarioXML.
This is really interesting stuff, but I still struggle with the idea of using an XML-based language for programming. I really like the idea of being able to validate my code against a DTD or schema, but the code ends up being really verbose and hard to read. It’s even worse than JSP or ASP coding. XML just doesn’t seem like the most user friendly way to describe these kinds of state transition diagrams.
Hi Robert – thanks for highlighting DialogXML, I’ll take a look. You make some interesting comments about using XML for dialog. If XML doesn’t seem suited to state transition diagrams, do you have suggestions for anything else that you think might be more suitable?
John
Excellent question. No, unfortunately, I don’t have a good alternative suggestion. I would prefer Java or Python, and I suspect you might prefer C#. Those are all great choices, but a programming language specific approach has its own obvious limitations. And language independent APIs like the DOM tend to be equally mediocre in all languages.
Maybe we are just too early in the cycle of graphical development tools for designing speech applications. The speech app building tools I’ve used so far have seemed great for novices, but become very limiting as you become more expert.
Of course, if the development tool adds a proprietary abstraction layer on top of the XML and standards based approach, you lose the primary advantage of portability.