According to Dave Raggett (W3C Voice Browser Activity Lead), "VoiceXML 2.0 has the power to change the way phone-based information and customer services are developed: no longer will we have to press one for this or two for that; instead, we will be able to make selections and provide information by speech. VoiceXML 2.0 also creates opportunities for people with visual impairments or those needing Web access while keeping their hands and eyes free for other things, such as getting directions while driving." W3C 'Proposed Recommendation' status signifies that the Working Group has successfully completed both public and W3C Working Group review, and has provided evidence of successful interoperable implementations; at least eight known implementations in both prototype and fully released products are available for VoiceXML Version 2.0.
Separately, the VoiceXML Forum has announced support for the W3C VoiceXML 2.0 PR version, and has released XHTML+Voice Profile Version 1.2. The XHTML+Voice profile "brings spoken interaction to standard web content by integrating the mature XHTML and XML-Events technologies with XML vocabularies developed as part of the W3C Speech Interface Framework. This profile includes voice modules that support speech synthesis, speech dialogs, command and control, and speech grammars."
XHTML+Voice Profile 1.2 (http://www.voicexml.org/specs/multimodal/x+v/12/)
VoiceXML Forum. 3-February-2004.
Abstract: "The XHTML+Voice profile brings spoken interaction to standard web content by integrating the mature XHTML and XML-Events technologies with XML vocabularies developed as part of the W3C Speech Interface Framework. The profile includes voice modules that support speech synthesis, speech dialogs, command and control, and speech grammars. Voice handlers can be attached to XHTML elements and respond to specific DOM events, thereby reusing the event model familiar to web developers. Voice interaction features are integrated with XHTML and CSS and can consequently be used directly within XHTML content."
VoiceXML's main goal is to bring the full power of web development and content delivery to voice response applications, and to free the authors of such applications from low-level programming and resource management. It enables integration of voice services with data services using the familiar client-server paradigm. A voice service is viewed as a sequence of interaction dialogs between a user and an implementation platform. The dialogs are provided by document servers, which may be external to the implementation platform. Document servers maintain overall service logic, perform database and legacy system operations, and produce dialogs. A VoiceXML document specifies each interaction dialog to be conducted by a VoiceXML interpreter. User input affects dialog interpretation and is collected into requests submitted to a document server. The document server replies with another VoiceXML document to continue the user's session with other dialogs...
[According to the architectural model,] a document server (e.g. a web server) processes requests from a client application, the VoiceXML Interpreter, through the VoiceXML interpreter context. The server produces VoiceXML documents in reply, which are processed by the VoiceXML Interpreter. The VoiceXML interpreter context may monitor user inputs in parallel with the VoiceXML interpreter. For example, one VoiceXML interpreter context may always listen for a special escape phrase that takes the user to a high-level personal assistant, and another may listen for escape phrases that alter user preferences like volume or text-to-speech characteristics. The implementation platform is controlled by the VoiceXML interpreter context and by the VoiceXML interpreter. For instance, in an interactive voice response application, the VoiceXML interpreter context may be responsible for detecting an incoming call, acquiring the initial VoiceXML document, and answering the call, while the VoiceXML interpreter conducts the dialog after answer. The implementation platform generates events in response to user actions (e.g. spoken or character input received, disconnect) and system events (e.g. timer expiration). Some of these events are acted upon by the VoiceXML interpreter itself, as specified by the VoiceXML document, while others are acted upon by the VoiceXML interpreter context..." [adapted]
The XML Events module provides XML host languages the ability to uniformly integrate event listeners and associated event handlers with Document Object Model (DOM) Level 2 event interfaces. The result is an event syntax for XHTML-based languages that enables an interoperable way of associating behaviors with document-level markup.
VoiceXML 2.0 has been designed for creating audio dialogs that feature synthesized speech, digitized audio, recognition of spoken and DTMF key input, recording of spoken input, telephony, and mixed-initiative conversations. In this document, VoiceXML 2.0 is modularized to prepare it for integration into the XHTML family of languages using the XHTML modularization framework. The modules that combine to support speech dialogs for updating XHTML forms and form elements are selected to be added to XHTML. The modules are described as well as the integration issues. The modularization of VoiceXML 2.0 also specifies DOM event types specific to voice interaction for use with the XML Events module. Speech dialogs authored in VoiceXML 2.0 can then be treated as event handlers to add voice-interaction specific behaviors to XHTML documents. The language integration supports all of the modules defined in XHTML Modularization, and adds speech interaction functionality to XHTML elements to enable multimodal applications. The document type defined by the XHTML+Voice profile is XHTML Host language document type conformant..." [from the 'Introduction']
Copyright © 2005
voice-commands.com All Rights
Reserved.
Send questions or comments to
webmaster
-commands.com
Or use the feedback form:
here
Thank you.