Analysing interactive voice services
Introduction
IVR (Interactive Voice Response) services have been developed during the past decade to provide a more satisfactory alternative to touch-tone systems. Touch-tone enquiry systems (`press 2 for sales') are often disliked by users due to their inflexible and crude interfaces. IVR allows users to do what they expect in a telephone call, namely to speak and to listen. IVR is convenient for users on the move, who may have little more than a mobile telephone. Although WAP (Wireless Access Protocol) is intended to provide Web browsing for mobile users, it has seen only limited use. Some categories of users (e.g., the partially sighted or those without Internet access) are also disadvantaged if information is provided only via the Web.
Although IVR is not new, it was initially supported by a variety of proprietary solutions. VoiceXML (Voice eXtensible Markup Language [30]) has been an important development in the standardisation of IVR. There are competing standards for IVR, but VoiceXML seems to have attracted the most support. The basic idea of VoiceXML is that users `fill in' fields of forms by speaking in response to prompts. VoiceXML platforms usually include sophisticated support for TTS (Text To Speech, i.e., synthesised speech output) and STT (Speech To Text, i.e., speech recognition). The completed information is then typically submitted to a program or database for further processing. VoiceXML lends itself to a wide variety of applications such as news and sports information, telephone banking, sales enquiries and orders, and travel bookings. For an application such as banking, VoiceXML could provide a voice-based front-end to an existing bank system. There could also be other front-ends to the same system, e.g., for Web browsing or WAP access.
As an application of XML, VoiceXML is textual in form. However, most commercial packages (e.g., Covigo Studio, Nuance V-Builder, Voxeo Designer) provide a graphical representation. VoiceXML has a nested, hierarchical structure that most packages reflect in graphical form. Some representations emphasise the relationship among VoiceXML elements, e.g., the flow of control among the fields of a form. Commercial packages are (not surprisingly) very close to VoiceXML since their aim is direct support of scripting with VoiceXML. As a programming language, VoiceXML focuses on how an IVR service is realised and not what it does. It can therefore be difficult to get a clear overview from VoiceXML of an IVR service.
It is easy, and even common, to write VoiceXML scripts that have implicit loops and complicated logic. To some extent, VoiceXML encourages this because its form interpretation algorithm requires multiple passes through a form. The consequences of certain VoiceXML constructs may not be immediately obvious, e.g., they may cause an indefinite loop.
VoiceXML adopts a pragmatic and programmatic approach to development. There is no way to formally check or analyse a VoiceXML script. Instead, VoiceXML must be debugged using traditional software engineering methods.
VoiceXML applications are essentially single scripts, though these can be made up from a number of individual documents (i.e., files). VoiceXML supports unconditional transfers (goto) and subroutine-like calls (subdialog) to other documents. However there is no equivalent of a feature. In fact, VoiceXML does not even use the term service.
In telephony, services are often composed from self-contained features. A feature is an additional function that is triggered automatically (e.g., call diversion or call blocking). From the developer's point of view, a feature is triggered by certain conditions and is not explicitly called at some point in the call processing code. Features can therefore easily add supplementary capabilities to basic call processing. The value of features has been amply demonstrated in the IN (Intelligent Network).
Cress (Chisel Representation Employing Systematic Specification) is a front-end for defining and formalising services. Cress was initially based on the industrial notation Chisel developed by BellCore [1]. However, Cress has been considerably extended since its beginnings. In particular, it supports the notion of plug-in domains: the vocabulary and concepts required for each application area are defined separately. Cress has been demonstrated on services from the IN (Intelligent Network [24]), Internet telephony [25], [27], and IVR (Interactive Voice Response [27], [28]).
Cress aims to combine the advantages of an accessible graphical notation, analysis via translation to formal languages, and realisation via translation to implementation languages. That is, the same service diagrams can be used for multiple purposes. Cress is neutral with respect to the target language. For formal analysis, Cress diagrams are automatically translated to Lotos (Language Of Temporal Ordering Specification [11]) or to SDL (Specification and Description Language [13]); see [28] and [26], respectively. For implementation, Cress diagrams are automatically translated to Perl (for SIP services) or to VoiceXML (for IVR services); see [25] and [27], [28], respectively.
For IVR services, Cress is intended to complement existing VoiceXML platforms. In particular, Cress offers the following:
- •
Cress is a platform-independent graphical notation for a substantial (but not complete) proportion of IVR applications. A Cress service is represented at a more abstract level than VoiceXML, making it easier to gain an overview of the service. VoiceXML is merely a target language for Cress, so it should be possible to translate Cress diagrams into other IVR languages.
- •
Cress supports features and services. These are not directly recognised in IVR, so their addition provides useful extra capabilities. Without features, IVR applications have to explicitly call supplementary capabilities.
- •
It can be difficult to check whether a realistic IVR application will behave correctly in all circumstances (e.g., will not stop prematurely or loop indefinitely). Through translation to a formal language, Cress supports rigorous analysis of IVR services. Cress is also accompanied by a scenario-based testing language that is used to validate IVR applications. The same approach also contributes to detecting feature interactions.
- •
VoiceXML is not formally defined. Some concepts are only vaguely described (e.g., event handling) and some are loosely defined (e.g., the semantics of expressions and variables). Through translation to a formal language, Cress contributes to a more precise understanding of VoiceXML.
Graphical notations for services are, of course, fairly common. Although it has a graphical form, SDL (Specification and Description Language [13]) is a general-purpose language that was not designed particularly to represent communications services. MSCs (Message Sequence Charts [12]) are higher-level and more straightforward in their representation of services. UCMs (Use Case Maps [2]) have been used to describe communications services graphically. However none of these approaches has support for specific domains, and they cannot be translated into a range of languages. Perhaps surprisingly, there does not appear to have been other work on graphical or formal specification of IVR services.
As noted earlier, there are a number of commercial tools for VoiceXML. These offer rather more complete support for IVR than Cress. However they are focused on VoiceXML only, and do not offer any kind of formal analysis. Their (graphical) representations of services are very close to VoiceXML, so they are useful only to specialists. Fig. 1 is an example of what VoiceXML looks like in a commercial tool; this corresponds to the Donation service described by Cress in Fig. 2.
Commercial VoiceXML tools do not support rigorous analysis of IVR services. The translation of Cress into Lotos or SDL gives formal meaning to IVR service descriptions. The translation provides access to any analytic technique based on these languages. Among these, the author's own approach [23] is one of several that might be used.
Feature interaction in telephony is a much studied issue (e.g., [7]). The basic problem is that independently designed features can interfere with each other. It has been shown that feature interactions occur in a variety of other domains such as building control [15], email [5], [9], Internet telephony [14], [25], lift control [17], mobile communication [32], multimedia [4], [22], policies [19], and the web [31]. The work reported here shows how feature interaction can arise with IVR.
The new contributions made by this paper are the application of Cress to IVR services and features, the rigorous analysis of IVR applications, and the analysis of feature interactions in IVR. Section 2 introduces IVR and its realisation using VoiceXML. Section 3 gives an overview of the Cress notation as used to describe IVR services. Section 4 describes how IVR services are analysed, including the use of observer processes and a specialised test notation. Section 5 discusses the nature of feature interaction in IVR, and shows how Cress can be used to discover feature interactions.
Section snippets
Interactive voice response systems
As an example of IVR, the following hypothetical dialogue might occur with a telephone banking system:
- System:
You have called the Automated Phone Bank.
What would you like to do?
- User:
Silence
- System:
You can ask for your balance, request a statement, or close your account
- User:
My balance please
- System:
What is your account number?
- User:
Four eight five six seven one
- System:
There is no account with this number, please try again
- User:
Four three five six seven one
- System:
What is the PIN for this account?
- User:
Five three eight one
- System:
Your balance is seven hundred and fifty
The Cress notation
Cress is a graphical notation for describing the possible behaviour of a service. State is intentionally implicit in Cress because this allows more abstract descriptions to be given. Arcs between states may be guarded by event conditions or by value conditions. Cress has explicit support for defining and composing features. Cress also has plug-in vocabularies that adapt it for different application domains. These allow Cress diagrams to be thoroughly checked for syntactic and static semantic
Analysis in general
An IVR application can be executed like any script. Some commercial packages allow VoiceXML to be run in an offline IDE, while others require the script to be run by an online environment. In either case, debugging follows typical programming practice. This is, of course, time-consuming and risks undetected errors. Since Cress diagrams can be translated into Lotos and SDL, this offers new possibilities for automated analysis. For illustration, this paper concentrates on what can be done with L
Categories of IVR feature interaction
It has been seen how the integrity of an IVR application can be checked through the use of observer processes and tests. The term `feature' is used loosely in the following to mean any addition to the base application, as well as to mean a Cress feature diagram. The addition of further features to an IVR application can lead to interactions in much the same way as for telephony. However the nature of interactions is rather different for IVR. The following categories of feature interactions can
Conclusion
The nature of IVR services and their representation in VoiceXML have been explained. Cress has been introduced as a general graphical notation for services, with particular emphasis on IVR. Cress is formalised through translation to languages like Lotos (the focus of this paper) and SDL. However Cress can also be translated for implementation into languages like VoiceXML (the focus of this paper) and Perl.
Cress offers the following benefits for IVR development:
- •
platform and language independence;
Acknowledgements
Nuance Corporation kindly provided an academic licence for use of Nuance V-Builder™ in this work.
Kenneth J. Turner graduated in Electrical Engineering from the University of Glasgow in 1970. He was awarded a Ph.D. from the University of Edinburgh in 1974 for his research on Pattern Recognition. Until 1986 he was employed by International Computers Ltd. as a data communications consultant. During this period he specialised in systems architecture, data communications and formal methods. This led to his appointment as Professor of Computing Science at the University of Stirling in 1987. His
References (32)
A temporal logic of concurrent programs
Theoretical Computer Science
(1981)- A.V. Aho, S. Gallagher, N.D. Griffeth, C.R. Schell, D.F. Swayne, SCF3/Sculptor with Chisel: requirements engineering...
- D. Amyot, L. Charfi, N. Gorse, T. Gray, L.M.S. Logrippo, J. Sincennes, B. Stepien, T. Stepien, T. Ware, Feature...
- et al.
The temporal logic of branching time
Acta Informatica
(1983) - L. Blair, J. Pang, Feature interactions––Life beyond traditional telephony, in: M.H. Calder, E.H. Magill (Eds.),...
- M. Calder, A. Miller, Generalising feature interactions in email, in: D. Amyot, L. Logrippo (Eds.), Proceedings of the...
- M. Calder, C.E. Shankland. A symbolic semantics and bisimulation for full Lotos, in: M. Kim, B. Chin, S. Kang, D. Lee...
- et al.
A feature-interaction benchmark for IN and beyond
IEEE Communications Magazine
(1993) - J.-C. Fernández, H. Garavel, A. Kerbrat, R. Mateescu, L. Mounier, M. Sighireanu, CADP (Cæsar Aldébaran Development...
- R.J. Hall. Feature interactions in electronic mail, in: M.H. Calder, E.H. Magill (Eds.), Proceedings of the 6th Feature...
Cited by (19)
Rigorous development of prompting dialogues
2011, Journal of Biomedical InformaticsCitation Excerpt :For dialogues in general, cress supports a much richer range of constructs than is described here. For example, dialogues can deal with a wide variety of user responses, event guards, dialogue-defined events at multiple levels, configurable reprompting, and flexible data handling [49]. For people with cognitive impairment, it would be very undesirable to have complex prompts and options.
A rigorous approach to orchestrating grid services
2007, Computer NetworksRepresenting and analysing composed web services using Cress
2007, Journal of Network and Computer ApplicationsCitation Excerpt :The author developed Cress from early work by BellCore on Chisel (Aho et al., 1998) for describing voice services. Cress has been used to specify and analyse voice services for the IN (Intelligent Network) (Turner, 2000), SIP (Session Initiation Protocol) Internet Telephony (Turner, 2002), and IVR (Interactive Voice Response) (Turner, 2004). Service descriptions in Cress are graphical and accessible to non-specialists.
The implementation of a secure and pervasive multimodal Web system architecture
2006, Information and Software TechnologyAn Intelligent Speech Interaction Model for Mobile Teaching
2019, Proceedings - 2019 International Conference on Intelligent Transportation, Big Data and Smart City, ICITBS 2019Workflows for quantitative data analysis in the social sciences
2015, International Journal on Software Tools for Technology Transfer
Kenneth J. Turner graduated in Electrical Engineering from the University of Glasgow in 1970. He was awarded a Ph.D. from the University of Edinburgh in 1974 for his research on Pattern Recognition. Until 1986 he was employed by International Computers Ltd. as a data communications consultant. During this period he specialised in systems architecture, data communications and formal methods. This led to his appointment as Professor of Computing Science at the University of Stirling in 1987. His research interests include voice services and formalising systems architecture, mainly using the standardised Formal Description Techniques LOTOS and SDL.