Automatic Speech Recognition (ASR)

Automatic Speech Recognition (ASR) allows contacts to respond to IVRClosed Automated phone menu that allows callers to interact through voice commands, key inputs, or both, to obtain information, route an inbound voice call, or both. prompts by speaking. You can use ASR in your IVR scripts either instead of or in addition to a DTMFClosed Signaling tones that are generated when a user presses or taps a key on their telephone keypad.-based menu system. ASR can simplify and speed up contacts' experience with your IVR. An ASR-enabled IVR should recognize both words and phrases. It can match them with values you have predefined and route or answer calls accordingly.

In CXone, ASR is an optional feature. It uses version 11 of the Nuance ASR engine, which enhances the accuracy of your system's voice recognition. The Nuance ASR engine can also record in stereo.

An existing understanding of automatic speech recognition and the Nuance ASR engine is crucial for creating an effective ASR-enabled IVR system. Complete documentation for using this engine is available from Nuance.

ASR Terminology

Before working with ASR in CXone, you should be familiar the following ASR-related terminology:

  • Utterance: Words or phrases spoken by a contact in response to IVRClosed Automated phone menu that allows callers to interact through voice commands, key inputs, or both, to obtain information, route an inbound voice call, or both. prompts.
  • Grammar file: The file that provides rules for the ASR engine. It covers the words or phrases contacts can be expected to say in response to a prompt, then assigns content to variables based on those responses. This makes the recognition process more efficient, and gives higher rates of accuracy. You can learn more about grammar files on the ASR Management page.
  • Phrase list: A list of phrases that callers can be expected to say in response to a prompt. Each phrase in the list should be listed on a separate line. one per line. Phrase lists are typically entered using the PhraseList property of an ASR Studio action.
  • Confidence percentage: Also known as recognition percentage. When the ASR engine recognizes a phrase spoken by a caller, it returns a percentage that indicates how confident it is in its matching of the utterance to an item in the phrase list or grammar file. The confidence percentage can be used to route calls to different branches in your ASR-enabled IVR script. You can learn more about confidence percentages on the ASR Management page.
  • Tuning: The process of capturing data about your ASR system. The ASR tuning report provides information you can use to enhance and improve your grammar files. This helps improve the accuracy of the system. You can learn more about tuning on the ASR Management page.

ASR Actions

Studio has several ASR actions for use in IVRClosed Automated phone menu that allows callers to interact through voice commands, key inputs, or both, to obtain information, route an inbound voice call, or both. scripts. There are two general actions and others that are designed for specific types of prompts. Additionally, there are two ASR actions that allow you to build custom grammar files.

Except for the actions that build grammar files, all ASR actions allow you to:

The ASR actions in Studio are:

  • Asr: A general ASR action that accepts any type of utterance and interprets it based on a custom phrase list or grammar file you provide. This action offers a great deal of flexibility but is also more complicated to set up.
  • Asralphanum: Accepts an utterance that's a combination of letters, numbers, or both, such as a password or email address. This action comes with a built-in grammar file.
  • Asrcurrency: Accepts an utterance of a monetary value, such as a payment amount. This action comes with a built-in grammar file for one or more currencies, based on the Nuance language pack for your .
  • Asrdate: Accepts a variety of utterances related to dates, based on its built-in grammar file. This includes full dates, days of the week, and relative date references such as yesterday.
  • Asrdigits: Accepts an utterance of a string of digits, such as a phone number or ID number. This action comes with a built-in grammar file.
  • Asrmenu: A general action that accepts utterances you define to create a speech-enabled menu. This action can use a custom phrase list or grammar file. You can use the branch variables you create for the menu itself as a basis for interpreting the caller's utterance.
  • Asrnumber: Accepts utterance of numeric values. For example, an utterance of "five six" would be interpreted by this action as "fifty-six." Use the Asrdigits action if you want an utterance interpreted as a string of separate digits, such as "five" and "six." This action comes with a built-in grammar file.
  • Asrtime: Accepts a variety of utterances related to time, based on its built-in grammar file. This includes durations (such as "twelve hours") in addition to specific times (such as "three p m").
  • Asryesno: Accepts positive or negative utterances based on its built-in grammar file. For example, there are multiple variations on how a contact might say "yes" (yes, yeah, yep, yup, okay, and so on). This action recognizes such variations.

Several of the ASR actions offer a choice between spoken and DTMFClosed Signaling tones that are generated when a user presses or taps a key on their telephone keypad. input. In some cases, DTMF may provide a better caller experience. For example, entering a social security number is just as easy as speaking it, and may be easier for the system to interpret.

Actions to Build Custom Grammar Files

There are two actions in Studio that you can use to build a custom grammar file from an existing database. This is helpful if your IVR needs to ask contacts for the name of an employee in your organization or the part number for a product in your organization's catalog. You can use the databases you already have to build grammar files for use in your IVR.

The Studio actions you can use to build custom grammar files are: 

  • Asrcompile: Used to compile custom grammar files into the GRAM format used by the Nuance ASR engine. This action is used in scripts that are run once, or at most, on an occasional basis. The script can be used to process existing GRXML files or in combination with Asrsql to create a new custom grammar file.
  • Asrsql: Works with DB Connector to pull a file of values from an existing database. This file can then be formatted and compiled into a grammar file for your ASR-enabled IVR. DB Connector is a service that acts as a gateway between CXone and a database.

When updating a grammar file, rename the file before using it in production scripts. This helps avoid conflicts during the update process. It also leaves the original file as a backup in case you need to revert for any reason. You can use variable substitution when specifying the grammar file name in ASR actions in your scripts.

Localization and ASR

Languages available for speech recognition vary depending on where your is housed. You can set the language with the Voiceparams Studio action in your script. Ask your CXone Account Representative for more information.

If your organization plans to use ASR to support more than one language, keep the following in mind:

  • Throughout parsing, "english" is hard-coded.

  • In parsing money, only "$" is supported.

  • In parsing money, '.' is always used to check for fractional values. ',' is not supported.

  • In pronouncing money, "dollars" and "cents" are hard-coded.

  • In pronouncing numbers, "negative" is hard-coded.

  • In pronouncing numbers, "point" is hard-coded.

  • ReadString is not localized (it reads English words).

Key Facts about ASR in CXone

As you develop ASR-enabled IVRClosed Automated phone menu that allows callers to interact through voice commands, key inputs, or both, to obtain information, route an inbound voice call, or both. scripts, keep the following in mind:

  • Grammar files should be used for most ASR Studio actions.
  • You can fine-tune the ASR settings for each script (or even before/after individual ASR actions) by setting a nuanceTuningParamsJson variable with a Snippet action.
  • Scripts should include routing in case there is a failure in the ASR functionality. For example, in the event of a failure, before ending the interaction, you could have the script revert to DTMF-only mode. You could also have the script play a message informing the contact of the problem.
  • The NICE CXone Professional Services group can assist you in developing ASR-enabled IVR scripts and their components. They can help with things such as creating custom grammar files built from your existing database. Contact your CXone Account Representative for more information about this service.
  • To ensure high availability of grammar files and reduce the potential for ASR timeout issues:

    • Large grammar files are proactively cached.
    • Grammar files are replicated to all ASR servers when they are compiled.