Cloud TTS

CXone Cloud TTS allows you to manage all of your Cloud TTS profiles in one place. Cloud TTS converts text into spoken output delivered by synthesized voices. A Cloud TTS profile defines a voice and language combination. This service can be used with IVRsClosed Automated phone menu that allows callers to interact through voice commands, key inputs, or both, to obtain information, route an inbound voice call, or both. in CXone. For example, you can add multiple language options to your IVR.

Cloud TTS is a separate text-to-speech offering from the TTS service provided with Studio actions such as Play.

Classics, Inc. recently expanded its bookselling operation into new regions. Anne Shirley, the CXone administrator, starts setting up IVRClosed Automated phone menu that allows callers to interact through voice commands, key inputs, or both, to obtain information, route an inbound voice call, or both. menus in scripts for the new regions. She discovers some gaps in the default text-to-speech languages that CXone offers. Anne learns that with Cloud TTS, she can choose a TTS provider that offers the languages she requires. She likes that the TTS providers offer a wide range of voices to choose from.

SSML Support

Cloud TTS supports the use of Speech Synthesis Markup Language (SSML). SSML is an XML-based markup language that allows you to specify many aspects of how text is synthesized into speech. You can use it to fine-tune pronunciation, rate of speech, voice pitch, volume, and more.

To use SSML, text input must be:

  • Valid XML
  • Valid SSML
  • Contained within a set of <speak> </speak> tags
  • Marked up with tags that each have only one attribute (this includes the <speak> tag)

For example: 

<speak xml:lang="en-US">

Here are <say-as interpret-as="characters">SSML</say-as> samples.

I can pause <break time="3s"/>.

I can say cardinal numbers. This number is <say-as interpret-as="cardinal">1135</say-as>.

Or I can say ordinal numbers. You are <say-as interpret-as="ordinal">1135</say-as> in line.

I can even say numbers as digits. The digits are <say-as interpret-as="characters">1135</say-as>.

I can also substitute phrases, like the <sub alias="World Wide Web Consortium">W3C</sub>.

</speak>

You need to use the supported markup language from the TTS provider in your scripts. Other TTS markup may not work. Refer to the Google TTS documentation for information about any SSML variations or requirements specific to Google.

TTS Providers

Content in this section is for a product or feature in controlled release (CR). If you are not part of the CR group and would like more information, contact your CXone Account Representative.

CXone Cloud TTS uses third-party TTSClosed Allows users to enter recorded prompts as text and use a computer-generated voice to speak the content. providers. You can choose which of the supported providers you want to use. You can also choose the language and voice that Cloud TTS uses. Supported languages vary by TTS provider.

Currently, CXone supports the following providers:

  • AWS Polly TTS (Controlled Release)
  • Google TTS
  • Google Custom Voice TTS

Supported Languages and Voices

Each TTSClosed Allows users to enter recorded prompts as text and use a computer-generated voice to speak the content. provider offers a different set of languages. For each language they offer one or more voices that you can choose from. The selection of languages and voices can change at any time. To see the most up to date list of supported languages, check the documentation for each TTS provider: 

If you need TTS in more than one language, you can add multiple TTSVOICE actions to your Studio scripts and configure each one to use a different voice. Each action can use a different TTS provider, if needed.