Transcription and TTS

The AI-based applications used in CXone Mpower work with text from interactions with contacts The person interacting with an agent, IVR, or bot in your contact center.. The audio from interactions on voice channels must be converted to text so the AI applications can work with it. After analyzing the text, AI applications can provide the responses they're designed to give. This is done using transcription Written form of all or part of a voice or digital interaction. services, also known as speech-to-text (STT).

The AI applications' responses are provided in text format. However, virtual agents need to convert this text to audio that can be played for the contact. This allows the virtual agents to "speak" with contacts. This conversion is done using text-to-speech Allows users to enter recorded prompts as text and use a computer-generated voice to speak the content. (TTS) services.

Working with transcription and TTS in CXone Mpower requires custom Studio scripting. The script manages the capture of interaction audio and sends it to the transcription service and the destination application. The script also manages the application's responses, including sending them to the STT service, if needed. The required scripting varies by use case. It's described in the online help for setting up each virtual agent or agent assist integration.

TTS

Text-to-speech converts written words to audio in the form of computer-generated voices. AI helps the computer-generated output sound more human by reproducing natural-sounding intonation, stress, pacing, and pronunciation. In CXone Mpower, TTS is used in IVR Interactive Voice Response. Automated phone menu contacts use via voice or key inputs to obtain information, route an inbound voice call, or both. menus and virtual agent A software application that handles customer interactions in place of a live human agent. integrations.

For TTS, you can use third-party TTS services or the native TTS service.