Main Content

eSpeak Text to Speech

Convert text to speech for output on default audio device


Convert text to speech using the eSpeak speech synthesizer. The synthesized speech is output by the default audio device.

The input port accepts a uint8 array containing the ASCII text to be converted: uint8('character vector').

For example, you can send uint8('Hello world') or uint8('1 2 3') to the input. Sending [1 2 3] to the input does not work.

If you run a simulation of a model that contains this block without the target hardware, this block does nothing. See Block Produces Zeros or Does Nothing in Simulation.

Use this block inside a triggered subsystem that lets the text-to-speech conversion finish before it runs the block again.

For an example of how to use this block, open the *motion_sensor_texttospeech.slx model in the “Motion Sensor” example for this support package. After the PIR sensor detects motion, this model holds the GPIO pin high for five seconds, giving the text-to-speech conversion enough time to finish before the block can run again. Without the five second interval, multiple instances of the text-to-speech conversion can run at the same time, producing output that is garbled and contains pauses.

When run, the block launches a background process that performs the conversion. The block is free to run again almost immediately. Unless prevented from doing so, the block can launch multiple background processes that run concurrently. Although the Linux sound driver mixes the sound from those processes, the resulting output is unsatisfactory. Running too many processes concurrently produces errors.

  • eSpeak Text to Speech block