Speech Synthesis

Why Trust Techopedia

What Does Speech Synthesis Mean?

Speech synthesis is artificial simulation of human speech with by a computer or other device. The counterpart of the voice recognition, speech synthesis is mostly used for translating text information into audio information and in applications such as voice-enabled services and mobile applications. Apart from this, it is also used in assistive technology for helping vision-impaired individuals in reading text content.

Advertisements

Techopedia Explains Speech Synthesis

Homer Dudley’s VODER, which was based on the vocoder from Bell Laboratories, is considered the first fully functional voice synthesizer. The computer used in speech synthesis is known as a speech synthesizer or speech computer. The quality of the speech computer is often judged by its similarity to the human voice. Most computer operating systems have incorporated speech synthesizers since the early 1990s. Synthesized speech is usually generated with the help of concatenating pieces of recorded speech, which is contained in a database.

The initial stage in speech synthesis is pre-processing, which eliminates the ambiguity surrounding the manner in which the specific word needs to be read, and which also includes handling homographs. In the next stage of speech synthesis, the computer takes the help of phonemes to convert the text into sequence of sounds. The last stage involves the use of human recordings or basic sound generation techniques to mimic the human voice mechanism and read out the entire text. One of the popular branches of speech synthesis is the audio-visual speech synthesis or multimodal speech synthesis which makes use of an animated face tightly synchronized to complement the synthesized speech. Multimodal speech synthesis also incorporates additional features such as non-verbal cues to the speech to help in communicating the user’s words with more accuracy. Many speech synthesis systems allow users to choose the type of voice such as male or female voice.

Most speech synthesis systems are capable of reading texts and outputting them in a very intelligent manner though the voice can at times be dull. Speech synthesis, however, is yet to develop the ability to fully imitate the wide spectrum of human intonations and cadences.

Advertisements

Related Terms

Margaret Rouse
Editor

Margaret jest nagradzaną technical writerką, nauczycielką i wykładowczynią. Jest znana z tego, że potrafi w prostych słowach pzybliżyć złożone pojęcia techniczne słuchaczom ze świata biznesu. Od dwudziestu lat jej definicje pojęć z dziedziny IT są publikowane przez Que w encyklopedii terminów technologicznych, a także cytowane w artykułach ukazujących się w New York Times, w magazynie Time, USA Today, ZDNet, a także w magazynach PC i Discovery. Margaret dołączyła do zespołu Techopedii w roku 2011. Margaret lubi pomagać znaleźć wspólny język specjalistom ze świata biznesu i IT. W swojej pracy, jak sama mówi, buduje mosty między tymi dwiema domenami, w ten…