Text to Speech with Javascript

Updated on July 23, 2019

PreviousHandle Timezones NextBuilding Blog for Your Portfolio

Last updated 5 years ago

Was this helpful?

Text to Speech with Javascript

Updated on July 23, 2019

Concepts and Methods Involved

SpeechSynthesis API — SpeechSynthesis API is the API that performs text to speech service in the browser. This is exposed through window.speechSynthesis

SpeechSynthesis API

SpeechSynthesis API is a part of the Web Speech API, that is responsible for speech service. The global window.speechSynthesis object implements the SpeechSynthesis API.

var synth = window.speechSynthesis;

The important methods defined in it are :

getVoices() : This method will give a list of available voices that can be played. They will come in different languages, and you can set a language of your preference to speak. Each voice has few properties, some of them are name, lang etc.
Important : The list of voices may be loaded asynchronously in the browser — some browsers (like Chrome) make a server request to get the voice list, while some (like Firefox) have the list loaded into them. So in order for the getVoices() method to work, you may need to wait for the list of languages to be loaded (otherwise list of voices may be returned empty). This can be done by listening to the voiceschanged event fired by window.speechSynthesis.
A simple check that can be performed is to get the voice list initially. If empty, then listen to the voiceschanged eent.

var available_voices;

// list of languages is probably not loaded, wait for it
if(window.speechSynthesis.getVoices().length == 0) {
	window.speechSynthesis.addEventListener('voiceschanged', function() {
		available_voices = window.speechSynthesis.getVoices();
	});
}
else {
	available_voices = window.speechSynthesis.getVoices();
}

speak() : This method will add a speech (or an utterance) to a queue called utterance queue. This speech will be spoken after all speeches in the queue before it have been spoken.

SpeechSynthesisUtterance API

Whenever you want a speech to be spoken, you will need to create a SpeechSynthesisUtterance object.

var utter = new SpeechSynthesisUtterance();

This object contains properties that affect various factors defining a speech :

lang : Language of the speech
pitch : Pitch of the speech
rate : Speed at which speech will be spoken
text : Text of the speech
voice : Voice of speech. This will be one of the voices returned by window.speechSynthesis.getVoices() method
volume : Volume of the speech

In addition there are several events that are fired along the way of a speech, some of them are :

onstart : Fired when speech has begun to be spoken
onend : Fired when speech has finished
onboundary : Fired when speech reaches a word or sentence boundary

Sample Javascript Code

// list of languages is probably not loaded, wait for it
if(window.speechSynthesis.getVoices().length == 0) {
	window.speechSynthesis.addEventListener('voiceschanged', function() {
		textToSpeech();
	});
}
else {
	// languages list available, no need to wait
	textToSpeech()
}

function textToSpeech() {
	// get all voices that browser offers
	var available_voices = window.speechSynthesis.getVoices();

	// this will hold an english voice
	var english_voice = '';

	// find voice by language locale "en-US"
	// if not then select the first voice
	for(var i=0; i<available_voices.length; i++) {
		if(available_voices[i].lang === 'en-US') {
			english_voice = available_voices[i];
			break;
		}
	}
	if(english_voice === '')
		english_voice = available_voices[0];

	// new SpeechSynthesisUtterance object
	var utter = new SpeechSynthesisUtterance();
	utter.rate = 1;
	utter.pitch = 0.5;
	utter.text = 'Hello World';
	utter.voice = english_voice;

	// event after text has been spoken
	utter.onend = function() {
		alert('Speech has finished');
	}

	// speak
	window.speechSynthesis.speak(utter);
}

Browser Compatability

SpeechSynthesis API is availabe in all current versions of Firefox, Chrome, Edge & Safari.

Don't Autoplay a Speech

Some sites start a speech upon the page being loaded. To prevent such autoplay behaviour, it is now required to have some user interaction before speech synthesis API will work. Otherwise it will throw an error.

Example II : SpeechSynthesisUtterance

Constructor

Properties

Events

Examples

var synth = window.speechSynthesis;

var inputForm = document.querySelector('form');
var inputTxt = document.querySelector('input');
var voiceSelect = document.querySelector('select');

var voices = synth.getVoices();

for(i = 0; i < voices.length ; i++) {
  var option = document.createElement('option');
  option.textContent = voices[i].name + ' (' + voices[i].lang + ')';
  option.setAttribute('data-lang', voices[i].lang);
  option.setAttribute('data-name', voices[i].name);
  voiceSelect.appendChild(option);
}

inputForm.onsubmit = function(event) {
  event.preventDefault();

  var utterThis = new SpeechSynthesisUtterance(inputTxt.value);
  var selectedOption = voiceSelect.selectedOptions[0].getAttribute('data-name');
  for(i = 0; i < voices.length ; i++) {
    if(voices[i].name === selectedOption) {
      utterThis.voice = voices[i];
    }
  }
  synth.speak(utterThis);
  inputTxt.blur();
}

PreviousHandle Timezones NextBuilding Blog for Your Portfolio

Last updated 5 years ago

Was this helpful?