How to Perform Speech Synthesis in Python

Last Updated on July 24, 2023 by Editorial Team

Author(s): Tommaso De Ponti

Originally published on Towards AI.

Introduction to Text-To-Speech(TTS) in Python to perform useful tasks

How to Perform Speech Synthesis in Python — Photo by Michal Czyz on Unsplash

Text to speech (TTS) is the use of software to create an audio output in the form of a spoken voice. The program that is used by programs to change the text on the page to an audio output of the spoken voice is normally a text to speech engine. TTS engines are needed for an audio output of machine translation results.

TTS Softwares are widely used by important companies such as Google, Apple, Microsoft, Amazon, and others. Google developed the Google Assistant, Apple developed Siri, Microsoft developed Cortana, and Amazon developed Alexa. All these advanced Softwares use, among lots of ML techniques and algorithms, TextToSpeech.

When you ask Siri something, it will process an answer using Machine Learning, and then using TTS, it will answer you vocally.

Today we will not build a Vocal Assistant. Instead, we will first introduce TTS in Python and then how to create a program that can read a text file aloud.

Finally, we’ll embed our Text-To-Speech basic functionalities in a GUI made with Kivy.

Pyttsx3

Today, in order to perform Speech Synthesis, we will use the pyttsx3 python package. To install it via pip, we open our terminal and type:

pip install pyttsx3

With this package installed, we can start to perform Text-To-Speech

First Speech Synthesis Program

In this paragraph, we will learn how to create a very simple TTS script that from a given input performs Text-To-Speech:

Here we: imported the pyttsx3 (Line 1), created the engine object by using the pyttsx3 module(Line 2). At Line 5/6, we perform Speech Synthesis for the string.

Pretty simple! Now, before building another TTS program, let’s make some improvements to this code.

Here we defined the function so we can use TTS every time we want; in this case, we used it in the loop. For each text we give the program, it will perform Speech Synthesis on it. Cool but still not much useful.

Let’s see how we can apply Speech Synthesis to do some useful tasks.

Useful App

Have you ever had to read that document within the next day? That long and boring document? Well, I had to.

We are programmers, There is a problem? Solve it!

The first thing that came in my head after I learned to use TTS is that the problem we’ve seen before could have been solved. Listening to something is much faster and relaxing than reading something, especially if it is boring or too long.

So, for a given text file, I choose to use TTS in order to listen to it instead of reading it:

In this simple script, we read the content of the example.txt file. Then we used the say function we talked about previously to apply Text-To-Speech to the content of the example.txt file.

Important

When you use this script, make sure to replace (At Line 3) the example.txt file with the file you want to be read loudly.

Embed TTS basic functionalities in a GUI

This step is very important. Knowing how to embed your program’s functionalities in a GUI can really make the difference, even a simple GUI. As I said in the introductory paragraph, we will use Kivy. It is an open-source Python library that allows us to quickly develop GUI apps with innovative graphics. More about Kivy can be found here.

First, let’s create a hello-world app with it:

As you can see, the code is really simple:

Lines 1/2: Imported Kivy.app and the Kivy button

Lines 4/6: Created the TestApp Class that returns us a GUI with a hello world button.

Line 8: Ran the app

This simple script gives us this as output:

Embedding

Now we can embed our TTS functionalities in a GUI: we want a GUI App that for a given text performs speech synthesis.

In this simple 30 lines code we:

Imported the needed modules → Lines 1/5
Paste the say() the function we created in the previous steps → Lines 8/13
Built our GUI:

Created a layout using Kivy’s BoxLayout → Line 18
Generated a Text Input object, we will enter our text here → Line 19
Generated a Button that, when pressed, performs the function we’ll define at the 25th Line. → Line 20
Added the object we generated to the layout → Lines 21/22
Returned the layout with the TextInput and the Button objects → Line 23
Defined the self.perform() function that, once pressed, will grab the text of our TextInput object and will perform TTS on it using the say() function defined at line 8. → Lines 25/27

4. Executed the GUI App

After executing this code, you should see this as output:

By entering your text there and clicking the Perform Speech Synthesis Button, the app will actuate TTS for the given text.

Conclusion

Today we have seen how speech synthesis works in Python. So, we implemented Text-To-Speech in a useful app that reads documents aloud. TTS applications have been growing significantly in recent years, and learning how to build this type of app is definitely a good way to improve your programming skills. Knowing to implement speech synthesis also applies in everyday codes; for example, you can use TTS while testing the code to receive a vocal notification of what is happening during the execution of the code.

Join thousands of data leaders on the AI newsletter. Join over 80,000 subscribers and keep up to date with the latest developments in AI. From research to projects and ideas. If you are building an AI startup, an AI-related product, or a service, we invite you to consider becoming a sponsor.

Published via Towards AI

Frequently Used, Contextual References

Resources

Publication

How to Perform Speech Synthesis in Python

Author(s): Tommaso De Ponti

Introduction to Text-To-Speech(TTS) in Python to perform useful tasks

Pyttsx3

First Speech Synthesis Program

Useful App

Important

Embed TTS basic functionalities in a GUI

Embedding

Conclusion

Popular posts

Best Laptops for Deep Learning, Machine Learning (ML), and Data Science for 2023

Best Workstations for Deep Learning, Data Science, and Machine Learning (ML) for 2022

Descriptive Statistics for Data-driven Decision Making with Python

Best Machine Learning (ML) Books - Free and Paid - Editorial Recommendations for 2022

Best Data Science Books - Free and Paid - Editorial Recommendations for 2022

Updates

Recent Posts

Why Knowledge Graphs Are the Missing Piece in AI Agent API Discovery

The Complexity of Self-Driving Cars Explained Simply

Bridging Symbolic AI and Deep Learning: How Knowledge Graphs are Revolutionizing ResNets

LAI #93: Smarter Model Choices, Multi-Agent Systems, and Cutting Through AI Noise

Who Wins Purview vs Rogue AI in Data Control

Comprehensive AI Engineering and AI for Work certifications

Company

CONTACT US

Cookie	Duration	Description
cookielawinfo-checkbox-analytics	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checkbox-functional	11 months	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checkbox-necessary	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-others	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-performance	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
viewed_cookie_policy	11 months	The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.

Frequently Used, Contextual References

Resources

Publication

How to Perform Speech Synthesis in Python

Author(s): Tommaso De Ponti

Introduction to Text-To-Speech(TTS) in Python to perform useful tasks

Pyttsx3

First Speech Synthesis Program

Useful App

Important

Embed TTS basic functionalities in a GUI

Embedding

Conclusion

Related posts

Popular posts

Updates

Recent Posts

Comprehensive AI Engineering and AI for Work certifications

Company

CONTACT US

GDPR CCPA Statement