How to build a Text-to-Speech serverless web application with Amazon Polly - Whizlabs Journey

·

15 min read

Introduction to Amazon Polly

  1. Amazon Polly is a service that turns text into speech, allowing you to create applications that speak and build speech-enabled products.

  2. Amazon Polly is an Amazon AI service that uses advanced deep learning technologies to synthesize speech that sounds like a human voice. It currently includes 47 lifelike voices in 24 languages, so you can select the ideal voice and build speech-enabled applications that work in many different countries.

  3. You can also create a custom voice for your organization. This is a custom engagement where you will work with the Amazon Polly team to build an NTTS voice for the exclusive use of your organization.

Amazon Polly offers the following benefits:

  1. Natural sounding voices

  2. Store & redistribute speech

  3. Real-time streaming

  4. Customize & control speech output

Introduction to Amazon API Gateway

  1. Amazon API Gateway is a fully managed service that makes it easy for developers to create, publish, maintain, monitor, and secure APIs at any scale.

  2. APIs act as the front door for applications to access data, business logic, or functionality from your backend services.

  3. API Gateway handles all the tasks involved in accepting and processing up to hundreds of thousands of concurrent API calls, including traffic management, CORS support, authorization and access control, throttling, monitoring, and API version management.

  4. Using API Gateway, you can create RESTful APIs and WebSocket APIs that enable real-time two-way communication applications. API Gateway supports containerized and serverless workloads, as well as web applications.

  5. AWS Lambda lets you run code without provisioning or managing servers. You pay only for the compute time you consume.

  6. With Lambda, you can run code for virtually any type of application or backend service - all with zero administration. Just upload your code and Lambda takes care of everything required to run and scale your code with high availability. You can set up your code to automatically trigger from other AWS services or call it directly from any web or mobile app.

Architectural Diagram

Solution Steps

  1. Create an S3 Bucket.

  2. Create a DynamoDB table.

  3. Create an SNS Topic.

  4. Create insert Lambda functions.

  5. Create process Lambda functions.

  6. Create list of Lambda functions.

  7. Create an API Gateway.

  8. Host a Serverless Application.

  9. Test the application.

  10. Validation of the lab.

  11. Deleting AWS Resources.

Task 1: Create a S3 Bucket

Navigate to S3 and create a bucket with the following configuration:

  • Bucket Name: Enter store-audio-<RANDOMNUMBER>

    • Replace RANDOMNUMBER with a Random Number, for Example store-audio-123

(Note: The Bucket Name must be Unique across all existing bucket names in Amazon S3.)

  • Region: Select US East (N. Virginia)

  • Object ownership: Select the ACLs disabled (recommended) option.

  • Bucket settings for Block Public Access: Uncheck the option, Block all public access, and select the check box option of Acknowledgment.

Leave other settings in their default and create the bucket.

Make the bucket public with a Bucket Policy:

Click the Permissions tab to configure your bucket:

Scroll down to Bucket policy, click on Edit button on the Right side.

The policy editor will open.

In the policy provided below, update your bucket ARN in the Resource key-value and copy the policy code.

{
    "Id": "Policy1",
    "Version": "2012-10-17",
    "Statement": [
      {
        "Sid": "Stmt1",
        "Action": [
          "s3:GetObject"
        ],
        "Effect": "Allow",
        "Resource": "replace-this-string-from-your-bucket-arn/*",
        "Principal": "*"
      }
    ]
  }

Task 2: Create a DynamoDB Table

  1. Navigate to the DynamoDB page

  2. Make sure that you are in the US East (N.Virginia) Region.

  3. From the left side menu select Tables and then Click on Create Tables.

    • Table Name : Enter store_message_details

    • Primary key : Enter id and select String.

    • Leave other options as default.

    • Click on Create Table button.

  4. Your table will be created within 1-3 minutes.

  5. Copy the Table name store_message_details and place it in your text editor.

Task 3: Create a SNS Topic

  1. Navigate to Simple Notification service (SNS) by clicking on the Services menu. Simple notification service is available under the Application Integration section.

  2. Click on Topics in the left panel. Click Create Topics button.

  3. Select the Type as Standard.

  4. Under Details:

    • Name : Enter Invoke_audio_lambda
  5. Leave other options as default and scroll below, click on Create Topic.

  6. Copy the SNS Topic ARN and place it in your text editor.

Task 4: Create Insert Lambda Function

  1. Navigate to Lambda

  2. Click on the Create Function button.

    • Choose Author from scratch
  • Function name : Enter Add_New_posts

  • Runtime : Select Python 3.9

  • Permissions: Click on Change Default execution role and choose Use an existing role.

  • Existing role: Select Lambda_execution_role from the dropdown list. In case you have not created this role pleas make sure you do this first.

  • Click on Create Function button.

  1. Copy the following code:

     import boto3
     import os
     import uuid
    
     def lambda_handler(event, context):
         return_val = "Failed to process."
         DynamoDB_table_name = os.environ['DB_TABLE_NAME']
         sns_topic_arn = os.environ['SNS_TOPIC']
    
         selected_voice = event["voice"]
         input_text = event["text"]
         unique_id = str(uuid.uuid4())
    
         print("Created Unique ID : " + unique_id)
         print("Input Text : " + input_text)
         print("Selected Voice : " + selected_voice)
    
         #Connect to DynamoDB service in N.Virginia Region
         try:
             dynamodb = boto3.resource("dynamodb", region_name="us-east-1")
             table = dynamodb.Table(DynamoDB_table_name)
             #Put an Item to Table
             try:
                 table.put_item(
                     Item={
                         'id' : unique_id,
                         'input text' : input_text,
                         'selected voice' : selected_voice,
                         'status' : 'PROCESSING'
                     }
                 )
                 print("Successfully Inserted an item with id : " + unique_id)
                 return_val = "Text is PROCESSING."
             except Exception as e:
                 print("Insert Item to Table failed because ", e)
         except Exception as e:
             print("Connect to DynamoDB failed because ", e)
    
         #Client connection to SNS service in N.Virgina region
         try:
             snsClient = boto3.client("sns", region_name = "us-east-1")
             #Publish a message to SNS Topic
             try:
                 snsClient.publish(
                     TopicArn = sns_topic_arn,
                     Message = unique_id
                 )
                 print("Successfully Published a Message.")
             except Exception as e:
                 print("Publish message to SNS topic failed because ", e)
         except Exception as e:
             print("Client connection failed because ", e)
         return return_val
    

The Code above does the following:

  • Get DynamoDB table name and SNS topic ARN from Environment variable.

  • Get the text and selected voice from the lambda event.

  • Connect to DynamoDB table.

  • Generate a random number.

  • Insert random number, text, voice and status to DynamoDB table.

  • Connect to SNS Service.

  • Publish the random number as a message to SNS.

  1. Deploy the code by clicking on Deploy in the top right corner.

  2. Now Select Configuration tab

  3. In the left panel select Environment Variable.

    • Now click on edit

    • Click on Add environment variable.

  • Key: Enter DB_TABLE_NAME

  • Value : Enter store_message_details

  • Click on Add Environment variable.

    • Key : Enter SNS_TOPIC

    • Value : Paste the SNS Topic ARN (You can retrieve it from e text editor where you saved it and should be of the format arn:aws:sns:us-east-1:43664647:Invoke_audio_lambda)

    • Click on the Save button.

(Note : Don't change the Key Name, use the same name.)

  1. In the left panel select General configuration

    • Click on Edit button

      • Timeout : Set 3 min and 0 sec

        • Click on the Save button.

Task 5: Create Process Lambda Function

  1. Navigate to the Lambda Functions page

  2. Click on the Create function button.

    • Choose Author from scratch.

    • Function name : Enter Convert_Text_to_Audio

    • Runtime : Enter Select Python 3.9

  • Permissions: Click on Change default execution role and choose Use an execting role.

  • Existing role: Select Lambda_execution_role from the dropdown list.

  • Click on Create Function button.

  1. Now, click on the + Add trigger button.

    • Trigger configuration : Select SNS from the dropdown.

    • SNS topic : Select Invoke_audio_lambda from the list.

    • Click on the Add button.

  2. Copy the following code:

import boto3
import os
from contextlib import closing
from boto3.dynamodb.conditions import Key, Attr

def lambda_handler(event, context):
    DynamoDB_Table_name = os.environ['DB_TABLE_NAME']
    S3_Bucket_name = os.environ['BUCKET_NAME']
    unique_id = event["Records"][0]["Sns"]["Message"]

    print("Started Text to Speech operation for ID : " + unique_id)

    #Connect to DynamoDB service in N.Virginia Region
    try:
        dynamodb = boto3.resource("dynamodb", region_name = "us-east-1")
        table = dynamodb.Table(DynamoDB_Table_name)
        #Fetch data based on ID
        try:
            GetItem = table.query(
                KeyConditionExpression=Key('id').eq(unique_id)
            )
        except Exception as e:
            print("Get item from table failed because ", e)
    except Exception as e:
        print("Connect to DynamoDB Failed because ", e)

    Input_text = GetItem["Items"][0]["input text"]
    selected_voice = GetItem["Items"][0]["selected voice"]

    text_backup = Input_text

    # Because single invocation of the polly synthesize_speech api can
    # transform text with about 3000 characters, we are dividing the
    # post into blocks of approximately 2500 characters.
    try:
        textBlocks = []
        while (len(text_backup) > 1100):
            begin = 0
            end = text_backup.find(".", 1000)

            if (end == -1):
                end = text_backup.find(" ", 1000)

            textBlock = text_backup[begin:end]
            text_backup = text_backup[end:]
            textBlocks.append(textBlock)
        textBlocks.append(text_backup)
    except Exception as e:
        print("Split text to blocks failed because ", e)

    #Client Connection to Polly in N.Virginia Region
    try:
        pollyClient = boto3.client("polly", region_name = "us-east-1")
        for textBlock in textBlocks:
            #Convert Text to Speech
            try:
                sysnthesize_response = pollyClient.synthesize_speech(
                    OutputFormat='mp3',
                    Text = textBlock,
                    VoiceId = selected_voice
                )
                #Append Multiple audio streams into a single file.
                #store/save this file in Lambda Temp Folder/Dirctory.
                if "AudioStream" in sysnthesize_response:
                    with closing(sysnthesize_response["AudioStream"]) as stream:
                        output = os.path.join("/tmp/", unique_id)
                        with open(output, "ab") as file:
                            file.write(stream.read())
            except Exception as e:
                print("Speech Synthesize failed because ", e)
        print("Polly Synthesize Completed.")
    except Exception as e:
        print("Client Conncetion to Polly failed because ", e)

    #Client Connection to S3 in N.Virginia Region
    try:
        s3Client = boto3.client("s3", region_name = "us-east-1")
        #Upload the file from Lambda temp folder to S3 bucket
        try:
            s3Client.upload_file('/tmp/' + unique_id,
                S3_Bucket_name,
                unique_id + ".mp3")
            print("S3 Upload completed.")
            #Give Read only access to S3 Object that is uploaded
            try:
                s3Client.put_object_acl(ACL='public-read',
                    Bucket=S3_Bucket_name,
                    Key= unique_id + ".mp3")
                print("Gave Public access to S3 Object.")
            except Exception as e:
                print("Set Read only access to Object failed because ", e)
        except Exception as e:
            print("Upload File to S3 Bucket failed beacuse ", e)

        #Get Bucket location
        try:
            location = s3Client.get_bucket_location(Bucket=S3_Bucket_name)
            region = location['LocationConstraint']

            if region is None:
                url_beginning = "https://s3.amazonaws.com/"
            else:
                url_beginning = "https://s3-" + str(region) + ".amazonaws.com/"

            url = url_beginning \
                    + str(S3_Bucket_name) \
                    + "/" \
                    + str(unique_id) \
                    + ".mp3"
        except Exception as e:
            print("Get bucket location failed because ", e)
    except Exception as e:
        print("Client Connection to S3 failed because ", e)

    #Update Item based on Unique Id
    #Add the S3 Object URL and status as Updated.
    try:
        response = table.update_item(
            Key={'id':unique_id},
              UpdateExpression=
                "SET #statusAtt = :statusValue, #urlAtt = :urlValue",
              ExpressionAttributeValues=
                {':statusValue': 'COMPLETED', ':urlValue': url},
            ExpressionAttributeNames=
              {'#statusAtt': 'status', '#urlAtt': 'url'},
        )
    except Exception as e:
        print("DynamoDB item update failed because ", e)

    print("Text to Speech operation Completed.")

The Code above does the following:

  • Get DynamoDB table name and S3 bucket name from Environment variable.

  • Get the random number sent from SNS trigger.

  • Connect to the DynamoDB table and get the text from the table using the random number.

  • Convert the text to audio using Polly Speech Synthesize.

  • Store the audio file in the Lambda temp directory

  • Upload the audio file to S3 bucket and give public access to the object.

  • Generate Object URL and store it in DynamoDB table.

  1. Deploy the code by clicking on Deploy in the top right corner.

  2. Now Select Configuration tab

    • In the left panel select Environment Variable.

    • Now click on edit

    • Click on Add Environment variable.

      • Key : Enter BUCKET_NAME

      • Value : Paste the S3 Bucket name (Must be in the format: store-audio-123)

    • Click on Add Environment variable.

      • Key : Enter DB_TABLE_NAME

      • Value : paste store_message_details

    • Click on the Save button.

(Note : Don't change the Key Name, use the same name.)

  1. In the left panel select General configuration

  2. Click on edit

  • Timeout : Set 3 min and 0 sec

  • Click on the Save button.

Task 6: Create List Lambda Function

  1. Navigate to the Lambda Functions page from the left side menu.

  2. Click on the Create Function button.

    • Choose Author from scratch.

    • Function name : Enter Read_Table_items

    • Runtime : Select Python 3.9

  • Permissions: Click on Change default exexcution role and choose use an existing role.

  • Existing role: Select Lambda_execution_role from the dropdown list.

  • Click on Create Function button.

  1. Copy the code below:
import boto3
import os
from boto3.dynamodb.conditions import Key, Attr

def lambda_handler(event, context):
    table_name = os.environ['DB_TABLE_NAME']

    #Connect to DynamoDB service in N.Virginia Region
    try:
        dynamodb = boto3.resource("dynamodb", region_name = "us-east-1")
        table = dynamodb.Table(table_name)
        #Read all data
        try:
            items = table.scan()
            print("Scan completed.")
        except Exception as e:
            print("")
    except Exception as e:
        print("Connect to DynamoDB table failed because ", e)
    #Return all the items in the table
    return items["Items"]
  1. Deploy the code by clicking on Deploy in the top right corner.

  2. Now Select Configuration tab

    • In the left panel select Environment Variable.

    • Now click on edit

    • Click on Add Environment variable.

      • Key : Enter DB_TABLE_NAME

      • Value : paste store_message_details

    • Click on the Save button.

  3. In the left panel select General configuration

  4. Click on edit

  • Timeout : Set 3 min and 0 sec

  • Click on the Save button.

Task 7: Create an API Gateway

  1. Navigate to the Services menu at the top, then click on API gateway in the Networking & content delivery section.

  2. Click on Build in REST API.

  3. Select Protocol as REST

  4. Create new API : Choose New API.

  5. Under settings:

    • API name : Enter WebAPI

    • Leave other options as default.

    • Click on the Create API button.

  6. Once the API is created, select the API and click on Actions button.

  7. Select Create Resource in actions.

  • Resource Name: Enter data

  • Leave everything as default

  1. Now click on Create Resource button.

  2. Once you create a resource, click on Actions and select create Method.

  3. From the drop down select POST and click on the button.

  4. Setup:

    • Method Type: POST

    • Integration Type : choose Lambda Function

    • Lambda Region : select us-east-1

    • Lambda function : Enter Add_New_posts

    • Leave everything else as default

  5. Select the resource /data, click on Actions and select Create method.

  6. Setup:

    • Method Type: GET

    • Integration Type : choose Lambda Function

    • Lambda Region : select us-east-1

    • Lambda function : Enter Read_Table_items

    • Leave everything else as default

  7. Select /data then under Resource Details click on Enable CORS

    The CORS Method (cross-origin resource sharing). This method enables invoking the API from a website with a different hostname.

  8. Click on Deploy API

  9. Select the Deployment Stage in the drop-down as [New Stage].

  10. Enter Stage Name: Dev and Stage description as Serverless API

  11. Click on deploy button.

  12. Under Stages, click on Dev and select POST and Copy the API Invoke URL and paste it into your text editor.

Task 8: Host a Serverless Application

  1. First, click on this link to download the ZIP folder which contains the Website.

  2. The ZIP contains : 2 HTML files, 1 JS and 1 CSS file.

  3. Extract the zip folder to your local machine.

  4. Open the scripts.js file in your text editor, Line number 1 : Replace API_ENDPOINT with the you API Invoke URL that you copied in the above step and save the file.

  5. Navigate to the Services menu at the top and click on S3 in the Storage section.

  6. In the left menu, choose Buckets, click Create Bucket and fill the bucket details.

  • Bucket Name : Enter store-serverless-<RANDOMNUMBER>

    • Replace RANDOMNUMBER with a Random Number, Example store-serverless-123

(Note: The Bucket Name must be Unique across all existing bucket names in Amazon S3.)

  • Region : Select US East (N. Virginia)

  • Object ownership: Select ACLs disabled (recommended) option

  • Bucket settings for Block Public Access: Uncheck the option, Block all public access and select the check box option of Acknowledgment.

  • Leave other settings as default.

  • Click on create Bucket button.

  1. Click on the bucket name of the format store-serverless-xxxxx .

  2. Under Objects, Click on Upload and click on Add files button.

  3. Upload all the 4 files that you have extracted from the zip.

  4. Now scroll down and click on the Upload button.

  5. Once the upload is successful, click on the Close button on the top right corner.

  6. To make the bucket Public, copy the ARN of your S3 bucket, click on properties and copy the ARN.

  7. Make the bucket public with a Bucket Policy:

    • Click the Permissions tab to configure your bucket:

    • Scroll down to Bucket policy, click on Edit button on the Right side.

    • The policy editor will open.

  8. In the policy below, update your bucket ARN in the Resource key-value and copy the policy code.

{
    "Id": "Policy1",
    "Version": "2012-10-17",
    "Statement": [
      {
        "Sid": "Stmt1",
        "Action": [
          "s3:GetObject"
        ],
        "Effect": "Allow",
        "Resource": "replace-this-string-from-your-bucket-arn/*",
        "Principal": "*"
      }
    ]
  }
  1. Paste the bucket policy into the Bucket policy editor.
  • Note: Remove all blank lines.
  1. Click on Save Changes button

  2. Now select properties and scroll down to the last and click on Edit of static website hosting.

    • Static website hosting : Select Enable

    • Hosting type : Select Host a static website

    • Index document : Enter index.html

    • Error document : Enter error.html

    • Scroll down and click on Save Changes button.

  3. Now scroll down to the last and copy the Bucket website endpoint. Paste it in your browser and hit [enter].

Task 10 : Test the application

  1. Select Voice : Select Joanna - English

  2. In text box : Enter This is a demo

  3. Click on the Process button.

  4. Now immediately if you click on the Refresh table button. In the table, you will see the entry and status will be PROCESSING. This means the task is not completed. Wait for a few seconds and then again click on the Refresh table button.

  5. Now the status will be COMPLETED and now click on the play button. You will hear the audio “This is a demo.”

  1. Now select a different Voice like : Carla - Italian

  2. Text box : Enter Welcome to Amazon Web Services machine learning demo

  3. Click on the Process Table button.

  4. Click on the refresh table button and wait till the status became COMPLETED.

  1. You can click the play button and hear the text in an Italian voice.

Do you know?

Building a Text-to-Speech (TTS) serverless web application using Amazon Polly involves using Amazon Web Services (AWS) services to convert text into lifelike speech. Amazon Polly is a cloud service that provides TTS capabilities, allowing you to integrate natural-sounding speech synthesis into your applications. The serverless approach means you can accomplish this without provisioning or managing traditional servers.

Task 11: Clean Up the AWS Resources

Deleting the DynamoDB Table

  1. Navigate to the DynamoDB page by clicking on the Services menu at the top. DynamoDB is available under the Database section.

  2. From the left side menu select Tables.

  3. Select the DynamoDB Table that you created store_message_details and click on the Delete button.

  1. Now in the pop up message :

    • Select the checkbox Delete all CloudWatch alarms for this table.

    • Don't select the checkbox Create a backup for this table before deleting it.

    • In the textbox : Enter delete.

  2. Now click on the delete Table button.

  3. If the table is not deleted, just refresh the page.

Deleting the SNS Topic

  1. Navigate to SNS by clicking on the Services menu. Simple notification service is available under the Application Integration section.

  2. Click on Topics in the left panel.

  3. Select the SNS Topic Invoke_audio_lambda and click on the Delete button.

  1. Now in the confirmation textbox : Enter delete me and click on Delete button.

Deleting the Lambda Functions

  1. Navigate to Lambda by clicking on Services menu. Lambda is available under Compute section.

  2. Select the lambda function Convert_Text_to_Audio and click Actions and click Delete button.

  1. Click the Delete button to confirm deletion.

  2. Now Follow the same steps to delete the other two Lambda functions.

Deleting API Gateway

  1. Navigate to the Services menu at the top, then click on API Gateway in the Networking content & delivery section.

  2. Select the API Gateway that you have created WebAPI and click on Actions and select Delete.

  1. Now click on the Delete button in the confirmation window.

Completion and Conclusion

Congratulations on completing the demo:

  1. You have successfully created the S3 Bucket.

  2. You have successfully created a DynamoDB Table.

  3. You have successfully created the SNS Topic.

  4. You have successfully created insert Lambda functions.

  5. You have successfully created process Lambda functions.

  6. You have successfully created list of Lambda functions.

  7. You have successfully created an API Gateway.

  8. You have successfully hosted a Serverless Application.

  9. You have successfully tested the application.

  10. You have successfully deleted all the AWS Resources.