A Simple Example of Amazon Transcribe with .NET
Want to learn more about AWS Lambda and .NET? Check out my A Cloud Guru course on ASP.NET Web API and Lambda.
Download full source code.
This is the first of a few posts on Amazon Transcribe, a service that converts audio to text.
Here you will see how to send a file for transcription. This process can take a while, so you will poll for completion, then download the transcription.
The downloaded transcript is in JSON format. At the top of the file is a transcript that does NOT differentiate between speakers, it is one lone paragraph with everything everyone said and no speaker attribution.
In a subsequent post, I will show how to parse the JSON to identify who said what, and when.
But for now, the basics.
Transcription has a few steps.
- Upload the file to S3
- Start the transcription job
- Poll the job until it is complete
- Download the transcription
0. Some Setup
Create a new console application and add the AWSSDK.TranscribeService
, and AWSSDK.S3
NuGet packages.
Add a few using statements, and create the Transcribe client and S3 TransferUtility.
1using System.Net;
2using Amazon.S3;
3using Amazon.S3.Transfer;
4using Amazon.TranscribeService;
5using Amazon.TranscribeService.Model;
6
7string bucketName = "your-bucket-name";
8string fileToTranscribe = "your-audio-file.mp3";
9
10IAmazonTranscribeService amazonTranscribeService = new AmazonTranscribeServiceClient();
11TransferUtility transferUtility = new TransferUtility(new AmazonS3Client());
1. Upload the file to S3
Before you can upload a file to S3, you need to create a bucket.
See this blog post for more information on creating an S3 bucket.
I will use the S3 TransferUtility to upload the file. This is a simple wrapper around the S3 client that makes uploading files to S3 very easy.
Uploading the file to S3 is a single line of code, and its S3 URI is predictable.
await transferUtility.UploadAsync(fileToTranscribe, bucketName);
string s3Uri = $"s3://{bucketName}/{fileToTranscribe}";
After the file is uploaded, you call the StartTranscriptionJob
method.
string jobName = Guid.NewGuid().ToString();
var startTranscriptionJobResponse = await StartTranscriptionJob(s3Uri, jobName);
2. Start the transcription job
To start the transcription job, you use the AmazonTranscribeService
instance, passing a StartTranscriptionJobRequest
that details the language of the audio, where the audio file is located, and where the transcription should be stored.
async Task<HttpStatusCode> StartTranscriptionJob(string s3Uri, string transcriptionJobName)
{
var startTranscriptionJobRequest = new StartTranscriptionJobRequest()
{
TranscriptionJobName = transcriptionJobName,
LanguageCode = LanguageCode.EnUS,
Media = new Media()
{
MediaFileUri = s3Uri
},
OutputBucketName = bucketName
};
var startTranscriptionJobResponse = await amazonTranscribeService.StartTranscriptionJobAsync(startTranscriptionJobRequest);
return startTranscriptionJobResponse.HttpStatusCode;
}
3. Poll the job until it is complete
The transcription job can take a while, in this example, you will poll for completion with a simple while
loop. But in a subsequent post, I’ll show how to get notified upon completion.
The GetTranscriptionJob
method returns the status of the job. When the job is complete, the status will be COMPLETED
.
This is the polling loop.
1async Task<bool> PollTranscriptionJob(string jobName)
2{
3 while (true)
4 {
5 var getTranscriptionJobResponse = await amazonTranscribeService.GetTranscriptionJobAsync(new GetTranscriptionJobRequest()
6 {
7 TranscriptionJobName = jobName
8 });
9
10 var transcriptionJobStatus = getTranscriptionJobResponse.TranscriptionJob.TranscriptionJobStatus;
11 if (transcriptionJobStatus == TranscriptionJobStatus.COMPLETED)
12 {
13 Console.WriteLine("Transcription job completed");
14 Console.WriteLine($"Output is available at {getTranscriptionJobResponse.TranscriptionJob.Transcript.TranscriptFileUri}");
15 return true;
16 }
17 else if (transcriptionJobStatus == TranscriptionJobStatus.FAILED)
18 {
19 Console.WriteLine("Transcription job failed");
20 return false;
21 }
22 else
23 {
24 Console.WriteLine("Transcription job not completed yet");
25 }
26 await Task.Delay(5000);
27 }
28}
4. Download the transcription
When polling tells you the job is complete, download the file.
if (await PollTranscriptionJob(jobName))
{
await transferUtility.DownloadAsync($"{jobName}.json", bucketName, $"{jobName}.json");
}
else
{
Console.WriteLine("Transcription job failed");
}
You can open the downloaded JSON file in a text editor to see the transcription.
Conclusion
It’s not very difficult to send a file for transcription and get the results back. In the next post, I’ll show how to parse the JSON when there is more than one speaker and produce a document that shows who said what, with timestamps. Download full source code.