diff --git a/6-consumer/lessons/1-speech-recognition/README.md b/6-consumer/lessons/1-speech-recognition/README.md index 6f6bc87..4bfa8ce 100644 --- a/6-consumer/lessons/1-speech-recognition/README.md +++ b/6-consumer/lessons/1-speech-recognition/README.md @@ -147,7 +147,9 @@ To avoid the complexity of training and using a wake word model, the smart timer ## Convert speech to text -Just like with image classification in the last project, there are pre-built AI services that can take speech as an audio file and convert it to text. Once such service is the Speech Service, part of the Cognitive Services, pre-built AI services you can use in your apps. +![Speech services logo](../../../images/azure-speech-logo.png) + +Just like with image classification in an earlier project, there are pre-built AI services that can take speech as an audio file and convert it to text. Once such service is the Speech Service, part of the Cognitive Services, pre-built AI services you can use in your apps. ### Task - configure a speech AI resource diff --git a/6-consumer/lessons/1-speech-recognition/code-iot-hub/pi/smart-timer/app.py b/6-consumer/lessons/1-speech-recognition/code-iot-hub/pi/smart-timer/app.py index 809f35c..81c16e6 100644 --- a/6-consumer/lessons/1-speech-recognition/code-iot-hub/pi/smart-timer/app.py +++ b/6-consumer/lessons/1-speech-recognition/code-iot-hub/pi/smart-timer/app.py @@ -41,7 +41,7 @@ def capture_audio(): return wav_buffer -api_key = '' +speech_api_key = '' location = '' language = '' connection_string = '' @@ -54,7 +54,7 @@ print('Connected') def get_access_token(): headers = { - 'Ocp-Apim-Subscription-Key': api_key + 'Ocp-Apim-Subscription-Key': speech_api_key } token_endpoint = f'https://{location}.api.cognitive.microsoft.com/sts/v1.0/issuetoken' diff --git a/6-consumer/lessons/1-speech-recognition/code-iot-hub/virtual-iot-device/smart-timer/app.py b/6-consumer/lessons/1-speech-recognition/code-iot-hub/virtual-iot-device/smart-timer/app.py index ae111d1..2b2f2c2 100644 --- a/6-consumer/lessons/1-speech-recognition/code-iot-hub/virtual-iot-device/smart-timer/app.py +++ b/6-consumer/lessons/1-speech-recognition/code-iot-hub/virtual-iot-device/smart-timer/app.py @@ -3,7 +3,7 @@ import time from azure.cognitiveservices.speech import SpeechConfig, SpeechRecognizer from azure.iot.device import IoTHubDeviceClient, Message -api_key = '' +speech_api_key = '' location = '' language = '' connection_string = '' @@ -14,7 +14,7 @@ print('Connecting') device_client.connect() print('Connected') -recognizer_config = SpeechConfig(subscription=api_key, +recognizer_config = SpeechConfig(subscription=speech_api_key, region=location, speech_recognition_language=language) diff --git a/6-consumer/lessons/1-speech-recognition/code-speech-to-text/pi/smart-timer/app.py b/6-consumer/lessons/1-speech-recognition/code-speech-to-text/pi/smart-timer/app.py index 64eb299..3a56b2f 100644 --- a/6-consumer/lessons/1-speech-recognition/code-speech-to-text/pi/smart-timer/app.py +++ b/6-consumer/lessons/1-speech-recognition/code-speech-to-text/pi/smart-timer/app.py @@ -39,13 +39,13 @@ def capture_audio(): return wav_buffer -api_key = '' +speech_api_key = '' location = '' language = '' def get_access_token(): headers = { - 'Ocp-Apim-Subscription-Key': api_key + 'Ocp-Apim-Subscription-Key': speech_api_key } token_endpoint = f'https://{location}.api.cognitive.microsoft.com/sts/v1.0/issuetoken' diff --git a/6-consumer/lessons/1-speech-recognition/code-speech-to-text/virtual-iot-device/smart-timer/app.py b/6-consumer/lessons/1-speech-recognition/code-speech-to-text/virtual-iot-device/smart-timer/app.py index 355b9c2..6d282ad 100644 --- a/6-consumer/lessons/1-speech-recognition/code-speech-to-text/virtual-iot-device/smart-timer/app.py +++ b/6-consumer/lessons/1-speech-recognition/code-speech-to-text/virtual-iot-device/smart-timer/app.py @@ -1,11 +1,11 @@ import time from azure.cognitiveservices.speech import SpeechConfig, SpeechRecognizer -api_key = '' +speech_api_key = '' location = '' language = '' -recognizer_config = SpeechConfig(subscription=api_key, +recognizer_config = SpeechConfig(subscription=speech_api_key, region=location, speech_recognition_language=language) diff --git a/6-consumer/lessons/1-speech-recognition/pi-speech-to-text.md b/6-consumer/lessons/1-speech-recognition/pi-speech-to-text.md index f5be1fc..5e9eac9 100644 --- a/6-consumer/lessons/1-speech-recognition/pi-speech-to-text.md +++ b/6-consumer/lessons/1-speech-recognition/pi-speech-to-text.md @@ -22,12 +22,12 @@ The audio can be sent to the speech service using the REST API. To use the speec 1. Add the following code above the `while True` loop to declare some settings for the speech service: ```python - api_key = '' + speech_api_key = '' location = '' language = '' ``` - Replace `` with the API key for your speech service. Replace `` with the location you used when you created the speech service resource. + Replace `` with the API key for your speech service resource. Replace `` with the location you used when you created the speech service resource. Replace `` with the locale name for language you will be speaking in, for example `en-GB` for English, or `zn-HK` for Cantonese. You can find a list of the supported languages and their locale names in the [Language and voice support documentation on Microsoft docs](https://docs.microsoft.com/azure/cognitive-services/speech-service/language-support?WT.mc_id=academic-17441-jabenn#speech-to-text). @@ -36,7 +36,7 @@ The audio can be sent to the speech service using the REST API. To use the speec ```python def get_access_token(): headers = { - 'Ocp-Apim-Subscription-Key': api_key + 'Ocp-Apim-Subscription-Key': speech_api_key } token_endpoint = f'https://{location}.api.cognitive.microsoft.com/sts/v1.0/issuetoken' diff --git a/6-consumer/lessons/1-speech-recognition/virtual-device-speech-to-text.md b/6-consumer/lessons/1-speech-recognition/virtual-device-speech-to-text.md index 02e29b8..8ec5b8a 100644 --- a/6-consumer/lessons/1-speech-recognition/virtual-device-speech-to-text.md +++ b/6-consumer/lessons/1-speech-recognition/virtual-device-speech-to-text.md @@ -41,11 +41,11 @@ On Windows, Linux, and macOS, the speech services Python SDK can be used to list 1. Add the following code to declare some configuration: ```python - api_key = '' + speech_api_key = '' location = '' language = '' - recognizer_config = SpeechConfig(subscription=api_key, + recognizer_config = SpeechConfig(subscription=speech_api_key, region=location, speech_recognition_language=language) ``` diff --git a/6-consumer/lessons/3-spoken-feedback/code-spoken-response/pi/smart-timer/app.py b/6-consumer/lessons/3-spoken-feedback/code-spoken-response/pi/smart-timer/app.py index 40bce46..1b3daae 100644 --- a/6-consumer/lessons/3-spoken-feedback/code-spoken-response/pi/smart-timer/app.py +++ b/6-consumer/lessons/3-spoken-feedback/code-spoken-response/pi/smart-timer/app.py @@ -42,7 +42,7 @@ def capture_audio(): return wav_buffer -api_key = '' +speech_api_key = '' location = '' language = '' connection_string = '' @@ -55,7 +55,7 @@ print('Connected') def get_access_token(): headers = { - 'Ocp-Apim-Subscription-Key': api_key + 'Ocp-Apim-Subscription-Key': speech_api_key } token_endpoint = f'https://{location}.api.cognitive.microsoft.com/sts/v1.0/issuetoken' @@ -97,7 +97,7 @@ def get_voice(): return first_voice['ShortName'] voice = get_voice() -print(f"Using voice {voice}") +print(f'Using voice {voice}') playback_format = 'riff-48khz-16bit-mono-pcm' @@ -143,10 +143,10 @@ def say(text): def announce_timer(minutes, seconds): announcement = 'Times up on your ' if minutes > 0: - announcement += f'{minutes} minute' + announcement += f'{minutes} minute ' if seconds > 0: - announcement += f'{seconds} second' - announcement += ' timer.' + announcement += f'{seconds} second ' + announcement += 'timer.' say(announcement) def create_timer(total_seconds): @@ -154,10 +154,10 @@ def create_timer(total_seconds): threading.Timer(total_seconds, announce_timer, args=[minutes, seconds]).start() announcement = '' if minutes > 0: - announcement += f'{minutes} minute' + announcement += f'{minutes} minute ' if seconds > 0: - announcement += f'{seconds} second' - announcement += ' timer started.' + announcement += f'{seconds} second ' + announcement += 'timer started.' say(announcement) def handle_method_request(request): diff --git a/6-consumer/lessons/3-spoken-feedback/code-spoken-response/virtual-iot-device/smart-timer/app.py b/6-consumer/lessons/3-spoken-feedback/code-spoken-response/virtual-iot-device/smart-timer/app.py index cd1a8fe..510b7fb 100644 --- a/6-consumer/lessons/3-spoken-feedback/code-spoken-response/virtual-iot-device/smart-timer/app.py +++ b/6-consumer/lessons/3-spoken-feedback/code-spoken-response/virtual-iot-device/smart-timer/app.py @@ -4,7 +4,7 @@ import time from azure.cognitiveservices.speech import SpeechConfig, SpeechRecognizer, SpeechSynthesizer from azure.iot.device import IoTHubDeviceClient, Message, MethodResponse -api_key = '' +speech_api_key = '' location = '' language = '' connection_string = '' @@ -15,7 +15,7 @@ print('Connecting') device_client.connect() print('Connected') -recognizer_config = SpeechConfig(subscription=api_key, +recognizer_config = SpeechConfig(subscription=speech_api_key, region=location, speech_recognition_language=language) @@ -30,7 +30,7 @@ recognizer.recognized.connect(recognized) recognizer.start_continuous_recognition() -speech_config = SpeechConfig(subscription=api_key, +speech_config = SpeechConfig(subscription=speech_api_key, region=location) speech_config.speech_synthesis_language = language speech_synthesizer = SpeechSynthesizer(speech_config=speech_config) @@ -53,10 +53,10 @@ def say(text): def announce_timer(minutes, seconds): announcement = 'Times up on your ' if minutes > 0: - announcement += f'{minutes} minute' + announcement += f'{minutes} minute ' if seconds > 0: - announcement += f'{seconds} second' - announcement += ' timer.' + announcement += f'{seconds} second ' + announcement += 'timer.' say(announcement) def create_timer(total_seconds): @@ -64,10 +64,10 @@ def create_timer(total_seconds): threading.Timer(total_seconds, announce_timer, args=[minutes, seconds]).start() announcement = '' if minutes > 0: - announcement += f'{minutes} minute' + announcement += f'{minutes} minute ' if seconds > 0: - announcement += f'{seconds} second' - announcement += ' timer started.' + announcement += f'{seconds} second ' + announcement += 'timer started.' say(announcement) def handle_method_request(request): diff --git a/6-consumer/lessons/3-spoken-feedback/code-timer/pi/smart-timer/app.py b/6-consumer/lessons/3-spoken-feedback/code-timer/pi/smart-timer/app.py index 1a8a622..afef5b7 100644 --- a/6-consumer/lessons/3-spoken-feedback/code-timer/pi/smart-timer/app.py +++ b/6-consumer/lessons/3-spoken-feedback/code-timer/pi/smart-timer/app.py @@ -42,7 +42,7 @@ def capture_audio(): return wav_buffer -api_key = '' +speech_api_key = '' location = '' language = '' connection_string = '' @@ -55,7 +55,7 @@ print('Connected') def get_access_token(): headers = { - 'Ocp-Apim-Subscription-Key': api_key + 'Ocp-Apim-Subscription-Key': speech_api_key } token_endpoint = f'https://{location}.api.cognitive.microsoft.com/sts/v1.0/issuetoken' @@ -89,10 +89,10 @@ def say(text): def announce_timer(minutes, seconds): announcement = 'Times up on your ' if minutes > 0: - announcement += f'{minutes} minute' + announcement += f'{minutes} minute ' if seconds > 0: - announcement += f'{seconds} second' - announcement += ' timer.' + announcement += f'{seconds} second ' + announcement += 'timer.' say(announcement) def create_timer(total_seconds): @@ -100,10 +100,10 @@ def create_timer(total_seconds): threading.Timer(total_seconds, announce_timer, args=[minutes, seconds]).start() announcement = '' if minutes > 0: - announcement += f'{minutes} minute' + announcement += f'{minutes} minute ' if seconds > 0: - announcement += f'{seconds} second' - announcement += ' timer started.' + announcement += f'{seconds} second ' + announcement += 'timer started.' say(announcement) def handle_method_request(request): diff --git a/6-consumer/lessons/3-spoken-feedback/code-timer/virtual-iot-device/smart-timer/app.py b/6-consumer/lessons/3-spoken-feedback/code-timer/virtual-iot-device/smart-timer/app.py index f6f8ed0..8d45eaf 100644 --- a/6-consumer/lessons/3-spoken-feedback/code-timer/virtual-iot-device/smart-timer/app.py +++ b/6-consumer/lessons/3-spoken-feedback/code-timer/virtual-iot-device/smart-timer/app.py @@ -4,7 +4,7 @@ import time from azure.cognitiveservices.speech import SpeechConfig, SpeechRecognizer from azure.iot.device import IoTHubDeviceClient, Message, MethodResponse -api_key = '' +speech_api_key = '' location = '' language = '' connection_string = '' @@ -15,7 +15,7 @@ print('Connecting') device_client.connect() print('Connected') -recognizer_config = SpeechConfig(subscription=api_key, +recognizer_config = SpeechConfig(subscription=speech_api_key, region=location, speech_recognition_language=language) @@ -33,13 +33,13 @@ recognizer.start_continuous_recognition() def say(text): print(text) -def announce_timer(minutes, seconds): +def announce_timer(minutes, seconds): announcement = 'Times up on your ' if minutes > 0: - announcement += f'{minutes} minute' + announcement += f'{minutes} minute ' if seconds > 0: - announcement += f'{seconds} second' - announcement += ' timer.' + announcement += f'{seconds} second ' + announcement += 'timer.' say(announcement) def create_timer(total_seconds): @@ -47,10 +47,10 @@ def create_timer(total_seconds): threading.Timer(total_seconds, announce_timer, args=[minutes, seconds]).start() announcement = '' if minutes > 0: - announcement += f'{minutes} minute' + announcement += f'{minutes} minute ' if seconds > 0: - announcement += f'{seconds} second' - announcement += ' timer started.' + announcement += f'{seconds} second ' + announcement += 'timer started.' say(announcement) def handle_method_request(request): diff --git a/6-consumer/lessons/3-spoken-feedback/pi-text-to-speech.md b/6-consumer/lessons/3-spoken-feedback/pi-text-to-speech.md index 2c961e3..2447dc5 100644 --- a/6-consumer/lessons/3-spoken-feedback/pi-text-to-speech.md +++ b/6-consumer/lessons/3-spoken-feedback/pi-text-to-speech.md @@ -10,6 +10,8 @@ Each language supports a range of different voices, and you can make a REST requ ### Task - get a voice +1. Open the `smart-timer` project in VS Code. + 1. Add the following code above the `say` function to request the list of voices for a language: ```python @@ -27,7 +29,7 @@ Each language supports a range of different voices, and you can make a REST requ return first_voice['ShortName'] voice = get_voice() - print(f"Using voice {voice}") + print(f'Using voice {voice}') ``` This code defines a function called `get_voice` that uses the speech service to get a list of voices. It then finds the first voice that matches the language that is being used. diff --git a/6-consumer/lessons/3-spoken-feedback/single-board-computer-set-timer.md b/6-consumer/lessons/3-spoken-feedback/single-board-computer-set-timer.md index efa3b9e..5f11422 100644 --- a/6-consumer/lessons/3-spoken-feedback/single-board-computer-set-timer.md +++ b/6-consumer/lessons/3-spoken-feedback/single-board-computer-set-timer.md @@ -31,10 +31,10 @@ Timers can be set using the Python `threading.Timer` class. This class takes a d def announce_timer(minutes, seconds): announcement = 'Times up on your ' if minutes > 0: - announcement += f'{minutes} minute' + announcement += f'{minutes} minute ' if seconds > 0: - announcement += f'{seconds} second' - announcement += ' timer.' + announcement += f'{seconds} second ' + announcement += 'timer.' say(announcement) ``` @@ -55,10 +55,10 @@ Timers can be set using the Python `threading.Timer` class. This class takes a d ```python announcement = '' if minutes > 0: - announcement += f'{minutes} minute' + announcement += f'{minutes} minute ' if seconds > 0: - announcement += f'{seconds} second' - announcement += ' timer started.' + announcement += f'{seconds} second ' + announcement += 'timer started.' say(announcement) ``` @@ -88,8 +88,8 @@ Timers can be set using the Python `threading.Timer` class. This class takes a d Connecting Connected Set a one minute 4 second timer. - 1 minute, 4 second timer started - Times up on your 1 minute, 4 second timer + 1 minute 4 second timer started. + Times up on your 1 minute 4 second timer. ``` > 💁 You can find this code in the [code-timer/pi](code-timer/pi) or [code-timer/virtual-iot-device](code-timer/virtual-iot-device) folder. diff --git a/6-consumer/lessons/3-spoken-feedback/virtual-device-text-to-speech.md b/6-consumer/lessons/3-spoken-feedback/virtual-device-text-to-speech.md index df71c4a..8c3ff68 100644 --- a/6-consumer/lessons/3-spoken-feedback/virtual-device-text-to-speech.md +++ b/6-consumer/lessons/3-spoken-feedback/virtual-device-text-to-speech.md @@ -10,6 +10,8 @@ Each language supports a range of different voices, and you can get the list of ### Task - convert text to speech +1. Open the `smart-timer` project in VS Code, and ensure the virtual environment is loaded in the terminal. + 1. Import the `SpeechSynthesizer` from the `azure.cognitiveservices.speech` package by adding it to the existing imports: ```python @@ -19,7 +21,7 @@ Each language supports a range of different voices, and you can get the list of 1. Above the `say` function, create a speech configuration to use with the speech synthesizer: ```python - speech_config = SpeechConfig(subscription=api_key, + speech_config = SpeechConfig(subscription=speech_api_key, region=location) speech_config.speech_synthesis_language = language speech_synthesizer = SpeechSynthesizer(speech_config=speech_config) diff --git a/6-consumer/lessons/4-multiple-language-support/README.md b/6-consumer/lessons/4-multiple-language-support/README.md index 6258d85..7a6bf1d 100644 --- a/6-consumer/lessons/4-multiple-language-support/README.md +++ b/6-consumer/lessons/4-multiple-language-support/README.md @@ -14,24 +14,157 @@ This video gives an overview of the Azure speech services, covering speech to te ## Introduction -In this lesson you will learn about +In the last 3 lessons you learned about converting speech to text, language understanding, and converting text to speech, all powered by AI. One other area of human communication that AI can help with is language translation - converting from one language to another, such as from English to French. + +In this lesson you will learn about using AI to translate text, allowing your smart timer to interact with users in multiple languages. In this lesson we'll cover: -* [Thing 1](#thing-1) +* [Translate text](#translate-text) +* [Translation services](#translation-services) +* [Create a translator resource](#create-a-translator-resource) +* [Support multiple languages in applications with translations](#support-multiple-languages-in-applications-with-translations) +* [Translate text using an AI service](#translate-text-using-an-ai-service) + +## Translate text + +Text translation has been a computer science problem that has been researched for over 70 years, and only now thanks to advances in AI and computer power is close to being solved to a point where it is almost as good as human translators. + +> 💁 The origins can be traced back even further, to [Al-Kindi](https://wikipedia.org/wiki/Al-Kindi), a 9th century Arabic cryptographer who developed techniques for language translation + +### Machine translations + +Text translation started out as a technology known as Machine Translation (MT), that can translate between different language pairs. MT works by substituting words in one language with another, adding techniques to select the correct ways of translating phrases or parts of sentences when a simple word-for-word translation doesn't make sense. + +> 🎓 When translators support translating between one language and another, these are know as *language pairs*. Different tools support different language pairs, and these may not be complete. For example, a translator may support English to Spanish as a language pair, and Spanish to Italian as a language pair, but not English to Italian. + +For example, translating "Hello world" from English into French can be performed with a substitution - "Bonjour" for "Hello", and "le monde" for "world", leading to the correct translation of "Bonjour le monde". + +Substitutions don't work when different languages use different ways of saying the same thing. For example, the English sentence "My name is Jim", translates into "Je m'appelle Jim" in French - literally "I call myself Jim". "Je" is French for "I", "moi" is me, but is concatenated with the verb as it starts with a vowel, so becomes "m'", "appelle" is to call, and "Jim" isn't translated as it's a name, and not a word that can be translated. Word ordering also becomes an issue - a simple substitution of "Je m'appelle Jim" becomes "I myself call Jim", with a different word order to English. + +> 💁 Some words are never translated - my name is Jim regardless of which language is used to introduce me. + +Idioms are also a problem for translation. These are phrases that have an understood meaning that is different from a direct interpretation of the words. For example, in English the idiom "I've got ants in my pants" does not literally refer to having ants in your clothing, but to being restless. If you translated this to German, you would end up confusing the listener, as the German version is "I have bumble bees in the bottom". + +> 💁 Different locales add different complexities. With the idiom "ants in your pants", in American English "pants" refers to outerwear, in British English, "pants" is underwear. + +✅ If you speak multiple languages, think of some phrases that don't directly translate + +Machine translation systems rely on large databases of rules that describe how to translate certain phrases and idioms, along with statistical methods to pick the appropriate translations from possible options. These statistical methods use huge databases of works translated by humans into multiple languages to pick the most likely translation, a technique called *statistical machine translation*. A number of these use intermediate representations of the language, allowing one language to be translated to the intermediate, then from the intermediate to another language. This way adding more languages involves translations to and from the intermediate, not to and from all other languages. + +### Neural translations + +Neural translations involve using the power of AI to translate, typically translating entire sentences using one model. These models are trained on huge data sets that have been human translated, such as web pages, books and United Nations documentation. + +Neural translation models are usually smaller than machine translation models due to not needing huge databases of phrases and idioms. Modern AI services that provide translations often mix multiple techniques, mixing statistical machine translation and neural translation + +There is no 1:1 translation for any language pair. Different translation models will produce slightly different results depending on the data used to train the model. Translations are not always symmetrical - in that if you translate a sentence from one language to another, then back to the first language you may see a slightly different sentence as the result. + +✅ Try out different online translators such as [Bing Translate](https://www.bing.com/translator), [Google Translate](https://translate.google.com), or the Apple translate app. Compare the translated versions of a few sentences. Also try translating in one, then translating back in another. + +## Translation services + +There are a number of AI services that can be used from your applications to translate speech and text. + +### Cognitive services Speech service + +![The speech service logo](../../../images/azure-speech-logo.png) + +The speech service you've been using over the past few lessons has translation capabilities for speech recognition. When you recognize speech, you can request not only the text of the speech in the same language, but also in other languages. + +> 💁 This is only available from the speech SDK, the REST API doesn't have translations built in. + +### Cognitive services Translator service + +![The translator service logo](../../../images/azure-translator-logo.png) + +The Translator service is a dedicated translation service that can translate text from one language, to one or more target languages. As well as translating, it supports a wide range of extra features including masking profanity. It also allows you to provide a specific translation for a particular word or sentence, to work with terms you don't want translated, or have a specific well-known translation. + +For example, when translating the sentence "I have a Raspberry Pi", referring to the single-board computer, into another language such as French, you would want to keep the name "Raspberry Pi" as is, and not translate it, giving "J’ai un Raspberry Pi" instead of "J’ai une pi aux framboises". -## Thing 1 +## Create a translator resource + +For this lesson you will need a Translator resource. You will use the REST API to translate text. + +### Task - create a translator resource + +1. From your terminal or command prompt, run the following command to create a translator resource in your `smart-timer` resource group. + + ```sh + az cognitiveservices account create --name smart-timer-translator \ + --resource-group smart-timer \ + --kind TextTranslation \ + --sku F0 \ + --yes \ + --location + ``` + + Replace `` with the location you used when creating the Resource Group. + +1. Get the key for the translator service: + + ```sh + az cognitiveservices account keys list --name smart-timer-translator \ + --resource-group smart-timer \ + --output table + ``` + + Take a copy of one of the keys. + +## Support multiple languages in applications with translations + +In an ideal world, your whole application should understand as many different languages as possible, from listening for speech, to language understanding, to responding with speech. This is a lot of work, so translation services can speed up the time to delivery of your application. + +![A smart timer architecture translating Japanese to English, processing in English then translating back to Japanese](../../../images/translated-smart-timer.png) + +***A smart timer architecture translating Japanese to English, processing in English then translating back to Japanese. Microcontroller by Template / recording by Aybige Speaker / Speaker by Gregor Cresnar - all from the [Noun Project](https://thenounproject.com)*** + +Imagine you are building a smart timer that uses English end-to-end, understanding spoken English and converting that to text, running the language understanding in English, building up responses in English and replying with English speech. If you wanted to add support for Japanese, you could start with translating spoken Japanese to English text, then keep the core of the application the same, then translate the response text to Japanese before speaking the response. This would allow you to quickly add Japanese support, and you can expand to providing full end-to-end Japanese support later. + +> 💁 The downside to relying on machine translation is that different languages and cultures have different ways of saying the same things, so the translation may not match the expression you are expecting. + +Machine translations also open up possibilities for apps and devices that can translate user-created content as it is created. Science fiction regularly features 'universal translators', devices that can translate from alien languages into (typically) American English. These devices are less science fiction, more science fact, if you ignore the alien part. There are already apps and devices that provide real-time translation of speech and written text, using combinations of speech and translation services. + +One example is the [Microsoft Translator](https://www.microsoft.com/translator/apps/?WT.mc_id=academic-17441-jabenn) mobile phone app, demonstrated in this video: + +[![Microsoft Translator live feature in action](https://img.youtube.com/vi/16yAGeP2FuM/0.jpg)](https://www.youtube.com/watch?v=16yAGeP2FuM) + +> 🎥 Click the image above to watch a video + +Imagine having such a device available to you, especially when travelling or interacting with folks whose language you don't know. Having automatic translation devices in airports or hospitals would provide much needed accessibility improvements. + +✅ Do some research: Are there any translation IoT devices commercially available? What about translation capabilities built into smart devices? + +> 👽 Although there are no true universal translators that allow us to talk to aliens, the [Microsoft translator does support Klingon](https://www.microsoft.com/translator/blog/2013/05/14/announcing-klingon-for-bing-translator/?WT.mc_id=academic-17441-jabenn). Qapla’! + +## Translate text using an AI service + +You can use an AI service to add this translation capability to your smart timer. + +### Task - translate text using an AI service + +Work through the relevant guide to convert translate text on your IoT device: + +* [Arduino - Wio Terminal](wio-terminal-translate-speech.md) +* [Single-board computer - Raspberry Pi](pi-translate-speech.md) +* [Single-board computer - Virtual device](virtual-device-translate-speech.md) --- ## 🚀 Challenge +How can machine translations benefit other IoT applications beyond smart devices? Think of different ways translations can help, not just with spoken words but with text. + ## Post-lecture quiz [Post-lecture quiz](https://brave-island-0b7c7f50f.azurestaticapps.net/quiz/48) ## Review & Self Study +* Read more on machine translation on the [Machine translation page on Wikipedia](https://wikipedia.org/wiki/Machine_translation) +* Read more on neural machine translation on the [Neural machine translation page on Wikipedia](https://wikipedia.org/wiki/Neural_machine_translation) +* Check out the list of supported languages for the Microsoft speech services in the [Language and voice support for the Speech service documentation on Microsoft Docs](https://docs.microsoft.com/azure/cognitive-services/speech-service/language-support?WT.mc_id=academic-17441-jabenn) + ## Assignment -[](assignment.md) +[Build a universal translator](assignment.md) diff --git a/6-consumer/lessons/4-multiple-language-support/assignment.md b/6-consumer/lessons/4-multiple-language-support/assignment.md index da157d5..ca22074 100644 --- a/6-consumer/lessons/4-multiple-language-support/assignment.md +++ b/6-consumer/lessons/4-multiple-language-support/assignment.md @@ -1,9 +1,17 @@ -# +# Build a universal translator ## Instructions +A universal translator is a device that can translate between multiple languages, allowing folks who speak different languages to be able to communicate. Use what you have learned over the past few lessons to build a universal translator using 2 IoT devices. + +> If you do not have 2 devices, follow the steps in the previous few lessons to set up a virtual IoT device as one of the IoT devices. + +You should configure one device for one language, and one for another. Each device should accept speech, convert it to text, send it to the other device via IoT Hub and a Functions app, then translate it and play the translated speech. + +> 💁 Tip: When sending the speech from one device to another, send the language it is in as well, making it easer to translate. You could even have each device register using IoT Hub and a Functions app first, passing the language they support to be stored in Azure Storage. You could then use a Functions app to do the translations, sending the translated text to the IoT device. + ## Rubric | Criteria | Exemplary | Adequate | Needs Improvement | | -------- | --------- | -------- | ----------------- | -| | | | | +| Create a universal translator | Was able to build a universal translator, converting speech detected by one device into speech played by another device in a different language | Was able to get some components working, such as capturing speech, or translating, but was unable to build the end to end solution | Was unable to build any parts of a working universal translator | diff --git a/6-consumer/lessons/4-multiple-language-support/code/pi/smart-timer/app.py b/6-consumer/lessons/4-multiple-language-support/code/pi/smart-timer/app.py new file mode 100644 index 0000000..2ba641a --- /dev/null +++ b/6-consumer/lessons/4-multiple-language-support/code/pi/smart-timer/app.py @@ -0,0 +1,212 @@ +import io +import json +import pyaudio +import requests +import time +import wave +import threading + +from azure.iot.device import IoTHubDeviceClient, Message, MethodResponse + +from grove.factory import Factory +button = Factory.getButton('GPIO-HIGH', 5) + +audio = pyaudio.PyAudio() +microphone_card_number = 1 +speaker_card_number = 1 +rate = 16000 + +def capture_audio(): + stream = audio.open(format = pyaudio.paInt16, + rate = rate, + channels = 1, + input_device_index = microphone_card_number, + input = True, + frames_per_buffer = 4096) + + frames = [] + + while button.is_pressed(): + frames.append(stream.read(4096)) + + stream.stop_stream() + stream.close() + + wav_buffer = io.BytesIO() + with wave.open(wav_buffer, 'wb') as wavefile: + wavefile.setnchannels(1) + wavefile.setsampwidth(audio.get_sample_size(pyaudio.paInt16)) + wavefile.setframerate(rate) + wavefile.writeframes(b''.join(frames)) + wav_buffer.seek(0) + + return wav_buffer + +speech_api_key = '' +translator_api_key = '' +location = '' +language = '' +server_language = '' +connection_string = '' + +device_client = IoTHubDeviceClient.create_from_connection_string(connection_string) + +print('Connecting') +device_client.connect() +print('Connected') + +def get_access_token(): + headers = { + 'Ocp-Apim-Subscription-Key': speech_api_key + } + + token_endpoint = f'https://{location}.api.cognitive.microsoft.com/sts/v1.0/issuetoken' + response = requests.post(token_endpoint, headers=headers) + return str(response.text) + +def convert_speech_to_text(buffer): + url = f'https://{location}.stt.speech.microsoft.com/speech/recognition/conversation/cognitiveservices/v1' + + headers = { + 'Authorization': 'Bearer ' + get_access_token(), + 'Content-Type': f'audio/wav; codecs=audio/pcm; samplerate={rate}', + 'Accept': 'application/json;text/xml' + } + + params = { + 'language': language + } + + response = requests.post(url, headers=headers, params=params, data=buffer) + response_json = json.loads(response.text) + + if response_json['RecognitionStatus'] == 'Success': + return response_json['DisplayText'] + else: + return '' + +def translate_text(text, from_language, to_language): + url = f'https://api.cognitive.microsofttranslator.com/translate?api-version=3.0' + + headers = { + 'Ocp-Apim-Subscription-Key': translator_api_key, + 'Ocp-Apim-Subscription-Region': location, + 'Content-type': 'application/json' + } + + params = { + 'from': from_language, + 'to': to_language + } + + body = [{ + 'text' : text + }] + + response = requests.post(url, headers=headers, params=params, json=body) + return response.json()[0]['translations'][0]['text'] + +def get_voice(): + url = f'https://{location}.tts.speech.microsoft.com/cognitiveservices/voices/list' + + headers = { + 'Authorization': 'Bearer ' + get_access_token() + } + + response = requests.get(url, headers=headers) + voices_json = json.loads(response.text) + + first_voice = next(x for x in voices_json if x['Locale'].lower() == language.lower()) + return first_voice['ShortName'] + +voice = get_voice() +print(f'Using voice {voice}') + +playback_format = 'riff-48khz-16bit-mono-pcm' + +def get_speech(text): + url = f'https://{location}.tts.speech.microsoft.com/cognitiveservices/v1' + + headers = { + 'Authorization': 'Bearer ' + get_access_token(), + 'Content-Type': 'application/ssml+xml', + 'X-Microsoft-OutputFormat': playback_format + } + + ssml = f'' + ssml += f'' + ssml += text + ssml += '' + ssml += '' + + response = requests.post(url, headers=headers, data=ssml.encode('utf-8')) + return io.BytesIO(response.content) + +def play_speech(speech): + with wave.open(speech, 'rb') as wave_file: + stream = audio.open(format=audio.get_format_from_width(wave_file.getsampwidth()), + channels=wave_file.getnchannels(), + rate=wave_file.getframerate(), + output_device_index=speaker_card_number, + output=True) + + data = wave_file.readframes(4096) + + while len(data) > 0: + stream.write(data) + data = wave_file.readframes(4096) + + stream.stop_stream() + stream.close() + +def say(text): + print('Original:', text) + text = translate_text(text, server_language, language) + print('Translated:', text) + speech = get_speech(text) + play_speech(speech) + +def announce_timer(minutes, seconds): + announcement = 'Times up on your ' + if minutes > 0: + announcement += f'{minutes} minute ' + if seconds > 0: + announcement += f'{seconds} second ' + announcement += 'timer.' + say(announcement) + +def create_timer(total_seconds): + minutes, seconds = divmod(total_seconds, 60) + threading.Timer(total_seconds, announce_timer, args=[minutes, seconds]).start() + announcement = '' + if minutes > 0: + announcement += f'{minutes} minute ' + if seconds > 0: + announcement += f'{seconds} second ' + announcement += 'timer started.' + say(announcement) + +def handle_method_request(request): + payload = json.loads(request.payload) + seconds = payload['seconds'] + if seconds > 0: + create_timer(payload['seconds']) + + method_response = MethodResponse.create_from_method_request(request, 200) + device_client.send_method_response(method_response) + +device_client.on_method_request_received = handle_method_request + +while True: + while not button.is_pressed(): + time.sleep(.1) + + buffer = capture_audio() + text = convert_speech_to_text(buffer) + if len(text) > 0: + print('Original:', text) + text = translate_text(text, language, server_language) + print('Translated:', text) + + message = Message(json.dumps({ 'speech': text })) + device_client.send_message(message) \ No newline at end of file diff --git a/6-consumer/lessons/4-multiple-language-support/code/virtual-iot-device/smart-timer/app.py b/6-consumer/lessons/4-multiple-language-support/code/virtual-iot-device/smart-timer/app.py new file mode 100644 index 0000000..7eb08e3 --- /dev/null +++ b/6-consumer/lessons/4-multiple-language-support/code/virtual-iot-device/smart-timer/app.py @@ -0,0 +1,124 @@ +import json +import requests +import threading +import time +from azure.cognitiveservices import speech +from azure.cognitiveservices.speech import SpeechConfig, SpeechRecognizer, SpeechSynthesizer +from azure.cognitiveservices.speech.translation import SpeechTranslationConfig, TranslationRecognizer +from azure.iot.device import IoTHubDeviceClient, Message, MethodResponse + +speech_api_key = '' +translator_api_key = '' +location = '' +language = '' +server_language = '' +connection_string = '' + +device_client = IoTHubDeviceClient.create_from_connection_string(connection_string) + +print('Connecting') +device_client.connect() +print('Connected') + +translation_config = SpeechTranslationConfig(subscription=speech_api_key, + region=location, + speech_recognition_language=language, + target_languages=(language, server_language)) + +recognizer = TranslationRecognizer(translation_config=translation_config) + +def recognized(args): + if args.result.reason == speech.ResultReason.TranslatedSpeech: + language_match = next(l for l in args.result.translations if server_language.lower().startswith(l.lower())) + text = args.result.translations[language_match] + + if (len(text) > 0): + print(f'Translated text: {text}') + + message = Message(json.dumps({ 'speech': text })) + device_client.send_message(message) + +recognizer.recognized.connect(recognized) + +recognizer.start_continuous_recognition() + +speech_config = SpeechTranslationConfig(subscription=speech_api_key, + region=location) +speech_config.speech_synthesis_language = language +speech_synthesizer = SpeechSynthesizer(speech_config=speech_config) + +voices = speech_synthesizer.get_voices_async().get().voices +first_voice = next(x for x in voices if x.locale.lower() == language.lower()) +speech_config.speech_synthesis_voice_name = first_voice.short_name + +def translate_text(text): + url = f'https://api.cognitive.microsofttranslator.com/translate?api-version=3.0' + + headers = { + 'Ocp-Apim-Subscription-Key': translator_api_key, + 'Ocp-Apim-Subscription-Region': location, + 'Content-type': 'application/json' + } + + params = { + 'from': server_language, + 'to': language + } + + body = [{ + 'text' : text + }] + + response = requests.post(url, headers=headers, params=params, json=body) + + return response.json()[0]['translations'][0]['text'] + +def say(text): + print('Original:', text) + text = translate_text(text) + print('Translated:', text) + + ssml = f'' + ssml += f'' + ssml += text + ssml += '' + ssml += '' + + recognizer.stop_continuous_recognition() + speech_synthesizer.speak_ssml(ssml) + recognizer.start_continuous_recognition() + +def announce_timer(minutes, seconds): + announcement = 'Times up on your ' + if minutes > 0: + announcement += f'{minutes} minute ' + if seconds > 0: + announcement += f'{seconds} second ' + announcement += 'timer.' + say(announcement) + +def create_timer(total_seconds): + minutes, seconds = divmod(total_seconds, 60) + threading.Timer(total_seconds, announce_timer, args=[minutes, seconds]).start() + announcement = '' + if minutes > 0: + announcement += f'{minutes} minute ' + if seconds > 0: + announcement += f'{seconds} second ' + announcement += 'timer started.' + say(announcement) + +def handle_method_request(request): + if request.name == 'set-timer': + payload = json.loads(request.payload) + seconds = payload['seconds'] + if seconds > 0: + create_timer(payload['seconds']) + + method_response = MethodResponse.create_from_method_request(request, 200) + device_client.send_method_response(method_response) + +device_client.on_method_request_received = handle_method_request + +while True: + time.sleep(1) \ No newline at end of file diff --git a/6-consumer/lessons/4-multiple-language-support/pi-translate-speech.md b/6-consumer/lessons/4-multiple-language-support/pi-translate-speech.md new file mode 100644 index 0000000..5130597 --- /dev/null +++ b/6-consumer/lessons/4-multiple-language-support/pi-translate-speech.md @@ -0,0 +1,150 @@ +# Translate speech - Raspberry Pi + +In this part of the lesson, you will write code to translate text using the translator service. + +## Convert text to speech using the translator service + +The speech service REST API doesn't support direct translations, instead you can use the Translator service to translate the text generated by the speech to text service, and the text of the spoken response. This service has a REST API you can use to translate the text. + +### Task - use the translator resource to translate text + +1. Your smart timer will have 2 languages set - the language of the server that was used to train LUIS, and the language spoken by the user. Update the `language` variable to be the language that will be spoken by the used, and add a new variable called `server_language` for the language used to train LUIS: + + ```python + language = '' + server_language = '' + ``` + + Replace `` with the locale name for language you will be speaking in, for example `fr-FR` for French, or `zn-HK` for Cantonese. + + Replace `` with the locale name for language used to train LUIS. + + You can find a list of the supported languages and their locale names in the [Language and voice support documentation on Microsoft docs](https://docs.microsoft.com/azure/cognitive-services/speech-service/language-support?WT.mc_id=academic-17441-jabenn#speech-to-text). + + > 💁 If you don't speak multiple languages you can use a service like [Bing Translate](https://www.bing.com/translator) or [Google Translate](https://translate.google.com) to translate from your preferred language to a language of your choice. These services can then play audio of the translated text. + > + > For example, if you train LUIS in English, but want to use French as the user language, you can translate sentences like "set a 2 minute and 27 second timer" from English into French using Bing Translate, then use the **Listen translation** button to speak the translation into your microphone. + > + > ![The listen translation button on Bing translate](../../../images/bing-translate.png) + +1. Add the translator API key below the `speech_api_key`: + + ```python + translator_api_key = '' + ``` + + Replace `` with the API key for your translator service resource. + +1. Above the `say` function, define a `translate_text` function that will translate text from the server language to the user language: + + ```python + def translate_text(text, from_language, to_language): + ``` + + The from and to languages are passed to this function - your app needs to convert from user language to server language when recognizing speech, and from server language to user language when provided spoken feedback. + +1. Inside this function, define the URL and headers for the REST API call: + + ```python + url = f'https://api.cognitive.microsofttranslator.com/translate?api-version=3.0' + + headers = { + 'Ocp-Apim-Subscription-Key': translator_api_key, + 'Ocp-Apim-Subscription-Region': location, + 'Content-type': 'application/json' + } + ``` + + The URL for this API is not location specific, instead the location is passed in as a header. The API key is used directly, so unlike the speech service there is no need to get an access token from the token issuer API. + +1. Below this define the parameters and body for the call: + + ```python + params = { + 'from': from_language, + 'to': to_language + } + + body = [{ + 'text' : text + }] + ``` + + The `params` defines the parameters to pass to the API call, passing the from and to languages. This call will translate text in the `from` language into the `to` language. + + The `body` contains the text to translate. This is an array, as multiple blocks of text can be translated in the same call. + +1. Make the call the REST API, and get the response: + + ```python + response = requests.post(url, headers=headers, params=params, json=body) + ``` + + The response that comes back is a JSON array, with one item that contains the translations. This item has an array for translations of all the items passed in the body. + + ```json + [ + { + "translations": [ + { + "text": "Set a 2 minute 27 second timer.", + "to": "en" + } + ] + } + ] + ``` + +1. Return the `test` property from the first translation from the first item in the array: + + ```python + return response.json()[0]['translations'][0]['text'] + ``` + +1. Update the `while True` loop to translate the text from the call to `convert_speech_to_text` from the user language to the server language: + + ```python + if len(text) > 0: + print('Original:', text) + text = translate_text(text, language, server_language) + print('Translated:', text) + + message = Message(json.dumps({ 'speech': text })) + device_client.send_message(message) + ``` + + This code also prints the original and translated versions of the text to the console. + +1. Update the `say` function to translate the text to say from the server language to the user language: + + ```python + def say(text): + print('Original:', text) + text = translate_text(text, server_language, language) + print('Translated:', text) + speech = get_speech(text) + play_speech(speech) + ``` + + This code also prints the original and translated versions of the text to the console. + +1. Run your code. Ensure your function app is running, and request a timer in the user language, either by speaking that language yourself, or using a translation app. + + ```output + pi@raspberrypi:~/smart-timer $ python3 app.py + Connecting + Connected + Using voice fr-FR-DeniseNeural + Original: Définir une minuterie de 2 minutes et 27 secondes. + Translated: Set a timer of 2 minutes and 27 seconds. + Original: 2 minute 27 second timer started. + Translated: 2 minute 27 seconde minute a commencé. + Original: Times up on your 2 minute 27 second timer. + Translated: Chronométrant votre minuterie de 2 minutes 27 secondes. + ``` + + > 💁 Due to the different ways of saying something in different languages, you may get translations that are slightly different to the examples you gave LUIS. If this is the case, add more examples to LUIS, retrain then re-publish the model. + +> 💁 You can find this code in the [code/pi](code/pi) folder. + +😀 Your multi-lingual timer program was a success! diff --git a/6-consumer/lessons/4-multiple-language-support/virtual-device-translate-speech.md b/6-consumer/lessons/4-multiple-language-support/virtual-device-translate-speech.md new file mode 100644 index 0000000..837a091 --- /dev/null +++ b/6-consumer/lessons/4-multiple-language-support/virtual-device-translate-speech.md @@ -0,0 +1,190 @@ +# Translate speech - Virtual IoT Device + +In this part of the lesson, you will write code to translate speech when converting to text using the speech service, then translate text using the Translator service before generating a spoken response. + +## Use the speech service to translate speech + +The speech service can take speech and not only convert to text in the same language, but also translate the output to other languages. + +### Task - use the speech service to translate speech + +1. Open the `smart-timer` project in VS Code, and ensure the virtual environment is loaded in the terminal. + +1. Add the following import statements below the existing imports: + + ```python + from azure.cognitiveservices import speech + from azure.cognitiveservices.speech.translation import SpeechTranslationConfig, TranslationRecognizer + import requests + ``` + + This imports classes used to translate speech, and a `requests` library that will be used to make a call to the Translator service later in this lesson. + +1. Your smart timer will have 2 languages set - the language of the server that was used to train LUIS, and the language spoken by the user. Update the `language` variable to be the language that will be spoken by the used, and add a new variable called `server_language` for the language used to train LUIS: + + ```python + language = '' + server_language = '' + ``` + + Replace `` with the locale name for language you will be speaking in, for example `fr-FR` for French, or `zn-HK` for Cantonese. + + Replace `` with the locale name for language used to train LUIS. + + You can find a list of the supported languages and their locale names in the [Language and voice support documentation on Microsoft docs](https://docs.microsoft.com/azure/cognitive-services/speech-service/language-support?WT.mc_id=academic-17441-jabenn#speech-to-text). + + > 💁 If you don't speak multiple languages you can use a service like [Bing Translate](https://www.bing.com/translator) or [Google Translate](https://translate.google.com) to translate from your preferred language to a language of your choice. These services can then play audio of the translated text. Be aware that the speech recognizer will ignore some audio output from your device, so you may need to use an additional device to play the translated text. + > + > For example, if you train LUIS in English, but want to use French as the user language, you can translate sentences like "set a 2 minute and 27 second timer" from English into French using Bing Translate, then use the **Listen translation** button to speak the translation into your microphone. + > + > ![The listen translation button on Bing translate](../../../images/bing-translate.png) + +1. Replace the `recognizer_config` and `recognizer` declarations with the following: + + ```python + translation_config = SpeechTranslationConfig(subscription=speech_api_key, + region=location, + speech_recognition_language=language, + target_languages=(language, server_language)) + + recognizer = TranslationRecognizer(translation_config=translation_config) + ``` + + This creates a translation config to recognize speech in the user language, and create translations in the user and server language. It then uses this config to create a translation recognizer - a speech recognizer that can translate the output of the speech recognition into multiple languages. + + > 💁 The original language needs to be specified in the `target_languages`, otherwise you won't get any translations. + +1. Update the `recognized` function, replacing the entire contents of the function with the following: + + ```python + if args.result.reason == speech.ResultReason.TranslatedSpeech: + language_match = next(l for l in args.result.translations if server_language.lower().startswith(l.lower())) + text = args.result.translations[language_match] + if (len(text) > 0): + print(f'Translated text: {text}') + + message = Message(json.dumps({ 'speech': text })) + device_client.send_message(message) + ``` + + This code checks to see if the recognized event was fired because speech was translated (this event can fire at other times, such as when the speech is recognized but not translated). If the speech was translated, it finds the translation in the `args.result.translations` dictionary that matches the server language. + + The `args.result.translations` dictionary is keyed off the language part of the locale setting, not the whole setting. For example, if you request a translation into `fr-FR` for French, the dictionary will contain an entry for `fr`, not `fr-FR`. + + The translated text is then sent to the IoT Hub. + +1. Run this code to test the translations. Ensure your function app is running, and request a timer in the user language, either by speaking that language yourself, or using a translation app. + + ```output + (.venv) ➜ smart-timer python app.py + Connecting + Connected + Translated text: Set a timer of 2 minutes and 27 seconds. + ``` + +## Translate text using the translator service + +The speech service doesn't support translation pf text back to speech, instead you can use the Translator service to translate the text. This service has a REST API you can use to translate the text. + +### Task - use the translator resource to translate text + +1. Add the translator API key below the `speech_api_key`: + + ```python + translator_api_key = '' + ``` + + Replace `` with the API key for your translator service resource. + +1. Above the `say` function, define a `translate_text` function that will translate text from the server language to the user language: + + ```python + def translate_text(text): + ``` + +1. Inside this function, define the URL and headers for the REST API call: + + ```python + url = f'https://api.cognitive.microsofttranslator.com/translate?api-version=3.0' + + headers = { + 'Ocp-Apim-Subscription-Key': translator_api_key, + 'Ocp-Apim-Subscription-Region': location, + 'Content-type': 'application/json' + } + ``` + + The URL for this API is not location specific, instead the location is passed in as a header. The API key is used directly, so unlike the speech service there is no need to get an access token from the token issuer API. + +1. Below this define the parameters and body for the call: + + ```python + params = { + 'from': server_language, + 'to': language + } + + body = [{ + 'text' : text + }] + ``` + + The `params` defines the parameters to pass to the API call, passing the from and to languages. This call will translate text in the `from` language into the `to` language. + + The `body` contains the text to translate. This is an array, as multiple blocks of text can be translated in the same call. + +1. Make the call the REST API, and get the response: + + ```python + response = requests.post(url, headers=headers, params=params, json=body) + ``` + + The response that comes back is a JSON array, with one item that contains the translations. This item has an array for translations of all the items passed in the body. + + ```json + [ + { + "translations": [ + { + "text": "Chronométrant votre minuterie de 2 minutes 27 secondes.", + "to": "fr" + } + ] + } + ] + ``` + +1. Return the `test` property from the first translation from the first item in the array: + + ```python + return response.json()[0]['translations'][0]['text'] + ``` + +1. Update the `say` function to translate the text to say before the SSML is generated: + + ```python + print('Original:', text) + text = translate_text(text) + print('Translated:', text) + ``` + + This code also prints the original and translated versions of the text to the console. + +1. Run your code. Ensure your function app is running, and request a timer in the user language, either by speaking that language yourself, or using a translation app. + + ```output + (.venv) ➜ smart-timer python app.py + Connecting + Connected + Translated text: Set a timer of 2 minutes and 27 seconds. + Original: 2 minute 27 second timer started. + Translated: 2 minute 27 seconde minute a commencé. + Original: Times up on your 2 minute 27 second timer. + Translated: Chronométrant votre minuterie de 2 minutes 27 secondes. + ``` + + > 💁 Due to the different ways of saying something in different languages, you may get translations that are slightly different to the examples you gave LUIS. If this is the case, add more examples to LUIS, retrain then re-publish the model. + +> 💁 You can find this code in the [code/virtual-iot-device](code/virtual-iot-device) folder. + +😀 Your multi-lingual timer program was a success! diff --git a/6-consumer/lessons/4-multiple-language-support/wio-terminal-translate-speech.md b/6-consumer/lessons/4-multiple-language-support/wio-terminal-translate-speech.md new file mode 100644 index 0000000..7d25fa0 --- /dev/null +++ b/6-consumer/lessons/4-multiple-language-support/wio-terminal-translate-speech.md @@ -0,0 +1,3 @@ +# Translate speech - Wio Terminal + +Coming soon! diff --git a/README.md b/README.md index 8640308..4b890e2 100644 --- a/README.md +++ b/README.md @@ -22,7 +22,7 @@ The projects cover the journey of food from farm to table. This includes farming **Hearty thanks to our authors [Jen Fox](https://github.com/jenfoxbot), [Jen Looper](https://github.com/jlooper), [Jim Bennett](https://github.com/jimbobbennett), and our sketchnote artist [Nitya Narasimhan](https://github.com/nitya).** -**Thanks as well to our team of [Microsoft Learn Student Ambassadors](https://studentambassadors.microsoft.com?WT.mc_id=academic-17441-jabenn) who have been reviewing and translating this curriculum - [Bhavesh Suneja](https://github.com/EliteWarrior315), [Lateefah Bello](https://www.linkedin.com/in/lateefah-bello/), [Manvi Jha](https://github.com/Severus-Matthew), [Mireille Tan](https://www.linkedin.com/in/mireille-tan-a4834819a/), [Mohammad Iftekher (Iftu) Ebne Jalal](https://github.com/Iftu119), [Priyanshu Srivastav](https://www.linkedin.com/in/priyanshu-srivastav-b067241ba), and [Zina Kamel](https://www.linkedin.com/in/zina-kamel/).** +**Thanks as well to our team of [Microsoft Learn Student Ambassadors](https://studentambassadors.microsoft.com?WT.mc_id=academic-17441-jabenn) who have been reviewing and translating this curriculum - [Aditya Garg](https://github.com/AdityaGarg00), [Aryan Jain](https://www.linkedin.com/in/aryan-jain-47a4a1145/), [Bhavesh Suneja](https://github.com/EliteWarrior315), [Lateefah Bello](https://www.linkedin.com/in/lateefah-bello/), [Manvi Jha](https://github.com/Severus-Matthew), [Mireille Tan](https://www.linkedin.com/in/mireille-tan-a4834819a/), [Mohammad Iftekher (Iftu) Ebne Jalal](https://github.com/Iftu119), [Priyanshu Srivastav](https://www.linkedin.com/in/priyanshu-srivastav-b067241ba), [Thanmai Gowducheruvu](https://github.com/innovation-platform), and [Zina Kamel](https://www.linkedin.com/in/zina-kamel/).** > **Teachers**, we have [included some suggestions](for-teachers.md) on how to use this curriculum. If you would like to create your own lessons, we have also included a [lesson template](lesson-template/README.md). diff --git a/images/Diagrams.sketch b/images/Diagrams.sketch index 468d852..ee1db29 100644 Binary files a/images/Diagrams.sketch and b/images/Diagrams.sketch differ diff --git a/images/azure-speech-logo.png b/images/azure-speech-logo.png new file mode 100644 index 0000000..0b37c36 Binary files /dev/null and b/images/azure-speech-logo.png differ diff --git a/images/azure-translator-logo.png b/images/azure-translator-logo.png new file mode 100644 index 0000000..a3dc18d Binary files /dev/null and b/images/azure-translator-logo.png differ diff --git a/images/bing-translate.png b/images/bing-translate.png new file mode 100644 index 0000000..3597ed4 Binary files /dev/null and b/images/bing-translate.png differ diff --git a/images/translated-smart-timer.png b/images/translated-smart-timer.png new file mode 100644 index 0000000..cad807d Binary files /dev/null and b/images/translated-smart-timer.png differ