diff --git a/1-getting-started/lessons/2-deeper-dive/README.md b/1-getting-started/lessons/2-deeper-dive/README.md index 175df64..c044759 100644 --- a/1-getting-started/lessons/2-deeper-dive/README.md +++ b/1-getting-started/lessons/2-deeper-dive/README.md @@ -261,6 +261,7 @@ The challenge in the last lesson was to list as many IoT devices as you can that * Read the [Arduino getting started guide](https://www.arduino.cc/en/Guide/Introduction) to understand more about the Arduino platform. * Read the [introduction to the Raspberry Pi 4](https://www.raspberrypi.org/products/raspberry-pi-4-model-b/) to learn more about Raspberry Pis. +* Learn more on some of the concepts and acronyms in the [What the FAQ are CPUs, MPUs, MCUs, and GPUs article in the Electrical Engineering Journal](https://www.eejournal.com/article/what-the-faq-are-cpus-mpus-mcus-and-gpus/). ✅ Use these guides, along with the costs shown by following the links in the [hardware guide](../../../hardware.md) to decide on what hardware platform you want to use, or if you would rather use a virtual device. diff --git a/5-retail/lessons/1-train-stock-detector/README.md b/5-retail/lessons/1-train-stock-detector/README.md index ada15c3..7a680a9 100644 --- a/5-retail/lessons/1-train-stock-detector/README.md +++ b/5-retail/lessons/1-train-stock-detector/README.md @@ -111,7 +111,7 @@ You can train an object detector using Custom Vision, in a similar way to how yo ![The settings for the custom vision project with the name set to fruit-quality-detector, no description, the resource set to fruit-quality-detector-training, the project type set to classification, the classification types set to multi class and the domains set to food](../../../images/custom-vision-create-object-detector-project.png) - > 💁 The products on shelves domain is specifically targeted for detecting stock on store shelves. + ✅ The products on shelves domain is specifically targeted for detecting stock on store shelves. Read more on the different domains in the [Select a domian documentation on Microsoft Docs](https://docs.microsoft.com/azure/cognitive-services/custom-vision-service/select-domain?WT.mc_id=academic-17441-jabenn#object-detection) ✅ Take some time to explore the Custom Vision UI for your object detector. @@ -119,7 +119,7 @@ You can train an object detector using Custom Vision, in a similar way to how yo To train your model you will need a set of images containing the objects you want to detect. -1. Gather images that contain the object to detect. You will need at least 15 images containing each object to detect from a variety of different angles and in different lighting conditions, but the more the better. You will also need a few images to test the model. If you are detecting more than one object, you will want some testing images that contain all the objects. +1. Gather images that contain the object to detect. You will need at least 15 images containing each object to detect from a variety of different angles and in different lighting conditions, but the more the better. This object detector uses the *Products on shelves* domain, so try to set up the objects as if they were on a store shelf. You will also need a few images to test the model. If you are detecting more than one object, you will want some testing images that contain all the objects. > 💁 Images with multiple different objects count towards the 15 image minimum for all the objects in the image. @@ -188,4 +188,4 @@ If you have any similar looking items, test it out by adding images of them to y ## Assignment -[](assignment.md) +[Compare domains](assignment.md) diff --git a/5-retail/lessons/1-train-stock-detector/assignment.md b/5-retail/lessons/1-train-stock-detector/assignment.md index da157d5..13c342d 100644 --- a/5-retail/lessons/1-train-stock-detector/assignment.md +++ b/5-retail/lessons/1-train-stock-detector/assignment.md @@ -1,9 +1,14 @@ -# +# Compare domains ## Instructions +When you created your object detector, you had a choice of multiple domains. Compare how well they work for your stock detector, and describe which gives better results. + +To change the domain, select the **Settings** button on the top menu, select a new domain, select the **Save changes** button, then retrain the model. Make sure you test with the new iteration of the model trained with the new domain. + ## Rubric | Criteria | Exemplary | Adequate | Needs Improvement | | -------- | --------- | -------- | ----------------- | -| | | | | +| Train the model with a different domain | Was able to change the domain and re-train the model | Was able to change the domain and re-train the model | Was unable to change the domain or re-train the model | +| Test the model and compare the results | Was able to test the model with different domains, compare results, and describe which is better | Was able to test the model with different domains, but was unable to compare the results and describe which is better | Was unable to test the model with different domains | diff --git a/6-consumer/lessons/1-speech-recognition/README.md b/6-consumer/lessons/1-speech-recognition/README.md index 4bfa8ce..b3c2634 100644 --- a/6-consumer/lessons/1-speech-recognition/README.md +++ b/6-consumer/lessons/1-speech-recognition/README.md @@ -91,6 +91,22 @@ These samples are taken many thousands of times per second, using well-defined s ✅ Do some research: If you use a streaming music service, what sample rate and size does it use? If you use CDs, what is the sample rate and size of CD audio? +There are a number of different formats for audio data. You've probably heard of mp3 files - audio data that is compressed to make it smaller without losing any quality. Uncompressed audio is often stored as a WAV file - this is a file with 44 bytes of header information, followed by raw audio data. The header contains information such as the sample rate (for example 16000 for 16KHz) and sample size (16 for 16-bit), and the number of channels. After the header, the WAV file contains the raw audio data. + +> 🎓 Channels refers to how many different audio streams make up the audio. For example, for stereo audio with left and right, there would be 2 channels. For 7.1 surround sound for a home theater system this would be 8. + +### Audio data size + +Audio data is relatively large. For example, capturing uncompressed 16-bit audio at 16KHz (a good enough rate for use with speech to text model), takes 32KB of data for each second of audio: + +* 16-bit means 2 bytes per sample (1 byte is 8 bits). +* 16KHz is 16,000 samples per second. +* 16,000 x 2 bytes = 32,000 bytes per second. + +This sounds like a small amount of data, but if you are using a microcontroller with limited memory, this can be a lot. For example, the Wio Terminal has 192KB of memory, and that needs to store program code and variables. Even if your program code was tiny, you couldn't capture more than 5 seconds of audio. + +Microcontrollers can access additional storage, such as SD cards or flash memory. When building an IoT device that captures audio you will need to ensure not only you have additional storage, but your code writes the audio captured from your microphone directly to that storage, and when sending it to the cloud, you stream from storage to the web request. That way you can avoid running out of memory by trying to hold the entire block of audio data in memory at once. + ## Capture audio from your IoT device Your IoT device can be connected to a microphone to capture audio, ready for conversion to text. It can also be connected to speakers to output audio. In later lessons this will be used to give audio feedback, but it is useful to set up speakers now to test the microphone. @@ -186,26 +202,6 @@ Work through the relevant guide to convert speech to text on your IoT device: * [Single-board computer - Raspberry Pi](pi-speech-to-text.md) * [Single-board computer - Virtual device](virtual-device-speech-to-text.md) -### Task - send converted speech to an IoT services - -To use the results of the speech to text conversion, you need to send it to the cloud. There it will be interpreted and responses sent back to the IoT device as commands. - -1. Create a new IoT Hub in the `smart-timer` resource group, and register a new device called `smart-timer`. - -1. Connect your IoT device to this IoT Hub using what you have learned in previous lessons, and send the speech as telemetry. Use a JSON document in this format: - - ```json - { - "speech" : "" - } - ``` - - Where `` is the output from the speech to text call. You only need to send speech that has content, if the call returns an empty string it can be ignored. - -1. Verify that messages are being sent by monitoring the Event Hub compatible endpoint using the `az iot hub monitor-events` command. - -> 💁 You can find this code in the [code-iot-hub/virtual-iot-device](code-iot-hub/virtual-iot-device), [code-iot-hub/pi](code-iot-hub/pi), or [code-iot-hub/wio-terminal](code-iot-hub/wio-terminal) folder. - --- ## 🚀 Challenge diff --git a/6-consumer/lessons/1-speech-recognition/code-iot-hub/pi/smart-timer/app.py b/6-consumer/lessons/1-speech-recognition/code-iot-hub/pi/smart-timer/app.py deleted file mode 100644 index 81c16e6..0000000 --- a/6-consumer/lessons/1-speech-recognition/code-iot-hub/pi/smart-timer/app.py +++ /dev/null @@ -1,93 +0,0 @@ -import io -import json -import pyaudio -import requests -import time -import wave - -from azure.iot.device import IoTHubDeviceClient, Message - -from grove.factory import Factory -button = Factory.getButton('GPIO-HIGH', 5) - -audio = pyaudio.PyAudio() -microphone_card_number = 1 -speaker_card_number = 1 -rate = 48000 - -def capture_audio(): - stream = audio.open(format = pyaudio.paInt16, - rate = rate, - channels = 1, - input_device_index = microphone_card_number, - input = True, - frames_per_buffer = 4096) - - frames = [] - - while button.is_pressed(): - frames.append(stream.read(4096)) - - stream.stop_stream() - stream.close() - - wav_buffer = io.BytesIO() - with wave.open(wav_buffer, 'wb') as wavefile: - wavefile.setnchannels(1) - wavefile.setsampwidth(audio.get_sample_size(pyaudio.paInt16)) - wavefile.setframerate(rate) - wavefile.writeframes(b''.join(frames)) - wav_buffer.seek(0) - - return wav_buffer - -speech_api_key = '' -location = '' -language = '' -connection_string = '' - -device_client = IoTHubDeviceClient.create_from_connection_string(connection_string) - -print('Connecting') -device_client.connect() -print('Connected') - -def get_access_token(): - headers = { - 'Ocp-Apim-Subscription-Key': speech_api_key - } - - token_endpoint = f'https://{location}.api.cognitive.microsoft.com/sts/v1.0/issuetoken' - response = requests.post(token_endpoint, headers=headers) - return str(response.text) - -def convert_speech_to_text(buffer): - url = f'https://{location}.stt.speech.microsoft.com/speech/recognition/conversation/cognitiveservices/v1' - - headers = { - 'Authorization': 'Bearer ' + get_access_token(), - 'Content-Type': f'audio/wav; codecs=audio/pcm; samplerate={rate}', - 'Accept': 'application/json;text/xml' - } - - params = { - 'language': language - } - - response = requests.post(url, headers=headers, params=params, data=buffer) - response_json = json.loads(response.text) - - if response_json['RecognitionStatus'] == 'Success': - return response_json['DisplayText'] - else: - return '' - -while True: - while not button.is_pressed(): - time.sleep(.1) - - buffer = capture_audio() - text = convert_speech_to_text(buffer) - if len(text) > 0: - message = Message(json.dumps({ 'speech': text })) - device_client.send_message(message) \ No newline at end of file diff --git a/6-consumer/lessons/1-speech-recognition/code-iot-hub/virtual-iot-device/smart-timer/app.py b/6-consumer/lessons/1-speech-recognition/code-iot-hub/virtual-iot-device/smart-timer/app.py deleted file mode 100644 index 2b2f2c2..0000000 --- a/6-consumer/lessons/1-speech-recognition/code-iot-hub/virtual-iot-device/smart-timer/app.py +++ /dev/null @@ -1,33 +0,0 @@ -import json -import time -from azure.cognitiveservices.speech import SpeechConfig, SpeechRecognizer -from azure.iot.device import IoTHubDeviceClient, Message - -speech_api_key = '' -location = '' -language = '' -connection_string = '' - -device_client = IoTHubDeviceClient.create_from_connection_string(connection_string) - -print('Connecting') -device_client.connect() -print('Connected') - -recognizer_config = SpeechConfig(subscription=speech_api_key, - region=location, - speech_recognition_language=language) - -recognizer = SpeechRecognizer(speech_config=recognizer_config) - -def recognized(args): - if len(args.result.text) > 0: - message = Message(json.dumps({ 'speech': args.result.text })) - device_client.send_message(message) - -recognizer.recognized.connect(recognized) - -recognizer.start_continuous_recognition() - -while True: - time.sleep(1) \ No newline at end of file diff --git a/6-consumer/lessons/1-speech-recognition/code-record/wio-terminal/smart-timer/include/README b/6-consumer/lessons/1-speech-recognition/code-record/wio-terminal/smart-timer/include/README new file mode 100644 index 0000000..194dcd4 --- /dev/null +++ b/6-consumer/lessons/1-speech-recognition/code-record/wio-terminal/smart-timer/include/README @@ -0,0 +1,39 @@ + +This directory is intended for project header files. + +A header file is a file containing C declarations and macro definitions +to be shared between several project source files. You request the use of a +header file in your project source file (C, C++, etc) located in `src` folder +by including it, with the C preprocessing directive `#include'. + +```src/main.c + +#include "header.h" + +int main (void) +{ + ... +} +``` + +Including a header file produces the same results as copying the header file +into each source file that needs it. Such copying would be time-consuming +and error-prone. With a header file, the related declarations appear +in only one place. If they need to be changed, they can be changed in one +place, and programs that include the header file will automatically use the +new version when next recompiled. The header file eliminates the labor of +finding and changing all the copies as well as the risk that a failure to +find one copy will result in inconsistencies within a program. + +In C, the usual convention is to give header files names that end with `.h'. +It is most portable to use only letters, digits, dashes, and underscores in +header file names, and at most one dot. + +Read more about using header files in official GCC documentation: + +* Include Syntax +* Include Operation +* Once-Only Headers +* Computed Includes + +https://gcc.gnu.org/onlinedocs/cpp/Header-Files.html diff --git a/6-consumer/lessons/1-speech-recognition/code-record/wio-terminal/smart-timer/lib/README b/6-consumer/lessons/1-speech-recognition/code-record/wio-terminal/smart-timer/lib/README new file mode 100644 index 0000000..6debab1 --- /dev/null +++ b/6-consumer/lessons/1-speech-recognition/code-record/wio-terminal/smart-timer/lib/README @@ -0,0 +1,46 @@ + +This directory is intended for project specific (private) libraries. +PlatformIO will compile them to static libraries and link into executable file. + +The source code of each library should be placed in a an own separate directory +("lib/your_library_name/[here are source files]"). + +For example, see a structure of the following two libraries `Foo` and `Bar`: + +|--lib +| | +| |--Bar +| | |--docs +| | |--examples +| | |--src +| | |- Bar.c +| | |- Bar.h +| | |- library.json (optional, custom build options, etc) https://docs.platformio.org/page/librarymanager/config.html +| | +| |--Foo +| | |- Foo.c +| | |- Foo.h +| | +| |- README --> THIS FILE +| +|- platformio.ini +|--src + |- main.c + +and a contents of `src/main.c`: +``` +#include +#include + +int main (void) +{ + ... +} + +``` + +PlatformIO Library Dependency Finder will find automatically dependent +libraries scanning project source files. + +More information about PlatformIO Library Dependency Finder +- https://docs.platformio.org/page/librarymanager/ldf.html diff --git a/6-consumer/lessons/1-speech-recognition/code-record/wio-terminal/smart-timer/platformio.ini b/6-consumer/lessons/1-speech-recognition/code-record/wio-terminal/smart-timer/platformio.ini new file mode 100644 index 0000000..c5999f1 --- /dev/null +++ b/6-consumer/lessons/1-speech-recognition/code-record/wio-terminal/smart-timer/platformio.ini @@ -0,0 +1,19 @@ +; PlatformIO Project Configuration File +; +; Build options: build flags, source filter +; Upload options: custom upload port, speed and extra flags +; Library options: dependencies, extra library storages +; Advanced options: extra scripting +; +; Please visit documentation for the other options and examples +; https://docs.platformio.org/page/projectconf.html + +[env:seeed_wio_terminal] +platform = atmelsam +board = seeed_wio_terminal +framework = arduino +lib_deps = + seeed-studio/Seeed Arduino FS @ 2.0.3 + seeed-studio/Seeed Arduino SFUD @ 2.0.1 +build_flags = + -DSFUD_USING_QSPI diff --git a/6-consumer/lessons/1-speech-recognition/code-record/wio-terminal/smart-timer/src/flash_writer.h b/6-consumer/lessons/1-speech-recognition/code-record/wio-terminal/smart-timer/src/flash_writer.h new file mode 100644 index 0000000..87fdff2 --- /dev/null +++ b/6-consumer/lessons/1-speech-recognition/code-record/wio-terminal/smart-timer/src/flash_writer.h @@ -0,0 +1,60 @@ +#pragma once + +#include +#include + +class FlashWriter +{ +public: + void init() + { + _flash = sfud_get_device_table() + 0; + _sfudBufferSize = _flash->chip.erase_gran; + _sfudBuffer = new byte[_sfudBufferSize]; + _sfudBufferPos = 0; + _sfudBufferWritePos = 0; + } + + void reset() + { + _sfudBufferPos = 0; + _sfudBufferWritePos = 0; + } + + void writeSfudBuffer(byte b) + { + _sfudBuffer[_sfudBufferPos++] = b; + if (_sfudBufferPos == _sfudBufferSize) + { + sfud_erase_write(_flash, _sfudBufferWritePos, _sfudBufferSize, _sfudBuffer); + _sfudBufferWritePos += _sfudBufferSize; + _sfudBufferPos = 0; + } + } + + void flushSfudBuffer() + { + if (_sfudBufferPos > 0) + { + sfud_erase_write(_flash, _sfudBufferWritePos, _sfudBufferSize, _sfudBuffer); + _sfudBufferWritePos += _sfudBufferSize; + _sfudBufferPos = 0; + } + } + + void writeSfudBuffer(byte *b, size_t len) + { + for (size_t i = 0; i < len; ++i) + { + writeSfudBuffer(b[i]); + } + } + +private: + byte *_sfudBuffer; + size_t _sfudBufferSize; + size_t _sfudBufferPos; + size_t _sfudBufferWritePos; + + const sfud_flash *_flash; +}; \ No newline at end of file diff --git a/6-consumer/lessons/1-speech-recognition/code-record/wio-terminal/smart-timer/src/main.cpp b/6-consumer/lessons/1-speech-recognition/code-record/wio-terminal/smart-timer/src/main.cpp new file mode 100644 index 0000000..0f77c9b --- /dev/null +++ b/6-consumer/lessons/1-speech-recognition/code-record/wio-terminal/smart-timer/src/main.cpp @@ -0,0 +1,49 @@ +#include +#include +#include + +#include "mic.h" + +void setup() +{ + Serial.begin(9600); + + while (!Serial) + ; // Wait for Serial to be ready + + delay(1000); + + while (!(sfud_init() == SFUD_SUCCESS)) + ; + + sfud_qspi_fast_read_enable(sfud_get_device(SFUD_W25Q32_DEVICE_INDEX), 2); + + pinMode(WIO_KEY_C, INPUT_PULLUP); + + mic.init(); + + Serial.println("Ready."); +} + +void processAudio() +{ + +} + +void loop() +{ + if (digitalRead(WIO_KEY_C) == LOW && !mic.isRecording()) + { + Serial.println("Starting recording..."); + mic.startRecording(); + } + + if (!mic.isRecording() && mic.isRecordingReady()) + { + Serial.println("Finished recording"); + + processAudio(); + + mic.reset(); + } +} \ No newline at end of file diff --git a/6-consumer/lessons/1-speech-recognition/code-record/wio-terminal/smart-timer/src/mic.h b/6-consumer/lessons/1-speech-recognition/code-record/wio-terminal/smart-timer/src/mic.h new file mode 100644 index 0000000..ecdeb41 --- /dev/null +++ b/6-consumer/lessons/1-speech-recognition/code-record/wio-terminal/smart-timer/src/mic.h @@ -0,0 +1,248 @@ +#pragma once + +#include + +#include "flash_writer.h" + +#define RATE 16000 +#define SAMPLE_LENGTH_SECONDS 4 +#define SAMPLES RATE * SAMPLE_LENGTH_SECONDS +#define BUFFER_SIZE (SAMPLES * 2) + 44 +#define ADC_BUF_LEN 1600 + +class Mic +{ +public: + Mic() + { + _isRecording = false; + _isRecordingReady = false; + } + + void startRecording() + { + _isRecording = true; + _isRecordingReady = false; + } + + bool isRecording() + { + return _isRecording; + } + + bool isRecordingReady() + { + return _isRecordingReady; + } + + void init() + { + analogReference(AR_INTERNAL2V23); + + _writer.init(); + + initBufferHeader(); + configureDmaAdc(); + } + + void reset() + { + _isRecordingReady = false; + _isRecording = false; + + _writer.reset(); + + initBufferHeader(); + } + + void dmaHandler() + { + static uint8_t count = 0; + static uint16_t idx = 0; + + if (DMAC->Channel[1].CHINTFLAG.bit.SUSP) + { + DMAC->Channel[1].CHCTRLB.reg = DMAC_CHCTRLB_CMD_RESUME; + DMAC->Channel[1].CHINTFLAG.bit.SUSP = 1; + + if (count) + { + audioCallback(_adc_buf_0, ADC_BUF_LEN); + } + else + { + audioCallback(_adc_buf_1, ADC_BUF_LEN); + } + + count = (count + 1) % 2; + } + } + +private: + volatile bool _isRecording; + volatile bool _isRecordingReady; + FlashWriter _writer; + +typedef struct + { + uint16_t btctrl; + uint16_t btcnt; + uint32_t srcaddr; + uint32_t dstaddr; + uint32_t descaddr; + } dmacdescriptor; + + // Globals - DMA and ADC + volatile dmacdescriptor _wrb[DMAC_CH_NUM] __attribute__((aligned(16))); + dmacdescriptor _descriptor_section[DMAC_CH_NUM] __attribute__((aligned(16))); + dmacdescriptor _descriptor __attribute__((aligned(16))); + + void configureDmaAdc() + { + // Configure DMA to sample from ADC at a regular interval (triggered by timer/counter) + DMAC->BASEADDR.reg = (uint32_t)_descriptor_section; // Specify the location of the descriptors + DMAC->WRBADDR.reg = (uint32_t)_wrb; // Specify the location of the write back descriptors + DMAC->CTRL.reg = DMAC_CTRL_DMAENABLE | DMAC_CTRL_LVLEN(0xf); // Enable the DMAC peripheral + DMAC->Channel[1].CHCTRLA.reg = DMAC_CHCTRLA_TRIGSRC(TC5_DMAC_ID_OVF) | // Set DMAC to trigger on TC5 timer overflow + DMAC_CHCTRLA_TRIGACT_BURST; // DMAC burst transfer + + _descriptor.descaddr = (uint32_t)&_descriptor_section[1]; // Set up a circular descriptor + _descriptor.srcaddr = (uint32_t)&ADC1->RESULT.reg; // Take the result from the ADC0 RESULT register + _descriptor.dstaddr = (uint32_t)_adc_buf_0 + sizeof(uint16_t) * ADC_BUF_LEN; // Place it in the adc_buf_0 array + _descriptor.btcnt = ADC_BUF_LEN; // Beat count + _descriptor.btctrl = DMAC_BTCTRL_BEATSIZE_HWORD | // Beat size is HWORD (16-bits) + DMAC_BTCTRL_DSTINC | // Increment the destination address + DMAC_BTCTRL_VALID | // Descriptor is valid + DMAC_BTCTRL_BLOCKACT_SUSPEND; // Suspend DMAC channel 0 after block transfer + memcpy(&_descriptor_section[0], &_descriptor, sizeof(_descriptor)); // Copy the descriptor to the descriptor section + + _descriptor.descaddr = (uint32_t)&_descriptor_section[0]; // Set up a circular descriptor + _descriptor.srcaddr = (uint32_t)&ADC1->RESULT.reg; // Take the result from the ADC0 RESULT register + _descriptor.dstaddr = (uint32_t)_adc_buf_1 + sizeof(uint16_t) * ADC_BUF_LEN; // Place it in the adc_buf_1 array + _descriptor.btcnt = ADC_BUF_LEN; // Beat count + _descriptor.btctrl = DMAC_BTCTRL_BEATSIZE_HWORD | // Beat size is HWORD (16-bits) + DMAC_BTCTRL_DSTINC | // Increment the destination address + DMAC_BTCTRL_VALID | // Descriptor is valid + DMAC_BTCTRL_BLOCKACT_SUSPEND; // Suspend DMAC channel 0 after block transfer + memcpy(&_descriptor_section[1], &_descriptor, sizeof(_descriptor)); // Copy the descriptor to the descriptor section + + // Configure NVIC + NVIC_SetPriority(DMAC_1_IRQn, 0); // Set the Nested Vector Interrupt Controller (NVIC) priority for DMAC1 to 0 (highest) + NVIC_EnableIRQ(DMAC_1_IRQn); // Connect DMAC1 to Nested Vector Interrupt Controller (NVIC) + + // Activate the suspend (SUSP) interrupt on DMAC channel 1 + DMAC->Channel[1].CHINTENSET.reg = DMAC_CHINTENSET_SUSP; + + // Configure ADC + ADC1->INPUTCTRL.bit.MUXPOS = ADC_INPUTCTRL_MUXPOS_AIN12_Val; // Set the analog input to ADC0/AIN2 (PB08 - A4 on Metro M4) + while (ADC1->SYNCBUSY.bit.INPUTCTRL) + ; // Wait for synchronization + ADC1->SAMPCTRL.bit.SAMPLEN = 0x00; // Set max Sampling Time Length to half divided ADC clock pulse (2.66us) + while (ADC1->SYNCBUSY.bit.SAMPCTRL) + ; // Wait for synchronization + ADC1->CTRLA.reg = ADC_CTRLA_PRESCALER_DIV128; // Divide Clock ADC GCLK by 128 (48MHz/128 = 375kHz) + ADC1->CTRLB.reg = ADC_CTRLB_RESSEL_12BIT | // Set ADC resolution to 12 bits + ADC_CTRLB_FREERUN; // Set ADC to free run mode + while (ADC1->SYNCBUSY.bit.CTRLB) + ; // Wait for synchronization + ADC1->CTRLA.bit.ENABLE = 1; // Enable the ADC + while (ADC1->SYNCBUSY.bit.ENABLE) + ; // Wait for synchronization + ADC1->SWTRIG.bit.START = 1; // Initiate a software trigger to start an ADC conversion + while (ADC1->SYNCBUSY.bit.SWTRIG) + ; // Wait for synchronization + + // Enable DMA channel 1 + DMAC->Channel[1].CHCTRLA.bit.ENABLE = 1; + + // Configure Timer/Counter 5 + GCLK->PCHCTRL[TC5_GCLK_ID].reg = GCLK_PCHCTRL_CHEN | // Enable perhipheral channel for TC5 + GCLK_PCHCTRL_GEN_GCLK1; // Connect generic clock 0 at 48MHz + + TC5->COUNT16.WAVE.reg = TC_WAVE_WAVEGEN_MFRQ; // Set TC5 to Match Frequency (MFRQ) mode + TC5->COUNT16.CC[0].reg = 3000 - 1; // Set the trigger to 16 kHz: (4Mhz / 16000) - 1 + while (TC5->COUNT16.SYNCBUSY.bit.CC0) + ; // Wait for synchronization + + // Start Timer/Counter 5 + TC5->COUNT16.CTRLA.bit.ENABLE = 1; // Enable the TC5 timer + while (TC5->COUNT16.SYNCBUSY.bit.ENABLE) + ; // Wait for synchronization + } + + uint16_t _adc_buf_0[ADC_BUF_LEN]; + uint16_t _adc_buf_1[ADC_BUF_LEN]; + + // WAV files have a header. This struct defines that header + struct wavFileHeader + { + char riff[4]; /* "RIFF" */ + long flength; /* file length in bytes */ + char wave[4]; /* "WAVE" */ + char fmt[4]; /* "fmt " */ + long chunk_size; /* size of FMT chunk in bytes (usually 16) */ + short format_tag; /* 1=PCM, 257=Mu-Law, 258=A-Law, 259=ADPCM */ + short num_chans; /* 1=mono, 2=stereo */ + long srate; /* Sampling rate in samples per second */ + long bytes_per_sec; /* bytes per second = srate*bytes_per_samp */ + short bytes_per_samp; /* 2=16-bit mono, 4=16-bit stereo */ + short bits_per_samp; /* Number of bits per sample */ + char data[4]; /* "data" */ + long dlength; /* data length in bytes (filelength - 44) */ + }; + + void initBufferHeader() + { + wavFileHeader wavh; + + strncpy(wavh.riff, "RIFF", 4); + strncpy(wavh.wave, "WAVE", 4); + strncpy(wavh.fmt, "fmt ", 4); + strncpy(wavh.data, "data", 4); + + wavh.chunk_size = 16; + wavh.format_tag = 1; // PCM + wavh.num_chans = 1; // mono + wavh.srate = RATE; + wavh.bytes_per_sec = (RATE * 1 * 16 * 1) / 8; + wavh.bytes_per_samp = 2; + wavh.bits_per_samp = 16; + wavh.dlength = RATE * 2 * 1 * 16 / 2; + wavh.flength = wavh.dlength + 44; + + _writer.writeSfudBuffer((byte *)&wavh, 44); + } + + void audioCallback(uint16_t *buf, uint32_t buf_len) + { + static uint32_t idx = 44; + + if (_isRecording) + { + for (uint32_t i = 0; i < buf_len; i++) + { + int16_t audio_value = ((int16_t)buf[i] - 2048) * 16; + + _writer.writeSfudBuffer(audio_value & 0xFF); + _writer.writeSfudBuffer((audio_value >> 8) & 0xFF); + } + + idx += buf_len; + + if (idx >= BUFFER_SIZE) + { + _writer.flushSfudBuffer(); + idx = 44; + _isRecording = false; + _isRecordingReady = true; + } + } + } +}; + +Mic mic; + +void DMAC_1_Handler() +{ + mic.dmaHandler(); +} diff --git a/6-consumer/lessons/1-speech-recognition/code-record/wio-terminal/smart-timer/test/README b/6-consumer/lessons/1-speech-recognition/code-record/wio-terminal/smart-timer/test/README new file mode 100644 index 0000000..b94d089 --- /dev/null +++ b/6-consumer/lessons/1-speech-recognition/code-record/wio-terminal/smart-timer/test/README @@ -0,0 +1,11 @@ + +This directory is intended for PlatformIO Unit Testing and project tests. + +Unit Testing is a software testing method by which individual units of +source code, sets of one or more MCU program modules together with associated +control data, usage procedures, and operating procedures, are tested to +determine whether they are fit for use. Unit testing finds problems early +in the development cycle. + +More information about PlatformIO Unit Testing: +- https://docs.platformio.org/page/plus/unit-testing.html diff --git a/6-consumer/lessons/1-speech-recognition/code-speech-to-text/pi/smart-timer/app.py b/6-consumer/lessons/1-speech-recognition/code-speech-to-text/pi/smart-timer/app.py index 3a56b2f..b3bd252 100644 --- a/6-consumer/lessons/1-speech-recognition/code-speech-to-text/pi/smart-timer/app.py +++ b/6-consumer/lessons/1-speech-recognition/code-speech-to-text/pi/smart-timer/app.py @@ -1,5 +1,4 @@ import io -import json import pyaudio import requests import time @@ -66,17 +65,20 @@ def convert_speech_to_text(buffer): } response = requests.post(url, headers=headers, params=params, data=buffer) - response_json = json.loads(response.text) + response_json = response.json() if response_json['RecognitionStatus'] == 'Success': return response_json['DisplayText'] else: return '' +def process_text(text): + print(text) + while True: while not button.is_pressed(): time.sleep(.1) buffer = capture_audio() text = convert_speech_to_text(buffer) - print(text) \ No newline at end of file + process_text(text) \ No newline at end of file diff --git a/6-consumer/lessons/1-speech-recognition/code-speech-to-text/virtual-iot-device/smart-timer/app.py b/6-consumer/lessons/1-speech-recognition/code-speech-to-text/virtual-iot-device/smart-timer/app.py index 6d282ad..4c9ea0a 100644 --- a/6-consumer/lessons/1-speech-recognition/code-speech-to-text/virtual-iot-device/smart-timer/app.py +++ b/6-consumer/lessons/1-speech-recognition/code-speech-to-text/virtual-iot-device/smart-timer/app.py @@ -11,8 +11,11 @@ recognizer_config = SpeechConfig(subscription=speech_api_key, recognizer = SpeechRecognizer(speech_config=recognizer_config) +def process_text(text): + print(text) + def recognized(args): - print(args.result.text) + process_text(args.result.text) recognizer.recognized.connect(recognized) diff --git a/6-consumer/lessons/1-speech-recognition/code-speech-to-text/wio-terminal/smart-timer/include/README b/6-consumer/lessons/1-speech-recognition/code-speech-to-text/wio-terminal/smart-timer/include/README new file mode 100644 index 0000000..194dcd4 --- /dev/null +++ b/6-consumer/lessons/1-speech-recognition/code-speech-to-text/wio-terminal/smart-timer/include/README @@ -0,0 +1,39 @@ + +This directory is intended for project header files. + +A header file is a file containing C declarations and macro definitions +to be shared between several project source files. You request the use of a +header file in your project source file (C, C++, etc) located in `src` folder +by including it, with the C preprocessing directive `#include'. + +```src/main.c + +#include "header.h" + +int main (void) +{ + ... +} +``` + +Including a header file produces the same results as copying the header file +into each source file that needs it. Such copying would be time-consuming +and error-prone. With a header file, the related declarations appear +in only one place. If they need to be changed, they can be changed in one +place, and programs that include the header file will automatically use the +new version when next recompiled. The header file eliminates the labor of +finding and changing all the copies as well as the risk that a failure to +find one copy will result in inconsistencies within a program. + +In C, the usual convention is to give header files names that end with `.h'. +It is most portable to use only letters, digits, dashes, and underscores in +header file names, and at most one dot. + +Read more about using header files in official GCC documentation: + +* Include Syntax +* Include Operation +* Once-Only Headers +* Computed Includes + +https://gcc.gnu.org/onlinedocs/cpp/Header-Files.html diff --git a/6-consumer/lessons/1-speech-recognition/code-speech-to-text/wio-terminal/smart-timer/lib/README b/6-consumer/lessons/1-speech-recognition/code-speech-to-text/wio-terminal/smart-timer/lib/README new file mode 100644 index 0000000..6debab1 --- /dev/null +++ b/6-consumer/lessons/1-speech-recognition/code-speech-to-text/wio-terminal/smart-timer/lib/README @@ -0,0 +1,46 @@ + +This directory is intended for project specific (private) libraries. +PlatformIO will compile them to static libraries and link into executable file. + +The source code of each library should be placed in a an own separate directory +("lib/your_library_name/[here are source files]"). + +For example, see a structure of the following two libraries `Foo` and `Bar`: + +|--lib +| | +| |--Bar +| | |--docs +| | |--examples +| | |--src +| | |- Bar.c +| | |- Bar.h +| | |- library.json (optional, custom build options, etc) https://docs.platformio.org/page/librarymanager/config.html +| | +| |--Foo +| | |- Foo.c +| | |- Foo.h +| | +| |- README --> THIS FILE +| +|- platformio.ini +|--src + |- main.c + +and a contents of `src/main.c`: +``` +#include +#include + +int main (void) +{ + ... +} + +``` + +PlatformIO Library Dependency Finder will find automatically dependent +libraries scanning project source files. + +More information about PlatformIO Library Dependency Finder +- https://docs.platformio.org/page/librarymanager/ldf.html diff --git a/6-consumer/lessons/1-speech-recognition/code-speech-to-text/wio-terminal/smart-timer/platformio.ini b/6-consumer/lessons/1-speech-recognition/code-speech-to-text/wio-terminal/smart-timer/platformio.ini new file mode 100644 index 0000000..5adbe73 --- /dev/null +++ b/6-consumer/lessons/1-speech-recognition/code-speech-to-text/wio-terminal/smart-timer/platformio.ini @@ -0,0 +1,22 @@ +; PlatformIO Project Configuration File +; +; Build options: build flags, source filter +; Upload options: custom upload port, speed and extra flags +; Library options: dependencies, extra library storages +; Advanced options: extra scripting +; +; Please visit documentation for the other options and examples +; https://docs.platformio.org/page/projectconf.html + +[env:seeed_wio_terminal] +platform = atmelsam +board = seeed_wio_terminal +framework = arduino +lib_deps = + seeed-studio/Seeed Arduino FS @ 2.0.3 + seeed-studio/Seeed Arduino SFUD @ 2.0.1 + seeed-studio/Seeed Arduino rpcWiFi @ 1.0.5 + seeed-studio/Seeed Arduino rpcUnified @ 2.1.3 + seeed-studio/Seeed_Arduino_mbedtls @ 3.0.1 + seeed-studio/Seeed Arduino RTC @ 2.0.0 + bblanchon/ArduinoJson @ 6.17.3 diff --git a/6-consumer/lessons/1-speech-recognition/code-speech-to-text/wio-terminal/smart-timer/src/config.h b/6-consumer/lessons/1-speech-recognition/code-speech-to-text/wio-terminal/smart-timer/src/config.h new file mode 100644 index 0000000..cca25e6 --- /dev/null +++ b/6-consumer/lessons/1-speech-recognition/code-speech-to-text/wio-terminal/smart-timer/src/config.h @@ -0,0 +1,89 @@ +#pragma once + +#define RATE 16000 +#define SAMPLE_LENGTH_SECONDS 4 +#define SAMPLES RATE * SAMPLE_LENGTH_SECONDS +#define BUFFER_SIZE (SAMPLES * 2) + 44 +#define ADC_BUF_LEN 1600 + +const char *SSID = ""; +const char *PASSWORD = ""; + +const char *SPEECH_API_KEY = ""; +const char *SPEECH_LOCATION = ""; +const char *LANGUAGE = ""; + +const char *TOKEN_URL = "https://%s.api.cognitive.microsoft.com/sts/v1.0/issuetoken"; +const char *SPEECH_URL = "https://%s.stt.speech.microsoft.com/speech/recognition/conversation/cognitiveservices/v1?language=%s"; + +const char *TOKEN_CERTIFICATE = + "-----BEGIN CERTIFICATE-----\r\n" + "MIIF8zCCBNugAwIBAgIQAueRcfuAIek/4tmDg0xQwDANBgkqhkiG9w0BAQwFADBh\r\n" + "MQswCQYDVQQGEwJVUzEVMBMGA1UEChMMRGlnaUNlcnQgSW5jMRkwFwYDVQQLExB3\r\n" + "d3cuZGlnaWNlcnQuY29tMSAwHgYDVQQDExdEaWdpQ2VydCBHbG9iYWwgUm9vdCBH\r\n" + "MjAeFw0yMDA3MjkxMjMwMDBaFw0yNDA2MjcyMzU5NTlaMFkxCzAJBgNVBAYTAlVT\r\n" + "MR4wHAYDVQQKExVNaWNyb3NvZnQgQ29ycG9yYXRpb24xKjAoBgNVBAMTIU1pY3Jv\r\n" + "c29mdCBBenVyZSBUTFMgSXNzdWluZyBDQSAwNjCCAiIwDQYJKoZIhvcNAQEBBQAD\r\n" + "ggIPADCCAgoCggIBALVGARl56bx3KBUSGuPc4H5uoNFkFH4e7pvTCxRi4j/+z+Xb\r\n" + "wjEz+5CipDOqjx9/jWjskL5dk7PaQkzItidsAAnDCW1leZBOIi68Lff1bjTeZgMY\r\n" + "iwdRd3Y39b/lcGpiuP2d23W95YHkMMT8IlWosYIX0f4kYb62rphyfnAjYb/4Od99\r\n" + "ThnhlAxGtfvSbXcBVIKCYfZgqRvV+5lReUnd1aNjRYVzPOoifgSx2fRyy1+pO1Uz\r\n" + "aMMNnIOE71bVYW0A1hr19w7kOb0KkJXoALTDDj1ukUEDqQuBfBxReL5mXiu1O7WG\r\n" + "0vltg0VZ/SZzctBsdBlx1BkmWYBW261KZgBivrql5ELTKKd8qgtHcLQA5fl6JB0Q\r\n" + "gs5XDaWehN86Gps5JW8ArjGtjcWAIP+X8CQaWfaCnuRm6Bk/03PQWhgdi84qwA0s\r\n" + "sRfFJwHUPTNSnE8EiGVk2frt0u8PG1pwSQsFuNJfcYIHEv1vOzP7uEOuDydsmCjh\r\n" + "lxuoK2n5/2aVR3BMTu+p4+gl8alXoBycyLmj3J/PUgqD8SL5fTCUegGsdia/Sa60\r\n" + "N2oV7vQ17wjMN+LXa2rjj/b4ZlZgXVojDmAjDwIRdDUujQu0RVsJqFLMzSIHpp2C\r\n" + "Zp7mIoLrySay2YYBu7SiNwL95X6He2kS8eefBBHjzwW/9FxGqry57i71c2cDAgMB\r\n" + "AAGjggGtMIIBqTAdBgNVHQ4EFgQU1cFnOsKjnfR3UltZEjgp5lVou6UwHwYDVR0j\r\n" + "BBgwFoAUTiJUIBiV5uNu5g/6+rkS7QYXjzkwDgYDVR0PAQH/BAQDAgGGMB0GA1Ud\r\n" + "JQQWMBQGCCsGAQUFBwMBBggrBgEFBQcDAjASBgNVHRMBAf8ECDAGAQH/AgEAMHYG\r\n" + "CCsGAQUFBwEBBGowaDAkBggrBgEFBQcwAYYYaHR0cDovL29jc3AuZGlnaWNlcnQu\r\n" + "Y29tMEAGCCsGAQUFBzAChjRodHRwOi8vY2FjZXJ0cy5kaWdpY2VydC5jb20vRGln\r\n" + "aUNlcnRHbG9iYWxSb290RzIuY3J0MHsGA1UdHwR0MHIwN6A1oDOGMWh0dHA6Ly9j\r\n" + "cmwzLmRpZ2ljZXJ0LmNvbS9EaWdpQ2VydEdsb2JhbFJvb3RHMi5jcmwwN6A1oDOG\r\n" + "MWh0dHA6Ly9jcmw0LmRpZ2ljZXJ0LmNvbS9EaWdpQ2VydEdsb2JhbFJvb3RHMi5j\r\n" + "cmwwHQYDVR0gBBYwFDAIBgZngQwBAgEwCAYGZ4EMAQICMBAGCSsGAQQBgjcVAQQD\r\n" + "AgEAMA0GCSqGSIb3DQEBDAUAA4IBAQB2oWc93fB8esci/8esixj++N22meiGDjgF\r\n" + "+rA2LUK5IOQOgcUSTGKSqF9lYfAxPjrqPjDCUPHCURv+26ad5P/BYtXtbmtxJWu+\r\n" + "cS5BhMDPPeG3oPZwXRHBJFAkY4O4AF7RIAAUW6EzDflUoDHKv83zOiPfYGcpHc9s\r\n" + "kxAInCedk7QSgXvMARjjOqdakor21DTmNIUotxo8kHv5hwRlGhBJwps6fEVi1Bt0\r\n" + "trpM/3wYxlr473WSPUFZPgP1j519kLpWOJ8z09wxay+Br29irPcBYv0GMXlHqThy\r\n" + "8y4m/HyTQeI2IMvMrQnwqPpY+rLIXyviI2vLoI+4xKE4Rn38ZZ8m\r\n" + "-----END CERTIFICATE-----\r\n"; + +const char *SPEECH_CERTIFICATE = + "-----BEGIN CERTIFICATE-----\r\n" + "MIIF8zCCBNugAwIBAgIQCq+mxcpjxFFB6jvh98dTFzANBgkqhkiG9w0BAQwFADBh\r\n" + "MQswCQYDVQQGEwJVUzEVMBMGA1UEChMMRGlnaUNlcnQgSW5jMRkwFwYDVQQLExB3\r\n" + "d3cuZGlnaWNlcnQuY29tMSAwHgYDVQQDExdEaWdpQ2VydCBHbG9iYWwgUm9vdCBH\r\n" + "MjAeFw0yMDA3MjkxMjMwMDBaFw0yNDA2MjcyMzU5NTlaMFkxCzAJBgNVBAYTAlVT\r\n" + "MR4wHAYDVQQKExVNaWNyb3NvZnQgQ29ycG9yYXRpb24xKjAoBgNVBAMTIU1pY3Jv\r\n" + "c29mdCBBenVyZSBUTFMgSXNzdWluZyBDQSAwMTCCAiIwDQYJKoZIhvcNAQEBBQAD\r\n" + "ggIPADCCAgoCggIBAMedcDrkXufP7pxVm1FHLDNA9IjwHaMoaY8arqqZ4Gff4xyr\r\n" + "RygnavXL7g12MPAx8Q6Dd9hfBzrfWxkF0Br2wIvlvkzW01naNVSkHp+OS3hL3W6n\r\n" + "l/jYvZnVeJXjtsKYcXIf/6WtspcF5awlQ9LZJcjwaH7KoZuK+THpXCMtzD8XNVdm\r\n" + "GW/JI0C/7U/E7evXn9XDio8SYkGSM63aLO5BtLCv092+1d4GGBSQYolRq+7Pd1kR\r\n" + "EkWBPm0ywZ2Vb8GIS5DLrjelEkBnKCyy3B0yQud9dpVsiUeE7F5sY8Me96WVxQcb\r\n" + "OyYdEY/j/9UpDlOG+vA+YgOvBhkKEjiqygVpP8EZoMMijephzg43b5Qi9r5UrvYo\r\n" + "o19oR/8pf4HJNDPF0/FJwFVMW8PmCBLGstin3NE1+NeWTkGt0TzpHjgKyfaDP2tO\r\n" + "4bCk1G7pP2kDFT7SYfc8xbgCkFQ2UCEXsaH/f5YmpLn4YPiNFCeeIida7xnfTvc4\r\n" + "7IxyVccHHq1FzGygOqemrxEETKh8hvDR6eBdrBwmCHVgZrnAqnn93JtGyPLi6+cj\r\n" + "WGVGtMZHwzVvX1HvSFG771sskcEjJxiQNQDQRWHEh3NxvNb7kFlAXnVdRkkvhjpR\r\n" + "GchFhTAzqmwltdWhWDEyCMKC2x/mSZvZtlZGY+g37Y72qHzidwtyW7rBetZJAgMB\r\n" + "AAGjggGtMIIBqTAdBgNVHQ4EFgQUDyBd16FXlduSzyvQx8J3BM5ygHYwHwYDVR0j\r\n" + "BBgwFoAUTiJUIBiV5uNu5g/6+rkS7QYXjzkwDgYDVR0PAQH/BAQDAgGGMB0GA1Ud\r\n" + "JQQWMBQGCCsGAQUFBwMBBggrBgEFBQcDAjASBgNVHRMBAf8ECDAGAQH/AgEAMHYG\r\n" + "CCsGAQUFBwEBBGowaDAkBggrBgEFBQcwAYYYaHR0cDovL29jc3AuZGlnaWNlcnQu\r\n" + "Y29tMEAGCCsGAQUFBzAChjRodHRwOi8vY2FjZXJ0cy5kaWdpY2VydC5jb20vRGln\r\n" + "aUNlcnRHbG9iYWxSb290RzIuY3J0MHsGA1UdHwR0MHIwN6A1oDOGMWh0dHA6Ly9j\r\n" + "cmwzLmRpZ2ljZXJ0LmNvbS9EaWdpQ2VydEdsb2JhbFJvb3RHMi5jcmwwN6A1oDOG\r\n" + "MWh0dHA6Ly9jcmw0LmRpZ2ljZXJ0LmNvbS9EaWdpQ2VydEdsb2JhbFJvb3RHMi5j\r\n" + "cmwwHQYDVR0gBBYwFDAIBgZngQwBAgEwCAYGZ4EMAQICMBAGCSsGAQQBgjcVAQQD\r\n" + "AgEAMA0GCSqGSIb3DQEBDAUAA4IBAQAlFvNh7QgXVLAZSsNR2XRmIn9iS8OHFCBA\r\n" + "WxKJoi8YYQafpMTkMqeuzoL3HWb1pYEipsDkhiMnrpfeYZEA7Lz7yqEEtfgHcEBs\r\n" + "K9KcStQGGZRfmWU07hPXHnFz+5gTXqzCE2PBMlRgVUYJiA25mJPXfB00gDvGhtYa\r\n" + "+mENwM9Bq1B9YYLyLjRtUz8cyGsdyTIG/bBM/Q9jcV8JGqMU/UjAdh1pFyTnnHEl\r\n" + "Y59Npi7F87ZqYYJEHJM2LGD+le8VsHjgeWX2CJQko7klXvcizuZvUEDTjHaQcs2J\r\n" + "+kPgfyMIOY1DMJ21NxOJ2xPRC/wAh/hzSBRVtoAnyuxtkZ4VjIOh\r\n" + "-----END CERTIFICATE-----\r\n"; diff --git a/6-consumer/lessons/1-speech-recognition/code-speech-to-text/wio-terminal/smart-timer/src/flash_stream.h b/6-consumer/lessons/1-speech-recognition/code-speech-to-text/wio-terminal/smart-timer/src/flash_stream.h new file mode 100644 index 0000000..b841f1d --- /dev/null +++ b/6-consumer/lessons/1-speech-recognition/code-speech-to-text/wio-terminal/smart-timer/src/flash_stream.h @@ -0,0 +1,69 @@ +#pragma once + +#include +#include +#include + +#include "config.h" + +class FlashStream : public Stream +{ +public: + FlashStream() + { + _pos = 0; + _flash_address = 0; + _flash = sfud_get_device_table() + 0; + + populateBuffer(); + } + + virtual size_t write(uint8_t val) + { + return 0; + } + + virtual int available() + { + int remaining = BUFFER_SIZE - ((_flash_address - HTTP_TCP_BUFFER_SIZE) + _pos); + int bytes_available = min(HTTP_TCP_BUFFER_SIZE, remaining); + + if (bytes_available == 0) + { + bytes_available = -1; + } + + return bytes_available; + } + + virtual int read() + { + int retVal = _buffer[_pos++]; + + if (_pos == HTTP_TCP_BUFFER_SIZE) + { + populateBuffer(); + } + + return retVal; + } + + virtual int peek() + { + return _buffer[_pos]; + } + +private: + void populateBuffer() + { + sfud_read(_flash, _flash_address, HTTP_TCP_BUFFER_SIZE, _buffer); + _flash_address += HTTP_TCP_BUFFER_SIZE; + _pos = 0; + } + + size_t _pos; + size_t _flash_address; + const sfud_flash *_flash; + + byte _buffer[HTTP_TCP_BUFFER_SIZE]; +}; diff --git a/6-consumer/lessons/1-speech-recognition/code-speech-to-text/wio-terminal/smart-timer/src/flash_writer.h b/6-consumer/lessons/1-speech-recognition/code-speech-to-text/wio-terminal/smart-timer/src/flash_writer.h new file mode 100644 index 0000000..87fdff2 --- /dev/null +++ b/6-consumer/lessons/1-speech-recognition/code-speech-to-text/wio-terminal/smart-timer/src/flash_writer.h @@ -0,0 +1,60 @@ +#pragma once + +#include +#include + +class FlashWriter +{ +public: + void init() + { + _flash = sfud_get_device_table() + 0; + _sfudBufferSize = _flash->chip.erase_gran; + _sfudBuffer = new byte[_sfudBufferSize]; + _sfudBufferPos = 0; + _sfudBufferWritePos = 0; + } + + void reset() + { + _sfudBufferPos = 0; + _sfudBufferWritePos = 0; + } + + void writeSfudBuffer(byte b) + { + _sfudBuffer[_sfudBufferPos++] = b; + if (_sfudBufferPos == _sfudBufferSize) + { + sfud_erase_write(_flash, _sfudBufferWritePos, _sfudBufferSize, _sfudBuffer); + _sfudBufferWritePos += _sfudBufferSize; + _sfudBufferPos = 0; + } + } + + void flushSfudBuffer() + { + if (_sfudBufferPos > 0) + { + sfud_erase_write(_flash, _sfudBufferWritePos, _sfudBufferSize, _sfudBuffer); + _sfudBufferWritePos += _sfudBufferSize; + _sfudBufferPos = 0; + } + } + + void writeSfudBuffer(byte *b, size_t len) + { + for (size_t i = 0; i < len; ++i) + { + writeSfudBuffer(b[i]); + } + } + +private: + byte *_sfudBuffer; + size_t _sfudBufferSize; + size_t _sfudBufferPos; + size_t _sfudBufferWritePos; + + const sfud_flash *_flash; +}; \ No newline at end of file diff --git a/6-consumer/lessons/1-speech-recognition/code-speech-to-text/wio-terminal/smart-timer/src/main.cpp b/6-consumer/lessons/1-speech-recognition/code-speech-to-text/wio-terminal/smart-timer/src/main.cpp new file mode 100644 index 0000000..37924a6 --- /dev/null +++ b/6-consumer/lessons/1-speech-recognition/code-speech-to-text/wio-terminal/smart-timer/src/main.cpp @@ -0,0 +1,69 @@ +#include +#include +#include +#include + +#include "config.h" +#include "mic.h" +#include "speech_to_text.h" + +void connectWiFi() +{ + while (WiFi.status() != WL_CONNECTED) + { + Serial.println("Connecting to WiFi.."); + WiFi.begin(SSID, PASSWORD); + delay(500); + } + + Serial.println("Connected!"); +} + +void setup() +{ + Serial.begin(9600); + + while (!Serial) + ; // Wait for Serial to be ready + + delay(1000); + + connectWiFi(); + + while (!(sfud_init() == SFUD_SUCCESS)) + ; + + sfud_qspi_fast_read_enable(sfud_get_device(SFUD_W25Q32_DEVICE_INDEX), 2); + + pinMode(WIO_KEY_C, INPUT_PULLUP); + + mic.init(); + + speechToText.init(); + + Serial.println("Ready."); +} + +void processAudio() +{ + String text = speechToText.convertSpeechToText(); + Serial.println(text); +} + +void loop() +{ + if (digitalRead(WIO_KEY_C) == LOW && !mic.isRecording()) + { + Serial.println("Starting recording..."); + mic.startRecording(); + } + + if (!mic.isRecording() && mic.isRecordingReady()) + { + Serial.println("Finished recording"); + + processAudio(); + + mic.reset(); + } +} \ No newline at end of file diff --git a/6-consumer/lessons/1-speech-recognition/code-speech-to-text/wio-terminal/smart-timer/src/mic.h b/6-consumer/lessons/1-speech-recognition/code-speech-to-text/wio-terminal/smart-timer/src/mic.h new file mode 100644 index 0000000..5f0815d --- /dev/null +++ b/6-consumer/lessons/1-speech-recognition/code-speech-to-text/wio-terminal/smart-timer/src/mic.h @@ -0,0 +1,242 @@ +#pragma once + +#include + +#include "config.h" +#include "flash_writer.h" + +class Mic +{ +public: + Mic() + { + _isRecording = false; + _isRecordingReady = false; + } + + void startRecording() + { + _isRecording = true; + _isRecordingReady = false; + } + + bool isRecording() + { + return _isRecording; + } + + bool isRecordingReady() + { + return _isRecordingReady; + } + + void init() + { + analogReference(AR_INTERNAL2V23); + + _writer.init(); + + initBufferHeader(); + configureDmaAdc(); + } + + void reset() + { + _isRecordingReady = false; + _isRecording = false; + + _writer.reset(); + + initBufferHeader(); + } + + void dmaHandler() + { + static uint8_t count = 0; + + if (DMAC->Channel[1].CHINTFLAG.bit.SUSP) + { + DMAC->Channel[1].CHCTRLB.reg = DMAC_CHCTRLB_CMD_RESUME; + DMAC->Channel[1].CHINTFLAG.bit.SUSP = 1; + + if (count) + { + audioCallback(_adc_buf_0, ADC_BUF_LEN); + } + else + { + audioCallback(_adc_buf_1, ADC_BUF_LEN); + } + + count = (count + 1) % 2; + } + } + +private: + volatile bool _isRecording; + volatile bool _isRecordingReady; + FlashWriter _writer; + +typedef struct + { + uint16_t btctrl; + uint16_t btcnt; + uint32_t srcaddr; + uint32_t dstaddr; + uint32_t descaddr; + } dmacdescriptor; + + // Globals - DMA and ADC + volatile dmacdescriptor _wrb[DMAC_CH_NUM] __attribute__((aligned(16))); + dmacdescriptor _descriptor_section[DMAC_CH_NUM] __attribute__((aligned(16))); + dmacdescriptor _descriptor __attribute__((aligned(16))); + + void configureDmaAdc() + { + // Configure DMA to sample from ADC at a regular interval (triggered by timer/counter) + DMAC->BASEADDR.reg = (uint32_t)_descriptor_section; // Specify the location of the descriptors + DMAC->WRBADDR.reg = (uint32_t)_wrb; // Specify the location of the write back descriptors + DMAC->CTRL.reg = DMAC_CTRL_DMAENABLE | DMAC_CTRL_LVLEN(0xf); // Enable the DMAC peripheral + DMAC->Channel[1].CHCTRLA.reg = DMAC_CHCTRLA_TRIGSRC(TC5_DMAC_ID_OVF) | // Set DMAC to trigger on TC5 timer overflow + DMAC_CHCTRLA_TRIGACT_BURST; // DMAC burst transfer + + _descriptor.descaddr = (uint32_t)&_descriptor_section[1]; // Set up a circular descriptor + _descriptor.srcaddr = (uint32_t)&ADC1->RESULT.reg; // Take the result from the ADC0 RESULT register + _descriptor.dstaddr = (uint32_t)_adc_buf_0 + sizeof(uint16_t) * ADC_BUF_LEN; // Place it in the adc_buf_0 array + _descriptor.btcnt = ADC_BUF_LEN; // Beat count + _descriptor.btctrl = DMAC_BTCTRL_BEATSIZE_HWORD | // Beat size is HWORD (16-bits) + DMAC_BTCTRL_DSTINC | // Increment the destination address + DMAC_BTCTRL_VALID | // Descriptor is valid + DMAC_BTCTRL_BLOCKACT_SUSPEND; // Suspend DMAC channel 0 after block transfer + memcpy(&_descriptor_section[0], &_descriptor, sizeof(_descriptor)); // Copy the descriptor to the descriptor section + + _descriptor.descaddr = (uint32_t)&_descriptor_section[0]; // Set up a circular descriptor + _descriptor.srcaddr = (uint32_t)&ADC1->RESULT.reg; // Take the result from the ADC0 RESULT register + _descriptor.dstaddr = (uint32_t)_adc_buf_1 + sizeof(uint16_t) * ADC_BUF_LEN; // Place it in the adc_buf_1 array + _descriptor.btcnt = ADC_BUF_LEN; // Beat count + _descriptor.btctrl = DMAC_BTCTRL_BEATSIZE_HWORD | // Beat size is HWORD (16-bits) + DMAC_BTCTRL_DSTINC | // Increment the destination address + DMAC_BTCTRL_VALID | // Descriptor is valid + DMAC_BTCTRL_BLOCKACT_SUSPEND; // Suspend DMAC channel 0 after block transfer + memcpy(&_descriptor_section[1], &_descriptor, sizeof(_descriptor)); // Copy the descriptor to the descriptor section + + // Configure NVIC + NVIC_SetPriority(DMAC_1_IRQn, 0); // Set the Nested Vector Interrupt Controller (NVIC) priority for DMAC1 to 0 (highest) + NVIC_EnableIRQ(DMAC_1_IRQn); // Connect DMAC1 to Nested Vector Interrupt Controller (NVIC) + + // Activate the suspend (SUSP) interrupt on DMAC channel 1 + DMAC->Channel[1].CHINTENSET.reg = DMAC_CHINTENSET_SUSP; + + // Configure ADC + ADC1->INPUTCTRL.bit.MUXPOS = ADC_INPUTCTRL_MUXPOS_AIN12_Val; // Set the analog input to ADC0/AIN2 (PB08 - A4 on Metro M4) + while (ADC1->SYNCBUSY.bit.INPUTCTRL) + ; // Wait for synchronization + ADC1->SAMPCTRL.bit.SAMPLEN = 0x00; // Set max Sampling Time Length to half divided ADC clock pulse (2.66us) + while (ADC1->SYNCBUSY.bit.SAMPCTRL) + ; // Wait for synchronization + ADC1->CTRLA.reg = ADC_CTRLA_PRESCALER_DIV128; // Divide Clock ADC GCLK by 128 (48MHz/128 = 375kHz) + ADC1->CTRLB.reg = ADC_CTRLB_RESSEL_12BIT | // Set ADC resolution to 12 bits + ADC_CTRLB_FREERUN; // Set ADC to free run mode + while (ADC1->SYNCBUSY.bit.CTRLB) + ; // Wait for synchronization + ADC1->CTRLA.bit.ENABLE = 1; // Enable the ADC + while (ADC1->SYNCBUSY.bit.ENABLE) + ; // Wait for synchronization + ADC1->SWTRIG.bit.START = 1; // Initiate a software trigger to start an ADC conversion + while (ADC1->SYNCBUSY.bit.SWTRIG) + ; // Wait for synchronization + + // Enable DMA channel 1 + DMAC->Channel[1].CHCTRLA.bit.ENABLE = 1; + + // Configure Timer/Counter 5 + GCLK->PCHCTRL[TC5_GCLK_ID].reg = GCLK_PCHCTRL_CHEN | // Enable perhipheral channel for TC5 + GCLK_PCHCTRL_GEN_GCLK1; // Connect generic clock 0 at 48MHz + + TC5->COUNT16.WAVE.reg = TC_WAVE_WAVEGEN_MFRQ; // Set TC5 to Match Frequency (MFRQ) mode + TC5->COUNT16.CC[0].reg = 3000 - 1; // Set the trigger to 16 kHz: (4Mhz / 16000) - 1 + while (TC5->COUNT16.SYNCBUSY.bit.CC0) + ; // Wait for synchronization + + // Start Timer/Counter 5 + TC5->COUNT16.CTRLA.bit.ENABLE = 1; // Enable the TC5 timer + while (TC5->COUNT16.SYNCBUSY.bit.ENABLE) + ; // Wait for synchronization + } + + uint16_t _adc_buf_0[ADC_BUF_LEN]; + uint16_t _adc_buf_1[ADC_BUF_LEN]; + + // WAV files have a header. This struct defines that header + struct wavFileHeader + { + char riff[4]; /* "RIFF" */ + long flength; /* file length in bytes */ + char wave[4]; /* "WAVE" */ + char fmt[4]; /* "fmt " */ + long chunk_size; /* size of FMT chunk in bytes (usually 16) */ + short format_tag; /* 1=PCM, 257=Mu-Law, 258=A-Law, 259=ADPCM */ + short num_chans; /* 1=mono, 2=stereo */ + long srate; /* Sampling rate in samples per second */ + long bytes_per_sec; /* bytes per second = srate*bytes_per_samp */ + short bytes_per_samp; /* 2=16-bit mono, 4=16-bit stereo */ + short bits_per_samp; /* Number of bits per sample */ + char data[4]; /* "data" */ + long dlength; /* data length in bytes (filelength - 44) */ + }; + + void initBufferHeader() + { + wavFileHeader wavh; + + strncpy(wavh.riff, "RIFF", 4); + strncpy(wavh.wave, "WAVE", 4); + strncpy(wavh.fmt, "fmt ", 4); + strncpy(wavh.data, "data", 4); + + wavh.chunk_size = 16; + wavh.format_tag = 1; // PCM + wavh.num_chans = 1; // mono + wavh.srate = RATE; + wavh.bytes_per_sec = (RATE * 1 * 16 * 1) / 8; + wavh.bytes_per_samp = 2; + wavh.bits_per_samp = 16; + wavh.dlength = RATE * 2 * 1 * 16 / 2; + wavh.flength = wavh.dlength + 44; + + _writer.writeSfudBuffer((byte *)&wavh, 44); + } + + void audioCallback(uint16_t *buf, uint32_t buf_len) + { + static uint32_t idx = 44; + + if (_isRecording) + { + for (uint32_t i = 0; i < buf_len; i++) + { + int16_t audio_value = ((int16_t)buf[i] - 2048) * 16; + + _writer.writeSfudBuffer(audio_value & 0xFF); + _writer.writeSfudBuffer((audio_value >> 8) & 0xFF); + } + + idx += buf_len; + + if (idx >= BUFFER_SIZE) + { + _writer.flushSfudBuffer(); + idx = 44; + _isRecording = false; + _isRecordingReady = true; + } + } + } +}; + +Mic mic; + +void DMAC_1_Handler() +{ + mic.dmaHandler(); +} diff --git a/6-consumer/lessons/1-speech-recognition/code-speech-to-text/wio-terminal/smart-timer/src/speech_to_text.h b/6-consumer/lessons/1-speech-recognition/code-speech-to-text/wio-terminal/smart-timer/src/speech_to_text.h new file mode 100644 index 0000000..a7ce075 --- /dev/null +++ b/6-consumer/lessons/1-speech-recognition/code-speech-to-text/wio-terminal/smart-timer/src/speech_to_text.h @@ -0,0 +1,102 @@ +#pragma once + +#include +#include +#include +#include + +#include "config.h" +#include "flash_stream.h" + +class SpeechToText +{ +public: + void init() + { + _token_client.setCACert(TOKEN_CERTIFICATE); + _speech_client.setCACert(SPEECH_CERTIFICATE); + _access_token = getAccessToken(); + } + + String convertSpeechToText() + { + char url[128]; + sprintf(url, SPEECH_URL, SPEECH_LOCATION, LANGUAGE); + + HTTPClient httpClient; + httpClient.begin(_speech_client, url); + + httpClient.addHeader("Authorization", String("Bearer ") + _access_token); + httpClient.addHeader("Content-Type", String("audio/wav; codecs=audio/pcm; samplerate=") + String(RATE)); + httpClient.addHeader("Accept", "application/json;text/xml"); + + Serial.println("Sending speech..."); + + FlashStream stream; + int httpResponseCode = httpClient.sendRequest("POST", &stream, BUFFER_SIZE); + + Serial.println("Speech sent!"); + + String text = ""; + + if (httpResponseCode == 200) + { + String result = httpClient.getString(); + Serial.println(result); + + DynamicJsonDocument doc(1024); + deserializeJson(doc, result.c_str()); + + JsonObject obj = doc.as(); + text = obj["DisplayText"].as(); + } + else if (httpResponseCode == 401) + { + Serial.println("Access token expired, trying again with a new token"); + _access_token = getAccessToken(); + return convertSpeechToText(); + } + else + { + Serial.print("Failed to convert text to speech - error "); + Serial.println(httpResponseCode); + } + + httpClient.end(); + + return text; + } + +private: + String getAccessToken() + { + char url[128]; + sprintf(url, TOKEN_URL, SPEECH_LOCATION); + + HTTPClient httpClient; + httpClient.begin(_token_client, url); + + httpClient.addHeader("Ocp-Apim-Subscription-Key", SPEECH_API_KEY); + int httpResultCode = httpClient.POST("{}"); + + if (httpResultCode != 200) + { + Serial.println("Error getting access token, trying again..."); + delay(10000); + return getAccessToken(); + } + + Serial.println("Got access token."); + String result = httpClient.getString(); + + httpClient.end(); + + return result; + } + + WiFiClientSecure _token_client; + WiFiClientSecure _speech_client; + String _access_token; +}; + +SpeechToText speechToText; diff --git a/6-consumer/lessons/1-speech-recognition/code-speech-to-text/wio-terminal/smart-timer/test/README b/6-consumer/lessons/1-speech-recognition/code-speech-to-text/wio-terminal/smart-timer/test/README new file mode 100644 index 0000000..b94d089 --- /dev/null +++ b/6-consumer/lessons/1-speech-recognition/code-speech-to-text/wio-terminal/smart-timer/test/README @@ -0,0 +1,11 @@ + +This directory is intended for PlatformIO Unit Testing and project tests. + +Unit Testing is a software testing method by which individual units of +source code, sets of one or more MCU program modules together with associated +control data, usage procedures, and operating procedures, are tested to +determine whether they are fit for use. Unit testing finds problems early +in the development cycle. + +More information about PlatformIO Unit Testing: +- https://docs.platformio.org/page/plus/unit-testing.html diff --git a/6-consumer/lessons/1-speech-recognition/pi-speech-to-text.md b/6-consumer/lessons/1-speech-recognition/pi-speech-to-text.md index 5e9eac9..1f18f11 100644 --- a/6-consumer/lessons/1-speech-recognition/pi-speech-to-text.md +++ b/6-consumer/lessons/1-speech-recognition/pi-speech-to-text.md @@ -12,11 +12,10 @@ The audio can be sent to the speech service using the REST API. To use the speec 1. Remove the `play_audio` function. This is no longer needed as you don't want a smart timer to repeat back to you what you said. -1. Add the following imports to the top of the `app.py` file: +1. Add the following import to the top of the `app.py` file: ```python import requests - import json ``` 1. Add the following code above the `while True` loop to declare some settings for the speech service: @@ -74,7 +73,7 @@ The audio can be sent to the speech service using the REST API. To use the speec ```python response = requests.post(url, headers=headers, params=params, data=buffer) - response_json = json.loads(response.text) + response_json = response.json() if response_json['RecognitionStatus'] == 'Success': return response_json['DisplayText'] @@ -84,11 +83,18 @@ The audio can be sent to the speech service using the REST API. To use the speec This calls the URL and decodes the JSON value that comes in the response. The `RecognitionStatus` value in the response indicates if the call was able to extract speech into text successfully, and if this is `Success` then the text is returned from the function, otherwise an empty string is returned. -1. Finally replace the call to `play_audio` in the `while True` loop with a call to the `convert_speech_to_text` function, as well as printing the text to the console: +1. Above the `while True:` loop, define a function to process the text returned from the speech to text service. This function will just print the text to the console for now. + + ```python + def process_text(text): + print(text) + ``` + +1. Finally replace the call to `play_audio` in the `while True` loop with a call to the `convert_speech_to_text` function, passing the text to the `process_text` function: ```python text = convert_speech_to_text(buffer) - print(text) + process_text(text) ``` 1. Run the code. Press the button and speak into the microphone. Release the button when you are done, and the audio will be converted to text and printed to the console. diff --git a/6-consumer/lessons/1-speech-recognition/virtual-device-speech-to-text.md b/6-consumer/lessons/1-speech-recognition/virtual-device-speech-to-text.md index 8ec5b8a..5aaf102 100644 --- a/6-consumer/lessons/1-speech-recognition/virtual-device-speech-to-text.md +++ b/6-consumer/lessons/1-speech-recognition/virtual-device-speech-to-text.md @@ -32,6 +32,7 @@ On Windows, Linux, and macOS, the speech services Python SDK can be used to list 1. Add the following imports to the `app,py` file: ```python + import requests import time from azure.cognitiveservices.speech import SpeechConfig, SpeechRecognizer ``` @@ -62,11 +63,14 @@ On Windows, Linux, and macOS, the speech services Python SDK can be used to list recognizer = SpeechRecognizer(speech_config=recognizer_config) ``` -1. The speech recognizer runs on a background thread, listening for audio and converting any speech in it to text. You can get the text using a callback function - a function you define and pass to the recognizer. Every time speech is detected, the callback is called. Add the following code to define a callback that prints the text to the console, and pass this callback to the recognizer: +1. The speech recognizer runs on a background thread, listening for audio and converting any speech in it to text. You can get the text using a callback function - a function you define and pass to the recognizer. Every time speech is detected, the callback is called. Add the following code to define a callback, and pass this callback to the recognizer, as well as defining a function to process the text, writing it to the consoled: ```python + def process_text(text): + print(text) + def recognized(args): - print(args.result.text) + process_text(args.result.text) recognizer.recognized.connect(recognized) ``` diff --git a/6-consumer/lessons/1-speech-recognition/wio-terminal-audio.md b/6-consumer/lessons/1-speech-recognition/wio-terminal-audio.md index 9c643d6..aca2922 100644 --- a/6-consumer/lessons/1-speech-recognition/wio-terminal-audio.md +++ b/6-consumer/lessons/1-speech-recognition/wio-terminal-audio.md @@ -1,3 +1,536 @@ # Capture audio - Wio Terminal -Coming soon! +In this part of the lesson, you will write code to capture audio on your Wio Terminal. Audio capture will be controlled by one of the buttons on the top of the Wio Terminal. + +## Program the device to capture audio + +You can capture audio from the microphone using C++ code. The Wio Terminal only has 192KB of RAM, not enough to capture more than a couple of seconds of audio. It also has 4MB of flash memory, so this can be used instead, saving captured audio to the flash memory. + +The built-in microphone captures an analog signal, which gets converted to a digital signal that the Wio Terminal can use. When capturing audio, the data needs to be captured at the correct time - for example to capture audio at 16KHz, the audio needs to be captured exactly 16,000 times per second, with equal intervals between each sample. Rather than use your code to do this, you can use the direct memory access controller (DMAC). This is circuitry that can capture a signal from somewhere and write to memory, without interrupting your code running on the processor. + +✅ Read more on DMA on the [direct memory access page on Wikipedia](https://wikipedia.org/wiki/Direct_memory_access). + +![Audio from the mic goes to an ADC then to the DMAC. This writes to one buffer. When this buffer is full, it is processed and the DMAC writes to a second buffer](../../../images/dmac-adc-buffers.png) + +The DMAC can capture audio from the ADC at fixed intervals, such as at 16,000 times a second for 16KHz audio. It can write this captured data to a pre-allocated memory buffer, and when this is full, make it available to your code to process. Using this memory can delay capturing audio, but you can set up multiple buffers. The DMAC writes to buffer 1, then when it's full, notifies your code to proces buffer 1, whilst the DMAC writes to buffer 2. When buffer 2 is full, it notifies your code, and goes back to writing to buffer 1. That way as long as you process each buffer in less time that it takes to fill one, you will not lose any data. + +Once each buffer has been captured, it can be written to the flash memory. Flash memory needs to be written to using defined addresses, specifying where to write and how large to write, similar to updating an array of bytes in memory. Flash memory has granularity, meaning erase and writing operations rely not only on being of a fixed size, but aligning to that size. For example, if the granularity is 4096 bytes and you request an erase at address 4200, it could erase all the data from address 4096 to 8192. This means when you write the audio data to flash memory, it has to be in chunks of the correct size. + +### Task - configure flash memory + +1. Create a brand new Wio Terminal project using PlatformIO. Call this project `smart-timer`. Add code in the `setup` function to configure the serial port. + +1. Add the following library dependencies to the `platformio.ini` file to provide access to the flash memory: + + ```ini + lib_deps = + seeed-studio/Seeed Arduino FS @ 2.0.3 + seeed-studio/Seeed Arduino SFUD @ 2.0.1 + ``` + +1. Open the `main.cpp` file and add the following include directive for the flash memory library to the top of the file: + + ```cpp + #include + #include + ``` + + > 🎓 SFUD stands for Serial Flash Universal Driver, and is a library designed to work with all flash memory chips + +1. In the `setup` function, add the following code to set up the flash storage library: + + ```cpp + while (!(sfud_init() == SFUD_SUCCESS)) + ; + + sfud_qspi_fast_read_enable(sfud_get_device(SFUD_W25Q32_DEVICE_INDEX), 2); + ``` + + This loops until the SFUD library is initialized, then turns on fast reads. The built-in flash memory can be accessed using a Queued Serial Peripheral Interface (QSPI), a type of SPI controller that allows continuous access via a queue with minimal processor usage. This makes it faster to read and write to flash memory. + +1. Create a new file in the `src` folder called `flash_writer.h`. + +1. Add the following to the top of this file: + + ```cpp + #pragma once + + #include + #include + ``` + + This includes some needed header files, including the header file for the SFUD library to interact with flash memory + +1. Define a class in this new header file called `FlashWriter`: + + ```cpp + class FlashWriter + { + public: + + private: + }; + ``` + +1. In the `private` section, add the following code: + + ```cpp + byte *_sfudBuffer; + size_t _sfudBufferSize; + size_t _sfudBufferPos; + size_t _sfudBufferWritePos; + + const sfud_flash *_flash; + ``` + + This defines some fields for the buffer to use to store data before writing it to the flash memory. There is a byte array, `_sfudBuffer`, to write data to, and when this is full, the data is written to flash memory. The `_sfudBufferPos` field stores the current location to write to in this buffer, and `_sfudBufferWritePos` stores the location in flash memory to write to. `_flash` is a pointer the flash memory to write to - some microcontrollers have multiple flash memory chips. + +1. Add the following method to the `public` section to initialize this class: + + ```cpp + void init() + { + _flash = sfud_get_device_table() + 0; + _sfudBufferSize = _flash->chip.erase_gran; + _sfudBuffer = new byte[_sfudBufferSize]; + _sfudBufferPos = 0; + _sfudBufferWritePos = 0; + } + ``` + + This configures the flash memory on teh Wio Terminal to write to, and sets up the buffers based off the grain size of the flash memory. This is in an `init` method, rather than a constructor as this needs to be called after the flash memory has been set up in the `setup` function. + +1. Add the following code to the `public` section: + + ```cpp + void writeSfudBuffer(byte b) + { + _sfudBuffer[_sfudBufferPos++] = b; + if (_sfudBufferPos == _sfudBufferSize) + { + sfud_erase_write(_flash, _sfudBufferWritePos, _sfudBufferSize, _sfudBuffer); + _sfudBufferWritePos += _sfudBufferSize; + _sfudBufferPos = 0; + } + } + + void writeSfudBuffer(byte *b, size_t len) + { + for (size_t i = 0; i < len; ++i) + { + writeSfudBuffer(b[i]); + } + } + + void flushSfudBuffer() + { + if (_sfudBufferPos > 0) + { + sfud_erase_write(_flash, _sfudBufferWritePos, _sfudBufferSize, _sfudBuffer); + _sfudBufferWritePos += _sfudBufferSize; + _sfudBufferPos = 0; + } + } + ``` + + This code defines methods to write bytes to the flash storage system. It works by writing to an in-memory buffer that is the right size for the flash memory, and when this is full, this is written to the flash memory, erasing any existing data at that location. There is also a `flushSfudBuffer` to write an incomplete buffer, as the data being captured won't be exact multiples of the grain size, so the end part of the data needs to be written. + + > 💁 The end part of the data will write additional unwanted data, but this is ok as only the data needed will be read. + +### Task - set up audio capture + +1. Create a new file in the `src` folder called `config.h`. + +1. Add the following to the top of this file: + + ```cpp + #pragma once + + #define RATE 16000 + #define SAMPLE_LENGTH_SECONDS 4 + #define SAMPLES RATE * SAMPLE_LENGTH_SECONDS + #define BUFFER_SIZE (SAMPLES * 2) + 44 + #define ADC_BUF_LEN 1600 + ``` + + This code sets up some constants for the audio capture. + + | Constant | Value | Description | + | --------------------- | -----: | - | + | RATE | 16000 | The sample rate for the audio. !6,000 is 16KHz | + | SAMPLE_LENGTH_SECONDS | 4 | The length of audio to capture. This is set to 4 seconds. To record longer audio, increase this. | + | SAMPLES | 64000 | The total number of audio samples that will be captured. Set to the sample rate * the number of seconds | + | BUFFER_SIZE | 128044 | The size of the audio buffer to capture. Audio will be captured as a WAV file, which is 44 bytes of header, then 128,000 bytes of audio date (each sample is 2 bytes) | + | ADC_BUF_LEN | 1600 | The size of the buffers to use to capture audio from the DMAC | + + > 💁 If you find 4 seconds is too short to request a timer, you can increase the `SAMPLE_LENGTH_SECONDS` value, and all the other values will recalculate. + +1. Create a new file in the `src` folder called `mic.h`. + +1. Add the following to the top of this file: + + ```cpp + #pragma once + + #include + + #include "config.h" + #include "flash_writer.h" + ``` + + This includes some needed header files, including the `config.h` and `FlashWriter` header files. + +1. Add the following to define a `Mic` class that can capture from the microphone: + + ```cpp + class Mic + { + public: + Mic() + { + _isRecording = false; + _isRecordingReady = false; + } + + void startRecording() + { + _isRecording = true; + _isRecordingReady = false; + } + + bool isRecording() + { + return _isRecording; + } + + bool isRecordingReady() + { + return _isRecordingReady; + } + + private: + volatile bool _isRecording; + volatile bool _isRecordingReady; + FlashWriter _writer; + }; + + Mic mic; + ``` + + This class currently only has a couple of fields to track if recording has started, and if a recording is ready to be used. When the DMAC is set up, it continuously writes to memory buffers, so the `_isRecording` flag determines if these should be processed or ignored. The `_isRecordingReady` flag will be set when the required 4 seconds of audio has been captured. The `_writer` field is used to save the audio data to flash memory. + + A global variable is then declared for an instance of the `Mic` class. + +1. Add the following code to the `private` section of the `Mic` class: + + ```cpp + typedef struct + { + uint16_t btctrl; + uint16_t btcnt; + uint32_t srcaddr; + uint32_t dstaddr; + uint32_t descaddr; + } dmacdescriptor; + + // Globals - DMA and ADC + volatile dmacdescriptor _wrb[DMAC_CH_NUM] __attribute__((aligned(16))); + dmacdescriptor _descriptor_section[DMAC_CH_NUM] __attribute__((aligned(16))); + dmacdescriptor _descriptor __attribute__((aligned(16))); + + void configureDmaAdc() + { + // Configure DMA to sample from ADC at a regular interval (triggered by timer/counter) + DMAC->BASEADDR.reg = (uint32_t)_descriptor_section; // Specify the location of the descriptors + DMAC->WRBADDR.reg = (uint32_t)_wrb; // Specify the location of the write back descriptors + DMAC->CTRL.reg = DMAC_CTRL_DMAENABLE | DMAC_CTRL_LVLEN(0xf); // Enable the DMAC peripheral + DMAC->Channel[1].CHCTRLA.reg = DMAC_CHCTRLA_TRIGSRC(TC5_DMAC_ID_OVF) | // Set DMAC to trigger on TC5 timer overflow + DMAC_CHCTRLA_TRIGACT_BURST; // DMAC burst transfer + + _descriptor.descaddr = (uint32_t)&_descriptor_section[1]; // Set up a circular descriptor + _descriptor.srcaddr = (uint32_t)&ADC1->RESULT.reg; // Take the result from the ADC0 RESULT register + _descriptor.dstaddr = (uint32_t)_adc_buf_0 + sizeof(uint16_t) * ADC_BUF_LEN; // Place it in the adc_buf_0 array + _descriptor.btcnt = ADC_BUF_LEN; // Beat count + _descriptor.btctrl = DMAC_BTCTRL_BEATSIZE_HWORD | // Beat size is HWORD (16-bits) + DMAC_BTCTRL_DSTINC | // Increment the destination address + DMAC_BTCTRL_VALID | // Descriptor is valid + DMAC_BTCTRL_BLOCKACT_SUSPEND; // Suspend DMAC channel 0 after block transfer + memcpy(&_descriptor_section[0], &_descriptor, sizeof(_descriptor)); // Copy the descriptor to the descriptor section + + _descriptor.descaddr = (uint32_t)&_descriptor_section[0]; // Set up a circular descriptor + _descriptor.srcaddr = (uint32_t)&ADC1->RESULT.reg; // Take the result from the ADC0 RESULT register + _descriptor.dstaddr = (uint32_t)_adc_buf_1 + sizeof(uint16_t) * ADC_BUF_LEN; // Place it in the adc_buf_1 array + _descriptor.btcnt = ADC_BUF_LEN; // Beat count + _descriptor.btctrl = DMAC_BTCTRL_BEATSIZE_HWORD | // Beat size is HWORD (16-bits) + DMAC_BTCTRL_DSTINC | // Increment the destination address + DMAC_BTCTRL_VALID | // Descriptor is valid + DMAC_BTCTRL_BLOCKACT_SUSPEND; // Suspend DMAC channel 0 after block transfer + memcpy(&_descriptor_section[1], &_descriptor, sizeof(_descriptor)); // Copy the descriptor to the descriptor section + + // Configure NVIC + NVIC_SetPriority(DMAC_1_IRQn, 0); // Set the Nested Vector Interrupt Controller (NVIC) priority for DMAC1 to 0 (highest) + NVIC_EnableIRQ(DMAC_1_IRQn); // Connect DMAC1 to Nested Vector Interrupt Controller (NVIC) + + // Activate the suspend (SUSP) interrupt on DMAC channel 1 + DMAC->Channel[1].CHINTENSET.reg = DMAC_CHINTENSET_SUSP; + + // Configure ADC + ADC1->INPUTCTRL.bit.MUXPOS = ADC_INPUTCTRL_MUXPOS_AIN12_Val; // Set the analog input to ADC0/AIN2 (PB08 - A4 on Metro M4) + while (ADC1->SYNCBUSY.bit.INPUTCTRL) + ; // Wait for synchronization + ADC1->SAMPCTRL.bit.SAMPLEN = 0x00; // Set max Sampling Time Length to half divided ADC clock pulse (2.66us) + while (ADC1->SYNCBUSY.bit.SAMPCTRL) + ; // Wait for synchronization + ADC1->CTRLA.reg = ADC_CTRLA_PRESCALER_DIV128; // Divide Clock ADC GCLK by 128 (48MHz/128 = 375kHz) + ADC1->CTRLB.reg = ADC_CTRLB_RESSEL_12BIT | // Set ADC resolution to 12 bits + ADC_CTRLB_FREERUN; // Set ADC to free run mode + while (ADC1->SYNCBUSY.bit.CTRLB) + ; // Wait for synchronization + ADC1->CTRLA.bit.ENABLE = 1; // Enable the ADC + while (ADC1->SYNCBUSY.bit.ENABLE) + ; // Wait for synchronization + ADC1->SWTRIG.bit.START = 1; // Initiate a software trigger to start an ADC conversion + while (ADC1->SYNCBUSY.bit.SWTRIG) + ; // Wait for synchronization + + // Enable DMA channel 1 + DMAC->Channel[1].CHCTRLA.bit.ENABLE = 1; + + // Configure Timer/Counter 5 + GCLK->PCHCTRL[TC5_GCLK_ID].reg = GCLK_PCHCTRL_CHEN | // Enable perhipheral channel for TC5 + GCLK_PCHCTRL_GEN_GCLK1; // Connect generic clock 0 at 48MHz + + TC5->COUNT16.WAVE.reg = TC_WAVE_WAVEGEN_MFRQ; // Set TC5 to Match Frequency (MFRQ) mode + TC5->COUNT16.CC[0].reg = 3000 - 1; // Set the trigger to 16 kHz: (4Mhz / 16000) - 1 + while (TC5->COUNT16.SYNCBUSY.bit.CC0) + ; // Wait for synchronization + + // Start Timer/Counter 5 + TC5->COUNT16.CTRLA.bit.ENABLE = 1; // Enable the TC5 timer + while (TC5->COUNT16.SYNCBUSY.bit.ENABLE) + ; // Wait for synchronization + } + + uint16_t _adc_buf_0[ADC_BUF_LEN]; + uint16_t _adc_buf_1[ADC_BUF_LEN]; + ``` + + This code defines a `configureDmaAdc` method that configures the DMAC, connecting it to the ADC and setting it to populate two different alternating buffers, `_adc_buf_0` and `_adc_buf_0`. + + > 💁 One of the downsides of microcontroller development is the complexity of the code needed to interact with hardware, as your code runs at a very low level interacting with hardware directly. This code is more complex than what you would write for a single-board computer or desktop computer as there is no operating system to help. There are some libraries available that can simplify this, but there is still a lot of complexity. + +1. Below this, add the following code: + + ```cpp + // WAV files have a header. This struct defines that header + struct wavFileHeader + { + char riff[4]; /* "RIFF" */ + long flength; /* file length in bytes */ + char wave[4]; /* "WAVE" */ + char fmt[4]; /* "fmt " */ + long chunk_size; /* size of FMT chunk in bytes (usually 16) */ + short format_tag; /* 1=PCM, 257=Mu-Law, 258=A-Law, 259=ADPCM */ + short num_chans; /* 1=mono, 2=stereo */ + long srate; /* Sampling rate in samples per second */ + long bytes_per_sec; /* bytes per second = srate*bytes_per_samp */ + short bytes_per_samp; /* 2=16-bit mono, 4=16-bit stereo */ + short bits_per_samp; /* Number of bits per sample */ + char data[4]; /* "data" */ + long dlength; /* data length in bytes (filelength - 44) */ + }; + + void initBufferHeader() + { + wavFileHeader wavh; + + strncpy(wavh.riff, "RIFF", 4); + strncpy(wavh.wave, "WAVE", 4); + strncpy(wavh.fmt, "fmt ", 4); + strncpy(wavh.data, "data", 4); + + wavh.chunk_size = 16; + wavh.format_tag = 1; // PCM + wavh.num_chans = 1; // mono + wavh.srate = RATE; + wavh.bytes_per_sec = (RATE * 1 * 16 * 1) / 8; + wavh.bytes_per_samp = 2; + wavh.bits_per_samp = 16; + wavh.dlength = RATE * 2 * 1 * 16 / 2; + wavh.flength = wavh.dlength + 44; + + _writer.writeSfudBuffer((byte *)&wavh, 44); + } + ``` + + This code defines the WAV header as a struct that takes up 44 bytes of memory. It writes details to it about the audio file rate, size, and number of channels. This header is then written to the flash memory + +1. Below this code, add the following to declare a method to be called when the audio buffers are ready to process: + + ```cpp + void audioCallback(uint16_t *buf, uint32_t buf_len) + { + static uint32_t idx = 44; + + if (_isRecording) + { + for (uint32_t i = 0; i < buf_len; i++) + { + int16_t audio_value = ((int16_t)buf[i] - 2048) * 16; + + _writer.writeSfudBuffer(audio_value & 0xFF); + _writer.writeSfudBuffer((audio_value >> 8) & 0xFF); + } + + idx += buf_len; + + if (idx >= BUFFER_SIZE) + { + _writer.flushSfudBuffer(); + idx = 44; + _isRecording = false; + _isRecordingReady = true; + } + } + } + ``` + + The audio buffers are arrays of 16-bit integers containing the audio from the ADC. The ADC returns 12-bit unsigned values (0-1023), so these need to be converted to 16-bit signed values, and then converted into 2 bytes to be stored as raw binary data. + + These bytes are written to the flash memory buffers. The write starts at index 44 - this is the offset from the 44 bytes written as the WAV file header. Once all the bytes needed for the required audio length have been captured, the remaing data is written to the flash memory. + +1. In the `public` section of the `Mic` class, add the following code: + + ```cpp + void dmaHandler() + { + static uint8_t count = 0; + + if (DMAC->Channel[1].CHINTFLAG.bit.SUSP) + { + DMAC->Channel[1].CHCTRLB.reg = DMAC_CHCTRLB_CMD_RESUME; + DMAC->Channel[1].CHINTFLAG.bit.SUSP = 1; + + if (count) + { + audioCallback(_adc_buf_0, ADC_BUF_LEN); + } + else + { + audioCallback(_adc_buf_1, ADC_BUF_LEN); + } + + count = (count + 1) % 2; + } + } + ``` + + This code will be called by the DMAC to tell your code to process the buffers. It checks that there is data to process, and calls the `audioCallback` method with the relevant buffer. + +1. Outside the class, after the `Mic mic;` declaration, add the following code: + + ```cpp + void DMAC_1_Handler() + { + mic.dmaHandler(); + } + ``` + + The `DMAC_1_Handler` will be called by the DMAC when there the buffers are ready to process. This function is found by name, so just needs to exist to be called. + +1. Add the following two methods to the `public` section of the `Mic` class: + + ```cpp + void init() + { + analogReference(AR_INTERNAL2V23); + + _writer.init(); + + initBufferHeader(); + configureDmaAdc(); + } + + void reset() + { + _isRecordingReady = false; + _isRecording = false; + + _writer.reset(); + + initBufferHeader(); + } + ``` + + The `init` method contain code to initialize the `Mic` class. This method sets the correct voltage for the Mic pin, sets up the flash memory writer, writes the WAV file header, and configures the DMAC. The `reset` method resets the flash memory and re-writes the header after the audio has been captured and used. + +### Task - capture audio + +1. In the `main.cpp` file, and an include directive for the `mic.h` header file: + + ```cpp + #include "mic.h" + ``` + +1. In the `setup` function, initialize the C button. Audio capture will start when this button is pressed, and continue for 4 seconds: + + ```cpp + pinMode(WIO_KEY_C, INPUT_PULLUP); + ``` + +1. Below this, initialize the microphone, then print to the console that audio is ready to be captured: + + ```cpp + mic.init(); + + Serial.println("Ready."); + ``` + +1. Above the `loop` function, define a function to process the captured audio. For now this does nothing, but later in this lesson it will send the speech to be converted to text: + + ```cpp + void processAudio() + { + + } + ``` + +1. Add the following to the `loop` function: + + ```cpp + void loop() + { + if (digitalRead(WIO_KEY_C) == LOW && !mic.isRecording()) + { + Serial.println("Starting recording..."); + mic.startRecording(); + } + + if (!mic.isRecording() && mic.isRecordingReady()) + { + Serial.println("Finished recording"); + + processAudio(); + + mic.reset(); + } + } + ``` + + This code checks hte C button, and if this is pressed and recording hasn't started, then the `_isRecording` field of the `Mic` class is set to true. This will cause the `audioCallback` method of the `Mic` class to store audio until 4 seconds has been captured. Once 4 seconds of audio has been captured, the `_isRecording` field is set to false, and the `_isRecordingReady` field is set to true. This is then checked in the `loop` function, and when true the `processAudio` function is called, then the mic class is reset. + +1. Build this code, upload it to your Wio Terminal and test it out through the serial monitor. Press the C button (the one on the left-hand side, closest to the power switch), and speak. 4 seconds of audio will be captured. + + ```output + --- Available filters and text transformations: colorize, debug, default, direct, hexlify, log2file, nocontrol, printable, send_on_enter, time + --- More details at http://bit.ly/pio-monitor-filters + --- Miniterm on /dev/cu.usbmodem1101 9600,8,N,1 --- + --- Quit: Ctrl+C | Menu: Ctrl+T | Help: Ctrl+T followed by Ctrl+H --- + Ready. + Starting recording... + Finished recording + ``` + +> 💁 You can find this code in the [code-record/wio-terminal](code-record/wio-terminal) folder. + +😀 Your audio recording program was a success! diff --git a/6-consumer/lessons/1-speech-recognition/wio-terminal-microphone.md b/6-consumer/lessons/1-speech-recognition/wio-terminal-microphone.md index 1e9bb93..4212f41 100644 --- a/6-consumer/lessons/1-speech-recognition/wio-terminal-microphone.md +++ b/6-consumer/lessons/1-speech-recognition/wio-terminal-microphone.md @@ -1,3 +1,11 @@ # Configure your microphone and speakers - Wio Terminal -Coming soon! +In this part of the lesson, you will add and speakers to your Wio Terminal. The Wio Terminal already has a microphone built-in, and this can be used to capture speech. + +## Hardware + +Coming soon + +### Task - connect speakers + +Coming soon diff --git a/6-consumer/lessons/1-speech-recognition/wio-terminal-speech-to-text.md b/6-consumer/lessons/1-speech-recognition/wio-terminal-speech-to-text.md index e89f1ca..a1e2cdd 100644 --- a/6-consumer/lessons/1-speech-recognition/wio-terminal-speech-to-text.md +++ b/6-consumer/lessons/1-speech-recognition/wio-terminal-speech-to-text.md @@ -1,3 +1,521 @@ # Speech to text - Wio Terminal -Coming soon! +In this part of the lesson, you will write code to convert speech in the captured audio to text using the speech service. + +## Send the audio to the speech service + +The audio can be sent to the speech service using the REST API. To use the speech service, first you need to request an access token, then use that token to access the REST API. These access tokens expire after 10 minutes, so your code should request them on a regular basis to ensure they are always up to date. + +### Task - get an access token + +1. Open the `smart-timer` project if it's not already open. + +1. Add the following library dependencies to the `platformio.ini` file to access WiFi and handle JSON: + + ```ini + seeed-studio/Seeed Arduino rpcWiFi @ 1.0.5 + seeed-studio/Seeed Arduino rpcUnified @ 2.1.3 + seeed-studio/Seeed_Arduino_mbedtls @ 3.0.1 + seeed-studio/Seeed Arduino RTC @ 2.0.0 + bblanchon/ArduinoJson @ 6.17.3 + ``` + +1. Add the following code to the `config.h` header file: + + ```cpp + const char *SSID = ""; + const char *PASSWORD = ""; + + const char *SPEECH_API_KEY = ""; + const char *SPEECH_LOCATION = ""; + const char *LANGUAGE = ""; + + const char *TOKEN_URL = "https://%s.api.cognitive.microsoft.com/sts/v1.0/issuetoken"; + ``` + + Replace `` and `` with the relevant values for your WiFi. + + Replace `` with the API key for your speech service resource. Replace `` with the location you used when you created the speech service resource. + + Replace `` with the locale name for language you will be speaking in, for example `en-GB` for English, or `zn-HK` for Cantonese. You can find a list of the supported languages and their locale names in the [Language and voice support documentation on Microsoft docs](https://docs.microsoft.com/azure/cognitive-services/speech-service/language-support?WT.mc_id=academic-17441-jabenn#speech-to-text). + + The `TOKEN_URL` constant is the URL of the token issuer without the location. This will be combined with the location later to get the full URL. + +1. Just like connecting to Custom Vision, you will need to use an HTTPS connection to connect to the token issuing service. To the end of `config.h`, add the following code: + + ```cpp + const char *TOKEN_CERTIFICATE = + "-----BEGIN CERTIFICATE-----\r\n" + "MIIF8zCCBNugAwIBAgIQAueRcfuAIek/4tmDg0xQwDANBgkqhkiG9w0BAQwFADBh\r\n" + "MQswCQYDVQQGEwJVUzEVMBMGA1UEChMMRGlnaUNlcnQgSW5jMRkwFwYDVQQLExB3\r\n" + "d3cuZGlnaWNlcnQuY29tMSAwHgYDVQQDExdEaWdpQ2VydCBHbG9iYWwgUm9vdCBH\r\n" + "MjAeFw0yMDA3MjkxMjMwMDBaFw0yNDA2MjcyMzU5NTlaMFkxCzAJBgNVBAYTAlVT\r\n" + "MR4wHAYDVQQKExVNaWNyb3NvZnQgQ29ycG9yYXRpb24xKjAoBgNVBAMTIU1pY3Jv\r\n" + "c29mdCBBenVyZSBUTFMgSXNzdWluZyBDQSAwNjCCAiIwDQYJKoZIhvcNAQEBBQAD\r\n" + "ggIPADCCAgoCggIBALVGARl56bx3KBUSGuPc4H5uoNFkFH4e7pvTCxRi4j/+z+Xb\r\n" + "wjEz+5CipDOqjx9/jWjskL5dk7PaQkzItidsAAnDCW1leZBOIi68Lff1bjTeZgMY\r\n" + "iwdRd3Y39b/lcGpiuP2d23W95YHkMMT8IlWosYIX0f4kYb62rphyfnAjYb/4Od99\r\n" + "ThnhlAxGtfvSbXcBVIKCYfZgqRvV+5lReUnd1aNjRYVzPOoifgSx2fRyy1+pO1Uz\r\n" + "aMMNnIOE71bVYW0A1hr19w7kOb0KkJXoALTDDj1ukUEDqQuBfBxReL5mXiu1O7WG\r\n" + "0vltg0VZ/SZzctBsdBlx1BkmWYBW261KZgBivrql5ELTKKd8qgtHcLQA5fl6JB0Q\r\n" + "gs5XDaWehN86Gps5JW8ArjGtjcWAIP+X8CQaWfaCnuRm6Bk/03PQWhgdi84qwA0s\r\n" + "sRfFJwHUPTNSnE8EiGVk2frt0u8PG1pwSQsFuNJfcYIHEv1vOzP7uEOuDydsmCjh\r\n" + "lxuoK2n5/2aVR3BMTu+p4+gl8alXoBycyLmj3J/PUgqD8SL5fTCUegGsdia/Sa60\r\n" + "N2oV7vQ17wjMN+LXa2rjj/b4ZlZgXVojDmAjDwIRdDUujQu0RVsJqFLMzSIHpp2C\r\n" + "Zp7mIoLrySay2YYBu7SiNwL95X6He2kS8eefBBHjzwW/9FxGqry57i71c2cDAgMB\r\n" + "AAGjggGtMIIBqTAdBgNVHQ4EFgQU1cFnOsKjnfR3UltZEjgp5lVou6UwHwYDVR0j\r\n" + "BBgwFoAUTiJUIBiV5uNu5g/6+rkS7QYXjzkwDgYDVR0PAQH/BAQDAgGGMB0GA1Ud\r\n" + "JQQWMBQGCCsGAQUFBwMBBggrBgEFBQcDAjASBgNVHRMBAf8ECDAGAQH/AgEAMHYG\r\n" + "CCsGAQUFBwEBBGowaDAkBggrBgEFBQcwAYYYaHR0cDovL29jc3AuZGlnaWNlcnQu\r\n" + "Y29tMEAGCCsGAQUFBzAChjRodHRwOi8vY2FjZXJ0cy5kaWdpY2VydC5jb20vRGln\r\n" + "aUNlcnRHbG9iYWxSb290RzIuY3J0MHsGA1UdHwR0MHIwN6A1oDOGMWh0dHA6Ly9j\r\n" + "cmwzLmRpZ2ljZXJ0LmNvbS9EaWdpQ2VydEdsb2JhbFJvb3RHMi5jcmwwN6A1oDOG\r\n" + "MWh0dHA6Ly9jcmw0LmRpZ2ljZXJ0LmNvbS9EaWdpQ2VydEdsb2JhbFJvb3RHMi5j\r\n" + "cmwwHQYDVR0gBBYwFDAIBgZngQwBAgEwCAYGZ4EMAQICMBAGCSsGAQQBgjcVAQQD\r\n" + "AgEAMA0GCSqGSIb3DQEBDAUAA4IBAQB2oWc93fB8esci/8esixj++N22meiGDjgF\r\n" + "+rA2LUK5IOQOgcUSTGKSqF9lYfAxPjrqPjDCUPHCURv+26ad5P/BYtXtbmtxJWu+\r\n" + "cS5BhMDPPeG3oPZwXRHBJFAkY4O4AF7RIAAUW6EzDflUoDHKv83zOiPfYGcpHc9s\r\n" + "kxAInCedk7QSgXvMARjjOqdakor21DTmNIUotxo8kHv5hwRlGhBJwps6fEVi1Bt0\r\n" + "trpM/3wYxlr473WSPUFZPgP1j519kLpWOJ8z09wxay+Br29irPcBYv0GMXlHqThy\r\n" + "8y4m/HyTQeI2IMvMrQnwqPpY+rLIXyviI2vLoI+4xKE4Rn38ZZ8m\r\n" + "-----END CERTIFICATE-----\r\n"; + ``` + + This is the same certificate you used when connecting to Custom Vision. + +1. Add an include for the WiFi header file and the config header file to the top of the `main.cpp` file: + + ```cpp + #include + + #include "config.h" + ``` + +1. Add code to connect to WiFi in `main.cpp` above the `setup` function: + + ```cpp + void connectWiFi() + { + while (WiFi.status() != WL_CONNECTED) + { + Serial.println("Connecting to WiFi.."); + WiFi.begin(SSID, PASSWORD); + delay(500); + } + + Serial.println("Connected!"); + } + ``` + +1. Call this function from the `setup` function after the serial connection has been established: + + ```cpp + connectWiFi(); + ``` + +1. Create a new header file in the `src` folder called `speech_to_text.h`. In this header file, add the following code: + + ```cpp + #pragma once + + #include + #include + #include + #include + + #include "config.h" + #include "mic.h" + + class SpeechToText + { + public: + + private: + + }; + + SpeechToText speechToText; + ``` + + This includes some necessary header files for an HTTP connection, configuration and the `mic.h` header file, and defines a class called `SpeechToText`, before declaring an instance of that class that can be used later. + +1. Add the following 2 fields to the `private` section of this class: + + ```cpp + WiFiClientSecure _token_client; + String _access_token; + ``` + + The `_token_client` is a WiFi Client that uses HTTPS and will be used to get the access token. This token will then be stored in `_access_token`. + +1. Add the following method to the `private` section: + + ```cpp + String getAccessToken() + { + char url[128]; + sprintf(url, TOKEN_URL, SPEECH_LOCATION); + + HTTPClient httpClient; + httpClient.begin(_token_client, url); + + httpClient.addHeader("Ocp-Apim-Subscription-Key", SPEECH_API_KEY); + int httpResultCode = httpClient.POST("{}"); + + if (httpResultCode != 200) + { + Serial.println("Error getting access token, trying again..."); + delay(10000); + return getAccessToken(); + } + + Serial.println("Got access token."); + String result = httpClient.getString(); + + httpClient.end(); + + return result; + } + ``` + + This code builds the URL for the token issuer API using the location of the speech resource. It then creates an `HTTPClient` to make the web request, setting it up to use the WiFi client configured with the token endpoints certificate. It sets the API key as a header for the call. It then makes a POST request to get the certificate, retrying if it gets any errors. Finally the access token is returned. + +1. To the `public` section, add an `init` method that sets up the token client: + + ```cpp + void init() + { + _token_client.setCACert(TOKEN_CERTIFICATE); + _access_token = getAccessToken(); + } + ``` + + This sets the certificate on the WiFi client, then gets the access token. + +1. In `main.cpp`, add this new header file to the include directives: + + ```cpp + #include "speech_to_text.h" + ``` + +1. Initialize the `SpeechToText` class at the end of the `setup` function, after the `mic.init` call but before `Ready` is written to the serial monitor: + + ```cpp + speechToText.init(); + ``` + +### Task - read audio from flash memory + +1. In an earlier part of this lesson, the audio was recorded to the flash memory. This audio will need to be sent to the Speech Services REST API, so it needs to be read from the flash memory. It can't be loaded into an in-memory buffer as it would be too large. The `HTTPClient` class that makes REST calls can stream data using an Arduino Stream - a class that can load data in small chunks, sending the chunks one at a time as part of the request. Every time you call `read` on a stream it returns the next block of data. An Arduino stream can be created that can read from the flash memory. Create a new file called `flash_stream.h` in the `src` folder, and add the following code to it: + + ```cpp + #pragma once + + #include + #include + #include + + #include "config.h" + + class FlashStream : public Stream + { + public: + virtual size_t write(uint8_t val) + { + } + + virtual int available() + { + } + + virtual int read() + { + } + + virtual int peek() + { + } + private: + + }; + ``` + + This declares the `FlashStream` class, deriving from the Arduino `Stream` class. This is an abstract class - derived classes have to implement a few methods before the class can be instantiated, and these methods are defined in this class. + + ✅ Read more on Arduino Streams in the [Arduino Stream documentation](https://www.arduino.cc/reference/en/language/functions/communication/stream/) + +1. Add the following fields to the `private` section: + + ```cpp + size_t _pos; + size_t _flash_address; + const sfud_flash *_flash; + + byte _buffer[HTTP_TCP_BUFFER_SIZE]; + ``` + + This defines a temporary buffer to store data read from the flash memory, along with fields to store the current position when reading from the buffer, the current address to read from the flash memory, and the flash memory device. + +1. In the `private` section, add the following method: + + ```cpp + void populateBuffer() + { + sfud_read(_flash, _flash_address, HTTP_TCP_BUFFER_SIZE, _buffer); + _flash_address += HTTP_TCP_BUFFER_SIZE; + _pos = 0; + } + ``` + + This code reads from the flash memory at the current address and stores the data in a buffer. It then increments the address, so the next call reads the next block of memory. The buffer is sized based on the largest chunk that the `HTTPClient` will send to the REST API at one time. + + > 💁 Erasing flash memory has to be done using the grain size, reading on the other hand does not. + +1. In the `public` section of this class, add a constructor: + + ```cpp + FlashStream() + { + _pos = 0; + _flash_address = 0; + _flash = sfud_get_device_table() + 0; + + populateBuffer(); + } + ``` + + This constructor sets up all the fields to start reading from the start of the flash memory block, and loads the first chunk of data into the buffer. + +1. Implement the `write` method. This stream will only read data, so this can do nothing and return 0: + + ```cpp + virtual size_t write(uint8_t val) + { + return 0; + } + ``` + +1. Implement the `peek` method. This returns the data at the current position without moving the stream along. Calling `peek` multiple times will always return the same data as long as no data is read from the stream. + + ```cpp + virtual int peek() + { + return _buffer[_pos]; + } + ``` + +1. Implement the `available` function. This returns how many bytes can be read from the stream, or -1 if the stream is complete. For this class, the maximum available will be no more than the HTTPClient's chunk size. When this stream is used in the HTTP client it calls this function to see how much data is available, then requests that much data to send to the REST API. We don't want each chunk to be more than the HTTP clients chunk size, so if more than that is available, the chunk size is returned. If less, then what is available is returned. Once all the data has been streamed, -1 is returned. + + ```cpp + virtual int available() + { + int remaining = BUFFER_SIZE - ((_flash_address - HTTP_TCP_BUFFER_SIZE) + _pos); + int bytes_available = min(HTTP_TCP_BUFFER_SIZE, remaining); + + if (bytes_available == 0) + { + bytes_available = -1; + } + + return bytes_available; + } + ``` + +1. Implement the `read` method to return the next byte from the buffer, incrementing the position. If the position exceeds the size of the buffer, it populates the buffer with the next block from the flash memory and resets the position. + + ```cpp + virtual int read() + { + int retVal = _buffer[_pos++]; + + if (_pos == HTTP_TCP_BUFFER_SIZE) + { + populateBuffer(); + } + + return retVal; + } + ``` + +1. In the `speech_to_text.h` header file, add an include directive for this new header file: + + ```cpp + #include "flash_stream.h" + ``` + +### Task - convert the speech to text + +1. The speech can be converted to text by sending the audio to the Speech Service via a REST API. This REST API has a different certificate to the token issuer, so add the following code to the `config.h` header file to define this certificate: + + ```cpp + const char *SPEECH_CERTIFICATE = + "-----BEGIN CERTIFICATE-----\r\n" + "MIIF8zCCBNugAwIBAgIQCq+mxcpjxFFB6jvh98dTFzANBgkqhkiG9w0BAQwFADBh\r\n" + "MQswCQYDVQQGEwJVUzEVMBMGA1UEChMMRGlnaUNlcnQgSW5jMRkwFwYDVQQLExB3\r\n" + "d3cuZGlnaWNlcnQuY29tMSAwHgYDVQQDExdEaWdpQ2VydCBHbG9iYWwgUm9vdCBH\r\n" + "MjAeFw0yMDA3MjkxMjMwMDBaFw0yNDA2MjcyMzU5NTlaMFkxCzAJBgNVBAYTAlVT\r\n" + "MR4wHAYDVQQKExVNaWNyb3NvZnQgQ29ycG9yYXRpb24xKjAoBgNVBAMTIU1pY3Jv\r\n" + "c29mdCBBenVyZSBUTFMgSXNzdWluZyBDQSAwMTCCAiIwDQYJKoZIhvcNAQEBBQAD\r\n" + "ggIPADCCAgoCggIBAMedcDrkXufP7pxVm1FHLDNA9IjwHaMoaY8arqqZ4Gff4xyr\r\n" + "RygnavXL7g12MPAx8Q6Dd9hfBzrfWxkF0Br2wIvlvkzW01naNVSkHp+OS3hL3W6n\r\n" + "l/jYvZnVeJXjtsKYcXIf/6WtspcF5awlQ9LZJcjwaH7KoZuK+THpXCMtzD8XNVdm\r\n" + "GW/JI0C/7U/E7evXn9XDio8SYkGSM63aLO5BtLCv092+1d4GGBSQYolRq+7Pd1kR\r\n" + "EkWBPm0ywZ2Vb8GIS5DLrjelEkBnKCyy3B0yQud9dpVsiUeE7F5sY8Me96WVxQcb\r\n" + "OyYdEY/j/9UpDlOG+vA+YgOvBhkKEjiqygVpP8EZoMMijephzg43b5Qi9r5UrvYo\r\n" + "o19oR/8pf4HJNDPF0/FJwFVMW8PmCBLGstin3NE1+NeWTkGt0TzpHjgKyfaDP2tO\r\n" + "4bCk1G7pP2kDFT7SYfc8xbgCkFQ2UCEXsaH/f5YmpLn4YPiNFCeeIida7xnfTvc4\r\n" + "7IxyVccHHq1FzGygOqemrxEETKh8hvDR6eBdrBwmCHVgZrnAqnn93JtGyPLi6+cj\r\n" + "WGVGtMZHwzVvX1HvSFG771sskcEjJxiQNQDQRWHEh3NxvNb7kFlAXnVdRkkvhjpR\r\n" + "GchFhTAzqmwltdWhWDEyCMKC2x/mSZvZtlZGY+g37Y72qHzidwtyW7rBetZJAgMB\r\n" + "AAGjggGtMIIBqTAdBgNVHQ4EFgQUDyBd16FXlduSzyvQx8J3BM5ygHYwHwYDVR0j\r\n" + "BBgwFoAUTiJUIBiV5uNu5g/6+rkS7QYXjzkwDgYDVR0PAQH/BAQDAgGGMB0GA1Ud\r\n" + "JQQWMBQGCCsGAQUFBwMBBggrBgEFBQcDAjASBgNVHRMBAf8ECDAGAQH/AgEAMHYG\r\n" + "CCsGAQUFBwEBBGowaDAkBggrBgEFBQcwAYYYaHR0cDovL29jc3AuZGlnaWNlcnQu\r\n" + "Y29tMEAGCCsGAQUFBzAChjRodHRwOi8vY2FjZXJ0cy5kaWdpY2VydC5jb20vRGln\r\n" + "aUNlcnRHbG9iYWxSb290RzIuY3J0MHsGA1UdHwR0MHIwN6A1oDOGMWh0dHA6Ly9j\r\n" + "cmwzLmRpZ2ljZXJ0LmNvbS9EaWdpQ2VydEdsb2JhbFJvb3RHMi5jcmwwN6A1oDOG\r\n" + "MWh0dHA6Ly9jcmw0LmRpZ2ljZXJ0LmNvbS9EaWdpQ2VydEdsb2JhbFJvb3RHMi5j\r\n" + "cmwwHQYDVR0gBBYwFDAIBgZngQwBAgEwCAYGZ4EMAQICMBAGCSsGAQQBgjcVAQQD\r\n" + "AgEAMA0GCSqGSIb3DQEBDAUAA4IBAQAlFvNh7QgXVLAZSsNR2XRmIn9iS8OHFCBA\r\n" + "WxKJoi8YYQafpMTkMqeuzoL3HWb1pYEipsDkhiMnrpfeYZEA7Lz7yqEEtfgHcEBs\r\n" + "K9KcStQGGZRfmWU07hPXHnFz+5gTXqzCE2PBMlRgVUYJiA25mJPXfB00gDvGhtYa\r\n" + "+mENwM9Bq1B9YYLyLjRtUz8cyGsdyTIG/bBM/Q9jcV8JGqMU/UjAdh1pFyTnnHEl\r\n" + "Y59Npi7F87ZqYYJEHJM2LGD+le8VsHjgeWX2CJQko7klXvcizuZvUEDTjHaQcs2J\r\n" + "+kPgfyMIOY1DMJ21NxOJ2xPRC/wAh/hzSBRVtoAnyuxtkZ4VjIOh\r\n" + "-----END CERTIFICATE-----\r\n"; + ``` + +1. Add a constant to this file for the speech URL without the location. This will be combined with the location and language later to get the full URL. + + ```cpp + const char *SPEECH_URL = "https://%s.stt.speech.microsoft.com/speech/recognition/conversation/cognitiveservices/v1?language=%s"; + ``` + +1. In the `speech_to_text.h` header file, in the `private` section of the `SpeechToText` class, define a field for a WiFi Client using the speech certificate: + + ```cpp + WiFiClientSecure _speech_client; + ``` + +1. In the `init` method, set the certificate on this WiFi Client: + + ```cpp + _speech_client.setCACert(SPEECH_CERTIFICATE); + ``` + +1. Add the following code to the `public` section of the `SpeechToText` class to define a method to convert speech to text: + + ```cpp + String convertSpeechToText() + { + + } + ``` + +1. Add the following code to this method to create an HTTP client using the WiFi client configured with the speech certificate, and using the speech URL set with the location and language: + + ```cpp + char url[128]; + sprintf(url, SPEECH_URL, SPEECH_LOCATION, LANGUAGE); + + HTTPClient httpClient; + httpClient.begin(_speech_client, url); + ``` + +1. Some headers need to be set on the connection: + + ```cpp + httpClient.addHeader("Authorization", String("Bearer ") + _access_token); + httpClient.addHeader("Content-Type", String("audio/wav; codecs=audio/pcm; samplerate=") + String(RATE)); + httpClient.addHeader("Accept", "application/json;text/xml"); + ``` + + This sets headers for the authorization using the access token, the audio format using the sample rate, and sets that the client expects the result as JSON. + +1. After this, add the following code to make the REST API call: + + ```cpp + Serial.println("Sending speech..."); + + FlashStream stream; + int httpResponseCode = httpClient.sendRequest("POST", &stream, BUFFER_SIZE); + + Serial.println("Speech sent!"); + ``` + + This creates a `FlashStream` and uses it to stream data to the REST API. + +1. Below this, add the following code: + + ```cpp + String text = ""; + + if (httpResponseCode == 200) + { + String result = httpClient.getString(); + Serial.println(result); + + DynamicJsonDocument doc(1024); + deserializeJson(doc, result.c_str()); + + JsonObject obj = doc.as(); + text = obj["DisplayText"].as(); + } + else if (httpResponseCode == 401) + { + Serial.println("Access token expired, trying again with a new token"); + _access_token = getAccessToken(); + return convertSpeechToText(); + } + else + { + Serial.print("Failed to convert text to speech - error "); + Serial.println(httpResponseCode); + } + ``` + + This code checks the response code. + + If it is 200, the code for success, then the result is retrieved, decoded from JSON, and the `DisplayText` property is set into the `text` variable. This is the property that the text version of the speech is returned in. + + If the response code is 401, then the access token has expired (these tokens only last 10 minutes). A new access token is requested, and the call is made again. + + Otherwise, an error is sent to the serial monitor, and the `text` is left blank. + +1. Add the following code to the end of this method to close the HTTP client and return the text: + + ```cpp + httpClient.end(); + + return text; + ``` + +1. In `main.cpp` call this new `convertSpeechToText` method in the `processAudio` function, then log out the speech to the serial monitor: + + ```cpp + String text = speechToText.convertSpeechToText(); + Serial.println(text); + ``` + +1. Build this code, upload it to your Wio Terminal and test it out through the serial monitor. Press the C button (the one on the left-hand side, closest to the power switch), and speak. 4 seconds of audio will be captured, then converted to text. + + ```output + --- Available filters and text transformations: colorize, debug, default, direct, hexlify, log2file, nocontrol, printable, send_on_enter, time + --- More details at http://bit.ly/pio-monitor-filters + --- Miniterm on /dev/cu.usbmodem1101 9600,8,N,1 --- + --- Quit: Ctrl+C | Menu: Ctrl+T | Help: Ctrl+T followed by Ctrl+H --- + Connecting to WiFi.. + Connected! + Got access token. + Ready. + Starting recording... + Finished recording + Sending speech... + Speech sent! + {"RecognitionStatus":"Success","DisplayText":"Set a 2 minute and 27 second timer.","Offset":4700000,"Duration":35300000} + Set a 2 minute and 27 second timer. + ``` + +> 💁 You can find this code in the [code-speech-to-text/wio-terminal](code-speech-to-text/wio-terminal) folder. + +😀 Your speech to text program was a success! diff --git a/6-consumer/lessons/2-language-understanding/README.md b/6-consumer/lessons/2-language-understanding/README.md index 29f99ae..a217a0c 100644 --- a/6-consumer/lessons/2-language-understanding/README.md +++ b/6-consumer/lessons/2-language-understanding/README.md @@ -256,22 +256,38 @@ To use this model from code, you need to publish it. When publishing from LUIS, ## Use the language understanding model -Once published, the LUIS model can be called from code. In the last lesson you sent the recognized speech to an IoT Hub, and you can use serverless code to respond to this and understand what was sent. +Once published, the LUIS model can be called from code. In previous lessons, you have used an IoT Hub to handle communication with cloud services, sending telemetry and listening for commands. This is very asynchronous - once telemetry is sent your code doesn't wait for a response, and if the cloud service is down, you wouldn't know. + +For a smart timer, we want a response straight away, so we can tell the user that a timer is set, or alert them that the cloud services are unavailable. To do this, our IoT device will call a web endpoint directly, instead of relying on an IoT Hub. + +Rather than calling LUIS from the IoT device, you can use serverless code with a different type of trigger - an HTTP trigger. This allows your function app to listen for REST requests, and respond to them. This function will be a REST endpoint your device can call. + +> 💁 Although you can call LUIS directly from your IoT device, it's better to use something like serverless code. This way when of you want to change the LUIS app that you call, for example when you train a better model or train a model in a different language, you only have to update your cloud code, not re-deploy code to potentially thousands or millions of IoT device. ### Task - create a serverless functions app -1. Create an Azure Functions app called `smart-timer-trigger`. +1. Create an Azure Functions app called `smart-timer-trigger`, and open this in VS Code + +1. Add an HTTP trigger to this app called `speech-trigger` using the following command from inside the VS Code terminal: + + ```sh + func new --name text-to-timer --template "HTTP trigger" + ``` -1. Add an IoT Hub event trigger to this app called `speech-trigger`. + This will crate an HTTP trigger called `text-to-timer`. -1. Set the Event Hub compatible endpoint connection string for your IoT Hub in the `local.settings.json` file, and use the key for that entry in the `function.json` file. +1. Test the HTTP trigger by running the functions app. When it runs you will see the endpoint listed in the output: -1. Use the Azurite app as a local storage emulator. + ```output + Functions: + + text-to-timer: [GET,POST] http://localhost:7071/api/text-to-timer + ``` -1. Run your functions app and your IoT device to ensure speech is arriving at the IoT Hub. + Test this by loading the [http://localhost:7071/api/text-to-timer](http://localhost:7071/api/text-to-timer) URL in your browser. ```output - Python EventHub trigger processed an event: {"speech": "Set a 3 minute timer."} + This HTTP triggered function executed successfully. Pass a name in the query string or in the request body for a personalized response. ``` ### Task - use the language understanding model @@ -288,6 +304,12 @@ Once published, the LUIS model can be called from code. In the last lesson you s pip install -r requirements.txt ``` + > 💁 If you get errors, you may need to upgrade pip with the following command: + > + > ```sh + > pip install --upgrade pip + > ``` + 1. Add new entries to the `local.settings.json` file for your LUIS API Key, Endpoint URL, and App ID from the **MANAGE** tab of the LUIS portal: ```JSON @@ -313,7 +335,7 @@ Once published, the LUIS model can be called from code. In the last lesson you s This imports some system libraries, as well as the libraries to interact with LUIS. -1. In the `main` method, before it loops through all the events, add the following code: +1. Delete the contents of the `main` method, and add the following code: ```python luis_key = os.environ['LUIS_KEY'] @@ -326,14 +348,18 @@ Once published, the LUIS model can be called from code. In the last lesson you s This loads the values you added to the `local.settings.json` file for your LUIS app, creates a credentials object with your API key, then creates a LUIS client object to interact with your LUIS app. -1. Predictions are requested from LUIS by sending a prediction request - a JSON document containing the text to predict. Create this with the following code inside the `for event in events` loop: +1. This HTTP trigger will be called passing the text to understand as an HTTP parameter. These are key/value pairs sent as part of the URL. For this app, the key will be `text` and the value will be the text to understand. The following code extracts the value from the HTTP request, and logs it to the console. Add this code to the `main` function: ```python - event_body = json.loads(event.get_body().decode('utf-8')) - prediction_request = { 'query' : event_body['speech'] } + text = req.params.get('text') + logging.info(f'Request - {text}') ``` - This code extracts the speech that was sent to the IoT Hub and uses it to build the prediction request. +1. Predictions are requested from LUIS by sending a prediction request - a JSON document containing the text to predict. Create this with the following code: + + ```python + prediction_request = { 'query' : text } + ``` 1. This request can then be sent to LUIS, using the staging slot that your app was published to: @@ -373,7 +399,7 @@ Once published, the LUIS model can be called from code. In the last lesson you s * *"Set a 30 second timer"* - this will have one number, `30`, and one time unit, `second` so the single number will match the single time unit. * *"Set a 2 minute and 30 second timer"* - this will have two numbers, `2` and `30`, and two time units, `minute` and `second` so the first number will be for the first time unit (2 minutes), and the second number for the second time unit (30 seconds). - The following code gets the count of items in the number entities, and uses that to extract the first item from each array, then the second and so on: + The following code gets the count of items in the number entities, and uses that to extract the first item from each array, then the second and so on. Add this inside the `if` block. ```python for i in range(0, len(numbers)): @@ -397,24 +423,69 @@ Once published, the LUIS model can be called from code. In the last lesson you s total_seconds += number ``` -1. Finally, outside this loop through the entities, log the total time for the timer: +1. Outside this loop through the entities, log the total time for the timer: ```python logging.info(f'Timer required for {total_seconds} seconds') ``` -1. Run the function app and speak into your IoT device. You will see the total time for the timer in the function app output: +1. The number of seconds needs to be returned from the function as an HTTP response. At the end of the `if` block, add the following: + + ```python + payload = { + 'seconds': total_seconds + } + return func.HttpResponse(json.dumps(payload), status_code=200) + ``` + + This code creates a payload containing the total number of seconds for the timer, converts it to a JSON string and returns it as an HTTP result with a status code of 200, which means the call was successful. + +1. Finally, outside the `if` block, handle if the intent was not recognized by returning an error code: + + ```python + return func.HttpResponse(status_code=404) + ``` + + 404 is the status code for *not found*. + +1. Run the function app and test it out by passing text to the URL. URLs cannot contain spaces, so you will need to encode spaces in a way that URLs can use. The encoding for a space is `%20`, so replace all the spaces in the text with `%20`. For example, to test "Set a 2 minutes 27 second timer", use the following URL: + + [http://localhost:7071/api/text-to-timer?text=Set%20a%202%20minutes%2027%20second%20timer](http://localhost:7071/api/text-to-timer?text=Set%20a%202%20minutes%2027%20second%20timer) ```output - [2021-06-16T01:38:33.316Z] Executing 'Functions.speech-trigger' (Reason='(null)', Id=39720c37-b9f1-47a9-b213-3650b4d0b034) - [2021-06-16T01:38:33.329Z] Trigger Details: PartionId: 0, Offset: 3144-3144, EnqueueTimeUtc: 2021-06-16T01:38:32.7970000Z-2021-06-16T01:38:32.7970000Z, SequenceNumber: 8-8, Count: 1 - [2021-06-16T01:38:33.605Z] Python EventHub trigger processed an event: {"speech": "Set a four minute 17 second timer."} - [2021-06-16T01:38:35.076Z] Timer required for 257 seconds - [2021-06-16T01:38:35.128Z] Executed 'Functions.speech-trigger' (Succeeded, Id=39720c37-b9f1-47a9-b213-3650b4d0b034, Duration=1894ms) + Functions: + + text-to-timer: [GET,POST] http://localhost:7071/api/text-to-timer + + For detailed output, run func with --verbose flag. + [2021-06-26T19:45:14.502Z] Worker process started and initialized. + [2021-06-26T19:45:19.338Z] Host lock lease acquired by instance ID '000000000000000000000000951CAE4E'. + [2021-06-26T19:45:52.059Z] Executing 'Functions.text-to-timer' (Reason='This function was programmatically called via the host APIs.', Id=f68bfb90-30e4-47a5-99da-126b66218e81) + [2021-06-26T19:45:53.577Z] Timer required for 147 seconds + [2021-06-26T19:45:53.746Z] Executed 'Functions.text-to-timer' (Succeeded, Id=f68bfb90-30e4-47a5-99da-126b66218e81, Duration=1750ms) ``` > 💁 You can find this code in the [code/functions](code/functions) folder. +### Task - make your function available to your IoT device + +1. For your IoT device to call your REST endpoint, it will need to know the URL. When you accessed it earlier, you used `localhost`, which is a shortcut to access REST endpoints on your local machine. To allow you IoT device to get access, you need to either: + + * Publish the Functions app - follow the instructions in earlier lessons to publish your functions app to the cloud. Once published, the URL will be `http://.azurewebsites.net/api/text-to-timer`, where `` will be the name of your functions app. + * Run the functions app locally, and access using the IP address - you can get the IP address of your computer on your local network, and use that to build the URL. + + Find your IP address: + + * On Windows 10, follow the [Find your IP address guide](https://support.microsoft.com/windows/find-your-ip-address-f21a9bbc-c582-55cd-35e0-73431160a1b9?WT.mc_id=academic-17441-jabenn) + * On macOS, follow the [How to find you IP address on a Mac guide](https://www.hellotech.com/guide/for/how-to-find-ip-address-on-mac) + * On linux, follow the section on finding your private IP address in the [How to find your IP address in Linux guide](https://opensource.com/article/18/5/how-find-ip-address-linux) + + Once you have your IP address, you will able to access the function at `http://:7071/api/text-to-timer`, where `` will be your IP address, for example `http://192.168.1.10:7071/api/text-to-timer`. + + > 💁 This will only work if your IoT device is on the same network as your computer. + +1. Test the endpoint by accessing it using your browser. + --- ## 🚀 Challenge @@ -429,6 +500,7 @@ There are many ways to request the same thing, such as setting a timer. Think of * Read more about LUIS and it's capabilities on the [Language Understanding (LUIS) documentation page on Microsoft docs](https://docs.microsoft.com/azure/cognitive-services/luis/?WT.mc_id=academic-17441-jabenn) * Read more about language understanding on the [Natural-language understanding page on Wikipedia](https://wikipedia.org/wiki/Natural-language_understanding) +* Read more on HTTP triggers in the [Azure Functions HTTP trigger documentation on Microsoft docs](https://docs.microsoft.com/azure/azure-functions/functions-bindings-http-webhook-trigger?tabs=python&WT.mc_id=academic-17441-jabenn) ## Assignment diff --git a/6-consumer/lessons/2-language-understanding/assignment.md b/6-consumer/lessons/2-language-understanding/assignment.md index f3c5e30..acfdd3a 100644 --- a/6-consumer/lessons/2-language-understanding/assignment.md +++ b/6-consumer/lessons/2-language-understanding/assignment.md @@ -4,7 +4,7 @@ So far in this lesson you have trained a model to understand setting a timer. Another useful feature is cancelling a timer - maybe your bread is ready and can be taken out of the oven before the timer is elapsed. -Add a new intent to your LUIS app to cancel the timer. It won't need any entities, but will need some example sentences. Handle this in your serverless code if it is the top intent, logging that the intent was recognized. +Add a new intent to your LUIS app to cancel the timer. It won't need any entities, but will need some example sentences. Handle this in your serverless code if it is the top intent, logging that the intent was recognized and returning an appropriate response. ## Rubric diff --git a/6-consumer/lessons/2-language-understanding/code/functions/smart-timer-trigger/local.settings.json b/6-consumer/lessons/2-language-understanding/code/functions/smart-timer-trigger/local.settings.json index abde93a..ee6b34c 100644 --- a/6-consumer/lessons/2-language-understanding/code/functions/smart-timer-trigger/local.settings.json +++ b/6-consumer/lessons/2-language-understanding/code/functions/smart-timer-trigger/local.settings.json @@ -2,8 +2,7 @@ "IsEncrypted": false, "Values": { "FUNCTIONS_WORKER_RUNTIME": "python", - "AzureWebJobsStorage": "UseDevelopmentStorage=true", - "IOT_HUB_CONNECTION_STRING": "", + "AzureWebJobsStorage": "", "LUIS_KEY": "", "LUIS_ENDPOINT_URL": "", "LUIS_APP_ID": "" diff --git a/6-consumer/lessons/2-language-understanding/code/functions/smart-timer-trigger/speech-trigger/__init__.py b/6-consumer/lessons/2-language-understanding/code/functions/smart-timer-trigger/speech-trigger/__init__.py deleted file mode 100644 index 1b9f3ac..0000000 --- a/6-consumer/lessons/2-language-understanding/code/functions/smart-timer-trigger/speech-trigger/__init__.py +++ /dev/null @@ -1,43 +0,0 @@ -from typing import List -import logging - -import azure.functions as func - -import json -import os -from azure.cognitiveservices.language.luis.runtime import LUISRuntimeClient -from msrest.authentication import CognitiveServicesCredentials - -def main(events: List[func.EventHubEvent]): - luis_key = os.environ['LUIS_KEY'] - endpoint_url = os.environ['LUIS_ENDPOINT_URL'] - app_id = os.environ['LUIS_APP_ID'] - - credentials = CognitiveServicesCredentials(luis_key) - client = LUISRuntimeClient(endpoint=endpoint_url, credentials=credentials) - - for event in events: - logging.info('Python EventHub trigger processed an event: %s', - event.get_body().decode('utf-8')) - - event_body = json.loads(event.get_body().decode('utf-8')) - prediction_request = { 'query' : event_body['speech'] } - - prediction_response = client.prediction.get_slot_prediction(app_id, 'Staging', prediction_request) - - if prediction_response.prediction.top_intent == 'set timer': - numbers = prediction_response.prediction.entities['number'] - time_units = prediction_response.prediction.entities['time unit'] - total_seconds = 0 - - for i in range(0, len(numbers)): - number = numbers[i] - time_unit = time_units[i][0] - - if time_unit == 'minute': - total_seconds += number * 60 - else: - total_seconds += number - - logging.info(f'Timer required for {total_seconds} seconds') - diff --git a/6-consumer/lessons/2-language-understanding/code/functions/smart-timer-trigger/speech-trigger/function.json b/6-consumer/lessons/2-language-understanding/code/functions/smart-timer-trigger/speech-trigger/function.json deleted file mode 100644 index 0117bdf..0000000 --- a/6-consumer/lessons/2-language-understanding/code/functions/smart-timer-trigger/speech-trigger/function.json +++ /dev/null @@ -1,15 +0,0 @@ -{ - "scriptFile": "__init__.py", - "bindings": [ - { - "type": "eventHubTrigger", - "name": "events", - "direction": "in", - "eventHubName": "samples-workitems", - "connection": "IOT_HUB_CONNECTION_STRING", - "cardinality": "many", - "consumerGroup": "$Default", - "dataType": "binary" - } - ] -} \ No newline at end of file diff --git a/6-consumer/lessons/2-language-understanding/code/functions/smart-timer-trigger/text-to-timer/__init__.py b/6-consumer/lessons/2-language-understanding/code/functions/smart-timer-trigger/text-to-timer/__init__.py new file mode 100644 index 0000000..84d0df4 --- /dev/null +++ b/6-consumer/lessons/2-language-understanding/code/functions/smart-timer-trigger/text-to-timer/__init__.py @@ -0,0 +1,44 @@ +import logging + +import azure.functions as func +import json +import os +from azure.cognitiveservices.language.luis.runtime import LUISRuntimeClient +from msrest.authentication import CognitiveServicesCredentials + + +def main(req: func.HttpRequest) -> func.HttpResponse: + luis_key = os.environ['LUIS_KEY'] + endpoint_url = os.environ['LUIS_ENDPOINT_URL'] + app_id = os.environ['LUIS_APP_ID'] + + credentials = CognitiveServicesCredentials(luis_key) + client = LUISRuntimeClient(endpoint=endpoint_url, credentials=credentials) + + text = req.params.get('text') + prediction_request = { 'query' : text } + + prediction_response = client.prediction.get_slot_prediction(app_id, 'Staging', prediction_request) + + if prediction_response.prediction.top_intent == 'set timer': + numbers = prediction_response.prediction.entities['number'] + time_units = prediction_response.prediction.entities['time unit'] + total_seconds = 0 + + for i in range(0, len(numbers)): + number = numbers[i] + time_unit = time_units[i][0] + + if time_unit == 'minute': + total_seconds += number * 60 + else: + total_seconds += number + + logging.info(f'Timer required for {total_seconds} seconds') + + payload = { + 'seconds': total_seconds + } + return func.HttpResponse(json.dumps(payload), status_code=200) + + return func.HttpResponse(status_code=404) \ No newline at end of file diff --git a/6-consumer/lessons/2-language-understanding/code/functions/smart-timer-trigger/text-to-timer/function.json b/6-consumer/lessons/2-language-understanding/code/functions/smart-timer-trigger/text-to-timer/function.json new file mode 100644 index 0000000..d901965 --- /dev/null +++ b/6-consumer/lessons/2-language-understanding/code/functions/smart-timer-trigger/text-to-timer/function.json @@ -0,0 +1,20 @@ +{ + "scriptFile": "__init__.py", + "bindings": [ + { + "authLevel": "function", + "type": "httpTrigger", + "direction": "in", + "name": "req", + "methods": [ + "get", + "post" + ] + }, + { + "type": "http", + "direction": "out", + "name": "$return" + } + ] +} \ No newline at end of file diff --git a/6-consumer/lessons/3-spoken-feedback/README.md b/6-consumer/lessons/3-spoken-feedback/README.md index 9fecc41..ef17d4c 100644 --- a/6-consumer/lessons/3-spoken-feedback/README.md +++ b/6-consumer/lessons/3-spoken-feedback/README.md @@ -72,34 +72,11 @@ These large ML models are being trained to combine all three steps into end-to-e ## Set the timer -The timer can be set by sending a command from the serverless code, instructing the IoT device to set the timer. This command will contain the time in seconds till the timer needs to go off. +To set the timer, your IoT device needs to call the REST endpoint you created using serverless code, then use the resulting number of seconds to set a timer. -### Task - set the timer using a command +### Task - call the serverless function to get the timer time -1. In your serverless code, add code to send a direct method request to your IoT device - - > ⚠️ You can refer to [the instructions for sending direct method requests in lesson 5 of the farm project if needed](../../../2-farm/lessons/5-migrate-application-to-the-cloud/README.md#send-direct-method-requests-from-serverless-code). - - You will need to set up the connection string for the IoT Hub with the service policy (*NOT* the device) in your `local.settings.json` file and add the `azure-iot-hub` pip package to your `requirements.txt` file. The device ID can be extracted from the event. - -1. The direct method you send needs to be called `set-timer`, and will need to send the length of the timer as a JSON property called `seconds`. Use the following code to build the `CloudToDeviceMethod` using the `total_seconds` calculated from the data extracted by LUIS: - - ```python - payload = { - 'seconds': total_seconds - } - direct_method = CloudToDeviceMethod(method_name='set-timer', payload=json.dumps(payload)) - ``` - -> 💁 You can find this code in the [code-command/functions](code-command/functions) folder. - -### Task - respond to the command on the IoT device - -1. On your IoT device, respond to the command. - - > ⚠️ You can refer to [the instructions for handling direct method requests from IoT devices in lesson 4 of the farm project if needed](../../../2-farm/lessons/4-migrate-your-plant-to-the-cloud#task---connect-your-iot-device-to-the-cloud). - -1. Work through the relevant guide to set a timer for the required time: +Follow the relevant guide to call the REST endpoint from your IoT device and set a timer for the required time: * [Arduino - Wio Terminal](wio-terminal-set-timer.md) * [Single-board computer - Raspberry Pi/Virtual IoT device](single-board-computer-set-timer.md) diff --git a/6-consumer/lessons/3-spoken-feedback/code-command/functions/smart-timer-trigger/host.json b/6-consumer/lessons/3-spoken-feedback/code-command/functions/smart-timer-trigger/host.json deleted file mode 100644 index 291065f..0000000 --- a/6-consumer/lessons/3-spoken-feedback/code-command/functions/smart-timer-trigger/host.json +++ /dev/null @@ -1,15 +0,0 @@ -{ - "version": "2.0", - "logging": { - "applicationInsights": { - "samplingSettings": { - "isEnabled": true, - "excludedTypes": "Request" - } - } - }, - "extensionBundle": { - "id": "Microsoft.Azure.Functions.ExtensionBundle", - "version": "[2.*, 3.0.0)" - } -} \ No newline at end of file diff --git a/6-consumer/lessons/3-spoken-feedback/code-command/functions/smart-timer-trigger/local.settings.json b/6-consumer/lessons/3-spoken-feedback/code-command/functions/smart-timer-trigger/local.settings.json deleted file mode 100644 index 8b5b956..0000000 --- a/6-consumer/lessons/3-spoken-feedback/code-command/functions/smart-timer-trigger/local.settings.json +++ /dev/null @@ -1,12 +0,0 @@ -{ - "IsEncrypted": false, - "Values": { - "FUNCTIONS_WORKER_RUNTIME": "python", - "AzureWebJobsStorage": "UseDevelopmentStorage=true", - "IOT_HUB_CONNECTION_STRING": "", - "LUIS_KEY": "", - "LUIS_ENDPOINT_URL": "", - "LUIS_APP_ID": "", - "REGISTRY_MANAGER_CONNECTION_STRING": "" - } -} \ No newline at end of file diff --git a/6-consumer/lessons/3-spoken-feedback/code-command/functions/smart-timer-trigger/requirements.txt b/6-consumer/lessons/3-spoken-feedback/code-command/functions/smart-timer-trigger/requirements.txt deleted file mode 100644 index d0405a3..0000000 --- a/6-consumer/lessons/3-spoken-feedback/code-command/functions/smart-timer-trigger/requirements.txt +++ /dev/null @@ -1,4 +0,0 @@ -# Do not include azure-functions-worker as it may conflict with the Azure Functions platform - -azure-functions -azure-cognitiveservices-language-luis \ No newline at end of file diff --git a/6-consumer/lessons/3-spoken-feedback/code-command/functions/smart-timer-trigger/speech-trigger/__init__.py b/6-consumer/lessons/3-spoken-feedback/code-command/functions/smart-timer-trigger/speech-trigger/__init__.py deleted file mode 100644 index be8e5ee..0000000 --- a/6-consumer/lessons/3-spoken-feedback/code-command/functions/smart-timer-trigger/speech-trigger/__init__.py +++ /dev/null @@ -1,60 +0,0 @@ -from typing import List -import logging - -import azure.functions as func - -import json -import os -from azure.cognitiveservices.language.luis.runtime import LUISRuntimeClient -from msrest.authentication import CognitiveServicesCredentials - -from azure.iot.hub import IoTHubRegistryManager -from azure.iot.hub.models import CloudToDeviceMethod - -def main(events: List[func.EventHubEvent]): - luis_key = os.environ['LUIS_KEY'] - endpoint_url = os.environ['LUIS_ENDPOINT_URL'] - app_id = os.environ['LUIS_APP_ID'] - registry_manager_connection_string = os.environ['REGISTRY_MANAGER_CONNECTION_STRING'] - - credentials = CognitiveServicesCredentials(luis_key) - client = LUISRuntimeClient(endpoint=endpoint_url, credentials=credentials) - - for event in events: - logging.info('Python EventHub trigger processed an event: %s', - event.get_body().decode('utf-8')) - - device_id = event.iothub_metadata['connection-device-id'] - - event_body = json.loads(event.get_body().decode('utf-8')) - prediction_request = { 'query' : event_body['speech'] } - - prediction_response = client.prediction.get_slot_prediction(app_id, 'Staging', prediction_request) - - if prediction_response.prediction.top_intent == 'set timer': - numbers = prediction_response.prediction.entities['number'] - time_units = prediction_response.prediction.entities['time unit'] - total_seconds = 0 - - for i in range(0, len(numbers)): - number = numbers[i] - time_unit = time_units[i][0] - - if time_unit == 'minute': - total_seconds += number * 60 - else: - total_seconds += number - - logging.info(f'Timer required for {total_seconds} seconds') - - payload = { - 'seconds': total_seconds - } - direct_method = CloudToDeviceMethod(method_name='set-timer', payload=json.dumps(payload)) - - registry_manager_connection_string = os.environ['REGISTRY_MANAGER_CONNECTION_STRING'] - registry_manager = IoTHubRegistryManager(registry_manager_connection_string) - - registry_manager.invoke_device_method(device_id, direct_method) - - diff --git a/6-consumer/lessons/3-spoken-feedback/code-command/functions/smart-timer-trigger/speech-trigger/function.json b/6-consumer/lessons/3-spoken-feedback/code-command/functions/smart-timer-trigger/speech-trigger/function.json deleted file mode 100644 index 0117bdf..0000000 --- a/6-consumer/lessons/3-spoken-feedback/code-command/functions/smart-timer-trigger/speech-trigger/function.json +++ /dev/null @@ -1,15 +0,0 @@ -{ - "scriptFile": "__init__.py", - "bindings": [ - { - "type": "eventHubTrigger", - "name": "events", - "direction": "in", - "eventHubName": "samples-workitems", - "connection": "IOT_HUB_CONNECTION_STRING", - "cardinality": "many", - "consumerGroup": "$Default", - "dataType": "binary" - } - ] -} \ No newline at end of file diff --git a/6-consumer/lessons/3-spoken-feedback/code-timer/pi/smart-timer/app.py b/6-consumer/lessons/3-spoken-feedback/code-timer/pi/smart-timer/app.py index afef5b7..4744056 100644 --- a/6-consumer/lessons/3-spoken-feedback/code-timer/pi/smart-timer/app.py +++ b/6-consumer/lessons/3-spoken-feedback/code-timer/pi/smart-timer/app.py @@ -1,12 +1,9 @@ import io -import json import pyaudio import requests +import threading import time import wave -import threading - -from azure.iot.device import IoTHubDeviceClient, Message, MethodResponse from grove.factory import Factory button = Factory.getButton('GPIO-HIGH', 5) @@ -45,13 +42,6 @@ def capture_audio(): speech_api_key = '' location = '' language = '' -connection_string = '' - -device_client = IoTHubDeviceClient.create_from_connection_string(connection_string) - -print('Connecting') -device_client.connect() -print('Connected') def get_access_token(): headers = { @@ -76,13 +66,28 @@ def convert_speech_to_text(buffer): } response = requests.post(url, headers=headers, params=params, data=buffer) - response_json = json.loads(response.text) + response_json = response.json() if response_json['RecognitionStatus'] == 'Success': return response_json['DisplayText'] else: return '' +def get_timer_time(text): + url = '' + + params = { + 'text': text + } + + response = requests.post(url, params=params) + + if response.status_code != 200: + return 0 + + payload = response.json() + return payload['seconds'] + def say(text): print(text) @@ -98,6 +103,7 @@ def announce_timer(minutes, seconds): def create_timer(total_seconds): minutes, seconds = divmod(total_seconds, 60) threading.Timer(total_seconds, announce_timer, args=[minutes, seconds]).start() + announcement = '' if minutes > 0: announcement += f'{minutes} minute ' @@ -106,17 +112,12 @@ def create_timer(total_seconds): announcement += 'timer started.' say(announcement) -def handle_method_request(request): - if request.name == 'set-timer': - payload = json.loads(request.payload) - seconds = payload['seconds'] - if seconds > 0: - create_timer(payload['seconds']) - - method_response = MethodResponse.create_from_method_request(request, 200) - device_client.send_method_response(method_response) - -device_client.on_method_request_received = handle_method_request +def process_text(text): + print(text) + + seconds = get_timer_time(text) + if seconds > 0: + create_timer(seconds) while True: while not button.is_pressed(): @@ -124,7 +125,4 @@ while True: buffer = capture_audio() text = convert_speech_to_text(buffer) - if len(text) > 0: - print(text) - message = Message(json.dumps({ 'speech': text })) - device_client.send_message(message) \ No newline at end of file + process_text(text) \ No newline at end of file diff --git a/6-consumer/lessons/3-spoken-feedback/code-timer/virtual-iot-device/smart-timer/app.py b/6-consumer/lessons/3-spoken-feedback/code-timer/virtual-iot-device/smart-timer/app.py index 8d45eaf..0b20fd8 100644 --- a/6-consumer/lessons/3-spoken-feedback/code-timer/virtual-iot-device/smart-timer/app.py +++ b/6-consumer/lessons/3-spoken-feedback/code-timer/virtual-iot-device/smart-timer/app.py @@ -1,19 +1,11 @@ -import json +import requests import threading import time from azure.cognitiveservices.speech import SpeechConfig, SpeechRecognizer -from azure.iot.device import IoTHubDeviceClient, Message, MethodResponse speech_api_key = '' location = '' language = '' -connection_string = '' - -device_client = IoTHubDeviceClient.create_from_connection_string(connection_string) - -print('Connecting') -device_client.connect() -print('Connected') recognizer_config = SpeechConfig(subscription=speech_api_key, region=location, @@ -21,19 +13,25 @@ recognizer_config = SpeechConfig(subscription=speech_api_key, recognizer = SpeechRecognizer(speech_config=recognizer_config) -def recognized(args): - if len(args.result.text) > 0: - message = Message(json.dumps({ 'speech': args.result.text })) - device_client.send_message(message) +def get_timer_time(text): + url = '' -recognizer.recognized.connect(recognized) + params = { + 'text': text + } -recognizer.start_continuous_recognition() + response = requests.post(url, params=params) + + if response.status_code != 200: + return 0 + + payload = response.json() + return payload['seconds'] def say(text): print(text) -def announce_timer(minutes, seconds): +def announce_timer(minutes, seconds): announcement = 'Times up on your ' if minutes > 0: announcement += f'{minutes} minute ' @@ -45,6 +43,7 @@ def announce_timer(minutes, seconds): def create_timer(total_seconds): minutes, seconds = divmod(total_seconds, 60) threading.Timer(total_seconds, announce_timer, args=[minutes, seconds]).start() + announcement = '' if minutes > 0: announcement += f'{minutes} minute ' @@ -53,17 +52,19 @@ def create_timer(total_seconds): announcement += 'timer started.' say(announcement) -def handle_method_request(request): - if request.name == 'set-timer': - payload = json.loads(request.payload) - seconds = payload['seconds'] - if seconds > 0: - create_timer(payload['seconds']) +def process_text(text): + print(text) + + seconds = get_timer_time(text) + if seconds > 0: + create_timer(seconds) - method_response = MethodResponse.create_from_method_request(request, 200) - device_client.send_method_response(method_response) +def recognized(args): + process_text(args.result.text) -device_client.on_method_request_received = handle_method_request +recognizer.recognized.connect(recognized) + +recognizer.start_continuous_recognition() while True: time.sleep(1) \ No newline at end of file diff --git a/6-consumer/lessons/3-spoken-feedback/single-board-computer-set-timer.md b/6-consumer/lessons/3-spoken-feedback/single-board-computer-set-timer.md index 5f11422..72b0a6b 100644 --- a/6-consumer/lessons/3-spoken-feedback/single-board-computer-set-timer.md +++ b/6-consumer/lessons/3-spoken-feedback/single-board-computer-set-timer.md @@ -4,21 +4,59 @@ In this part of the lesson, you will set a timer on your virtual IoT device or R ## Set a timer -The command sent from the serverless function contains the time for the timer in seconds as the payload. This time can be used to set a timer. +The text that comes back from the speech to text call needs to be sent to your serverless code to be processed by LUIS, getting back the number of seconds for the timer. This number of seconds can be used to set a timer. Timers can be set using the Python `threading.Timer` class. This class takes a delay time and a function, and after the delay time, the function is executed. -### Task - set a timer +### Task - send the text to the serverless function 1. Open the `smart-timer` project in VS Code, and ensure the virtual environment is loaded in the terminal if you are using a virtual IoT device. +1. Above the `process_text` function, declare a function called `get_timer_time` to call the REST endpoint you created: + + ```python + def get_timer_time(text): + ``` + +1. Add the following code to this function to define the URL to call: + + ```python + url = '' + ``` + + Replace `` with the URL of your rest endpoint that you built in the last lesson, either on your computer or in the cloud. + +1. Add the following code to set the text as a parameter on the URL and make the API call: + + ```python + params = { + 'text': text + } + + response = requests.post(url, params=params) + ``` + +1. Below this, retrieve the `seconds` from the response payload, returning 0 if the call failed: + + ```python + if response.status_code != 200: + return 0 + + payload = response.json() + return payload['seconds'] + ``` + + Successful HTTP calls return a status code in the 200 range, and your serverless code returns 200 if the text was processed and recognized as the set timer intent. + +### Task - set a timer on a background thread + 1. Add the following import statement at the top of the file to import the threading Python library: ```python import threading ``` -1. Above the `handle_method_request` function that handles the method request, add a function to speak a response. Fow now this will just write to the console, but later in this lesson this will speak the text. +1. Above the `process_text` function, add a function to speak a response. Fow now this will just write to the console, but later in this lesson this will speak the text. ```python def say(text): @@ -43,9 +81,9 @@ Timers can be set using the Python `threading.Timer` class. This class takes a d 1. Below this, add the following `create_timer` function to create a timer: ```python - def create_timer(seconds): - minutes, seconds = divmod(seconds, 60) - threading.Timer(seconds, announce_timer, args=[minutes, seconds]).start() + def create_timer(total_seconds): + minutes, seconds = divmod(total_seconds, 60) + threading.Timer(total_seconds, announce_timer, args=[minutes, seconds]).start() ``` This function takes the total number of seconds for the timer that will be sent in the command, and converts this to minutes and seconds. It then creates and starts a timer object using the total number of seconds, passing in the `announce_timer` function and a list containing the minutes and seconds. When the timer elapses, it will call the `announce_timer` function, and pass the contents of this list as the parameters - so the first item in the list gets passes as the `minutes` parameter, and the second item as the `seconds` parameter. @@ -64,32 +102,23 @@ Timers can be set using the Python `threading.Timer` class. This class takes a d Again, this only includes the time unit that has a value. This sentence is then sent to the `say` function. -1. At the start of the `handle_method_request` function, add the following code to check that the `set-timer` direct method was requested: - - ```python - if request.name == 'set-timer': - ``` - -1. Inside this `if` statement, extract the timer time in seconds from the payload and use this to create a timer: +1. Add the following to the end of the `process_text` function to get the time for the timer from the text, then create the timer: ```python - payload = json.loads(request.payload) - seconds = payload['seconds'] + seconds = get_timer_time(text) if seconds > 0: - create_timer(payload['seconds']) + create_timer(seconds) ``` - The timer is only created if the number of seconds is greater than 0 + The timer is only created if the number of seconds is greater than 0. 1. Run the app, and ensure the function app is also running. Set some timers, and the output will show the timer being set, and then will show when it elapses: ```output pi@raspberrypi:~/smart-timer $ python3 app.py - Connecting - Connected - Set a one minute 4 second timer. - 1 minute 4 second timer started. - Times up on your 1 minute 4 second timer. + Set a two minute 27 second timer. + 2 minute 27 second timer started. + Times up on your 2 minute 27 second timer. ``` > 💁 You can find this code in the [code-timer/pi](code-timer/pi) or [code-timer/virtual-iot-device](code-timer/virtual-iot-device) folder. diff --git a/README.md b/README.md index 4b890e2..aa5e02c 100644 --- a/README.md +++ b/README.md @@ -22,7 +22,7 @@ The projects cover the journey of food from farm to table. This includes farming **Hearty thanks to our authors [Jen Fox](https://github.com/jenfoxbot), [Jen Looper](https://github.com/jlooper), [Jim Bennett](https://github.com/jimbobbennett), and our sketchnote artist [Nitya Narasimhan](https://github.com/nitya).** -**Thanks as well to our team of [Microsoft Learn Student Ambassadors](https://studentambassadors.microsoft.com?WT.mc_id=academic-17441-jabenn) who have been reviewing and translating this curriculum - [Aditya Garg](https://github.com/AdityaGarg00), [Aryan Jain](https://www.linkedin.com/in/aryan-jain-47a4a1145/), [Bhavesh Suneja](https://github.com/EliteWarrior315), [Lateefah Bello](https://www.linkedin.com/in/lateefah-bello/), [Manvi Jha](https://github.com/Severus-Matthew), [Mireille Tan](https://www.linkedin.com/in/mireille-tan-a4834819a/), [Mohammad Iftekher (Iftu) Ebne Jalal](https://github.com/Iftu119), [Priyanshu Srivastav](https://www.linkedin.com/in/priyanshu-srivastav-b067241ba), [Thanmai Gowducheruvu](https://github.com/innovation-platform), and [Zina Kamel](https://www.linkedin.com/in/zina-kamel/).** +**Thanks as well to our team of [Microsoft Learn Student Ambassadors](https://studentambassadors.microsoft.com?WT.mc_id=academic-17441-jabenn) who have been reviewing and translating this curriculum - [Aditya Garg](https://github.com/AdityaGarg00), [Arpita Das](https://github.com/Arpiiitaaa), [Aryan Jain](https://www.linkedin.com/in/aryan-jain-47a4a1145/), [Bhavesh Suneja](https://github.com/EliteWarrior315), [Lateefah Bello](https://www.linkedin.com/in/lateefah-bello/), [Manvi Jha](https://github.com/Severus-Matthew), [Mireille Tan](https://www.linkedin.com/in/mireille-tan-a4834819a/), [Mohammad Iftekher (Iftu) Ebne Jalal](https://github.com/Iftu119), [Priyanshu Srivastav](https://www.linkedin.com/in/priyanshu-srivastav-b067241ba), [Thanmai Gowducheruvu](https://github.com/innovation-platform), and [Zina Kamel](https://www.linkedin.com/in/zina-kamel/).** > **Teachers**, we have [included some suggestions](for-teachers.md) on how to use this curriculum. If you would like to create your own lessons, we have also included a [lesson template](lesson-template/README.md). diff --git a/images/Diagrams.sketch b/images/Diagrams.sketch index 1bf0842..44f09f1 100644 Binary files a/images/Diagrams.sketch and b/images/Diagrams.sketch differ diff --git a/images/dmac-adc-buffers.png b/images/dmac-adc-buffers.png new file mode 100644 index 0000000..3424cd6 Binary files /dev/null and b/images/dmac-adc-buffers.png differ