diff --git a/8-Reinforcement/1-QLearning/README.md b/8-Reinforcement/1-QLearning/README.md index 86819de0..e2ce8168 100644 --- a/8-Reinforcement/1-QLearning/README.md +++ b/8-Reinforcement/1-QLearning/README.md @@ -25,10 +25,6 @@ You can open [the lesson notebook](notebook.ipynb) and walk through this lesson In this lesson, we will explore the world of **[Peter and the Wolf](https://en.wikipedia.org/wiki/Peter_and_the_Wolf)**, inspired by a musical fairy tale by a Russian composer, [Sergei Prokofiev](https://en.wikipedia.org/wiki/Sergei_Prokofiev). We will use **Reinforcement Learning** to let Peter explore his environment, collect tasty apples and avoid meeting the wolf. -![peter and the wolf](images/peter.png) - -> Peter and his friends need to escape the hungry wolf! Image by [Jen Looper](https://twitter.com/jenlooper) - **Reinforcement Learning** (RL) is a learning technique that allows us to learn an optimal behavior of an **agent** in some **environment** by running many experiments. An agent in this environment should have some **goal**, defined by a **reward function**. ## The environment diff --git a/8-Reinforcement/README.md b/8-Reinforcement/README.md index 464f6aeb..c9881eb4 100644 --- a/8-Reinforcement/README.md +++ b/8-Reinforcement/README.md @@ -4,9 +4,9 @@ Reinforcement learning, RL, is seen as one of the basic machine learning paradig Imagine you have a simulated environment, like the stock market for example. What happens if you impose this or that regulation does it have a positive or negative effect? The whole point is being able to change course if something negative happen, so called _negative reinforcement_ or if it's a positive outcome, to keep building on that, so called _positive reinforcement_. -[![Peter and the Wolf](https://img.youtube.com/vi/Fmi5zHg4QSM/0.jpg)](https://www.youtube.com/watch?v=Fmi5zHg4QSM) +![peter and the wolf](images/peter.png) -> 🎥 Click the image above to listen to Peter and the Wolf by Prokofiev +> Peter and his friends need to escape the hungry wolf! Image by [Jen Looper](https://twitter.com/jenlooper) ## Regional topic: Peter and the Wolf (Russia) @@ -15,6 +15,10 @@ Imagine you have a simulated environment, like the stock market for example. Wha - **Explore** the surrounding area and build an optimal navigation map - **Learn** how to use a skateboard and balance on it, in order to move around faster. +[![Peter and the Wolf](https://img.youtube.com/vi/Fmi5zHg4QSM/0.jpg)](https://www.youtube.com/watch?v=Fmi5zHg4QSM) + +> 🎥 Click the image above to listen to Peter and the Wolf by Prokofiev + ## Reinforcement learning In previous sections, you have seen two examples of machine learning problems: diff --git a/8-Reinforcement/1-QLearning/images/peter.png b/8-Reinforcement/images/peter.png similarity index 100% rename from 8-Reinforcement/1-QLearning/images/peter.png rename to 8-Reinforcement/images/peter.png