From e062028016ede3c0006a9b7ba9a2f6c57ca382b7 Mon Sep 17 00:00:00 2001 From: edgargonarr <35715904+edgargonarr@users.noreply.github.com> Date: Tue, 13 Jul 2021 17:41:46 -0500 Subject: [PATCH 01/51] Update README.md --- 3-Web-App/1-Web-App/README.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/3-Web-App/1-Web-App/README.md b/3-Web-App/1-Web-App/README.md index 6150aece..7c91acb5 100644 --- a/3-Web-App/1-Web-App/README.md +++ b/3-Web-App/1-Web-App/README.md @@ -165,7 +165,7 @@ Now you can build a Flask app to call your model and return similar results, but web-app/ static/ css/ - templates/ + templates/ notebook.ipynb ufo-model.pkl ``` From 721ec86311781b23a4f35a0c145bfa8d323305f5 Mon Sep 17 00:00:00 2001 From: edgargonarr <35715904+edgargonarr@users.noreply.github.com> Date: Tue, 13 Jul 2021 23:25:37 -0500 Subject: [PATCH 02/51] Update README.md In line 178, templates folder seems to be inside of static folder and this condition gives notfoundtemplate error. I suggest delete one tab, I discovered this solution because I sew your solutions folder on github where locations is correct --- 3-Web-App/1-Web-App/README.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/3-Web-App/1-Web-App/README.md b/3-Web-App/1-Web-App/README.md index 6150aece..7c91acb5 100644 --- a/3-Web-App/1-Web-App/README.md +++ b/3-Web-App/1-Web-App/README.md @@ -165,7 +165,7 @@ Now you can build a Flask app to call your model and return similar results, but web-app/ static/ css/ - templates/ + templates/ notebook.ipynb ufo-model.pkl ``` From e975db0a74cf8ba14780cc2775a1d0d99f13e275 Mon Sep 17 00:00:00 2001 From: lty <247969917@qq.com> Date: Wed, 14 Jul 2021 18:14:43 +0800 Subject: [PATCH 03/51] Fix a spelling error in README.md --- 6-NLP/1-Introduction-to-NLP/README.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/6-NLP/1-Introduction-to-NLP/README.md b/6-NLP/1-Introduction-to-NLP/README.md index 0d47a1d7..227ad589 100644 --- a/6-NLP/1-Introduction-to-NLP/README.md +++ b/6-NLP/1-Introduction-to-NLP/README.md @@ -81,7 +81,7 @@ This gave the impression that Eliza understood the statement and was asking a fo ## Exercise - coding a basic conversational bot -A conversational bot, like Eliza, is a program that elicits user input and seems to understand and respond intelligently. Unlike Eliza, our bot will not have several rules giving it the appearance of having an intelligent conversation. Instead, out bot will have one ability only, to keep the conversation going with random responses that might work in almost any trivial conversation. +A conversational bot, like Eliza, is a program that elicits user input and seems to understand and respond intelligently. Unlike Eliza, our bot will not have several rules giving it the appearance of having an intelligent conversation. Instead, our bot will have one ability only, to keep the conversation going with random responses that might work in almost any trivial conversation. ### The plan From 5f225d0063e3bce27429a00df83ac2b973b47925 Mon Sep 17 00:00:00 2001 From: lty <247969917@qq.com> Date: Wed, 14 Jul 2021 18:19:45 +0800 Subject: [PATCH 04/51] Add 6.1 Chinese and English README.md --- .../translations/README.zh-cn.md | 225 ++++++++++++++++++ 1 file changed, 225 insertions(+) create mode 100644 6-NLP/1-Introduction-to-NLP/translations/README.zh-cn.md diff --git a/6-NLP/1-Introduction-to-NLP/translations/README.zh-cn.md b/6-NLP/1-Introduction-to-NLP/translations/README.zh-cn.md new file mode 100644 index 00000000..06ed9ca0 --- /dev/null +++ b/6-NLP/1-Introduction-to-NLP/translations/README.zh-cn.md @@ -0,0 +1,225 @@ +# Introduction to natural language processing +# 自然语言处理介绍 +This lesson covers a brief history and important concepts of *natural language processing*, a subfield of *computational linguistics*. +这节课讲解了*自然语言处理*简要历史和重要概念,*自然语言处理*是计算语言学的一个子领域。 +## [Pre-lecture quiz](https://jolly-sea-0a877260f.azurestaticapps.net/quiz/31/) + +## Introduction +## 介绍 +NLP, as it is commonly known, is one of the best-known areas where machine learning has been applied and used in production software. +众所周知,自然语言处理(Natural Language Processing, NLP)是机器学习在生产软件中应用最广泛的领域之一。 + +✅ Can you think of software that you use every day that probably has some NLP embedded? What about your word processing programs or mobile apps that you use regularly? + +✅你能想到哪些你日常生活中使用的软件嵌入了某些自然语言处理技术呢?你经常使用的文字处理程序或移动应用程序是否嵌入了自然语言处理技术呢? + +You will learn about: +你将会学习到: + +- **The idea of languages**. How languages developed and what the major areas of study have been. +- **Definition and concepts**. You will also learn definitions and concepts about how computers process text, including parsing, grammar, and identifying nouns and verbs. There are some coding tasks in this lesson, and several important concepts are introduced that you will learn to code later on in the next lessons. +- **语言的思想**. 语言的发展历程及主要研究领域. +- **定义和概念**. 你还将学习到有关计算机如何处理文本的定义和概念,包括解析、语法以及名词和动词的识别。本节课程包含一些编码任务并介绍了几个重要的概念,你将在下一节课中学习编码实现这些概念。 + +## Computational linguistics +## 计算语言学 + +Computational linguistics is an area of research and development over many decades that studies how computers can work with, and even understand, translate, and communicate with languages. natural language processing (NLP) is a related field focused on how computers can process 'natural', or human, languages. +计算语言学是一个经过几十年研究和发展的领域,它研究计算机如何使用语言、理解语言、翻译语言及使用语言交流。自然语言处理(NLP)是计算语言学中一个专注于计算机如何处理“自然”或人类语言的相关领域, + +### Example - phone dictation +### 例子 - 电话号码识别 + +If you have ever dictated to your phone instead of typing or asked a virtual assistant a question, your speech was converted into a text form and then processed or *parsed* from the language you spoke. The detected keywords were then processed into a format that the phone or assistant could understand and act on. +如果你曾经在手机上使用语音输入替代键盘输入或者向语音助手小娜提问,那么你的语音将被转录为文本形式后进行处理或者叫*解析*。被检测到的关键字最后将被处理成手机或语音助手可以理解并采取行动的格式。 + +![comprehension](images/comprehension.png) +> Real linguistic comprehension is hard! Image by [Jen Looper](https://twitter.com/jenlooper) +> 真实的语言理解十分困难!图源:[Jen Looper](https://twitter.com/jenlooper) + +### How is this technology made possible? +### 这项技术是如何实现的? + +This is possible because someone wrote a computer program to do this. A few decades ago, some science fiction writers predicted that people would mostly speak to their computers, and the computers would always understand exactly what they meant. Sadly, it turned out to be a harder problem that many imagined, and while it is a much better understood problem today, there are significant challenges in achieving 'perfect' natural language processing when it comes to understanding the meaning of a sentence. This is a particularly hard problem when it comes to understanding humour or detecting emotions such as sarcasm in a sentence. +有人编写了一个计算机程序来实现这项技术。几十年前,一些科幻作家预测人类很大可能会和他们的电脑对话,而电脑总是能准确地理解人类的意思。可惜的是,事实证明这是一个比许多人想象中更难实现的问题,虽然今天这个问题已经被初步解决,但在理解句子的含义时,要实现“完美”的自然语言处理仍然存在重大挑战。句子中的幽默理解或讽刺等情绪的检测是一个特别困难的问题。 + +At this point, you may be remembering school classes where the teacher covered the parts of grammar in a sentence. In some countries, students are taught grammar and linguistics as a dedicated subject, but in many, these topics are included as part of learning a language: either your first language in primary school (learning to read and write) and perhaps a second language in post-primary, or high school. Don't worry if you are not an expert at differentiating nouns from verbs or adverbs from adjectives! +此时,你可能会想起学校课堂上老师讲解的部分句子语法。在某些国家/地区,语法和语言学知识是学生的专题课内容。但在另一些国家/地区,不管是在小学时的第一语言(学习阅读和写作),或者在高年级及高中时学习的第二语言中,语法及语言学知识是作为学习语言的一部分教学的。如果你不能很好地区分名词与动词或者区分副词与形容词,请不要担心! + +If you struggle with the difference between the *simple present* and *present progressive*, you are not alone. This is a challenging thing for many people, even native speakers of a language. The good news is that computers are really good at applying formal rules, and you will learn to write code that can *parse* a sentence as well as a human. The greater challenge you will examine later is understanding the *meaning*, and *sentiment*, of a sentence. +如果你还为区分*一般现在时*与*现在进行时*而烦恼,你并不是一个人。即使是对以这门语言为母语的人在内的很多人来说这都是一项有挑战性的任务。好消息是,计算机非常善于应用标准的规则,你将学会编写可以像人一样"解析"句子的代码。稍后你将面对的更大挑战是理解句子的*语义*和*情绪*。 + +## Prerequisites +## 前提 + +For this lesson, the main prerequisite is being able to read and understand the language of this lesson. There are no math problems or equations to solve. While the original author wrote this lesson in English, it is also translated into other languages, so you could be reading a translation. There are examples where a number of different languages are used (to compare the different grammar rules of different languages). These are *not* translated, but the explanatory text is, so the meaning should be clear. +本节教程的主要先决条件是能够阅读和理解本节教程的语言。本节中没有数学问题或方程需要解决。虽然原作者用英文写了这教程,但它也被翻译成其他语言,所以你可能在阅读翻译内容。有使用多种不同语言的示例(以比较不同语言的不同语法规则)。这些是*未*翻译的,但解释性文本是翻译内容,所以表义应当是清晰的。 + +For the coding tasks, you will use Python and the examples are using Python 3.8. +编程任务中,你将会使用Python语言,示例使用的是Python 3.8版本。 + +In this section, you will need, and use: +在本节中你将需要并使用: + +- **Python 3 comprehension**. Programming language comprehension in Python 3, this lesson uses input, loops, file reading, arrays. +- **Visual Studio Code + extension**. We will use Visual Studio Code and its Python extension. You can also use a Python IDE of your choice. +- **TextBlob**. [TextBlob](https://github.com/sloria/TextBlob) is a simplified text processing library for Python. Follow the instructions on the TextBlob site to install it on your system (install the corpora as well, as shown below): +- **Python 3 理解**. Python 3中的编程语言理解,本课使用输入、循环、文件读取、数组。 +- **Visual Studio Code + 扩展**. 我们将使用 Visual Studio Code 及其 Python 扩展。你还可以使用你选择的 Python IDE。 +- **TextBlob**. [TextBlob](https://github.com/sloria/TextBlob)是一个简化的 Python 文本处理库。按照 TextBlob 网站上的说明在您的系统上安装它(也安装语料库,如下所示): +- + ```bash + pip install -U textblob + python -m textblob.download_corpora + ``` + +> 💡 Tip: You can run Python directly in VS Code environments. Check the [docs](https://code.visualstudio.com/docs/languages/python?WT.mc_id=academic-15963-cxa) for more information. +> 💡 提示:可以在 VS Code 环境中直接运行 Python。 点击[docs](https://code.visualstudio.com/docs/languages/python?WT.mc_id=academic-15963-cxa)查看更多信息。 + +## Talking to machines +## 与机器对话 + +The history of trying to make computers understand human language goes back decades, and one of the earliest scientists to consider natural language processing was *Alan Turing*. +试图让计算机理解人类语言的历史可以追溯到几十年前,最早考虑自然语言处理的科学家之一是 *Alan Turing*。 + +### The 'Turing test' +### 图灵测试 + + +When Turing was researching *artificial intelligence* in the 1950's, he considered if a conversational test could be given to a human and computer (via typed correspondence) where the human in the conversation was not sure if they were conversing with another human or a computer. +当图灵在1950年代研究*人工智能*时,他考虑是否可以对人和计算机进行对话测试(通过打字对应),其中对话中的人不确定他们是在与另一个人交谈还是与计算机交谈. + +If, after a certain length of conversation, the human could not determine that the answers were from a computer or not, then could the computer be said to be *thinking*? +如果经过一定时间的交谈,人类无法确定答案是否来自计算机,那么是否可以说计算机正在“思考”? + +### The inspiration - 'the imitation game' +### 灵感 - “模仿游戏” + +The idea for this came from a party game called *The Imitation Game* where an interrogator is alone in a room and tasked with determining which of two people (in another room) are male and female respectively. The interrogator can send notes, and must try to think of questions where the written answers reveal the gender of the mystery person. Of course, the players in the other room are trying to trick the interrogator by answering questions in such as way as to mislead or confuse the interrogator, whilst also giving the appearance of answering honestly. +这个想法来自一个名为 *模仿游戏* 的派对游戏,其中一名审讯者独自一人在一个房间里,负责确定两个人(在另一个房间里)是男性还是女性。审讯者可以传递笔记,并且需要想出能够揭示神秘人性别的问题。当然,另一个房间的玩家试图通过回答问题的方式来欺骗审讯者,例如误导或迷惑审讯者,同时表现出诚实回答的样子。 + +### Eliza的研发 + +In the 1960's an MIT scientist called *Joseph Weizenbaum* developed [*Eliza*](https:/wikipedia.org/wiki/ELIZA), a computer 'therapist' that would ask the human questions and give the appearance of understanding their answers. However, while Eliza could parse a sentence and identify certain grammatical constructs and keywords so as to give a reasonable answer, it could not be said to *understand* the sentence. If Eliza was presented with a sentence following the format "**I am** sad" it might rearrange and substitute words in the sentence to form the response "How long have **you been** sad". +在 1960 年代,一位名叫 *Joseph Weizenbaum* 的麻省理工学院科学家开发了[*Eliza*](https:/wikipedia.org/wiki/ELIZA),Eliza是一位计算机“治疗师”,它可以向人类提出问题并表现出理解他们的答案。然而,虽然 Eliza 可以解析句子并识别某些语法结构和关键字以给出合理的答案,但不能说它*理解*了句子。如果 Eliza 看到的句子格式为“**I am** sad”,它可能会重新排列并替换句子中的单词以形成响应“How long have ** you been** sad"。 + +This gave the impression that Eliza understood the statement and was asking a follow-on question, whereas in reality, it was changing the tense and adding some words. If Eliza could not identify a keyword that it had a response for, it would instead give a random response that could be applicable to many different statements. Eliza could be easily tricked, for instance if a user wrote "**You are** a bicycle" it might respond with "How long have **I been** a bicycle?", instead of a more reasoned response. +这给人的印象是伊丽莎理解了这句话,并在问一个后续问题,而实际上,它是在改变时态并添加一些词。如果 Eliza 无法识别它有响应的关键字,它会给出一个随机响应,该响应可以适用于许多不同的语句。 Eliza 很容易被欺骗,例如,如果用户写了**You are** a bicycle",它可能会回复"How long have **I been** a bicycle?",而不是更合理的回答。 + +[![Chatting with Eliza](https://img.youtube.com/vi/RMK9AphfLco/0.jpg)](https://youtu.be/RMK9AphfLco "Chatting with Eliza") + +> 🎥 Click the image above for a video about original ELIZA program +> 🎥 点击上方的图片查看真实的ELIZA程序视频 + +> Note: You can read the original description of [Eliza](https://cacm.acm.org/magazines/1966/1/13317-elizaa-computer-program-for-the-study-of-natural-language-communication-between-man-and-machine/abstract) published in 1966 if you have an ACM account. Alternately, read about Eliza on [wikipedia](https://wikipedia.org/wiki/ELIZA) +> 注意:如果你拥有ACM账户,你可以阅读1996年发表的[Eliza](https://cacm.acm.org/magazines/1966/1/13317-elizaa-computer-program-for-the-study-of-natural-language-communication-between-man-and-machine/abstract)的原始介绍。或者,在[wikipedia](https://wikipedia.org/wiki/ELIZA)阅读有关 Eliza 的信息 + +## Exercise - coding a basic conversational bot +## 联系 - 编码实现一个基础的对话机器人 + +A conversational bot, like Eliza, is a program that elicits user input and seems to understand and respond intelligently. Unlike Eliza, our bot will not have several rules giving it the appearance of having an intelligent conversation. Instead, out bot will have one ability only, to keep the conversation going with random responses that might work in almost any trivial conversation. +像 Eliza 一样的对话机器人是一个似乎可以智能地理解和响应用户输入的程序。与 Eliza 不同的是,我们的机器人不会用规则让它看起来像是在进行智能对话。取而代之的是,我们的对话机器人将只有一种能力,通过几乎在所有琐碎对话中都适用的随机响应保持对话的进行。 + +### The plan +### 计划 + +Your steps when building a conversational bot: +搭建聊天机器人的步骤 + +1. Print instructions advising the user how to interact with the bot +2. Start a loop + 1. Accept user input + 2. If user has asked to exit, then exit + 3. Process user input and determine response (in this case, the response is a random choice from a list of possible generic responses) + 4. Print response +3. loop back to step 2 +1. 打印指导用户如何与机器人交互的说明 +2. 开启循环 + 1. 获取用户输入 + 2. 如果用户要求退出,就退出 + 3. 处理用户输入并选择一个回答(在这个例子中,回答从一个可能的通用回答列表中随机选择) + 4. 打印回答 +3. 重复步骤2 + +### Building the bot +### 构建聊天机器人 + +接下来让我们构建聊天机器人。我们将从定义一些短语开始。 + +1. 使用以下随机响应在 Python 中自己创建此机器人: + + ```python + random_responses = ["That is quite interesting, please tell me more.", + "I see. Do go on.", + "Why do you say that?", + "Funny weather we've been having, isn't it?", + "Let's change the subject.", + "Did you catch the game last night?"] + ``` + + Here is some sample output to guide you (user input is on the lines starting with `>`): + + ```output + Hello, I am Marvin, the simple robot. + You can end this conversation at any time by typing 'bye' + After typing each answer, press 'enter' + How are you today? + > I am good thanks + That is quite interesting, please tell me more. + > today I went for a walk + Did you catch the game last night? + > I did, but my team lost + Funny weather we've been having, isn't it? + > yes but I hope next week is better + Let's change the subject. + > ok, lets talk about music + Why do you say that? + > because I like music! + Why do you say that? + > bye + It was nice talking to you, goodbye! + ``` + + 该任务的一种可能解决方案在[这里](solution/bot.py) + + ✅ Stop and consider + ✅ 停止并思考 + + 1. Do you think the random responses would 'trick' someone into thinking that the bot actually understood them? + 2. What features would the bot need to be more effective? + 3. If a bot could really 'understand' the meaning of a sentence, would it need to 'remember' the meaning of previous sentences in a conversation too? + + 1. 你认为随机响应会“欺骗”某人认为机器人实际上理解他们吗? + 2. 机器人需要哪些功能才能更有效? + 3. 如果机器人真的可以“理解”一个句子的意思,它是否也需要“记住”对话中前面句子的意思? + +--- + +## 🚀Challenge +## 🚀挑战 + +Choose one of the "stop and consider" elements above and either try to implement them in code or write a solution on paper using pseudocode. +选择上面的“停止并思考”元素之一,然后尝试在代码中实现它们或使用伪代码在纸上编写解决方案。 + +In the next lesson, you'll learn about a number of other approaches to parsing natural language and machine learning. +在下一课中,您将了解解析自然语言和机器学习的许多其他方法。 + +## [Post-lecture quiz](https://jolly-sea-0a877260f.azurestaticapps.net/quiz/32/) +## [课后测验](https://jolly-sea-0a877260f.azurestaticapps.net/quiz/32/) + +## Review & Self Study +## 复习与自学 + +Take a look at the references below as further reading opportunities. +看看下面的参考资料作为进一步的阅读机会。 + +### References +### 参考 + +1. Schubert, Lenhart, "Computational Linguistics", *The Stanford Encyclopedia of Philosophy* (Spring 2020 Edition), Edward N. Zalta (ed.), URL = . +2. Princeton University "About WordNet." [WordNet](https://wordnet.princeton.edu/). Princeton University. 2010. + +## Assignment +## 任务 + +[查找一个机器人](assignment.md) From a0e4826c25c771f1c8559def049b955347854ade Mon Sep 17 00:00:00 2001 From: lty <247969917@qq.com> Date: Wed, 14 Jul 2021 18:24:33 +0800 Subject: [PATCH 05/51] Add Chinese README for 6.1 --- .../translations/README.zh-cn.md | 69 +------------------ 1 file changed, 2 insertions(+), 67 deletions(-) diff --git a/6-NLP/1-Introduction-to-NLP/translations/README.zh-cn.md b/6-NLP/1-Introduction-to-NLP/translations/README.zh-cn.md index 06ed9ca0..8dbfebc2 100644 --- a/6-NLP/1-Introduction-to-NLP/translations/README.zh-cn.md +++ b/6-NLP/1-Introduction-to-NLP/translations/README.zh-cn.md @@ -1,69 +1,41 @@ -# Introduction to natural language processing # 自然语言处理介绍 -This lesson covers a brief history and important concepts of *natural language processing*, a subfield of *computational linguistics*. 这节课讲解了*自然语言处理*简要历史和重要概念,*自然语言处理*是计算语言学的一个子领域。 -## [Pre-lecture quiz](https://jolly-sea-0a877260f.azurestaticapps.net/quiz/31/) -## Introduction +## [课前测验]](https://jolly-sea-0a877260f.azurestaticapps.net/quiz/31/) + ## 介绍 -NLP, as it is commonly known, is one of the best-known areas where machine learning has been applied and used in production software. 众所周知,自然语言处理(Natural Language Processing, NLP)是机器学习在生产软件中应用最广泛的领域之一。 -✅ Can you think of software that you use every day that probably has some NLP embedded? What about your word processing programs or mobile apps that you use regularly? - ✅你能想到哪些你日常生活中使用的软件嵌入了某些自然语言处理技术呢?你经常使用的文字处理程序或移动应用程序是否嵌入了自然语言处理技术呢? -You will learn about: 你将会学习到: -- **The idea of languages**. How languages developed and what the major areas of study have been. -- **Definition and concepts**. You will also learn definitions and concepts about how computers process text, including parsing, grammar, and identifying nouns and verbs. There are some coding tasks in this lesson, and several important concepts are introduced that you will learn to code later on in the next lessons. - **语言的思想**. 语言的发展历程及主要研究领域. - **定义和概念**. 你还将学习到有关计算机如何处理文本的定义和概念,包括解析、语法以及名词和动词的识别。本节课程包含一些编码任务并介绍了几个重要的概念,你将在下一节课中学习编码实现这些概念。 - -## Computational linguistics ## 计算语言学 -Computational linguistics is an area of research and development over many decades that studies how computers can work with, and even understand, translate, and communicate with languages. natural language processing (NLP) is a related field focused on how computers can process 'natural', or human, languages. 计算语言学是一个经过几十年研究和发展的领域,它研究计算机如何使用语言、理解语言、翻译语言及使用语言交流。自然语言处理(NLP)是计算语言学中一个专注于计算机如何处理“自然”或人类语言的相关领域, - -### Example - phone dictation ### 例子 - 电话号码识别 -If you have ever dictated to your phone instead of typing or asked a virtual assistant a question, your speech was converted into a text form and then processed or *parsed* from the language you spoke. The detected keywords were then processed into a format that the phone or assistant could understand and act on. 如果你曾经在手机上使用语音输入替代键盘输入或者向语音助手小娜提问,那么你的语音将被转录为文本形式后进行处理或者叫*解析*。被检测到的关键字最后将被处理成手机或语音助手可以理解并采取行动的格式。 ![comprehension](images/comprehension.png) -> Real linguistic comprehension is hard! Image by [Jen Looper](https://twitter.com/jenlooper) > 真实的语言理解十分困难!图源:[Jen Looper](https://twitter.com/jenlooper) - -### How is this technology made possible? ### 这项技术是如何实现的? -This is possible because someone wrote a computer program to do this. A few decades ago, some science fiction writers predicted that people would mostly speak to their computers, and the computers would always understand exactly what they meant. Sadly, it turned out to be a harder problem that many imagined, and while it is a much better understood problem today, there are significant challenges in achieving 'perfect' natural language processing when it comes to understanding the meaning of a sentence. This is a particularly hard problem when it comes to understanding humour or detecting emotions such as sarcasm in a sentence. 有人编写了一个计算机程序来实现这项技术。几十年前,一些科幻作家预测人类很大可能会和他们的电脑对话,而电脑总是能准确地理解人类的意思。可惜的是,事实证明这是一个比许多人想象中更难实现的问题,虽然今天这个问题已经被初步解决,但在理解句子的含义时,要实现“完美”的自然语言处理仍然存在重大挑战。句子中的幽默理解或讽刺等情绪的检测是一个特别困难的问题。 -At this point, you may be remembering school classes where the teacher covered the parts of grammar in a sentence. In some countries, students are taught grammar and linguistics as a dedicated subject, but in many, these topics are included as part of learning a language: either your first language in primary school (learning to read and write) and perhaps a second language in post-primary, or high school. Don't worry if you are not an expert at differentiating nouns from verbs or adverbs from adjectives! 此时,你可能会想起学校课堂上老师讲解的部分句子语法。在某些国家/地区,语法和语言学知识是学生的专题课内容。但在另一些国家/地区,不管是在小学时的第一语言(学习阅读和写作),或者在高年级及高中时学习的第二语言中,语法及语言学知识是作为学习语言的一部分教学的。如果你不能很好地区分名词与动词或者区分副词与形容词,请不要担心! -If you struggle with the difference between the *simple present* and *present progressive*, you are not alone. This is a challenging thing for many people, even native speakers of a language. The good news is that computers are really good at applying formal rules, and you will learn to write code that can *parse* a sentence as well as a human. The greater challenge you will examine later is understanding the *meaning*, and *sentiment*, of a sentence. 如果你还为区分*一般现在时*与*现在进行时*而烦恼,你并不是一个人。即使是对以这门语言为母语的人在内的很多人来说这都是一项有挑战性的任务。好消息是,计算机非常善于应用标准的规则,你将学会编写可以像人一样"解析"句子的代码。稍后你将面对的更大挑战是理解句子的*语义*和*情绪*。 - -## Prerequisites ## 前提 -For this lesson, the main prerequisite is being able to read and understand the language of this lesson. There are no math problems or equations to solve. While the original author wrote this lesson in English, it is also translated into other languages, so you could be reading a translation. There are examples where a number of different languages are used (to compare the different grammar rules of different languages). These are *not* translated, but the explanatory text is, so the meaning should be clear. 本节教程的主要先决条件是能够阅读和理解本节教程的语言。本节中没有数学问题或方程需要解决。虽然原作者用英文写了这教程,但它也被翻译成其他语言,所以你可能在阅读翻译内容。有使用多种不同语言的示例(以比较不同语言的不同语法规则)。这些是*未*翻译的,但解释性文本是翻译内容,所以表义应当是清晰的。 -For the coding tasks, you will use Python and the examples are using Python 3.8. 编程任务中,你将会使用Python语言,示例使用的是Python 3.8版本。 -In this section, you will need, and use: 在本节中你将需要并使用: -- **Python 3 comprehension**. Programming language comprehension in Python 3, this lesson uses input, loops, file reading, arrays. -- **Visual Studio Code + extension**. We will use Visual Studio Code and its Python extension. You can also use a Python IDE of your choice. -- **TextBlob**. [TextBlob](https://github.com/sloria/TextBlob) is a simplified text processing library for Python. Follow the instructions on the TextBlob site to install it on your system (install the corpora as well, as shown below): - **Python 3 理解**. Python 3中的编程语言理解,本课使用输入、循环、文件读取、数组。 - **Visual Studio Code + 扩展**. 我们将使用 Visual Studio Code 及其 Python 扩展。你还可以使用你选择的 Python IDE。 - **TextBlob**. [TextBlob](https://github.com/sloria/TextBlob)是一个简化的 Python 文本处理库。按照 TextBlob 网站上的说明在您的系统上安装它(也安装语料库,如下所示): @@ -73,66 +45,40 @@ In this section, you will need, and use: python -m textblob.download_corpora ``` -> 💡 Tip: You can run Python directly in VS Code environments. Check the [docs](https://code.visualstudio.com/docs/languages/python?WT.mc_id=academic-15963-cxa) for more information. > 💡 提示:可以在 VS Code 环境中直接运行 Python。 点击[docs](https://code.visualstudio.com/docs/languages/python?WT.mc_id=academic-15963-cxa)查看更多信息。 -## Talking to machines ## 与机器对话 -The history of trying to make computers understand human language goes back decades, and one of the earliest scientists to consider natural language processing was *Alan Turing*. 试图让计算机理解人类语言的历史可以追溯到几十年前,最早考虑自然语言处理的科学家之一是 *Alan Turing*。 - -### The 'Turing test' ### 图灵测试 - -When Turing was researching *artificial intelligence* in the 1950's, he considered if a conversational test could be given to a human and computer (via typed correspondence) where the human in the conversation was not sure if they were conversing with another human or a computer. 当图灵在1950年代研究*人工智能*时,他考虑是否可以对人和计算机进行对话测试(通过打字对应),其中对话中的人不确定他们是在与另一个人交谈还是与计算机交谈. -If, after a certain length of conversation, the human could not determine that the answers were from a computer or not, then could the computer be said to be *thinking*? 如果经过一定时间的交谈,人类无法确定答案是否来自计算机,那么是否可以说计算机正在“思考”? - -### The inspiration - 'the imitation game' ### 灵感 - “模仿游戏” -The idea for this came from a party game called *The Imitation Game* where an interrogator is alone in a room and tasked with determining which of two people (in another room) are male and female respectively. The interrogator can send notes, and must try to think of questions where the written answers reveal the gender of the mystery person. Of course, the players in the other room are trying to trick the interrogator by answering questions in such as way as to mislead or confuse the interrogator, whilst also giving the appearance of answering honestly. 这个想法来自一个名为 *模仿游戏* 的派对游戏,其中一名审讯者独自一人在一个房间里,负责确定两个人(在另一个房间里)是男性还是女性。审讯者可以传递笔记,并且需要想出能够揭示神秘人性别的问题。当然,另一个房间的玩家试图通过回答问题的方式来欺骗审讯者,例如误导或迷惑审讯者,同时表现出诚实回答的样子。 ### Eliza的研发 -In the 1960's an MIT scientist called *Joseph Weizenbaum* developed [*Eliza*](https:/wikipedia.org/wiki/ELIZA), a computer 'therapist' that would ask the human questions and give the appearance of understanding their answers. However, while Eliza could parse a sentence and identify certain grammatical constructs and keywords so as to give a reasonable answer, it could not be said to *understand* the sentence. If Eliza was presented with a sentence following the format "**I am** sad" it might rearrange and substitute words in the sentence to form the response "How long have **you been** sad". 在 1960 年代,一位名叫 *Joseph Weizenbaum* 的麻省理工学院科学家开发了[*Eliza*](https:/wikipedia.org/wiki/ELIZA),Eliza是一位计算机“治疗师”,它可以向人类提出问题并表现出理解他们的答案。然而,虽然 Eliza 可以解析句子并识别某些语法结构和关键字以给出合理的答案,但不能说它*理解*了句子。如果 Eliza 看到的句子格式为“**I am** sad”,它可能会重新排列并替换句子中的单词以形成响应“How long have ** you been** sad"。 -This gave the impression that Eliza understood the statement and was asking a follow-on question, whereas in reality, it was changing the tense and adding some words. If Eliza could not identify a keyword that it had a response for, it would instead give a random response that could be applicable to many different statements. Eliza could be easily tricked, for instance if a user wrote "**You are** a bicycle" it might respond with "How long have **I been** a bicycle?", instead of a more reasoned response. 这给人的印象是伊丽莎理解了这句话,并在问一个后续问题,而实际上,它是在改变时态并添加一些词。如果 Eliza 无法识别它有响应的关键字,它会给出一个随机响应,该响应可以适用于许多不同的语句。 Eliza 很容易被欺骗,例如,如果用户写了**You are** a bicycle",它可能会回复"How long have **I been** a bicycle?",而不是更合理的回答。 [![Chatting with Eliza](https://img.youtube.com/vi/RMK9AphfLco/0.jpg)](https://youtu.be/RMK9AphfLco "Chatting with Eliza") -> 🎥 Click the image above for a video about original ELIZA program > 🎥 点击上方的图片查看真实的ELIZA程序视频 -> Note: You can read the original description of [Eliza](https://cacm.acm.org/magazines/1966/1/13317-elizaa-computer-program-for-the-study-of-natural-language-communication-between-man-and-machine/abstract) published in 1966 if you have an ACM account. Alternately, read about Eliza on [wikipedia](https://wikipedia.org/wiki/ELIZA) > 注意:如果你拥有ACM账户,你可以阅读1996年发表的[Eliza](https://cacm.acm.org/magazines/1966/1/13317-elizaa-computer-program-for-the-study-of-natural-language-communication-between-man-and-machine/abstract)的原始介绍。或者,在[wikipedia](https://wikipedia.org/wiki/ELIZA)阅读有关 Eliza 的信息 -## Exercise - coding a basic conversational bot ## 联系 - 编码实现一个基础的对话机器人 -A conversational bot, like Eliza, is a program that elicits user input and seems to understand and respond intelligently. Unlike Eliza, our bot will not have several rules giving it the appearance of having an intelligent conversation. Instead, out bot will have one ability only, to keep the conversation going with random responses that might work in almost any trivial conversation. 像 Eliza 一样的对话机器人是一个似乎可以智能地理解和响应用户输入的程序。与 Eliza 不同的是,我们的机器人不会用规则让它看起来像是在进行智能对话。取而代之的是,我们的对话机器人将只有一种能力,通过几乎在所有琐碎对话中都适用的随机响应保持对话的进行。 -### The plan ### 计划 -Your steps when building a conversational bot: 搭建聊天机器人的步骤 -1. Print instructions advising the user how to interact with the bot -2. Start a loop - 1. Accept user input - 2. If user has asked to exit, then exit - 3. Process user input and determine response (in this case, the response is a random choice from a list of possible generic responses) - 4. Print response -3. loop back to step 2 1. 打印指导用户如何与机器人交互的说明 2. 开启循环 1. 获取用户输入 @@ -141,7 +87,6 @@ Your steps when building a conversational bot: 4. 打印回答 3. 重复步骤2 -### Building the bot ### 构建聊天机器人 接下来让我们构建聊天机器人。我们将从定义一些短语开始。 @@ -194,32 +139,22 @@ Your steps when building a conversational bot: 3. 如果机器人真的可以“理解”一个句子的意思,它是否也需要“记住”对话中前面句子的意思? --- - -## 🚀Challenge ## 🚀挑战 -Choose one of the "stop and consider" elements above and either try to implement them in code or write a solution on paper using pseudocode. 选择上面的“停止并思考”元素之一,然后尝试在代码中实现它们或使用伪代码在纸上编写解决方案。 -In the next lesson, you'll learn about a number of other approaches to parsing natural language and machine learning. 在下一课中,您将了解解析自然语言和机器学习的许多其他方法。 -## [Post-lecture quiz](https://jolly-sea-0a877260f.azurestaticapps.net/quiz/32/) ## [课后测验](https://jolly-sea-0a877260f.azurestaticapps.net/quiz/32/) -## Review & Self Study ## 复习与自学 -Take a look at the references below as further reading opportunities. 看看下面的参考资料作为进一步的阅读机会。 - -### References ### 参考 1. Schubert, Lenhart, "Computational Linguistics", *The Stanford Encyclopedia of Philosophy* (Spring 2020 Edition), Edward N. Zalta (ed.), URL = . 2. Princeton University "About WordNet." [WordNet](https://wordnet.princeton.edu/). Princeton University. 2010. -## Assignment ## 任务 [查找一个机器人](assignment.md) From 4bccc6ae2fe5cdbaf3dd91be4bf53ba3ae3d0236 Mon Sep 17 00:00:00 2001 From: lty <247969917@qq.com> Date: Wed, 14 Jul 2021 18:31:38 +0800 Subject: [PATCH 06/51] Fix a image path error --- 6-NLP/1-Introduction-to-NLP/translations/README.zh-cn.md | 5 ++--- 1 file changed, 2 insertions(+), 3 deletions(-) diff --git a/6-NLP/1-Introduction-to-NLP/translations/README.zh-cn.md b/6-NLP/1-Introduction-to-NLP/translations/README.zh-cn.md index 8dbfebc2..0b5b83f5 100644 --- a/6-NLP/1-Introduction-to-NLP/translations/README.zh-cn.md +++ b/6-NLP/1-Introduction-to-NLP/translations/README.zh-cn.md @@ -1,7 +1,7 @@ # 自然语言处理介绍 这节课讲解了*自然语言处理*简要历史和重要概念,*自然语言处理*是计算语言学的一个子领域。 -## [课前测验]](https://jolly-sea-0a877260f.azurestaticapps.net/quiz/31/) +## [课前测验](https://jolly-sea-0a877260f.azurestaticapps.net/quiz/31/) ## 介绍 众所周知,自然语言处理(Natural Language Processing, NLP)是机器学习在生产软件中应用最广泛的领域之一。 @@ -19,7 +19,7 @@ 如果你曾经在手机上使用语音输入替代键盘输入或者向语音助手小娜提问,那么你的语音将被转录为文本形式后进行处理或者叫*解析*。被检测到的关键字最后将被处理成手机或语音助手可以理解并采取行动的格式。 -![comprehension](images/comprehension.png) +![comprehension](../images/comprehension.png) > 真实的语言理解十分困难!图源:[Jen Looper](https://twitter.com/jenlooper) ### 这项技术是如何实现的? @@ -127,7 +127,6 @@ 该任务的一种可能解决方案在[这里](solution/bot.py) - ✅ Stop and consider ✅ 停止并思考 1. Do you think the random responses would 'trick' someone into thinking that the bot actually understood them? From 31b963c1d8621b6f05e64e5361924ea3ce8645c4 Mon Sep 17 00:00:00 2001 From: lty <247969917@qq.com> Date: Wed, 14 Jul 2021 18:33:15 +0800 Subject: [PATCH 07/51] ind --- 6-NLP/1-Introduction-to-NLP/translations/README.zh-cn.md | 8 ++------ 1 file changed, 2 insertions(+), 6 deletions(-) diff --git a/6-NLP/1-Introduction-to-NLP/translations/README.zh-cn.md b/6-NLP/1-Introduction-to-NLP/translations/README.zh-cn.md index 0b5b83f5..9d1e0665 100644 --- a/6-NLP/1-Introduction-to-NLP/translations/README.zh-cn.md +++ b/6-NLP/1-Introduction-to-NLP/translations/README.zh-cn.md @@ -102,8 +102,8 @@ "Did you catch the game last night?"] ``` - Here is some sample output to guide you (user input is on the lines starting with `>`): - + 以下是一些指导你的示例输出(用户输入位于以 `>` 开头的行上): + ```output Hello, I am Marvin, the simple robot. You can end this conversation at any time by typing 'bye' @@ -128,10 +128,6 @@ 该任务的一种可能解决方案在[这里](solution/bot.py) ✅ 停止并思考 - - 1. Do you think the random responses would 'trick' someone into thinking that the bot actually understood them? - 2. What features would the bot need to be more effective? - 3. If a bot could really 'understand' the meaning of a sentence, would it need to 'remember' the meaning of previous sentences in a conversation too? 1. 你认为随机响应会“欺骗”某人认为机器人实际上理解他们吗? 2. 机器人需要哪些功能才能更有效? From 4a7ab265381462067304e844d37d3ed7d3642736 Mon Sep 17 00:00:00 2001 From: lty <247969917@qq.com> Date: Wed, 14 Jul 2021 18:35:01 +0800 Subject: [PATCH 08/51] ind --- 6-NLP/1-Introduction-to-NLP/translations/README.zh-cn.md | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/6-NLP/1-Introduction-to-NLP/translations/README.zh-cn.md b/6-NLP/1-Introduction-to-NLP/translations/README.zh-cn.md index 9d1e0665..75cd69da 100644 --- a/6-NLP/1-Introduction-to-NLP/translations/README.zh-cn.md +++ b/6-NLP/1-Introduction-to-NLP/translations/README.zh-cn.md @@ -45,7 +45,7 @@ python -m textblob.download_corpora ``` -> 💡 提示:可以在 VS Code 环境中直接运行 Python。 点击[docs](https://code.visualstudio.com/docs/languages/python?WT.mc_id=academic-15963-cxa)查看更多信息。 +> 💡 提示:可以在 VS Code 环境中直接运行 Python。 点击[docs](https://code.visualstudio.com/docs/languages/python?WT.mc_id=academic-15963-cxa)查看更多信息。 ## 与机器对话 @@ -61,7 +61,7 @@ ### Eliza的研发 -在 1960 年代,一位名叫 *Joseph Weizenbaum* 的麻省理工学院科学家开发了[*Eliza*](https:/wikipedia.org/wiki/ELIZA),Eliza是一位计算机“治疗师”,它可以向人类提出问题并表现出理解他们的答案。然而,虽然 Eliza 可以解析句子并识别某些语法结构和关键字以给出合理的答案,但不能说它*理解*了句子。如果 Eliza 看到的句子格式为“**I am** sad”,它可能会重新排列并替换句子中的单词以形成响应“How long have ** you been** sad"。 +在 1960 年代,一位名叫 *Joseph Weizenbaum* 的麻省理工学院科学家开发了[*Eliza*](https:/wikipedia.org/wiki/ELIZA),Eliza是一位计算机“治疗师”,它可以向人类提出问题并表现出理解他们的答案。然而,虽然 Eliza 可以解析句子并识别某些语法结构和关键字以给出合理的答案,但不能说它*理解*了句子。如果 Eliza 看到的句子格式为“**I am** sad”,它可能会重新排列并替换句子中的单词以形成响应“How long have **you been** sad"。 这给人的印象是伊丽莎理解了这句话,并在问一个后续问题,而实际上,它是在改变时态并添加一些词。如果 Eliza 无法识别它有响应的关键字,它会给出一个随机响应,该响应可以适用于许多不同的语句。 Eliza 很容易被欺骗,例如,如果用户写了**You are** a bicycle",它可能会回复"How long have **I been** a bicycle?",而不是更合理的回答。 @@ -103,7 +103,7 @@ ``` 以下是一些指导你的示例输出(用户输入位于以 `>` 开头的行上): - + ```output Hello, I am Marvin, the simple robot. You can end this conversation at any time by typing 'bye' From eb6f65e3a8e8b9ba5ea8b555d314690558b0a236 Mon Sep 17 00:00:00 2001 From: "Charles Emmanuel S. Ndiaye" Date: Wed, 14 Jul 2021 11:41:50 +0000 Subject: [PATCH 09/51] propose Regression french translation readme add a README.fr.md for Regression base README --- 2-Regression/translations/README.fr.md | 33 ++++++++++++++++++++++++++ 1 file changed, 33 insertions(+) create mode 100644 2-Regression/translations/README.fr.md diff --git a/2-Regression/translations/README.fr.md b/2-Regression/translations/README.fr.md new file mode 100644 index 00000000..eaed5756 --- /dev/null +++ b/2-Regression/translations/README.fr.md @@ -0,0 +1,33 @@ +# Modèles de régression pour le machine learning +## Sujet régional : Modèles de régression des prix des citrouilles en Amérique du Nord 🎃 + +En Amérique du Nord, les citrouilles sont souvent sculptées en visages effrayants pour Halloween. Découvrons-en plus sur ces légumes fascinants! + +![jack-o-lanterns](../images/jack-o-lanterns.jpg) +> Photo de Beth Teutschmann sur Unsplash + +## Ce que vous apprendrez + +Les leçons de cette section couvrent les types de régression dans le contexte du machine learning. Les modèles de régression peuvent aider à déterminer la _relation_ entre les variables. Ce type de modèle peut prédire des valeurs telles que la longueur, la température ou l'âge, découvrant ainsi les relations entre les variables lors de l'analyse des points de données. + +Dans cette série de leçons, vous découvrirez la différence entre la régression linéaire et la régression logistique, et quand vous devriez utiliser l'une ou l'autre. + +Dans ce groupe de leçons, vous serez préparé afin de commencer les tâches de machine learning, y compris la configuration de Visual Studio Code pour gérer les blocs-notes, l'environnement commun pour les scientifiques des données. Vous découvrirez Scikit-learn, une bibliothèque pour le machine learning, et vous construirez vos premiers modèles, en vous concentrant sur les modèles de régression dans ce chapitre. + +> Il existe des outils low-code utiles qui peuvent vous aider à apprendre à travailler avec des modèles de régression. Essayez [Azure ML pour cette tâche](https://docs.microsoft.com/learn/modules/create-regression-model-azure-machine-learning-designer/?WT.mc_id=academic-15963-cxa) + +### Cours + +1. [Outils du métier](1-Tools/README.md) +2. [Gestion des données](2-Data/README.md) +3. [Régression linéaire et polynomiale](3-Linear/README.md) +4. [Régression logistique](4-Logistic/README.md) + +--- +### Crédits + +"ML avec régression" a été écrit avec ♥️ par [Jen Looper](https://twitter.com/jenlooper) + +♥️ Les contributeurs du quiz incluent : [Muhammad Sakib Khan Inan](https://twitter.com/Sakibinan) et [Ornella Altunyan](https://twitter.com/ornelladotcom) + +L'ensemble de données sur la citrouille est suggéré par [ce projet sur Kaggle](https://www.kaggle.com/usda/a-year-of-pumpkin-prices) et ses données proviennent des [Rapports standard des marchés terminaux des cultures spécialisées](https://www.marketnews.usda.gov/mnp/fv-report-config-step1?type=termPrice) distribué par le département américain de l'Agriculture. Nous avons ajouté quelques points autour de la couleur en fonction de la variété pour normaliser la distribution. Ces données sont dans le domaine public. From ac061faf5fdf6e409e94b2da01bfc0beb825fa29 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?=E6=96=87=E4=BD=93=E4=B8=A4=E5=BC=80=E8=8A=B1=E7=94=9F?= <56857145+loap-a@users.noreply.github.com> Date: Wed, 14 Jul 2021 20:40:40 +0800 Subject: [PATCH 10/51] Add files via upload --- 2-Regression/translations/Readme.zh-cn.md | 34 +++++++++++++++++++++++ 1 file changed, 34 insertions(+) create mode 100644 2-Regression/translations/Readme.zh-cn.md diff --git a/2-Regression/translations/Readme.zh-cn.md b/2-Regression/translations/Readme.zh-cn.md new file mode 100644 index 00000000..853c73b5 --- /dev/null +++ b/2-Regression/translations/Readme.zh-cn.md @@ -0,0 +1,34 @@ +# 机器学习中的回归模型 +## 本节主题: 北美南瓜价格的回归模型 🎃 + +在北美,南瓜经常在万圣节被刻上吓人的鬼脸。让我们来深入研究一下这种奇妙的蔬菜 + +![jack-o-lanterns](./images/jack-o-lanterns.jpg) +> Photo by Beth Teutschmann on Unsplash + +##你会学到什么 + +这节的课程包括机器学习领域中的多种回归模型。回归模型可以明确多种变量间的_关系_。这种模型可以用来预测类似长度、温度和年龄之类的值, 通过分析数据点来揭示变量之间的关系。 + +在本节的一系列课程中,你会学到线性回归和逻辑回归之间的区别,并且你将知道对于特定问题如何在这两种模型中进行选择 + +在这组课程中,你会准备好包括为管理笔记而设置VS Code、配置数据科学家常用的环境等机器学习的初始任务。你会开始上手Scikit-learn学习项目(一个机器学习的百科),并且你会以回归模型为主构建起你的第一种机器学习模型 + +> 这里有一些代码难度较低但很有用的工具可以帮助你学习使用回归模型。 试一下 [Azure ML for this task](https://docs.microsoft.com/learn/modules/create-regression-model-azure-machine-learning-designer/?WT.mc_id=academic-15963-cxa) + + +### Lessons + +1. [交易的工具](1-Tools/README.md) +2. [管理数据](2-Data/README.md) +3. [线性和多项式回归](3-Linear/README.md) +4. [逻辑回归](4-Logistic/README.md) + +--- +### Credits + +"机器学习中的回归" 由[Jen Looper](https://twitter.com/jenlooper)♥️ 撰写 + +♥️ 测试的贡献者: [Muhammad Sakib Khan Inan](https://twitter.com/Sakibinan) 和 [Ornella Altunyan](https://twitter.com/ornelladotcom) + +南瓜数据集受此启发 [this project on Kaggle](https://www.kaggle.com/usda/a-year-of-pumpkin-prices) 并且其数据源自 [Specialty Crops Terminal Markets Standard Reports](https://www.marketnews.usda.gov/mnp/fv-report-config-step1?type=termPrice) 由美国农业部上传分享。我们根据种类 \ No newline at end of file From aa5048fb0e5dba984d5f7bd949301b3031d513e2 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?=E6=96=87=E4=BD=93=E4=B8=A4=E5=BC=80=E8=8A=B1=E7=94=9F?= <56857145+loap-a@users.noreply.github.com> Date: Wed, 14 Jul 2021 21:04:12 +0800 Subject: [PATCH 11/51] Update Readme.zh-cn.md --- 2-Regression/translations/Readme.zh-cn.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/2-Regression/translations/Readme.zh-cn.md b/2-Regression/translations/Readme.zh-cn.md index 853c73b5..ee1243a2 100644 --- a/2-Regression/translations/Readme.zh-cn.md +++ b/2-Regression/translations/Readme.zh-cn.md @@ -31,4 +31,4 @@ ♥️ 测试的贡献者: [Muhammad Sakib Khan Inan](https://twitter.com/Sakibinan) 和 [Ornella Altunyan](https://twitter.com/ornelladotcom) -南瓜数据集受此启发 [this project on Kaggle](https://www.kaggle.com/usda/a-year-of-pumpkin-prices) 并且其数据源自 [Specialty Crops Terminal Markets Standard Reports](https://www.marketnews.usda.gov/mnp/fv-report-config-step1?type=termPrice) 由美国农业部上传分享。我们根据种类 \ No newline at end of file +南瓜数据集受此启发 [this project on Kaggle](https://www.kaggle.com/usda/a-year-of-pumpkin-prices) 并且其数据源自 [Specialty Crops Terminal Markets Standard Reports](https://www.marketnews.usda.gov/mnp/fv-report-config-step1?type=termPrice) 由美国农业部上传分享。我们根据种类添加了围绕颜色的一些数据点。这些数据处在公共的域名上。 From 10a7a93f1d050f22fcb63c8b4f93c29baddab0d0 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?=E6=96=87=E4=BD=93=E4=B8=A4=E5=BC=80=E8=8A=B1=E7=94=9F?= <56857145+loap-a@users.noreply.github.com> Date: Wed, 14 Jul 2021 21:12:46 +0800 Subject: [PATCH 12/51] Add files via upload --- 2-Regression/translations/Readme.zh-cn.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/2-Regression/translations/Readme.zh-cn.md b/2-Regression/translations/Readme.zh-cn.md index ee1243a2..81c2c9d4 100644 --- a/2-Regression/translations/Readme.zh-cn.md +++ b/2-Regression/translations/Readme.zh-cn.md @@ -31,4 +31,4 @@ ♥️ 测试的贡献者: [Muhammad Sakib Khan Inan](https://twitter.com/Sakibinan) 和 [Ornella Altunyan](https://twitter.com/ornelladotcom) -南瓜数据集受此启发 [this project on Kaggle](https://www.kaggle.com/usda/a-year-of-pumpkin-prices) 并且其数据源自 [Specialty Crops Terminal Markets Standard Reports](https://www.marketnews.usda.gov/mnp/fv-report-config-step1?type=termPrice) 由美国农业部上传分享。我们根据种类添加了围绕颜色的一些数据点。这些数据处在公共的域名上。 +南瓜数据集受此启发 [this project on Kaggle](https://www.kaggle.com/usda/a-year-of-pumpkin-prices) 并且其数据源自 [Specialty Crops Terminal Markets Standard Reports](https://www.marketnews.usda.gov/mnp/fv-report-config-step1?type=termPrice) 由美国农业部上传分享。我们根据种类添加了围绕颜色的一些数据点。这些数据处在公共的域名上。 \ No newline at end of file From 11567b58cc5f88bfcd1fca645c1cc83fab48a8e9 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?=E6=96=87=E4=BD=93=E4=B8=A4=E5=BC=80=E8=8A=B1=E7=94=9F?= <56857145+loap-a@users.noreply.github.com> Date: Wed, 14 Jul 2021 21:13:48 +0800 Subject: [PATCH 13/51] Add files via upload --- 2-Regression/Readme.zh-cn.md | 34 ++++++++++++++++++++++++++++++++++ 1 file changed, 34 insertions(+) create mode 100644 2-Regression/Readme.zh-cn.md diff --git a/2-Regression/Readme.zh-cn.md b/2-Regression/Readme.zh-cn.md new file mode 100644 index 00000000..4972c594 --- /dev/null +++ b/2-Regression/Readme.zh-cn.md @@ -0,0 +1,34 @@ +# 机器学习中的回归模型 +## 本节主题: 北美南瓜价格的回归模型 🎃 + +在北美,南瓜经常在万圣节被刻上吓人的鬼脸。让我们来深入研究一下这种奇妙的蔬菜 + +![jack-o-lanterns](../images/jack-o-lanterns.jpg) +> Photo by Beth Teutschmann on Unsplash + +##你会学到什么 + +这节的课程包括机器学习领域中的多种回归模型。回归模型可以明确多种变量间的_关系_。这种模型可以用来预测类似长度、温度和年龄之类的值, 通过分析数据点来揭示变量之间的关系。 + +在本节的一系列课程中,你会学到线性回归和逻辑回归之间的区别,并且你将知道对于特定问题如何在这两种模型中进行选择 + +在这组课程中,你会准备好包括为管理笔记而设置VS Code、配置数据科学家常用的环境等机器学习的初始任务。你会开始上手Scikit-learn学习项目(一个机器学习的百科),并且你会以回归模型为主构建起你的第一种机器学习模型 + +> 这里有一些代码难度较低但很有用的工具可以帮助你学习使用回归模型。 试一下 [Azure ML for this task](https://docs.microsoft.com/learn/modules/create-regression-model-azure-machine-learning-designer/?WT.mc_id=academic-15963-cxa) + + +### Lessons + +1. [交易的工具](1-Tools/README.md) +2. [管理数据](2-Data/README.md) +3. [线性和多项式回归](3-Linear/README.md) +4. [逻辑回归](4-Logistic/README.md) + +--- +### Credits + +"机器学习中的回归" 由[Jen Looper](https://twitter.com/jenlooper)♥️ 撰写 + +♥️ 测试的贡献者: [Muhammad Sakib Khan Inan](https://twitter.com/Sakibinan) 和 [Ornella Altunyan](https://twitter.com/ornelladotcom) + +南瓜数据集受此启发 [this project on Kaggle](https://www.kaggle.com/usda/a-year-of-pumpkin-prices) 并且其数据源自 [Specialty Crops Terminal Markets Standard Reports](https://www.marketnews.usda.gov/mnp/fv-report-config-step1?type=termPrice) 由美国农业部上传分享。我们根据种类添加了围绕颜色的一些数据点。这些数据处在公共的域名上。 \ No newline at end of file From 7d9da49081ce7960872e84820c3adc205daa3b39 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?=E6=96=87=E4=BD=93=E4=B8=A4=E5=BC=80=E8=8A=B1=E7=94=9F?= <56857145+loap-a@users.noreply.github.com> Date: Wed, 14 Jul 2021 21:15:26 +0800 Subject: [PATCH 14/51] Delete Readme.zh-cn.md --- 2-Regression/Readme.zh-cn.md | 34 ---------------------------------- 1 file changed, 34 deletions(-) delete mode 100644 2-Regression/Readme.zh-cn.md diff --git a/2-Regression/Readme.zh-cn.md b/2-Regression/Readme.zh-cn.md deleted file mode 100644 index 4972c594..00000000 --- a/2-Regression/Readme.zh-cn.md +++ /dev/null @@ -1,34 +0,0 @@ -# 机器学习中的回归模型 -## 本节主题: 北美南瓜价格的回归模型 🎃 - -在北美,南瓜经常在万圣节被刻上吓人的鬼脸。让我们来深入研究一下这种奇妙的蔬菜 - -![jack-o-lanterns](../images/jack-o-lanterns.jpg) -> Photo by Beth Teutschmann on Unsplash - -##你会学到什么 - -这节的课程包括机器学习领域中的多种回归模型。回归模型可以明确多种变量间的_关系_。这种模型可以用来预测类似长度、温度和年龄之类的值, 通过分析数据点来揭示变量之间的关系。 - -在本节的一系列课程中,你会学到线性回归和逻辑回归之间的区别,并且你将知道对于特定问题如何在这两种模型中进行选择 - -在这组课程中,你会准备好包括为管理笔记而设置VS Code、配置数据科学家常用的环境等机器学习的初始任务。你会开始上手Scikit-learn学习项目(一个机器学习的百科),并且你会以回归模型为主构建起你的第一种机器学习模型 - -> 这里有一些代码难度较低但很有用的工具可以帮助你学习使用回归模型。 试一下 [Azure ML for this task](https://docs.microsoft.com/learn/modules/create-regression-model-azure-machine-learning-designer/?WT.mc_id=academic-15963-cxa) - - -### Lessons - -1. [交易的工具](1-Tools/README.md) -2. [管理数据](2-Data/README.md) -3. [线性和多项式回归](3-Linear/README.md) -4. [逻辑回归](4-Logistic/README.md) - ---- -### Credits - -"机器学习中的回归" 由[Jen Looper](https://twitter.com/jenlooper)♥️ 撰写 - -♥️ 测试的贡献者: [Muhammad Sakib Khan Inan](https://twitter.com/Sakibinan) 和 [Ornella Altunyan](https://twitter.com/ornelladotcom) - -南瓜数据集受此启发 [this project on Kaggle](https://www.kaggle.com/usda/a-year-of-pumpkin-prices) 并且其数据源自 [Specialty Crops Terminal Markets Standard Reports](https://www.marketnews.usda.gov/mnp/fv-report-config-step1?type=termPrice) 由美国农业部上传分享。我们根据种类添加了围绕颜色的一些数据点。这些数据处在公共的域名上。 \ No newline at end of file From f9f69077acc76f317c8c3ce3aa0e5837ccb2dc40 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?=E6=96=87=E4=BD=93=E4=B8=A4=E5=BC=80=E8=8A=B1=E7=94=9F?= <56857145+loap-a@users.noreply.github.com> Date: Wed, 14 Jul 2021 21:16:48 +0800 Subject: [PATCH 15/51] Add files via upload --- 2-Regression/translations/Readme.zh-cn.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/2-Regression/translations/Readme.zh-cn.md b/2-Regression/translations/Readme.zh-cn.md index 81c2c9d4..3802d540 100644 --- a/2-Regression/translations/Readme.zh-cn.md +++ b/2-Regression/translations/Readme.zh-cn.md @@ -3,8 +3,8 @@ 在北美,南瓜经常在万圣节被刻上吓人的鬼脸。让我们来深入研究一下这种奇妙的蔬菜 -![jack-o-lanterns](./images/jack-o-lanterns.jpg) -> Photo by Beth Teutschmann on Unsplash +![jack-o-lantern](../images/jack-o-lanterns.jpg) +> Foto oleh Beth Teutschmann di Unsplash ##你会学到什么 From 489eb93d5563f0f4e15a5cb23a62159bdc7d4747 Mon Sep 17 00:00:00 2001 From: "Charles Emmanuel S. Ndiaye" Date: Wed, 14 Jul 2021 15:35:28 +0000 Subject: [PATCH 16/51] Update links for future translated readme Update links for future translated readme --- 2-Regression/translations/README.fr.md | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/2-Regression/translations/README.fr.md b/2-Regression/translations/README.fr.md index eaed5756..1b252f3f 100644 --- a/2-Regression/translations/README.fr.md +++ b/2-Regression/translations/README.fr.md @@ -18,10 +18,10 @@ Dans ce groupe de leçons, vous serez préparé afin de commencer les tâches de ### Cours -1. [Outils du métier](1-Tools/README.md) -2. [Gestion des données](2-Data/README.md) -3. [Régression linéaire et polynomiale](3-Linear/README.md) -4. [Régression logistique](4-Logistic/README.md) +1. [Outils du métier](1-Tools/translations/README.fr.md) +2. [Gestion des données](2-Data/translations/README.fr.md) +3. [Régression linéaire et polynomiale](3-Linear/translations/README.fr.md) +4. [Régression logistique](4-Logistic/translations/README.fr.md) --- ### Crédits From c53bc86acdbfe218775faf6bf61903b216d9ba24 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?=E6=96=87=E4=BD=93=E4=B8=A4=E5=BC=80=E8=8A=B1=E7=94=9F?= <56857145+loap-a@users.noreply.github.com> Date: Wed, 14 Jul 2021 23:37:01 +0800 Subject: [PATCH 17/51] Rename Readme.zh-cn.md to Readme.zh-ch.md --- 2-Regression/translations/{Readme.zh-cn.md => Readme.zh-ch.md} | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) rename 2-Regression/translations/{Readme.zh-cn.md => Readme.zh-ch.md} (98%) diff --git a/2-Regression/translations/Readme.zh-cn.md b/2-Regression/translations/Readme.zh-ch.md similarity index 98% rename from 2-Regression/translations/Readme.zh-cn.md rename to 2-Regression/translations/Readme.zh-ch.md index 3802d540..f25a542a 100644 --- a/2-Regression/translations/Readme.zh-cn.md +++ b/2-Regression/translations/Readme.zh-ch.md @@ -31,4 +31,4 @@ ♥️ 测试的贡献者: [Muhammad Sakib Khan Inan](https://twitter.com/Sakibinan) 和 [Ornella Altunyan](https://twitter.com/ornelladotcom) -南瓜数据集受此启发 [this project on Kaggle](https://www.kaggle.com/usda/a-year-of-pumpkin-prices) 并且其数据源自 [Specialty Crops Terminal Markets Standard Reports](https://www.marketnews.usda.gov/mnp/fv-report-config-step1?type=termPrice) 由美国农业部上传分享。我们根据种类添加了围绕颜色的一些数据点。这些数据处在公共的域名上。 \ No newline at end of file +南瓜数据集受此启发 [this project on Kaggle](https://www.kaggle.com/usda/a-year-of-pumpkin-prices) 并且其数据源自 [Specialty Crops Terminal Markets Standard Reports](https://www.marketnews.usda.gov/mnp/fv-report-config-step1?type=termPrice) 由美国农业部上传分享。我们根据种类添加了围绕颜色的一些数据点。这些数据处在公共的域名上。 From 95aa36df517eff69aabc23a583e384ffced6708d Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?=E6=96=87=E4=BD=93=E4=B8=A4=E5=BC=80=E8=8A=B1=E7=94=9F?= <56857145+loap-a@users.noreply.github.com> Date: Wed, 14 Jul 2021 23:38:15 +0800 Subject: [PATCH 18/51] Update Readme.zh-ch.md --- 2-Regression/translations/Readme.zh-ch.md | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/2-Regression/translations/Readme.zh-ch.md b/2-Regression/translations/Readme.zh-ch.md index f25a542a..7ce096c3 100644 --- a/2-Regression/translations/Readme.zh-ch.md +++ b/2-Regression/translations/Readme.zh-ch.md @@ -19,10 +19,10 @@ ### Lessons -1. [交易的工具](1-Tools/README.md) -2. [管理数据](2-Data/README.md) -3. [线性和多项式回归](3-Linear/README.md) -4. [逻辑回归](4-Logistic/README.md) +1. [交易的工具](../1-Tools/README.md) +2. [管理数据](../2-Data/README.md) +3. [线性和多项式回归](../3-Linear/README.md) +4. [逻辑回归](../4-Logistic/README.md) --- ### Credits From 50a94e9ed8622fd6eb52489db602d710ba822f43 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?=E6=96=87=E4=BD=93=E4=B8=A4=E5=BC=80=E8=8A=B1=E7=94=9F?= <56857145+loap-a@users.noreply.github.com> Date: Wed, 14 Jul 2021 23:44:01 +0800 Subject: [PATCH 19/51] Rename Readme.zh-ch.md to README.zh-ch.md --- 2-Regression/translations/{Readme.zh-ch.md => README.zh-ch.md} | 0 1 file changed, 0 insertions(+), 0 deletions(-) rename 2-Regression/translations/{Readme.zh-ch.md => README.zh-ch.md} (100%) diff --git a/2-Regression/translations/Readme.zh-ch.md b/2-Regression/translations/README.zh-ch.md similarity index 100% rename from 2-Regression/translations/Readme.zh-ch.md rename to 2-Regression/translations/README.zh-ch.md From c2f2beed8e8bcbfbfb006dfc62322a54223c39d5 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?=E6=96=87=E4=BD=93=E4=B8=A4=E5=BC=80=E8=8A=B1=E7=94=9F?= <56857145+loap-a@users.noreply.github.com> Date: Wed, 14 Jul 2021 23:50:08 +0800 Subject: [PATCH 20/51] Update README.zh-ch.md --- 2-Regression/translations/README.zh-ch.md | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/2-Regression/translations/README.zh-ch.md b/2-Regression/translations/README.zh-ch.md index 7ce096c3..24c7a26c 100644 --- a/2-Regression/translations/README.zh-ch.md +++ b/2-Regression/translations/README.zh-ch.md @@ -19,10 +19,10 @@ ### Lessons -1. [交易的工具](../1-Tools/README.md) -2. [管理数据](../2-Data/README.md) -3. [线性和多项式回归](../3-Linear/README.md) -4. [逻辑回归](../4-Logistic/README.md) +1. [交易的工具](../1-Tools/translations/README.zh-cn.md) +2. [管理数据](../2-Data/translations/README.zh-cn.md) +3. [线性和多项式回归](../3-Linear/translations/README.zh-cn.md) +4. [逻辑回归](../4-Logistic/translations/README.zh-cn.md) --- ### Credits From 38c3dfa0c332e03c8e4cefb52c7af5c014b9237e Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?=E6=96=87=E4=BD=93=E4=B8=A4=E5=BC=80=E8=8A=B1=E7=94=9F?= <56857145+loap-a@users.noreply.github.com> Date: Thu, 15 Jul 2021 01:19:49 +0800 Subject: [PATCH 21/51] rename --- 2-Regression/translations/{README.zh-ch.md => README.zh-cn.md} | 0 1 file changed, 0 insertions(+), 0 deletions(-) rename 2-Regression/translations/{README.zh-ch.md => README.zh-cn.md} (100%) diff --git a/2-Regression/translations/README.zh-ch.md b/2-Regression/translations/README.zh-cn.md similarity index 100% rename from 2-Regression/translations/README.zh-ch.md rename to 2-Regression/translations/README.zh-cn.md From 078846d8064d7fc2453e7c69a936b342426979bf Mon Sep 17 00:00:00 2001 From: edgargonarr <35715904+edgargonarr@users.noreply.github.com> Date: Wed, 14 Jul 2021 14:22:26 -0500 Subject: [PATCH 22/51] Update README.md The table for classification_report is not correctly aligned --- 4-Classification/2-Classifiers-1/README.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/4-Classification/2-Classifiers-1/README.md b/4-Classification/2-Classifiers-1/README.md index 15800922..bdff6bc9 100644 --- a/4-Classification/2-Classifiers-1/README.md +++ b/4-Classification/2-Classifiers-1/README.md @@ -217,7 +217,7 @@ Since you are using the multiclass case, you need to choose what _scheme_ to use print(classification_report(y_test,y_pred)) ``` - | precision | recall | f1-score | support | | | | | | | | | | | | | | | | | | | + | | precision | recall | f1-score| support | | | | | | | | | | | | | | | | | | | ------------ | ------ | -------- | ------- | ---- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | | chinese | 0.73 | 0.71 | 0.72 | 229 | | | | | | | | | | | | | | | | | | | indian | 0.91 | 0.93 | 0.92 | 254 | | | | | | | | | | | | | | | | | | From ce68d9bf2214b9194a85fb3ab25227bd9bd20381 Mon Sep 17 00:00:00 2001 From: Jen Looper Date: Wed, 14 Jul 2021 17:31:56 -0400 Subject: [PATCH 23/51] Update README.md --- quiz-app/README.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/quiz-app/README.md b/quiz-app/README.md index 042d53ca..83b30d1d 100644 --- a/quiz-app/README.md +++ b/quiz-app/README.md @@ -1,6 +1,6 @@ # Quizzes -These quizzes are the pre- and post-lecture quizzes for the web development for ml curriculum at https://aka.ms/ml-beginners +These quizzes are the pre- and post-lecture quizzes for the ML curriculum at https://aka.ms/ml-beginners ## Project setup From 1be8c54849e3213b590f3ad8a986271c6aea0a07 Mon Sep 17 00:00:00 2001 From: simplg <81249731+simplg@users.noreply.github.com> Date: Wed, 14 Jul 2021 23:38:49 +0200 Subject: [PATCH 24/51] Create README.fr.md --- .../1-intro-to-ML/translations/README.fr.md | 109 ++++++++++++++++++ 1 file changed, 109 insertions(+) create mode 100644 1-Introduction/1-intro-to-ML/translations/README.fr.md diff --git a/1-Introduction/1-intro-to-ML/translations/README.fr.md b/1-Introduction/1-intro-to-ML/translations/README.fr.md new file mode 100644 index 00000000..511f3764 --- /dev/null +++ b/1-Introduction/1-intro-to-ML/translations/README.fr.md @@ -0,0 +1,109 @@ +# Introduction au machine learning + +[![ML, AI, deep learning - Quelle est la différence ?](https://img.youtube.com/vi/lTd9RSxS9ZE/0.jpg)](https://youtu.be/lTd9RSxS9ZE "ML, AI, deep learning - What's the difference?") + +> 🎥 Cliquer sur l'image ci-dessus afin de regarder une vidéo expliquant la différence entre machine learning, AI et deep learning. + +## [Quizz de pré-conférence](https://jolly-sea-0a877260f.azurestaticapps.net/quiz/1/) + +### Introduction + +Bienvenue à ce cours sur le machine learning classique pour débutant ! Que vous soyez complètement nouveau sur ce sujet ou que vous soyez un professonnel du ML expérimenté cherchant à peaufiner vos connaissances, nous sommes heureux de vous avoir avec nous ! Nous voulons créer un tremplin chaleureux pour vos études en ML et serions ravis d'évaluer, de répondre et d'apprendre de vos retours d'[expériences](https://github.com/microsoft/ML-For-Beginners/discussions). + +[![Introduction au ML](https://img.youtube.com/vi/h0e2HAPTGF4/0.jpg)](https://youtu.be/h0e2HAPTGF4 "Introduction to ML") + +> 🎥 Cliquer sur l'image ci-dessus afin de regarder une vidéo: John Guttag du MIT introduit le machine learning +### Débuter avec le machine learning + +Avant de commencer avec ce cours, vous aurez besoin d'un ordinateur configuré et prêt à faire tourner des notebooks (jupyter) localement. + +- **Configurer votre ordinateur avec ces vidéos**. Apprendre comment configurer votre ordinateur avec cette [série de vidéos](https://www.youtube.com/playlist?list=PLlrxD0HtieHhS8VzuMCfQD4uJ9yne1mE6). +- **Apprendre Python**. Il est aussi recommandé d'avoir une connaissance basique de [Python](https://docs.microsoft.com/learn/paths/python-language/?WT.mc_id=academic-15963-cxa), un langage de programmaton utile pour les data scientist que nous utilisons tout au long de ce cours. +- **Apprendre Node.js et Javascript**. Nous utilisons aussi Javascript par moment dans ce cours afin de construire des applications WEB, vous aurez donc besoin de [node](https://nodejs.org) et [npm](https://www.npmjs.com/) installé, ainsi que de [Visual Studio Code](https://code.visualstudio.com/) pour développer en Python et Javascript. +- **Créer un compte GitHub**. Comme vous nous avez trouvé sur [GitHub](https://github.com), vous y avez sûrement un compte, mais si non, créez en un et répliquez ce cours afin de l'utiliser à votre grés. (N'oublier pas de nous donner une étoile aussi 😊) +- **Explorer Scikit-learn**. Familiariser vous avec [Scikit-learn](https://scikit-learn.org/stable/user_guide.html), un ensemble de librairies ML que nous mentionnons dans nos leçons. + +### Qu'est-ce que le machine learning + +Le terme `machine learning` est un des mots les plus populaire et le plus utilisé ces derniers temps. Il y a une probabilité accrue que vous l'ayez entendu au moins une fois si vous avez une appétence pour la technologie indépendamment du domaine dans lequel vous travaillez. Le fonctionnement du machine learning, cependant, reste un mystère pour la plupart des personnes. Pour un débutant en machine learning, le sujet peut nous submerger. Ainsi, il est important de comprendre ce qu'est le machine learning et de l'apprendre petit à petit au travers d'exemples pratiques. + +![ml hype curve](images/hype.png) + +> Google Trends montre la récente 'courbe de popularité' pour le mot 'machine learning' + +Nous vivons dans un univers rempli de mystères fascinants. De grands scientifiques comme Stephen Hawking, Albert Einstein et pleins d'autres ont dévoués leur vie à la recherche d'informations utiles afin de dévoiler les mystères qui nous entourent. C'est la condition humaine pour apprendre : un enfant apprend de nouvelles choses et découvre la structure du monde année après année jusqu'à qu'ils deviennent adultes. + +Le cerveau d'un enfant et ses sens perçoivent l'environnement qui les entourent et apprennent graduellement des schémas secrets de la vie qui vont l'aider à fabriquer des règles logiques afin d'identifier les schémas appris. Le processus d'apprentissage du cerveau humain est ce que rend les hommes comme la créature la plus sophistiquée du monde vivant. Apprendre continuellement par la découverte de schémas cachés et ensuite innover sur ces schémas nous permet de nous améliorer tout au long de notre vie. Cette capacité d'apprendre et d'évoluer est liée au concept de [plasticité neuronale](https://www.simplypsychology.org/brain-plasticity.html), nous pouvons tirer quelques motivations similaires entre le processus d'apprentissage du cerveau humain et le concept de machine learning. + +Le [cerveau humain](https://www.livescience.com/29365-human-brain.html) perçoit des choses du monde réel, assimile les informations perçues, fait des décisions rationnelles et entreprend certaines actions selon le contexte. C'est ce que l'on appelle se comporter intelligemment. Lorsque nous programmons une reproduction du processus de ce comportement à une machine, c'est ce que l'on appelle intelligence artificielle (IA). + +Bien que le terme peut être confu, machine learning (ML) est un important sous-ensemble de l'intelligence artificielle. **ML se réfère à l'utilisation d'algorithmes spécialisés afin de découvrir des informations utiles et de trouver des schémas cachés depuis des données perçues pour corroborer un processus de décision rationnel**. + +![AI, ML, deep learning, data science](images/ai-ml-ds.png) + +> Un diagramme montrant les relations entre AI, ML, deep learning et data science. Infographie par [Jen Looper](https://twitter.com/jenlooper) et inspiré par [ce graphique](https://softwareengineering.stackexchange.com/questions/366996/distinction-between-ai-ml-neural-networks-deep-learning-and-data-mining) + +## Ce que vous allez apprendre dans ce cours + +Dans ce cours, nous allons nous concentrer sur les concepts clés du machine learning qu'un débutant se doit de connaître. Nous parlerons ce que l'on appelle le 'machine learning classique' en utilisant principalement Scikit-learn, une excellente librairie que beaucoup d'étudiants utilisent afin d'apprendre les bases. Afin de comprendre les concepts plus larges de l'intelligence artificielle ou du deep learning, une profonde connaissance en machine learning est indispensable, et c'est ce que nous aimerions fournir ici. + +Dans ce cours, vous allez apprendre : + +- Les concepts clés du machine learning +- L'histoire du ML +- ML et équité (fairness) +- Les techniques de régression ML +- Les techniques de classification ML +- Les techniques de regroupement (clustering) ML +- Les techniques du traitement automatique des langues (NLP) ML +- Les techniques de prédictions à partir de séries chronologiques ML +- Apprentissage renforcé +- D'applications réels du ML + +## Ce que nous ne couvrirons pas + +- Deep learning +- Neural networks +- IA + +Afin d'avoir la meilleur expérience d'apprentissage, nous éviterons les complexités des réseaux neuronaux, du 'deep learning' (construire un modèle utilisant plusieurs couches de réseaux neuronaux) et IA, dont nous parlerons dans un cours différent. Nous offirons aussi un cours à venir sur la data science pour concentrer sur cet aspect de champs très large. + +## Pourquoi etudier le machine learning ? + +Le machine learning, depuis une perspective systémique, est défini comme la création de systèmes automatiques pouvant apprendre des schémas cachés depuis des données afin d'aider à prendre des décisions intelligentes. + +Ce but est faiblement inspiré de la manière dont le cerveau humain apprend certaines choses depuis les données qu'il perçoit du monde extérieur. + +✅ Penser une minute aux raisons qu'une entreprise aurait d'essayer d'utiliser des stratégies de machine learning au lieu de créer des règles codés en dur. + +### Les applications du machine learning + +Les applications du machine learning sont maintenant pratiquement partout, et sont aussi omniprésentes que les données qui circulent autour de notre société (générés par nos smartphones, appareils connectés ou autres systèmes). En prenant en considération l'immense potentiel des algorithmes dernier cri de machine learning, les chercheurs ont pu exploités leurs capacités afin de résoudre des problèmes multidimensionnels et interdisciplinaires de la vie avec d'important retours positifs + +**Vous pouvez utiliser le machine learning de plusieurs manières** : + +- Afin de prédire la possibilité d'avoir une maladie à partir des données médicales d'un patient. +- Pour tirer parti des données météorologiques afin de prédire les événements météorologiques. +- Afin de comprendre le sentiment d'un texte. +- Afin de détecter les fake news pour stopper la propagation de la propagande. + +La finance, l'économie, les sciences de la terre, l'exploration spatiale, le génie biomédical, les sciences cognitives et même les domaines des sciences humaines ont adapté le machine learning pour résoudre les problèmes ardus et lourds de traitement des données dans leur domaine respectif. + +Le machine learning automatise le processus de découverte de modèles en trouvant des informations significatives à partir de données réelles ou générées. Il s'est avéré très utile dans les applications commerciales, de santé et financières, entre autres. + +Dans un avenir proche, comprendre les bases du machine learning sera indispensable pour les personnes de tous les domaines en raison de son adoption généralisée. + +--- +## 🚀 Challenge + +Esquisser, sur papier ou à l'aide d'une application en ligne comme [Excalidraw](https://excalidraw.com/), votre compréhension des différences entre l'IA, le ML, le deep learning et la data science. Ajouter quelques idées de problèmes que chacune de ces techniques est bonne à résoudre. + +## [Quizz de post-conférence](https://jolly-sea-0a877260f.azurestaticapps.net/quiz/2/) + +## Révision et auto-apprentissage + +Pour en savoir plus sur la façon dont vous pouvez utiliser les algorithmes de ML dans le cloud, suivez ce [Parcours d'apprentissage](https://docs.microsoft.com/learn/paths/create-no-code-predictive-models-azure-machine- learning/?WT.mc_id=academic-15963-cxa). + +## Devoir + +[Être opérationnel](assignment.md) From 9439709d5b2a3a1311d6fbbd3f44bb56ae53112b Mon Sep 17 00:00:00 2001 From: simplg <81249731+simplg@users.noreply.github.com> Date: Wed, 14 Jul 2021 23:47:41 +0200 Subject: [PATCH 25/51] Editing assignment link --- 1-Introduction/1-intro-to-ML/translations/README.fr.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/1-Introduction/1-intro-to-ML/translations/README.fr.md b/1-Introduction/1-intro-to-ML/translations/README.fr.md index 511f3764..19af588d 100644 --- a/1-Introduction/1-intro-to-ML/translations/README.fr.md +++ b/1-Introduction/1-intro-to-ML/translations/README.fr.md @@ -106,4 +106,4 @@ Pour en savoir plus sur la façon dont vous pouvez utiliser les algorithmes de M ## Devoir -[Être opérationnel](assignment.md) +[Être opérationnel](../assignment.md) From 5775c4b0690771414a7bdfb4f013beaaafa2df7f Mon Sep 17 00:00:00 2001 From: Jen Looper Date: Wed, 14 Jul 2021 20:16:08 -0400 Subject: [PATCH 26/51] table tidy-up --- 4-Classification/2-Classifiers-1/README.md | 66 +++++++++++----------- 1 file changed, 33 insertions(+), 33 deletions(-) diff --git a/4-Classification/2-Classifiers-1/README.md b/4-Classification/2-Classifiers-1/README.md index bdff6bc9..0db1aeba 100644 --- a/4-Classification/2-Classifiers-1/README.md +++ b/4-Classification/2-Classifiers-1/README.md @@ -21,15 +21,14 @@ Assuming you completed [Lesson 1](../1-Introduction/README.md), make sure that a The data looks like this: - ```output - | | Unnamed: 0 | cuisine | almond | angelica | anise | anise_seed | apple | apple_brandy | apricot | armagnac | ... | whiskey | white_bread | white_wine | whole_grain_wheat_flour | wine | wood | yam | yeast | yogurt | zucchini | - | --- | ---------- | ------- | ------ | -------- | ----- | ---------- | ----- | ------------ | ------- | -------- | --- | ------- | ----------- | ---------- | ----------------------- | ---- | ---- | --- | ----- | ------ | -------- | - | 0 | 0 | indian | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | ... | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | - | 1 | 1 | indian | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | ... | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | - | 2 | 2 | indian | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | ... | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | - | 3 | 3 | indian | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | ... | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | - | 4 | 4 | indian | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | ... | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | - ``` +| | Unnamed: 0 | cuisine | almond | angelica | anise | anise_seed | apple | apple_brandy | apricot | armagnac | ... | whiskey | white_bread | white_wine | whole_grain_wheat_flour | wine | wood | yam | yeast | yogurt | zucchini | +| --- | ---------- | ------- | ------ | -------- | ----- | ---------- | ----- | ------------ | ------- | -------- | --- | ------- | ----------- | ---------- | ----------------------- | ---- | ---- | --- | ----- | ------ | -------- | +| 0 | 0 | indian | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | ... | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | +| 1 | 1 | indian | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | ... | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | +| 2 | 2 | indian | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | ... | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | +| 3 | 3 | indian | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | ... | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | +| 4 | 4 | indian | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | ... | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | + 1. Now, import several more libraries: @@ -68,13 +67,13 @@ Assuming you completed [Lesson 1](../1-Introduction/README.md), make sure that a Your features look like this: - | almond | angelica | anise | anise_seed | apple | apple_brandy | apricot | armagnac | artemisia | artichoke | ... | whiskey | white_bread | white_wine | whole_grain_wheat_flour | wine | wood | yam | yeast | yogurt | zucchini | | - | -----: | -------: | ----: | ---------: | ----: | -----------: | ------: | -------: | --------: | --------: | ---: | ------: | ----------: | ---------: | ----------------------: | ---: | ---: | ---: | ----: | -----: | -------: | --- | - | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | ... | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | - | 1 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | ... | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | - | 2 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | ... | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | - | 3 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | ... | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | - | 4 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | ... | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | +| almond | angelica | anise | anise_seed | apple | apple_brandy | apricot | armagnac | artemisia | artichoke | ... | whiskey | white_bread | white_wine | whole_grain_wheat_flour | wine | wood | yam | yeast | yogurt | zucchini | +| -----: | -------: | ----: | ---------: | ----: | -----------: | ------: | -------: | --------: | --------: | ---: | ------: | ----------: | ---------: | ----------------------: | ---: | ---: | ---: | ----: | -----: | -------: | +| 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | ... | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | +| 1 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | ... | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | +| 2 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | ... | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | +| 3 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | ... | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | +| 4 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | ... | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | Now you are ready to train your model! @@ -200,13 +199,13 @@ Since you are using the multiclass case, you need to choose what _scheme_ to use The result is printed - Indian cuisine is its best guess, with good probability: - | | 0 | | | | | | | | | | | | | | | | | | | | | - | -------: | -------: | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | - | indian | 0.715851 | | | | | | | | | | | | | | | | | | | | | - | chinese | 0.229475 | | | | | | | | | | | | | | | | | | | | | - | japanese | 0.029763 | | | | | | | | | | | | | | | | | | | | | - | korean | 0.017277 | | | | | | | | | | | | | | | | | | | | | - | thai | 0.007634 | | | | | | | | | | | | | | | | | | | | | + | | 0 | + | -------: | -------: | + | indian | 0.715851 | + | chinese | 0.229475 | + | japanese | 0.029763 | + | korean | 0.017277 | + | thai | 0.007634 | ✅ Can you explain why the model is pretty sure this is an Indian cuisine? @@ -217,22 +216,23 @@ Since you are using the multiclass case, you need to choose what _scheme_ to use print(classification_report(y_test,y_pred)) ``` - | | precision | recall | f1-score| support | | | | | | | | | | | | | | | | | | - | ------------ | ------ | -------- | ------- | ---- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | - | chinese | 0.73 | 0.71 | 0.72 | 229 | | | | | | | | | | | | | | | | | | - | indian | 0.91 | 0.93 | 0.92 | 254 | | | | | | | | | | | | | | | | | | - | japanese | 0.70 | 0.75 | 0.72 | 220 | | | | | | | | | | | | | | | | | | - | korean | 0.86 | 0.76 | 0.81 | 242 | | | | | | | | | | | | | | | | | | - | thai | 0.79 | 0.85 | 0.82 | 254 | | | | | | | | | | | | | | | | | | - | accuracy | 0.80 | 1199 | | | | | | | | | | | | | | | | | | | | - | macro avg | 0.80 | 0.80 | 0.80 | 1199 | | | | | | | | | | | | | | | | | | - | weighted avg | 0.80 | 0.80 | 0.80 | 1199 | | | | | | | | | | | | | | | | | | + | | precision | recall | f1-score | support | + | ------------ | ------ | -------- | ------- | ---- | + | chinese | 0.73 | 0.71 | 0.72 | 229 | + | indian | 0.91 | 0.93 | 0.92 | 254 | + | japanese | 0.70 | 0.75 | 0.72 | 220 | + | korean | 0.86 | 0.76 | 0.81 | 242 | + | thai | 0.79 | 0.85 | 0.82 | 254 | + | accuracy | 0.80 | 1199 | | | + | macro avg | 0.80 | 0.80 | 0.80 | 1199 | + | weighted avg | 0.80 | 0.80 | 0.80 | 1199 | ## 🚀Challenge In this lesson, you used your cleaned data to build a machine learning model that can predict a national cuisine based on a series of ingredients. Take some time to read through the many options Scikit-learn provides to classify data. Dig deeper into the concept of 'solver' to understand what goes on behind the scenes. ## [Post-lecture quiz](https://jolly-sea-0a877260f.azurestaticapps.net/quiz/22/) + ## Review & Self Study Dig a little more into the math behind logistic regression in [this lesson](https://people.eecs.berkeley.edu/~russell/classes/cs194/f11/lectures/CS194%20Fall%202011%20Lecture%2006.pdf) From 59cefad031d40dba0abdc293a4ed912a492934ff Mon Sep 17 00:00:00 2001 From: ahaliu1 <247969917@qq.com> Date: Thu, 15 Jul 2021 11:07:53 +0800 Subject: [PATCH 27/51] Fix two path error --- 6-NLP/1-Introduction-to-NLP/translations/README.zh-cn.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/6-NLP/1-Introduction-to-NLP/translations/README.zh-cn.md b/6-NLP/1-Introduction-to-NLP/translations/README.zh-cn.md index 75cd69da..3d122be6 100644 --- a/6-NLP/1-Introduction-to-NLP/translations/README.zh-cn.md +++ b/6-NLP/1-Introduction-to-NLP/translations/README.zh-cn.md @@ -125,7 +125,7 @@ It was nice talking to you, goodbye! ``` - 该任务的一种可能解决方案在[这里](solution/bot.py) + 该任务的一种可能解决方案在[这里](../solution/bot.py) ✅ 停止并思考 @@ -152,4 +152,4 @@ ## 任务 -[查找一个机器人](assignment.md) +[查找一个机器人](../assignment.md) From 4b18d4f145dbe5c584da60ee82993e0610ff439d Mon Sep 17 00:00:00 2001 From: ahaliu1 <247969917@qq.com> Date: Thu, 15 Jul 2021 11:08:36 +0800 Subject: [PATCH 28/51] Fix two file path error --- 6-NLP/1-Introduction-to-NLP/README.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/6-NLP/1-Introduction-to-NLP/README.md b/6-NLP/1-Introduction-to-NLP/README.md index 227ad589..ea244c74 100644 --- a/6-NLP/1-Introduction-to-NLP/README.md +++ b/6-NLP/1-Introduction-to-NLP/README.md @@ -133,7 +133,7 @@ Let's create the bot next. We'll start by defining some phrases. It was nice talking to you, goodbye! ``` - One possible solution to the task is [here](solution/bot.py) + One possible solution to the task is [here](../solution/bot.py) ✅ Stop and consider @@ -162,4 +162,4 @@ Take a look at the references below as further reading opportunities. ## Assignment -[Search for a bot](assignment.md) +[Search for a bot](../assignment.md) From 27d16669d59fc09c8af6b651ff4bf9414b37029d Mon Sep 17 00:00:00 2001 From: Vishvanathan K Date: Thu, 15 Jul 2021 10:24:23 +0530 Subject: [PATCH 29/51] Update Markdown error --- 8-Reinforcement/1-QLearning/README.md | 9 ++++----- 1 file changed, 4 insertions(+), 5 deletions(-) diff --git a/8-Reinforcement/1-QLearning/README.md b/8-Reinforcement/1-QLearning/README.md index bfa07ffe..6301c46e 100644 --- a/8-Reinforcement/1-QLearning/README.md +++ b/8-Reinforcement/1-QLearning/README.md @@ -229,8 +229,7 @@ We are now ready to implement the learning algorithm. Before we do that, we also We add a few `eps` to the original vector in order to avoid division by 0 in the initial case, when all components of the vector are identical. Run them learning algorithm through 5000 experiments, also called **epochs**: (code block 8) - - ```python +```python for epoch in range(5000): # Pick initial point @@ -255,11 +254,11 @@ Run them learning algorithm through 5000 experiments, also called **epochs**: (c ai = action_idx[a] Q[x,y,ai] = (1 - alpha) * Q[x,y,ai] + alpha * (r + gamma * Q[x+dpos[0], y+dpos[1]].max()) n+=1 - ``` +``` - After executing this algorithm, the Q-Table should be updated with values that define the attractiveness of different actions at each step. We can try to visualize the Q-Table by plotting a vector at each cell that will point in the desired direction of movement. For simplicity, we draw a small circle instead of an arrow head. +After executing this algorithm, the Q-Table should be updated with values that define the attractiveness of different actions at each step. We can try to visualize the Q-Table by plotting a vector at each cell that will point in the desired direction of movement. For simplicity, we draw a small circle instead of an arrow head. - + ## Checking the policy From 829ac30282d356b1b4b601452c1ab48cd1da390e Mon Sep 17 00:00:00 2001 From: simplg <81249731+simplg@users.noreply.github.com> Date: Thu, 15 Jul 2021 09:30:48 +0200 Subject: [PATCH 30/51] Changed url for images --- 1-Introduction/1-intro-to-ML/translations/README.fr.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/1-Introduction/1-intro-to-ML/translations/README.fr.md b/1-Introduction/1-intro-to-ML/translations/README.fr.md index 19af588d..a65367b5 100644 --- a/1-Introduction/1-intro-to-ML/translations/README.fr.md +++ b/1-Introduction/1-intro-to-ML/translations/README.fr.md @@ -27,7 +27,7 @@ Avant de commencer avec ce cours, vous aurez besoin d'un ordinateur configuré e Le terme `machine learning` est un des mots les plus populaire et le plus utilisé ces derniers temps. Il y a une probabilité accrue que vous l'ayez entendu au moins une fois si vous avez une appétence pour la technologie indépendamment du domaine dans lequel vous travaillez. Le fonctionnement du machine learning, cependant, reste un mystère pour la plupart des personnes. Pour un débutant en machine learning, le sujet peut nous submerger. Ainsi, il est important de comprendre ce qu'est le machine learning et de l'apprendre petit à petit au travers d'exemples pratiques. -![ml hype curve](images/hype.png) +![ml hype curve](../images/hype.png) > Google Trends montre la récente 'courbe de popularité' pour le mot 'machine learning' @@ -39,7 +39,7 @@ Le [cerveau humain](https://www.livescience.com/29365-human-brain.html) perçoit Bien que le terme peut être confu, machine learning (ML) est un important sous-ensemble de l'intelligence artificielle. **ML se réfère à l'utilisation d'algorithmes spécialisés afin de découvrir des informations utiles et de trouver des schémas cachés depuis des données perçues pour corroborer un processus de décision rationnel**. -![AI, ML, deep learning, data science](images/ai-ml-ds.png) +![AI, ML, deep learning, data science](../images/ai-ml-ds.png) > Un diagramme montrant les relations entre AI, ML, deep learning et data science. Infographie par [Jen Looper](https://twitter.com/jenlooper) et inspiré par [ce graphique](https://softwareengineering.stackexchange.com/questions/366996/distinction-between-ai-ml-neural-networks-deep-learning-and-data-mining) From 091d096eb19e3d8229e9556e3217c6c7823bb7fa Mon Sep 17 00:00:00 2001 From: simplg <81249731+simplg@users.noreply.github.com> Date: Thu, 15 Jul 2021 09:33:50 +0200 Subject: [PATCH 31/51] Corrected Quizz to Quiz --- 1-Introduction/1-intro-to-ML/translations/README.fr.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/1-Introduction/1-intro-to-ML/translations/README.fr.md b/1-Introduction/1-intro-to-ML/translations/README.fr.md index a65367b5..828e7c62 100644 --- a/1-Introduction/1-intro-to-ML/translations/README.fr.md +++ b/1-Introduction/1-intro-to-ML/translations/README.fr.md @@ -4,7 +4,7 @@ > 🎥 Cliquer sur l'image ci-dessus afin de regarder une vidéo expliquant la différence entre machine learning, AI et deep learning. -## [Quizz de pré-conférence](https://jolly-sea-0a877260f.azurestaticapps.net/quiz/1/) +## [Quiz de pré-conférence](https://jolly-sea-0a877260f.azurestaticapps.net/quiz/1/) ### Introduction @@ -98,7 +98,7 @@ Dans un avenir proche, comprendre les bases du machine learning sera indispensab Esquisser, sur papier ou à l'aide d'une application en ligne comme [Excalidraw](https://excalidraw.com/), votre compréhension des différences entre l'IA, le ML, le deep learning et la data science. Ajouter quelques idées de problèmes que chacune de ces techniques est bonne à résoudre. -## [Quizz de post-conférence](https://jolly-sea-0a877260f.azurestaticapps.net/quiz/2/) +## [Quiz de post-conférence](https://jolly-sea-0a877260f.azurestaticapps.net/quiz/2/) ## Révision et auto-apprentissage From a0c925fdaf878c8c1017fdef25b339297b9e9c38 Mon Sep 17 00:00:00 2001 From: simplg <81249731+simplg@users.noreply.github.com> Date: Thu, 15 Jul 2021 12:22:53 +0200 Subject: [PATCH 32/51] Correction de traduction MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit * Post-conférence -> Postlecture * Pré-conférence -> Prélecture * Hidden patterns -> Schémas non observés Comme discuté dans le groupe --- 1-Introduction/1-intro-to-ML/translations/README.fr.md | 10 +++++----- 1 file changed, 5 insertions(+), 5 deletions(-) diff --git a/1-Introduction/1-intro-to-ML/translations/README.fr.md b/1-Introduction/1-intro-to-ML/translations/README.fr.md index 828e7c62..fd396d69 100644 --- a/1-Introduction/1-intro-to-ML/translations/README.fr.md +++ b/1-Introduction/1-intro-to-ML/translations/README.fr.md @@ -4,7 +4,7 @@ > 🎥 Cliquer sur l'image ci-dessus afin de regarder une vidéo expliquant la différence entre machine learning, AI et deep learning. -## [Quiz de pré-conférence](https://jolly-sea-0a877260f.azurestaticapps.net/quiz/1/) +## [Quiz prélecture](https://jolly-sea-0a877260f.azurestaticapps.net/quiz/1/) ### Introduction @@ -33,11 +33,11 @@ Le terme `machine learning` est un des mots les plus populaire et le plus utilis Nous vivons dans un univers rempli de mystères fascinants. De grands scientifiques comme Stephen Hawking, Albert Einstein et pleins d'autres ont dévoués leur vie à la recherche d'informations utiles afin de dévoiler les mystères qui nous entourent. C'est la condition humaine pour apprendre : un enfant apprend de nouvelles choses et découvre la structure du monde année après année jusqu'à qu'ils deviennent adultes. -Le cerveau d'un enfant et ses sens perçoivent l'environnement qui les entourent et apprennent graduellement des schémas secrets de la vie qui vont l'aider à fabriquer des règles logiques afin d'identifier les schémas appris. Le processus d'apprentissage du cerveau humain est ce que rend les hommes comme la créature la plus sophistiquée du monde vivant. Apprendre continuellement par la découverte de schémas cachés et ensuite innover sur ces schémas nous permet de nous améliorer tout au long de notre vie. Cette capacité d'apprendre et d'évoluer est liée au concept de [plasticité neuronale](https://www.simplypsychology.org/brain-plasticity.html), nous pouvons tirer quelques motivations similaires entre le processus d'apprentissage du cerveau humain et le concept de machine learning. +Le cerveau d'un enfant et ses sens perçoivent l'environnement qui les entourent et apprennent graduellement des schémas non observés de la vie qui vont l'aider à fabriquer des règles logiques afin d'identifier les schémas appris. Le processus d'apprentissage du cerveau humain est ce que rend les hommes comme la créature la plus sophistiquée du monde vivant. Apprendre continuellement par la découverte de schémas non observés et ensuite innover sur ces schémas nous permet de nous améliorer tout au long de notre vie. Cette capacité d'apprendre et d'évoluer est liée au concept de [plasticité neuronale](https://www.simplypsychology.org/brain-plasticity.html), nous pouvons tirer quelques motivations similaires entre le processus d'apprentissage du cerveau humain et le concept de machine learning. Le [cerveau humain](https://www.livescience.com/29365-human-brain.html) perçoit des choses du monde réel, assimile les informations perçues, fait des décisions rationnelles et entreprend certaines actions selon le contexte. C'est ce que l'on appelle se comporter intelligemment. Lorsque nous programmons une reproduction du processus de ce comportement à une machine, c'est ce que l'on appelle intelligence artificielle (IA). -Bien que le terme peut être confu, machine learning (ML) est un important sous-ensemble de l'intelligence artificielle. **ML se réfère à l'utilisation d'algorithmes spécialisés afin de découvrir des informations utiles et de trouver des schémas cachés depuis des données perçues pour corroborer un processus de décision rationnel**. +Bien que le terme peut être confu, machine learning (ML) est un important sous-ensemble de l'intelligence artificielle. **ML se réfère à l'utilisation d'algorithmes spécialisés afin de découvrir des informations utiles et de trouver des schémas non observés depuis des données perçues pour corroborer un processus de décision rationnel**. ![AI, ML, deep learning, data science](../images/ai-ml-ds.png) @@ -70,7 +70,7 @@ Afin d'avoir la meilleur expérience d'apprentissage, nous éviterons les comple ## Pourquoi etudier le machine learning ? -Le machine learning, depuis une perspective systémique, est défini comme la création de systèmes automatiques pouvant apprendre des schémas cachés depuis des données afin d'aider à prendre des décisions intelligentes. +Le machine learning, depuis une perspective systémique, est défini comme la création de systèmes automatiques pouvant apprendre des schémas non observés depuis des données afin d'aider à prendre des décisions intelligentes. Ce but est faiblement inspiré de la manière dont le cerveau humain apprend certaines choses depuis les données qu'il perçoit du monde extérieur. @@ -98,7 +98,7 @@ Dans un avenir proche, comprendre les bases du machine learning sera indispensab Esquisser, sur papier ou à l'aide d'une application en ligne comme [Excalidraw](https://excalidraw.com/), votre compréhension des différences entre l'IA, le ML, le deep learning et la data science. Ajouter quelques idées de problèmes que chacune de ces techniques est bonne à résoudre. -## [Quiz de post-conférence](https://jolly-sea-0a877260f.azurestaticapps.net/quiz/2/) +## [Quiz postlecture](https://jolly-sea-0a877260f.azurestaticapps.net/quiz/2/) ## Révision et auto-apprentissage From bd08fb25c2e3f49a95aefabfd976ea77d44775d6 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?=E6=96=87=E4=BD=93=E4=B8=A4=E5=BC=80=E8=8A=B1=E7=94=9F?= <56857145+loap-a@users.noreply.github.com> Date: Thu, 15 Jul 2021 19:04:56 +0800 Subject: [PATCH 33/51] Add files via upload --- .../translations/README.zh-cn.md | 294 ++++++++++++++++++ 1 file changed, 294 insertions(+) create mode 100644 4-Classification/1-Introduction/translations/README.zh-cn.md diff --git a/4-Classification/1-Introduction/translations/README.zh-cn.md b/4-Classification/1-Introduction/translations/README.zh-cn.md new file mode 100644 index 00000000..d85c266c --- /dev/null +++ b/4-Classification/1-Introduction/translations/README.zh-cn.md @@ -0,0 +1,294 @@ +# 对分类方法的介绍 + +在这四节课程中,你将会学习机器学习中一个基本的重点 - _分类_. 我们会在关于亚洲和印度的神奇的美食的数据集上尝试使用多种分类算法。希望你有点饿了。 + +![一个桃子!](../images/pinch.png) + +>在学习的课程中赞叹泛亚地区的美食吧! 图片由 [Jen Looper](https://twitter.com/jenlooper)提供 + +分类算法是[监督学习](https://wikipedia.org/wiki/Supervised_learning) 的一种。它与回归算法在很多方面都有相同之处。如果机器学习所有的目标都是使用数据集来预测数值或物品的名字,那么分类算法通常可以分为两类 _二元分类_ 和 _多元分类_。 + +[![对分类算法的介绍](https://img.youtube.com/vi/eg8DJYwdMyg/0.jpg)](https://youtu.be/eg8DJYwdMyg "对分类算法的介绍") + +> 🎥 点击上方给的图片可以跳转到一个视频-MIT的John对分类算法的介绍 + +请记住: + +- **线性回归** 帮助你预测变量之间的关系并对一个新的数据点会落在哪条线上做出精确的预测。因此,你可以预测 _南瓜在九月的价格和十月的价格_。 +- **逻辑回归** 帮助你发现“二元范畴”:即在当前这个价格, _这个南瓜是不是橙色_? + +分类方法采用多种算法来确定其他可以用来确定一个数据点的标签或类别的方法。让我们来研究一下这个数据集,看看我们能否通过观察菜肴的原料来确定它的源头。 + +## [课程前的小问题](https://jolly-sea-0a877260f.azurestaticapps.net/quiz/19/) + +分类是机器学习研究者和数据科学家使用的一种基本方法。从基本的二元分类(这是不是一份垃圾邮件?)到复杂的图片分类和使用计算机视觉的分割技术,它都是将数据分类并提出相关问题的有效工具。 + +![二元分类 vs 多元分类](../images/binary-multiclass.png) + +> 需要分类算法解决的二元分类和多元分类问题的对比. 信息图由[Jen Looper](https://twitter.com/jenlooper)提供 + +在开始清洗数据、数据可视化和调整数据以适应机器学习的任务前,让我们来了解一下多种可用来数据分类的机器学习方法。 + +派生自[统计数学](https://wikipedia.org/wiki/Statistical_classification),分类算法使用经典的机器学习的一些特征,比如通过'吸烟者'、'体重'和'年龄'来推断 _罹患某种疾病的可能性_。作为一个与你刚刚实践过的回归算法很相似的监督学习算法,你的数据是被标记过的并且算法通过采集这些标签来进行分类和预测并进行输出。 + +✅ 花一点时间来想象一下一个关于菜肴的数据集。一个多元分类的模型应该能回答什么问题?一个二元分类的模型又应该能回答什么?如果你想确定一个给定的菜肴是否会用到葫芦巴(一种植物,种子用来调味)该怎么做?如果你想知道给你一个装满了八角茴香、花椰菜和辣根的购物袋你能否做出一道代表性的印度菜又该怎么做? + +[![Crazy mystery baskets](https://img.youtube.com/vi/GuTeDbaNoEU/0.jpg)](https://youtu.be/GuTeDbaNoEU "疯狂的神秘篮子") + +> 🎥 点击图像观看视频。整个'Chopped'节目的前提都是建立在神秘的篮子上,在这个节目中厨师必须利用随机给定的食材做菜。可见一个机器学习模型能起到不小的作用 + +## 初见-分类器 + +我们关于这个菜肴数据集想要提出的问题其实是一个 **多元问题**,因为我们有很多潜在的具有代表性的菜肴。给定一系列食材数据,数据能够符合这些类别中的哪一类? + +Scikit-learn项目提供多种对数据进行分类的算法,你需要根据问题的具体类型来进行选择。在下两节课程中你会学到这些算法中的几个。 + +## 练习 - 清洗并平衡你的数据 + +在你开始进行这个项目前的第一个上手的任务就是清洗和 **平衡**你的数据来得到更好的结果。从当前目录的根目录中的 _nodebook.ipynb_ 开始。 + +第一个需要安装的东西是 [imblearn](https://imbalanced-learn.org/stable/)这是一个Scikit-learn项目中的一个包,它可以让你更好的平衡数据 (关于这个任务你很快你就会学到更多)。 + +1. 安装 `imblearn`, 运行命令 `pip install`: + + ```python + pip install imblearn + ``` + +1. 为了导入和可视化数据你需要导入下面的这些包, 你还需要从`imblearn`导入`SMOTE` + + ```python + import pandas as pd + import matplotlib.pyplot as plt + import matplotlib as mpl + import numpy as np + from imblearn.over_sampling import SMOTE + ``` + + 现在你已经准备好导入数据了。 + +1. 下一项任务是导入数据: + + ```python + df = pd.read_csv('../data/cuisines.csv') + ``` + + 使用函数 `read_csv()` 会读取csv文件的内容 _cusines.csv_ 并将内容放置在 变量`df`中。 + +1. 检查数据的形状是否正确: + + ```python + df.head() + ``` + + 前五行输出应该是这样的: + + ```output + | | Unnamed: 0 | cuisine | almond | angelica | anise | anise_seed | apple | apple_brandy | apricot | armagnac | ... | whiskey | white_bread | white_wine | whole_grain_wheat_flour | wine | wood | yam | yeast | yogurt | zucchini | + | --- | ---------- | ------- | ------ | -------- | ----- | ---------- | ----- | ------------ | ------- | -------- | --- | ------- | ----------- | ---------- | ----------------------- | ---- | ---- | --- | ----- | ------ | -------- | + | 0 | 65 | indian | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | ... | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | + | 1 | 66 | indian | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | ... | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | + | 2 | 67 | indian | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | ... | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | + | 3 | 68 | indian | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | ... | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | + | 4 | 69 | indian | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | ... | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | + ``` + +1. 调用函数 `info()` 可以获得有关这个数据集的信息: + + ```python + df.info() + ``` + + Your out resembles: + + ```output + + RangeIndex: 2448 entries, 0 to 2447 + Columns: 385 entries, Unnamed: 0 to zucchini + dtypes: int64(384), object(1) + memory usage: 7.2+ MB + ``` + + ## 练习 - 了解这些菜肴 + +现在任务变得更有趣了,让我们来探索如何将数据分配给各个菜肴 + +1. 调用函数 `barh()`可以绘制出数据的条形图: + + ```python + df.cuisine.value_counts().plot.barh() + ``` + + ![菜肴数据分配](../images/cuisine-dist.png) + + 这里有有限的一些菜肴,但是数据的分配是不平均的。但是你可以修正这一现象!在这样做之前再稍微探索一下。 + +1. 找出对于每个菜肴有多少数据是有效的并将其打印出来: + + ```python + thai_df = df[(df.cuisine == "thai")] + japanese_df = df[(df.cuisine == "japanese")] + chinese_df = df[(df.cuisine == "chinese")] + indian_df = df[(df.cuisine == "indian")] + korean_df = df[(df.cuisine == "korean")] + + print(f'thai df: {thai_df.shape}') + print(f'japanese df: {japanese_df.shape}') + print(f'chinese df: {chinese_df.shape}') + print(f'indian df: {indian_df.shape}') + print(f'korean df: {korean_df.shape}') + ``` + + 输出应该是这样的 : + + ```output + thai df: (289, 385) + japanese df: (320, 385) + chinese df: (442, 385) + indian df: (598, 385) + korean df: (799, 385) + ``` +## 探索有关食材的内容 + +现在你可以在数据中探索的更深一点并了解每道菜肴的代表性食材。你需要将反复出现的、容易造成混淆的数据清理出去,那么让我们来学习解决这个问题。 + +1. 在Python中创建一个函数 `create_ingredient()` 来创建一个食材的数据帧。这个函数会去掉数据中无用的列并按食材的数量进行分类。 + + ```python + def create_ingredient_df(df): + ingredient_df = df.T.drop(['cuisine','Unnamed: 0']).sum(axis=1).to_frame('value') + ingredient_df = ingredient_df[(ingredient_df.T != 0).any()] + ingredient_df = ingredient_df.sort_values(by='value', ascending=False + inplace=False) + return ingredient_df + ``` +现在你可以使用这个函数来得到理想的每道菜肴最重要的10种食材。 + +1. 调用函数 `create_ingredient()` 然后通过函数`barh()`来绘制图像: + + ```python + thai_ingredient_df = create_ingredient_df(thai_df) + thai_ingredient_df.head(10).plot.barh() + ``` + + ![thai](../images/thai.png) + +1. 对日本的数据进行相同的操作: + + ```python + japanese_ingredient_df = create_ingredient_df(japanese_df) + japanese_ingredient_df.head(10).plot.barh() + ``` + + ![日本](../images/japanese.png) + +1. 现在处理中国的数据: + + ```python + chinese_ingredient_df = create_ingredient_df(chinese_df) + chinese_ingredient_df.head(10).plot.barh() + ``` + + ![中国](../images/chinese.png) + +1. 绘制印度食材的数据: + + ```python + indian_ingredient_df = create_ingredient_df(indian_df) + indian_ingredient_df.head(10).plot.barh() + ``` + + ![印度](../images/indian.png) + +1. 最后,绘制韩国的食材的数据: + + ```python + korean_ingredient_df = create_ingredient_df(korean_df) + korean_ingredient_df.head(10).plot.barh() + ``` + + ![韩国](../images/korean.png) + +1. 现在,去除在不同的菜肴间最普遍的容易造成混乱的食材,调用函数 `drop()`: + + 大家都喜欢米饭、大蒜和生姜 + + ```python + feature_df= df.drop(['cuisine','Unnamed: 0','rice','garlic','ginger'], axis=1) + labels_df = df.cuisine #.unique() + feature_df.head() + ``` + +## 平衡数据集 + +现在你已经清理过数据集了, 使用 [SMOTE](https://imbalanced-learn.org/dev/references/generated/imblearn.over_sampling.SMOTE.html) - "Synthetic Minority Over-sampling Technique" - 来平衡数据集。 + +1. 调用函数 `fit_resample()`, 此方法通过插入数据来生成新的样本 + + ```python + oversample = SMOTE() + transformed_feature_df, transformed_label_df = oversample.fit_resample(feature_df, labels_df) + ``` + + 通过对数据集的平衡,当你对数据进行分类时能够得到更好的结果。现在考虑一个二元分类的问题,如果你的数据集中的大部分数据都属于其中一个类别,那么机器学习的模型就会因为在那个类别的数据更多而判断那个类别更为常见。平衡数据能够去除不公平的数据点。 + +1. 现在你可以查看每个食材的标签数量: + + ```python + print(f'new label count: {transformed_label_df.value_counts()}') + print(f'old label count: {df.cuisine.value_counts()}') + ``` + + 输出应该是这样的 : + + ```output + new label count: korean 799 + chinese 799 + indian 799 + japanese 799 + thai 799 + Name: cuisine, dtype: int64 + old label count: korean 799 + indian 598 + chinese 442 + japanese 320 + thai 289 + Name: cuisine, dtype: int64 + ``` + + 现在这个数据集不仅干净、平衡而且还很“美味” ! + +1. 最后一步是保存你处理过后的平衡的数据(包括标签和特征),将其保存为一个可以被输出到文件中的数据帧。 + + ```python + transformed_df = pd.concat([transformed_label_df,transformed_feature_df],axis=1, join='outer') + ``` + +1. 你可以通过调用函数 `transformed_df.head()` 和 `transformed_df.info()`再检查一下你的数据。 接下来要将数据保存以供在未来的课程中使用: + + ```python + transformed_df.head() + transformed_df.info() + transformed_df.to_csv("../data/cleaned_cuisine.csv") + ``` + + 这个全新的CSV文件可以在数据根目录中被找到。 + +--- + +## 🚀小练习 + +本项目的全部课程含有很多有趣的数据集。 探索一下 `data`文件夹,看看这里面有没有适合二元分类、多元分类算法的数据集,再想一下你对这些数据集有没有什么想问的问题。 + +## [课后练习](https://jolly-sea-0a877260f.azurestaticapps.net/quiz/20/) + +## Review & Self Study + +探索一下 SMOTE的API文档。思考一下它最适合于什么样的情况、它能够解决什么样的问题。 + +## Assignment + +[探索一下分类方法](../assignment.md) +{"mode":"full","isActive":false} + + From bb1821234ebf62cb6f3b9827d5e7b0216b7b00ce Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?=E6=96=87=E4=BD=93=E4=B8=A4=E5=BC=80=E8=8A=B1=E7=94=9F?= <56857145+loap-a@users.noreply.github.com> Date: Thu, 15 Jul 2021 21:04:43 +0800 Subject: [PATCH 34/51] Add files via upload --- .../1-Introduction/translations/README.zh-cn.md | 8 +++----- 1 file changed, 3 insertions(+), 5 deletions(-) diff --git a/4-Classification/1-Introduction/translations/README.zh-cn.md b/4-Classification/1-Introduction/translations/README.zh-cn.md index d85c266c..1dbc3598 100644 --- a/4-Classification/1-Introduction/translations/README.zh-cn.md +++ b/4-Classification/1-Introduction/translations/README.zh-cn.md @@ -282,13 +282,11 @@ Scikit-learn项目提供多种对数据进行分类的算法,你需要根据 ## [课后练习](https://jolly-sea-0a877260f.azurestaticapps.net/quiz/20/) -## Review & Self Study +## 回顾 & 自学 探索一下 SMOTE的API文档。思考一下它最适合于什么样的情况、它能够解决什么样的问题。 -## Assignment +## 课后作业 [探索一下分类方法](../assignment.md) -{"mode":"full","isActive":false} - - +{"mode":"full","isActive":false} \ No newline at end of file From 05b028a5b2755a92c3c1f43fb7dd1be6cae01615 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?=E6=96=87=E4=BD=93=E4=B8=A4=E5=BC=80=E8=8A=B1=E7=94=9F?= <56857145+loap-a@users.noreply.github.com> Date: Thu, 15 Jul 2021 21:07:43 +0800 Subject: [PATCH 36/51] Update README.zh-cn.md --- 4-Classification/1-Introduction/translations/README.zh-cn.md | 1 - 1 file changed, 1 deletion(-) diff --git a/4-Classification/1-Introduction/translations/README.zh-cn.md b/4-Classification/1-Introduction/translations/README.zh-cn.md index 1dbc3598..2e258f3f 100644 --- a/4-Classification/1-Introduction/translations/README.zh-cn.md +++ b/4-Classification/1-Introduction/translations/README.zh-cn.md @@ -289,4 +289,3 @@ Scikit-learn项目提供多种对数据进行分类的算法,你需要根据 ## 课后作业 [探索一下分类方法](../assignment.md) -{"mode":"full","isActive":false} \ No newline at end of file From baa5cabaae13c85ac031ba836f37c416e8b1a875 Mon Sep 17 00:00:00 2001 From: simplg <81249731+simplg@users.noreply.github.com> Date: Thu, 15 Jul 2021 15:41:26 +0200 Subject: [PATCH 37/51] Corrected to accepted words MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Postconférence Préconférence --- 1-Introduction/1-intro-to-ML/translations/README.fr.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/1-Introduction/1-intro-to-ML/translations/README.fr.md b/1-Introduction/1-intro-to-ML/translations/README.fr.md index fd396d69..08ad7a09 100644 --- a/1-Introduction/1-intro-to-ML/translations/README.fr.md +++ b/1-Introduction/1-intro-to-ML/translations/README.fr.md @@ -4,7 +4,7 @@ > 🎥 Cliquer sur l'image ci-dessus afin de regarder une vidéo expliquant la différence entre machine learning, AI et deep learning. -## [Quiz prélecture](https://jolly-sea-0a877260f.azurestaticapps.net/quiz/1/) +## [Quiz préconférence](https://jolly-sea-0a877260f.azurestaticapps.net/quiz/1/) ### Introduction @@ -98,7 +98,7 @@ Dans un avenir proche, comprendre les bases du machine learning sera indispensab Esquisser, sur papier ou à l'aide d'une application en ligne comme [Excalidraw](https://excalidraw.com/), votre compréhension des différences entre l'IA, le ML, le deep learning et la data science. Ajouter quelques idées de problèmes que chacune de ces techniques est bonne à résoudre. -## [Quiz postlecture](https://jolly-sea-0a877260f.azurestaticapps.net/quiz/2/) +## [Quiz postconférence](https://jolly-sea-0a877260f.azurestaticapps.net/quiz/2/) ## Révision et auto-apprentissage From f49981f258791b6a7735f090367bae716c2bfba3 Mon Sep 17 00:00:00 2001 From: simplg <81249731+simplg@users.noreply.github.com> Date: Thu, 15 Jul 2021 15:55:38 +0200 Subject: [PATCH 38/51] Update README.fr.md --- 1-Introduction/1-intro-to-ML/translations/README.fr.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/1-Introduction/1-intro-to-ML/translations/README.fr.md b/1-Introduction/1-intro-to-ML/translations/README.fr.md index 08ad7a09..5fae99cd 100644 --- a/1-Introduction/1-intro-to-ML/translations/README.fr.md +++ b/1-Introduction/1-intro-to-ML/translations/README.fr.md @@ -4,7 +4,7 @@ > 🎥 Cliquer sur l'image ci-dessus afin de regarder une vidéo expliquant la différence entre machine learning, AI et deep learning. -## [Quiz préconférence](https://jolly-sea-0a877260f.azurestaticapps.net/quiz/1/) +## [Quiz de préconférence](https://jolly-sea-0a877260f.azurestaticapps.net/quiz/1/) ### Introduction @@ -98,7 +98,7 @@ Dans un avenir proche, comprendre les bases du machine learning sera indispensab Esquisser, sur papier ou à l'aide d'une application en ligne comme [Excalidraw](https://excalidraw.com/), votre compréhension des différences entre l'IA, le ML, le deep learning et la data science. Ajouter quelques idées de problèmes que chacune de ces techniques est bonne à résoudre. -## [Quiz postconférence](https://jolly-sea-0a877260f.azurestaticapps.net/quiz/2/) +## [Quiz de postconférence](https://jolly-sea-0a877260f.azurestaticapps.net/quiz/2/) ## Révision et auto-apprentissage From 93bcc7ff1b5416161067655086fdb7710405c30e Mon Sep 17 00:00:00 2001 From: JudyZhangYifan Date: Thu, 15 Jul 2021 17:54:36 -0400 Subject: [PATCH 39/51] 8-Reinforcement intro README translation --- 8-Reinforcement/translations/README.zh-cn.md | 53 ++++++++++++++++++++ 1 file changed, 53 insertions(+) create mode 100644 8-Reinforcement/translations/README.zh-cn.md diff --git a/8-Reinforcement/translations/README.zh-cn.md b/8-Reinforcement/translations/README.zh-cn.md new file mode 100644 index 00000000..c033d1db --- /dev/null +++ b/8-Reinforcement/translations/README.zh-cn.md @@ -0,0 +1,53 @@ +# 强化学习介绍 + +强化学习(Reinforcement learning,RL)被视为基础机器学习除监督学习以及无监督学习之外的范式之一。强化学习是完全关于决策的,它可以提供正确的决策或者至少能从他们中学习。 + +想象你现在有一个例如股票市场的模拟环境。如果你施加了一条给定的规章制度的话,将会发生什么呢?这条规章制度会带来积极还是消极的影响呢?如果产生了负面影响的话,那么你就需要接受这种 _负强化_ ,从中学习并改变方针。如果产生了正面的成果,那么你就需要基于这种 _正强化_ 越做越好。 + +![彼得与狼](../images/peter.png) + +> 彼得和他的朋友们需要逃离饥饿的狼!(图片来自:[Jen Looper](https://twitter.com/jenlooper)) + +## 区域主题:彼得与狼(俄罗斯) + +[彼得与狼](https://zh.wikipedia.org/wiki/%E5%BD%BC%E5%BE%97%E5%92%8C%E7%8B%BC) 是前苏联作曲家[普罗科菲耶夫](https://zh.wikipedia.org/wiki/%E8%B0%A2%E5%B0%94%E7%9B%96%C2%B7%E6%99%AE%E7%BD%97%E7%A7%91%E8%8F%B2%E8%80%B6%E5%A4%AB)写的一部交响童话。它讲述的是少先队员彼得勇敢地离家到森林空地去追捕狼的故事。在本节中,我们将训练可以帮助彼得的机器学习算法: + +- **探索** 周边区域并构建一张最佳的导航地图 +- **学习** 如何使用滑板并在上面保持平衡,以便更加快速地移动。 + +[![彼得与狼](https://img.youtube.com/vi/Fmi5zHg4QSM/0.jpg)](https://www.youtube.com/watch?v=Fmi5zHg4QSM) + +> 🎥 点击上图聆听普罗科菲耶夫的《彼得与狼》 + +## 强化学习 + +在之前的章节中,你已经看到了两个机器学习问题的例子: + +- **有监督的**——我们有数据集可以为我们想要解决的问题提出示例解决方案。[分类模型](../../4-Classification/README.md)与[回归模型](../../2-Regression/translations/README.zh-cn.md)都是有监督的任务。 +- **无监督的**——我们的训练数据没有标签。无监督学习的一个主要例子就是[聚类分析](../../5-Clustering/README.md)。 + +在本节中,我们会向你介绍一种新的学习问题。这种问题不需要有标签的训练数据,它们有以下几类问题: + +- **[半监督学习](https://wikipedia.org/wiki/Semi-supervised_learning)**——我们有很多没有标签的数据可以用于预先训练模型。 +- **[强化学习](https://wikipedia.org/wiki/Reinforcement_learning)**——一个智能体(agent)在某些模拟环境中进行实验并以此学习如何表现。 + +### 例子 - 电脑游戏 + +假设你想要教会电脑如何玩一个例如国际象棋或者[超级马里奥](https://wikipedia.org/wiki/Super_Mario)的游戏。对于电脑来说,我们需要让它预测在每个游戏状态下它的动作才能使它成功地玩游戏。虽然这看上去像是个分类问题,但是事实并非如此——因为我们没有包含(游戏)状态和相应动作的数据集。虽然我们可能有一些现有的国际象棋比赛数据或者玩家玩超级马里奥的记录,但是那些数据很可能无法包含足够多的潜在(游戏)状态。 + +**强化学习** (RL) 不是寻找现有的游戏数据,而是基于一种*想让电脑玩* 多次并观察结果的想法。因此,我们需要做以下两件事来应用强化学习: + +- **环境** 和 **模拟器** ——可以让我们多次玩游戏。这个模拟器将定义所有游戏的规则、可能的状态以及动作。 + +- **奖励函数** ——会告诉我们在每个动作或游戏中的表现如何。 + +其他机器学习和强化学习(RL)的主要差别就是在RL中我们通常无法在完成游戏之前知道我们是赢还是输。因此,我们无法评价游戏中的某一个特定动作是好是坏——我们只会在游戏结束时才得到奖励。我们的目标是设计一种可以在不确定条件下帮我们训练模型的算法。接下来我们将要学习一种叫**Q-learning**的RL算法。 + +## 课程 + +1. [强化学习与Q-Learning介绍](../1-QLearning/README.md) +2. [使用Gym模拟环境](../2-Gym/README.md) + +## Credits + +"强化学习介绍"由[Dmitry Soshnikov](http://soshnikov.com)撰写 ♥️ From 034b28edb4c39e4d56ba5a8377988f2c448c227d Mon Sep 17 00:00:00 2001 From: unknown Date: Fri, 16 Jul 2021 11:24:28 +0800 Subject: [PATCH 40/51] Translated 1-intro-to-ML assignment.md into Simplified Chinese --- .../1-intro-to-ML/translations/assignment.zh-cn.md | 9 +++++++++ 1 file changed, 9 insertions(+) create mode 100644 1-Introduction/1-intro-to-ML/translations/assignment.zh-cn.md diff --git a/1-Introduction/1-intro-to-ML/translations/assignment.zh-cn.md b/1-Introduction/1-intro-to-ML/translations/assignment.zh-cn.md new file mode 100644 index 00000000..fd59f691 --- /dev/null +++ b/1-Introduction/1-intro-to-ML/translations/assignment.zh-cn.md @@ -0,0 +1,9 @@ +# 启动和运行 + +## 说明 + +在这个不评分的作业中,你应该温习一下 Python,将 Python 环境能够运行起来,并且可以运行 notebooks。 + +学习这个 [Python 学习路径](https://docs.microsoft.com/learn/paths/python-language/?WT.mc_id=academic-15963-cxa),然后通过这些介绍性的视频将你的系统环境设置好: + +https://www.youtube.com/playlist?list=PLlrxD0HtieHhS8VzuMCfQD4uJ9yne1mE6 From 89ea1f05fa130042de972a6d4cfac97a7fca12b0 Mon Sep 17 00:00:00 2001 From: unknown Date: Fri, 16 Jul 2021 14:05:43 +0800 Subject: [PATCH 41/51] Translated 2-history-of-ML assignment.md into Simplified Chinese --- .../2-history-of-ML/translations/assignment.zh-cn.md | 11 +++++++++++ 1 file changed, 11 insertions(+) create mode 100644 1-Introduction/2-history-of-ML/translations/assignment.zh-cn.md diff --git a/1-Introduction/2-history-of-ML/translations/assignment.zh-cn.md b/1-Introduction/2-history-of-ML/translations/assignment.zh-cn.md new file mode 100644 index 00000000..adf3ee15 --- /dev/null +++ b/1-Introduction/2-history-of-ML/translations/assignment.zh-cn.md @@ -0,0 +1,11 @@ +# 建立一个时间轴 + +## 说明 + +使用这个 [仓库](https://github.com/Digital-Humanities-Toolkit/timeline-builder),创建一个关于算法、数学、统计学、人工智能、机器学习的某个方面或者可以综合多个以上学科来讲。你可以着重介绍某个人,某个想法,或者一个经久不衰的思想。请确保添加了多媒体元素在你的时间线中。 + +## 评判标准 + +| 标准 | 优秀 | 中规中矩 | 仍需努力 | +| ------------ | ---------------------------------- | ---------------------- | ------------------------------------------ | +| | 有一个用 GitHub page 展示的 timeline | 代码还不完整并且没有部署 | 时间线不完整,没有经过充分的研究,并且没有部署 | From 8f99e59b7722930941719b50905f854483b87fe7 Mon Sep 17 00:00:00 2001 From: unknown Date: Fri, 16 Jul 2021 15:16:20 +0800 Subject: [PATCH 42/51] Translated 3-fairness assignment.md into Simplified Chinese --- .../3-fairness/translations/assignment.zh-cn.md | 11 +++++++++++ 1 file changed, 11 insertions(+) create mode 100644 1-Introduction/3-fairness/translations/assignment.zh-cn.md diff --git a/1-Introduction/3-fairness/translations/assignment.zh-cn.md b/1-Introduction/3-fairness/translations/assignment.zh-cn.md new file mode 100644 index 00000000..a8124199 --- /dev/null +++ b/1-Introduction/3-fairness/translations/assignment.zh-cn.md @@ -0,0 +1,11 @@ +# 探索 Fairlearn + +## 说明 + +在这节课中,你了解了 Fairlearn,一个“开源的,社区驱动的项目,旨在帮助数据科学家们提高人工智能系统的公平性”。在这项作业中,探索 Fairlearn [笔记本](https://fairlearn.org/v0.6.2/auto_examples/index.html)中的一个例子,之后你可以用论文或者 ppt 的形式叙述你学习后的发现。 + +## 评判标准 + +| 标准 | 优秀 | 中规中矩 | 仍需努力 | +| -------- | --------- | -------- | ----------------- | +| | 提交了一篇论文或者ppt 关于讨论 Fairlearn 系统、挑选运行的例子、和运行这个例子后所得出来的心得结论 | 提交了一篇没有结论的论文 | 没有提交论文 | From 01c53fb14613d1fcb248e0bd4500a4674ea95fea Mon Sep 17 00:00:00 2001 From: unknown Date: Fri, 16 Jul 2021 17:28:11 +0800 Subject: [PATCH 43/51] Translated 4-techniques-of-ML assignment.md into Simplified Chinese --- .../translations/assignment.zh-cn.md | 11 +++++++++++ 1 file changed, 11 insertions(+) create mode 100644 1-Introduction/4-techniques-of-ML/translations/assignment.zh-cn.md diff --git a/1-Introduction/4-techniques-of-ML/translations/assignment.zh-cn.md b/1-Introduction/4-techniques-of-ML/translations/assignment.zh-cn.md new file mode 100644 index 00000000..ba28b554 --- /dev/null +++ b/1-Introduction/4-techniques-of-ML/translations/assignment.zh-cn.md @@ -0,0 +1,11 @@ +# 采访一位数据科学家 + +## 说明 + +在你的公司、你所在的社群、或者在你的朋友和同学中,找到一位从事数据科学专业工作的人,与他或她交流一下。写一篇关于他们工作日常的小短文(500字左右)。他们是专家,还是说他们是“全栈”开发者? + +## 评判标准 + +| 标准 | 优秀 | 中规中矩 | 仍需努力 | +| -------- | ------------------------------------------------------------------------------------ | ------------------------------------------------------------------ | --------------------- | +| | 提交一篇清晰描述了职业属性且字数符合规范的word文档 | 提交的文档职业属性描述得不清晰或者字数不合规范 | 啥都没有交 | From 39bf19bd05443b2b56111eb20de6ef02937d114d Mon Sep 17 00:00:00 2001 From: simplg <81249731+simplg@users.noreply.github.com> Date: Fri, 16 Jul 2021 11:39:42 +0200 Subject: [PATCH 44/51] Modifying Quiz translation MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit pre-lecture: Quiz préalable post-lecture: Quiz de validation des connaissances --- 1-Introduction/1-intro-to-ML/translations/README.fr.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/1-Introduction/1-intro-to-ML/translations/README.fr.md b/1-Introduction/1-intro-to-ML/translations/README.fr.md index 5fae99cd..1e27ca32 100644 --- a/1-Introduction/1-intro-to-ML/translations/README.fr.md +++ b/1-Introduction/1-intro-to-ML/translations/README.fr.md @@ -4,7 +4,7 @@ > 🎥 Cliquer sur l'image ci-dessus afin de regarder une vidéo expliquant la différence entre machine learning, AI et deep learning. -## [Quiz de préconférence](https://jolly-sea-0a877260f.azurestaticapps.net/quiz/1/) +## [Quiz préalable](https://jolly-sea-0a877260f.azurestaticapps.net/quiz/1/) ### Introduction @@ -98,7 +98,7 @@ Dans un avenir proche, comprendre les bases du machine learning sera indispensab Esquisser, sur papier ou à l'aide d'une application en ligne comme [Excalidraw](https://excalidraw.com/), votre compréhension des différences entre l'IA, le ML, le deep learning et la data science. Ajouter quelques idées de problèmes que chacune de ces techniques est bonne à résoudre. -## [Quiz de postconférence](https://jolly-sea-0a877260f.azurestaticapps.net/quiz/2/) +## [Quiz de validation des connaissances](https://jolly-sea-0a877260f.azurestaticapps.net/quiz/2/) ## Révision et auto-apprentissage From d5ef7e07218b043b967282b0656767f61684b771 Mon Sep 17 00:00:00 2001 From: simplg <81249731+simplg@users.noreply.github.com> Date: Fri, 16 Jul 2021 11:48:42 +0200 Subject: [PATCH 45/51] Create assignment.fr.md --- .../1-intro-to-ML/translations/assignment.fr.md | 10 ++++++++++ 1 file changed, 10 insertions(+) create mode 100644 1-Introduction/1-intro-to-ML/translations/assignment.fr.md diff --git a/1-Introduction/1-intro-to-ML/translations/assignment.fr.md b/1-Introduction/1-intro-to-ML/translations/assignment.fr.md new file mode 100644 index 00000000..0d703d26 --- /dev/null +++ b/1-Introduction/1-intro-to-ML/translations/assignment.fr.md @@ -0,0 +1,10 @@ +# Être opérationnel + + +## Instructions + +Dans ce devoir non noté, vous devez vous familiariser avec Python et rendre votre environnement opérationnel et capable d'exécuter des notebook. + +Suivez ce [parcours d'apprentissage Python](https://docs.microsoft.com/learn/paths/python-language/?WT.mc_id=academic-15963-cxa), puis configurez votre système en parcourant ces vidéos introductives : + +https://www.youtube.com/playlist?list=PLlrxD0HtieHhS8VzuMCfQD4uJ9yne1mE6 From 2a0c80f8c0d687016e8a737a537ae935e36723b4 Mon Sep 17 00:00:00 2001 From: simplg <81249731+simplg@users.noreply.github.com> Date: Fri, 16 Jul 2021 11:49:44 +0200 Subject: [PATCH 46/51] Changed url to assignment translation --- 1-Introduction/1-intro-to-ML/translations/README.fr.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/1-Introduction/1-intro-to-ML/translations/README.fr.md b/1-Introduction/1-intro-to-ML/translations/README.fr.md index 1e27ca32..a178790c 100644 --- a/1-Introduction/1-intro-to-ML/translations/README.fr.md +++ b/1-Introduction/1-intro-to-ML/translations/README.fr.md @@ -106,4 +106,4 @@ Pour en savoir plus sur la façon dont vous pouvez utiliser les algorithmes de M ## Devoir -[Être opérationnel](../assignment.md) +[Être opérationnel](assignment.fr.md) From c3074d622c0fe323c9f3d94a58c345f684431246 Mon Sep 17 00:00:00 2001 From: simplg <81249731+simplg@users.noreply.github.com> Date: Fri, 16 Jul 2021 14:39:00 +0200 Subject: [PATCH 47/51] Update README.fr.md --- 1-Introduction/1-intro-to-ML/translations/README.fr.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/1-Introduction/1-intro-to-ML/translations/README.fr.md b/1-Introduction/1-intro-to-ML/translations/README.fr.md index a178790c..e762c7f6 100644 --- a/1-Introduction/1-intro-to-ML/translations/README.fr.md +++ b/1-Introduction/1-intro-to-ML/translations/README.fr.md @@ -8,7 +8,7 @@ ### Introduction -Bienvenue à ce cours sur le machine learning classique pour débutant ! Que vous soyez complètement nouveau sur ce sujet ou que vous soyez un professonnel du ML expérimenté cherchant à peaufiner vos connaissances, nous sommes heureux de vous avoir avec nous ! Nous voulons créer un tremplin chaleureux pour vos études en ML et serions ravis d'évaluer, de répondre et d'apprendre de vos retours d'[expériences](https://github.com/microsoft/ML-For-Beginners/discussions). +Bienvenue à ce cours sur le machine learning classique pour débutant ! Que vous soyez complètement nouveau sur ce sujet ou que vous soyez un professionnel du ML expérimenté cherchant à peaufiner vos connaissances, nous sommes heureux de vous avoir avec nous ! Nous voulons créer un tremplin chaleureux pour vos études en ML et serions ravis d'évaluer, de répondre et d'apprendre de vos retours d'[expériences](https://github.com/microsoft/ML-For-Beginners/discussions). [![Introduction au ML](https://img.youtube.com/vi/h0e2HAPTGF4/0.jpg)](https://youtu.be/h0e2HAPTGF4 "Introduction to ML") @@ -45,7 +45,7 @@ Bien que le terme peut être confu, machine learning (ML) est un important sous- ## Ce que vous allez apprendre dans ce cours -Dans ce cours, nous allons nous concentrer sur les concepts clés du machine learning qu'un débutant se doit de connaître. Nous parlerons ce que l'on appelle le 'machine learning classique' en utilisant principalement Scikit-learn, une excellente librairie que beaucoup d'étudiants utilisent afin d'apprendre les bases. Afin de comprendre les concepts plus larges de l'intelligence artificielle ou du deep learning, une profonde connaissance en machine learning est indispensable, et c'est ce que nous aimerions fournir ici. +Dans ce cours, nous allons nous concentrer sur les concepts clés du machine learning qu'un débutant se doit de connaître. Nous parlerons de ce que l'on appelle le 'machine learning classique' en utilisant principalement Scikit-learn, une excellente librairie que beaucoup d'étudiants utilisent afin d'apprendre les bases. Afin de comprendre les concepts plus larges de l'intelligence artificielle ou du deep learning, une profonde connaissance en machine learning est indispensable, et c'est ce que nous aimerions fournir ici. Dans ce cours, vous allez apprendre : From 4dab6027cc9af7640a5403f6d8f77d1ffcae106c Mon Sep 17 00:00:00 2001 From: feiyun0112 Date: Fri, 16 Jul 2021 21:18:52 +0800 Subject: [PATCH 48/51] Update README.zh-cn.md --- 2-Regression/4-Logistic/translations/README.zh-cn.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/2-Regression/4-Logistic/translations/README.zh-cn.md b/2-Regression/4-Logistic/translations/README.zh-cn.md index 52453de5..b4397856 100644 --- a/2-Regression/4-Logistic/translations/README.zh-cn.md +++ b/2-Regression/4-Logistic/translations/README.zh-cn.md @@ -120,7 +120,7 @@ Seaborn提供了一些巧妙的方法来可视化你的数据。例如,你可 sns.swarmplot(x="Color", y="Item Size", data=new_pumpkins) ``` - ![分类散点图可视化数据](images/swarm.png) + ![分类散点图可视化数据](../images/swarm.png) ### 小提琴图 @@ -133,7 +133,7 @@ Seaborn提供了一些巧妙的方法来可视化你的数据。例如,你可 kind="violin", data=new_pumpkins) ``` - ![小提琴图](images/violin.png) + ![小提琴图](../images/violin.png) ✅ 尝试使用其他变量创建此图和其他Seaborn图。 From 580cfc314c2348e52849a90be12d6b9e74decd5e Mon Sep 17 00:00:00 2001 From: Jen Looper Date: Fri, 16 Jul 2021 10:14:21 -0400 Subject: [PATCH 49/51] small link format error --- 1-Introduction/1-intro-to-ML/translations/README.fr.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/1-Introduction/1-intro-to-ML/translations/README.fr.md b/1-Introduction/1-intro-to-ML/translations/README.fr.md index e762c7f6..0d07a5e8 100644 --- a/1-Introduction/1-intro-to-ML/translations/README.fr.md +++ b/1-Introduction/1-intro-to-ML/translations/README.fr.md @@ -102,7 +102,7 @@ Esquisser, sur papier ou à l'aide d'une application en ligne comme [Excalidraw] ## Révision et auto-apprentissage -Pour en savoir plus sur la façon dont vous pouvez utiliser les algorithmes de ML dans le cloud, suivez ce [Parcours d'apprentissage](https://docs.microsoft.com/learn/paths/create-no-code-predictive-models-azure-machine- learning/?WT.mc_id=academic-15963-cxa). +Pour en savoir plus sur la façon dont vous pouvez utiliser les algorithmes de ML dans le cloud, suivez ce [Parcours d'apprentissage](https://docs.microsoft.com/learn/paths/create-no-code-predictive-models-azure-machine-learning/?WT.mc_id=academic-15963-cxa). ## Devoir From c67123ad7322722cfab87a0666ef8c5417f3c9aa Mon Sep 17 00:00:00 2001 From: Jen Looper Date: Fri, 16 Jul 2021 10:23:19 -0400 Subject: [PATCH 50/51] linking assignment --- 1-Introduction/1-intro-to-ML/translations/README.zh-cn.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/1-Introduction/1-intro-to-ML/translations/README.zh-cn.md b/1-Introduction/1-intro-to-ML/translations/README.zh-cn.md index 8693ff20..45ec79be 100644 --- a/1-Introduction/1-intro-to-ML/translations/README.zh-cn.md +++ b/1-Introduction/1-intro-to-ML/translations/README.zh-cn.md @@ -104,4 +104,4 @@ ## 任务 -[启动并运行](../assignment.md) +[启动并运行](assignment.zh-cn.md) From d07d80730ddc9c8eda3dc294be54fb51130f5e24 Mon Sep 17 00:00:00 2001 From: Jen Looper Date: Fri, 16 Jul 2021 10:25:47 -0400 Subject: [PATCH 51/51] linking Chinese language assignments --- .../translations/README.zh-cn.md | 2 +- .../3-fairness/translations/README.zh-cn.md | 22 +++++++++---------- .../translations/README.zh-cn.md | 4 ++-- 3 files changed, 14 insertions(+), 14 deletions(-) diff --git a/1-Introduction/2-history-of-ML/translations/README.zh-cn.md b/1-Introduction/2-history-of-ML/translations/README.zh-cn.md index 51e66ecd..8ca7e690 100644 --- a/1-Introduction/2-history-of-ML/translations/README.zh-cn.md +++ b/1-Introduction/2-history-of-ML/translations/README.zh-cn.md @@ -113,4 +113,4 @@ Alan Turing,一个真正杰出的人,[在2019年被公众投票选出](https ## 任务 -[创建时间线](../assignment.md) +[创建时间线](assignment.zh-cn.md) diff --git a/1-Introduction/3-fairness/translations/README.zh-cn.md b/1-Introduction/3-fairness/translations/README.zh-cn.md index 3b75ddab..22204544 100644 --- a/1-Introduction/3-fairness/translations/README.zh-cn.md +++ b/1-Introduction/3-fairness/translations/README.zh-cn.md @@ -89,11 +89,11 @@ ✅ **讨论**:重温一些例子,看看它们是否显示出不同的危害。 -| | 分配 | 服务质量 | 刻板印象 | 诋毁 | 代表性过高或过低 | -| ----------------------- | :--------: | :----------------: | :----------: | :---------: | :----------------------------: | -| 自动招聘系统 | x | x | x | | x | -| 机器翻译 | | | | | | -| 照片加标签 | | | | | | +| | 分配 | 服务质量 | 刻板印象 | 诋毁 | 代表性过高或过低 | +| ------------ | :---: | :------: | :------: | :---: | :--------------: | +| 自动招聘系统 | x | x | x | | x | +| 机器翻译 | | | | | | +| 照片加标签 | | | | | | ## 检测不公平 @@ -138,11 +138,11 @@ ✅ 在以后关于聚类的课程中,你将看到如何在代码中构建这个“混淆矩阵” -| | 假阳性率 | 假阴性率 | 数量 | -| ---------- | ------------------- | ------------------- | ----- | -| 女性 | 0.37 | 0.27 | 54032 | -| 男性 | 0.31 | 0.35 | 28620 | -| 未列出性别 | 0.33 | 0.31 | 1266 | +| | 假阳性率 | 假阴性率 | 数量 | +| ---------- | -------- | -------- | ----- | +| 女性 | 0.37 | 0.27 | 54032 | +| 男性 | 0.31 | 0.35 | 28620 | +| 未列出性别 | 0.33 | 0.31 | 1266 | 这个表格告诉我们几件事。首先,我们注意到数据中的未列出性别的人相对较少。数据是有偏差的,所以你需要小心解释这些数字。 @@ -211,4 +211,4 @@ ## 任务 -[探索Fairlearn](../assignment.md) +[探索Fairlearn](assignment.zh-cn.md) diff --git a/1-Introduction/4-techniques-of-ML/translations/README.zh-cn.md b/1-Introduction/4-techniques-of-ML/translations/README.zh-cn.md index d01d5bbf..373602f3 100644 --- a/1-Introduction/4-techniques-of-ML/translations/README.zh-cn.md +++ b/1-Introduction/4-techniques-of-ML/translations/README.zh-cn.md @@ -54,7 +54,7 @@ - **训练**。这部分数据集适合你的模型进行训练。这个集合构成了原始数据集的大部分。 - **测试**。测试数据集是一组独立的数据,通常从原始数据中收集,用于确认构建模型的性能。 -- **验证**。验证集是一个较小的独立示例组,用于调整模型的超参数或架构,以改进模型。根据你的数据大小和你提出的问题,你可能不需要构建第三组(正如我们在[时间序列预测](../../7-TimeSeries/1-Introduction/README.md)中所述)。 +- **验证**。验证集是一个较小的独立示例组,用于调整模型的超参数或架构,以改进模型。根据你的数据大小和你提出的问题,你可能不需要构建第三组(正如我们在[时间序列预测](../../../7-TimeSeries/1-Introduction/README.md)中所述)。 ## 建立模型 @@ -105,4 +105,4 @@ ## 任务 -[采访一名数据科学家](../assignment.md) +[采访一名数据科学家](assignment.zh-cn.md)