Merge branch 'main' of github.com:microsoft/Data-Science-For-Beginners into feat/section1-translation-ES

pull/173/head
Angel Mendez 3 years ago
commit 686040f1a3

@ -1,8 +1,8 @@
# Defining Data Science
|![ Sketchnote by [(@sketchthedocs)](https://sketchthedocs.dev) ](../../sketchnotes/01-Definitions.png)|
|:---:|
|Defining Data Science - _Sketchnote by [@nitya](https://twitter.com/nitya)_ |
| ![ Sketchnote by [(@sketchthedocs)](https://sketchthedocs.dev) ](../../sketchnotes/01-Definitions.png) |
| :----------------------------------------------------------------------------------------------------: |
| Defining Data Science - _Sketchnote by [@nitya](https://twitter.com/nitya)_ |
---
@ -69,11 +69,11 @@ Vast amounts of data are incomprehensible for a human being, but once we create
As we have already mentioned - data is everywhere, we just need to capture it in the right way! It is useful to distinguish between **structured** and **unstructured** data. The former are typically represented in some well-structured form, often as a table or number of tables, while latter is just a collection of files. Sometimes we can also talk about **semistructured** data, that have some sort of a structure that may vary greatly.
| Structured | Semi-structured | Unstructured |
|----------- |-----------------|--------------|
| List of people with their phone numbers | Wikipedia pages with links | Text of Encyclopaedia Britannica |
| Temperature in all rooms of a building at every minute for the last 20 years | Collection of scientific papers in JSON format with authors, data of publication, and abstract | File share with corporate documents |
| Data for age and gender of all people entering the building | Internet pages | Raw video feed from surveillance camera |
| Structured | Semi-structured | Unstructured |
| ---------------------------------------------------------------------------- | ---------------------------------------------------------------------------------------------- | --------------------------------------- |
| List of people with their phone numbers | Wikipedia pages with links | Text of Encyclopaedia Britannica |
| Temperature in all rooms of a building at every minute for the last 20 years | Collection of scientific papers in JSON format with authors, data of publication, and abstract | File share with corporate documents |
| Data for age and gender of all people entering the building | Internet pages | Raw video feed from surveillance camera |
## Where to get Data
@ -107,7 +107,7 @@ First step is to collect the data. While in many cases it can be a straightforwa
Storing the data can be challenging, especially if we are talking about big data. When deciding how to store data, it makes sense to anticipate the way you would want later on to query them. There are several ways data can be stored:
<ul>
<li>Relational database stores a collection of tables, and uses a special language called SQL to query them. Typically, tables would be connected to each other using some schema. In many cases we need to convert the data from original form to fit the schema.</li>
<li><a href="https://en.wikipedia.org/wiki/NoSQL">NoSQL</a> database, such as <a href="https://azure.microsoft.com/services/cosmos-db/?WT.mc_id=acad-31812-dmitryso">CosmosDB</a>, does not enforce schema on data, and allows storing more complex data, for example, hierarchical JSON documents or graphs. However, NoSQL database does not have rich querying capabilities of SQL, and cannot enforce referential integrity between data.</li>
<li><a href="https://en.wikipedia.org/wiki/NoSQL">NoSQL</a> database, such as <a href="https://azure.microsoft.com/services/cosmos-db/?WT.mc_id=academic-31812-dmitryso">CosmosDB</a>, does not enforce schema on data, and allows storing more complex data, for example, hierarchical JSON documents or graphs. However, NoSQL database does not have rich querying capabilities of SQL, and cannot enforce referential integrity between data.</li>
<li><a href="https://en.wikipedia.org/wiki/Data_lake">Data Lake</a> storage is used for large collections of data in raw form. Data lakes are often used with big data, where all data cannot fit into one machine, and has to be stored and processed by a cluster. <a href="https://en.wikipedia.org/wiki/Apache_Parquet">Parquet</a> is the data format that is often used in conjunction with big data.</li>
</ul>
</dd>

@ -1,8 +1,8 @@
# Working with Data: Python and the Pandas Library
|![ Sketchnote by [(@sketchthedocs)](https://sketchthedocs.dev) ](../../sketchnotes/07-WorkWithPython.png)|
|:---:|
|Working With Python - _Sketchnote by [@nitya](https://twitter.com/nitya)_ |
| ![ Sketchnote by [(@sketchthedocs)](https://sketchthedocs.dev) ](../../sketchnotes/07-WorkWithPython.png) |
| :-------------------------------------------------------------------------------------------------------: |
| Working With Python - _Sketchnote by [@nitya](https://twitter.com/nitya)_ |
[![Intro Video](images/video-ds-python.png)](https://youtu.be/dZjWOGbsN4Y)
@ -16,7 +16,7 @@ Data processing can be programmed in any programming language, but there are cer
In this lesson, we will focus on using Python for simple data processing. We will assume basic familiarity with the language. If you want a deeper tour of Python, you can refer to one of the following resources:
* [Learn Python in a Fun Way with Turtle Graphics and Fractals](https://github.com/shwars/pycourse) - GitHub-based quick intro course into Python Programming
* [Take your First Steps with Python](https://docs.microsoft.com/en-us/learn/paths/python-first-steps/?WT.mc_id=acad-31812-dmitryso) Learning Path on [Microsoft Learn](http://learn.microsoft.com/?WT.mc_id=acad-31812-dmitryso)
* [Take your First Steps with Python](https://docs.microsoft.com/en-us/learn/paths/python-first-steps/?WT.mc_id=academic-31812-dmitryso) Learning Path on [Microsoft Learn](http://learn.microsoft.com/?WT.mc_id=academic-31812-dmitryso)
Data can come in many forms. In this lesson, we will consider three forms of data - **tabular data**, **text** and **images**.
@ -97,10 +97,10 @@ b = pd.Series(["I","like","to","play","games","and","will","not","change"],index
df = pd.DataFrame([a,b])
```
This will create a horizontal table like this:
| | 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 |
|---|---|---|---|---|---|---|---|---|---|
| 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 |
| 1 | I | like | to | use | Python | and | Pandas | very | much |
| | 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 |
| --- | --- | ---- | --- | --- | ------ | --- | ------ | ---- | ---- |
| 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 |
| 1 | I | like | to | use | Python | and | Pandas | very | much |
We can also use Series as columns, and specify column names using dictionary:
```python
@ -108,17 +108,17 @@ df = pd.DataFrame({ 'A' : a, 'B' : b })
```
This will give us a table like this:
| | A | B |
|---|---|---|
| 0 | 1 | I |
| 1 | 2 | like |
| 2 | 3 | to |
| 3 | 4 | use |
| 4 | 5 | Python |
| 5 | 6 | and |
| 6 | 7 | Pandas |
| 7 | 8 | very |
| 8 | 9 | much |
| | A | B |
| --- | --- | ------ |
| 0 | 1 | I |
| 1 | 2 | like |
| 2 | 3 | to |
| 3 | 4 | use |
| 4 | 5 | Python |
| 5 | 6 | and |
| 6 | 7 | Pandas |
| 7 | 8 | very |
| 8 | 9 | much |
**Note** that we can also get this table layout by transposing the previous table, eg. by writing
```python
@ -154,17 +154,17 @@ df['LenB'] = df['B'].apply(len)
After operations above, we will end up with the following DataFrame:
| | A | B | DivA | LenB |
|---|---|---|---|---|
| 0 | 1 | I | -4.0 | 1 |
| 1 | 2 | like | -3.0 | 4 |
| 2 | 3 | to | -2.0 | 2 |
| 3 | 4 | use | -1.0 | 3 |
| 4 | 5 | Python | 0.0 | 6 |
| 5 | 6 | and | 1.0 | 3 |
| 6 | 7 | Pandas | 2.0 | 6 |
| 7 | 8 | very | 3.0 | 4 |
| 8 | 9 | much | 4.0 | 4 |
| | A | B | DivA | LenB |
| --- | --- | ------ | ---- | ---- |
| 0 | 1 | I | -4.0 | 1 |
| 1 | 2 | like | -3.0 | 4 |
| 2 | 3 | to | -2.0 | 2 |
| 3 | 4 | use | -1.0 | 3 |
| 4 | 5 | Python | 0.0 | 6 |
| 5 | 6 | and | 1.0 | 3 |
| 6 | 7 | Pandas | 2.0 | 6 |
| 7 | 8 | very | 3.0 | 4 |
| 8 | 9 | much | 4.0 | 4 |
**Selecting rows based on numbers** can be done using `iloc` construct. For example, to select first 5 rows from the DataFrame:
```python
@ -183,13 +183,13 @@ df.groupby(by='LenB') \
```
This gives us the following table:
| LenB | Count | Mean |
|------|-------|------|
| 1 | 1 | 1.000000 |
| 2 | 1 | 3.000000 |
| 3 | 2 | 5.000000 |
| 4 | 3 | 6.333333 |
| 6 | 2 | 6.000000 |
| LenB | Count | Mean |
| ---- | ----- | -------- |
| 1 | 1 | 1.000000 |
| 2 | 1 | 3.000000 |
| 3 | 2 | 5.000000 |
| 4 | 3 | 6.333333 |
| 6 | 2 | 6.000000 |
### Getting Data
@ -230,7 +230,7 @@ While data very often comes in tabular form, in some cases we need to deal with
In this challenge, we will continue with the topic of COVID pandemic, and focus on processing scientific papers on the subject. There is [CORD-19 Dataset](https://www.kaggle.com/allen-institute-for-ai/CORD-19-research-challenge) with more than 7000 (at the time of writing) papers on COVID, available with metadata and abstracts (and for about half of them there is also full text provided).
A full example of analyzing this dataset using [Text Analytics for Health](https://docs.microsoft.com/azure/cognitive-services/text-analytics/how-tos/text-analytics-for-health/?WT.mc_id=acad-31812-dmitryso) cognitive service is described [in this blog post](https://soshnikov.com/science/analyzing-medical-papers-with-azure-and-text-analytics-for-health/). We will discuss simplified version of this analysis.
A full example of analyzing this dataset using [Text Analytics for Health](https://docs.microsoft.com/azure/cognitive-services/text-analytics/how-tos/text-analytics-for-health/?WT.mc_id=academic-31812-dmitryso) cognitive service is described [in this blog post](https://soshnikov.com/science/analyzing-medical-papers-with-azure-and-text-analytics-for-health/). We will discuss simplified version of this analysis.
> **NOTE**: We do not provide a copy of the dataset as part of this repository. You may first need to download the [`metadata.csv`](https://www.kaggle.com/allen-institute-for-ai/CORD-19-research-challenge?select=metadata.csv) file from [this dataset on Kaggle](https://www.kaggle.com/allen-institute-for-ai/CORD-19-research-challenge). Registration with Kaggle may be required. You may also download the dataset without registration [from here](https://ai2-semanticscholar-cord-19.s3-us-west-2.amazonaws.com/historical_releases.html), but it will include all full texts in addition to metadata file.
@ -242,15 +242,15 @@ Open [`notebook-papers.ipynb`](notebook-papers.ipynb) and read it from top to bo
Recently, very powerful AI models have been developed that allow us to understand images. There are many tasks that can be solved using pre-trained neural networks, or cloud services. Some examples include:
* **Image Classification**, which can help you categorize the image into one of the pre-defined classes. You can easily train your own image classifiers using services such as [Custom Vision](https://azure.microsoft.com/services/cognitive-services/custom-vision-service/?WT.mc_id=acad-31812-dmitryso)
* **Object Detection** to detect different objects in the image. Services such as [computer vision](https://azure.microsoft.com/services/cognitive-services/computer-vision/?WT.mc_id=acad-31812-dmitryso) can detect a number of common objects, and you can train [Custom Vision](https://azure.microsoft.com/services/cognitive-services/custom-vision-service/?WT.mc_id=acad-31812-dmitryso) model to detect some specific objects of interest.
* **Face Detection**, including Age, Gender and Emotion detection. This can be done via [Face API](https://azure.microsoft.com/services/cognitive-services/face/?WT.mc_id=acad-31812-dmitryso).
* **Image Classification**, which can help you categorize the image into one of the pre-defined classes. You can easily train your own image classifiers using services such as [Custom Vision](https://azure.microsoft.com/services/cognitive-services/custom-vision-service/?WT.mc_id=academic-31812-dmitryso)
* **Object Detection** to detect different objects in the image. Services such as [computer vision](https://azure.microsoft.com/services/cognitive-services/computer-vision/?WT.mc_id=academic-31812-dmitryso) can detect a number of common objects, and you can train [Custom Vision](https://azure.microsoft.com/services/cognitive-services/custom-vision-service/?WT.mc_id=academic-31812-dmitryso) model to detect some specific objects of interest.
* **Face Detection**, including Age, Gender and Emotion detection. This can be done via [Face API](https://azure.microsoft.com/services/cognitive-services/face/?WT.mc_id=academic-31812-dmitryso).
All those cloud services can be called using [Python SDKs](https://docs.microsoft.com/samples/azure-samples/cognitive-services-python-sdk-samples/cognitive-services-python-sdk-samples/?WT.mc_id=acad-31812-dmitryso), and thus can be easily incorporated into your data exploration workflow.
All those cloud services can be called using [Python SDKs](https://docs.microsoft.com/samples/azure-samples/cognitive-services-python-sdk-samples/cognitive-services-python-sdk-samples/?WT.mc_id=academic-31812-dmitryso), and thus can be easily incorporated into your data exploration workflow.
Here are some examples of exploring data from Image data sources:
* In the blog post [How to Learn Data Science without Coding](https://soshnikov.com/azure/how-to-learn-data-science-without-coding/) we explore Instagram photos, trying to understand what makes people give more likes to a photo. We first extract as much information from pictures as possible using [computer vision](https://azure.microsoft.com/services/cognitive-services/computer-vision/?WT.mc_id=acad-31812-dmitryso), and then use [Azure Machine Learning AutoML](https://docs.microsoft.com/azure/machine-learning/concept-automated-ml/?WT.mc_id=acad-31812-dmitryso) to build interpretable model.
* In [Facial Studies Workshop](https://github.com/CloudAdvocacy/FaceStudies) we use [Face API](https://azure.microsoft.com/services/cognitive-services/face/?WT.mc_id=acad-31812-dmitryso) to extract emotions on people on photographs from events, in order to try to understand what makes people happy.
* In the blog post [How to Learn Data Science without Coding](https://soshnikov.com/azure/how-to-learn-data-science-without-coding/) we explore Instagram photos, trying to understand what makes people give more likes to a photo. We first extract as much information from pictures as possible using [computer vision](https://azure.microsoft.com/services/cognitive-services/computer-vision/?WT.mc_id=academic-31812-dmitryso), and then use [Azure Machine Learning AutoML](https://docs.microsoft.com/azure/machine-learning/concept-automated-ml/?WT.mc_id=academic-31812-dmitryso) to build interpretable model.
* In [Facial Studies Workshop](https://github.com/CloudAdvocacy/FaceStudies) we use [Face API](https://azure.microsoft.com/services/cognitive-services/face/?WT.mc_id=academic-31812-dmitryso) to extract emotions on people on photographs from events, in order to try to understand what makes people happy.
## Conclusion
@ -271,7 +271,7 @@ Whether you already have structured or unstructured data, using Python you can p
**Learning Python**
* [Learn Python in a Fun Way with Turtle Graphics and Fractals](https://github.com/shwars/pycourse)
* [Take your First Steps with Python](https://docs.microsoft.com/learn/paths/python-first-steps/?WT.mc_id=acad-31812-dmitryso) Learning Path on [Microsoft Learn](http://learn.microsoft.com/?WT.mc_id=acad-31812-dmitryso)
* [Take your First Steps with Python](https://docs.microsoft.com/learn/paths/python-first-steps/?WT.mc_id=academic-31812-dmitryso) Learning Path on [Microsoft Learn](http://learn.microsoft.com/?WT.mc_id=academic-31812-dmitryso)
## Assignment

@ -7,7 +7,7 @@
"\r\n",
"In this challenge, we will continue with the topic of COVID pandemic, and focus on processing scientific papers on the subject. There is [CORD-19 Dataset](https://www.kaggle.com/allen-institute-for-ai/CORD-19-research-challenge) with more than 7000 (at the time of writing) papers on COVID, available with metadata and abstracts (and for about half of them there is also full text provided).\r\n",
"\r\n",
"A full example of analyzing this dataset using [Text Analytics for Health](https://docs.microsoft.com/azure/cognitive-services/text-analytics/how-tos/text-analytics-for-health/?WT.mc_id=acad-31812-dmitryso) cognitive service is described [in this blog post](https://soshnikov.com/science/analyzing-medical-papers-with-azure-and-text-analytics-for-health/). We will discuss simplified version of this analysis."
"A full example of analyzing this dataset using [Text Analytics for Health](https://docs.microsoft.com/azure/cognitive-services/text-analytics/how-tos/text-analytics-for-health/?WT.mc_id=academic-31812-dmitryso) cognitive service is described [in this blog post](https://soshnikov.com/science/analyzing-medical-papers-with-azure-and-text-analytics-for-health/). We will discuss simplified version of this analysis."
],
"metadata": {}
},

File diff suppressed because it is too large Load Diff

@ -0,0 +1,33 @@
# विज़ुअलाइज़ेशन
![लैवेंडर फूल पर मधुमक्खी](../images/bee.jpg)
> <a href="https://unsplash.com/@jenna2980?utm_source=unsplash&utm_medium=referral&utm_content=creditCopyText">
जेना ली</a> द्वारा फोटो <a href="https://unsplash.com/s/photos/bees-in-a-meadow?utm_source=unsplash&utm_medium=referral&utm_content=creditCopyText">Unsplash
पर </a>
डेटा को विज़ुअलाइज़ करना डेटा साइंटिस्ट के सबसे महत्वपूर्ण कार्यों में से एक है। छवियां 1000 शब्दों के लायक हैं, और एक विज़ुअलाइज़ेशन आपको अपने डेटा के सभी प्रकार के दिलचस्प हिस्सों जैसे कि स्पाइक्स, आउटलेयर, ग्रुपिंग, प्रवृत्ति, और बहुत कुछ की पहचान करने में मदद कर सकता है, जो आपको उस कहानी को समझने में मदद कर सकता है जिसे आपका डेटा बताने की कोशिश कर रहा है।
इन पांच पाठों में, आप प्रकृति से प्राप्त डेटा का पता लगाएंगे और विभिन्न तकनीकों का उपयोग करके दिलचस्प और सुंदर विज़ुअलाइज़ेशन बनाएंगे।
### Topics
1. [विज़ुअलाइज़िंग मात्रा](09-visualization-quantities/README.md)
1. [विज़ुअलाइज़िंग वितरण](10-visualization-distributions/README.md)
1. [विज़ुअलाइज़िंग अनुपात](11-visualization-proportions/README.md)
1. [रिश्तों की कल्पना](12-visualization-relationships/README.md)
1. [सार्थक विज़ुअलाइज़ेशन बनाना](13-meaningful-visualizations/README.md)
### Credits
ये विज़ुअलाइज़ेशन पाठ 🌸 [Jen Looper](https://twitter.com/jenlooper) के साथ लिखे गए थे
🍯 यूएस हनी प्रोडक्शन के लिए डेटा [कागल](https://www.kaggle.com/jessicali9530/honey-production) पर जेसिका ली के प्रोजेक्ट से लिया गया है। [डेटा](https://usda.library.cornell.edu/concern/publications/rn301137d) [यूनाइटेड स्टेट्स डिपार्टमेंट ऑफ़ एग्रीकल्चर](https://www.nass.usda.gov/About_NASS/index.php) से लिया गया है।
🍄 मशरूम के लिए डेटा भी हैटरस डनटन द्वारा संशोधित [कागल](https://www.kaggle.com/hatterasdunton/mushroom-classification-updated-dataset) से प्राप्त किया जाता है। इस डेटासेट में एगारिकस और लेपियोटा परिवार में ग्रील्ड मशरूम की 23 प्रजातियों के अनुरूप काल्पनिक नमूनों का विवरण शामिल है। द ऑडबोन सोसाइटी फील्ड गाइड टू नॉर्थ अमेरिकन मशरूम (1981) से लिया गया मशरूम। यह डेटासेट 1987 में UCI ML 27 को दान किया गया था।
🦆 मिनेसोटा बर्ड्स के लिए डेटा [कागल](https://www.kaggle.com/hannahcollins/minnesota-birds) से है, जिसे हन्ना कॉलिन्स द्वारा [विकिपीडिया](https://en.wikipedia.org/wiki/List_of_birds_of_Minnesota) से स्क्रैप किया गया है।
ये सभी डेटासेट [CC0: Creative Commons](https://creativecommons.org/publicdomain/zero/1.0/) के रूप में लाइसेंसीकृत हैं।

@ -0,0 +1,13 @@
# डेटा विज्ञान के जीवनचक्र
![संचार](../images/communication.jpg)
>तस्वीर <a href="https://unsplash.com/@headwayio?utm_source=unsplash&utm_medium=referral&utm_content=creditCopyText">Headway</a> द्वारा <a href="https://unsplash.com/s/photos/communication?utm_source=unsplash&utm_medium=referral&utm_content=creditCopyText">Unsplash</a> पर
इन पाठों में, आप डेटा विज्ञान जीवनचक्र के कुछ पहलुओं का पता लगाएंगे, जिसमें डेटा के आसपास विश्लेषण और संचार शामिल है।
### विषय
1. [परिचय](../14-Introduction/README.md)
2. [विश्लेषण](../15-analyzing/README.md)
3. [संचार](../16-communication/README.md)
### क्रेडिट
ये पाठ [जालेन मैक्गी](https://twitter.com/JalenMCG) और [जैस्मीन ग्रीनवे](https://twitter.com/paladique) द्वारा ❤️ से लिखे गए हैं।

@ -14,7 +14,7 @@ Azure Cloud Advocates at Microsoft are pleased to offer a 10-week, 20-lesson cur
**Hearty thanks to our authors:** [Jasmine Greenaway](https://www.twitter.com/paladique), [Dmitry Soshnikov](http://soshnikov.com), [Nitya Narasimhan](https://twitter.com/nitya), [Jalen McGee](https://twitter.com/JalenMcG), [Jen Looper](https://twitter.com/jenlooper), [Maud Levy](https://twitter.com/maudstweets), [Tiffany Souterre](https://twitter.com/TiffanySouterre), [Christopher Harrison](https://www.twitter.com/geektrainer).
**🙏 Special thanks 🙏 to our Microsoft Student Ambassador authors, reviewers and content contributors,** notably [Raymond Wangsa Putra](https://www.linkedin.com/in/raymond-wp/), [Ankita Singh](https://www.linkedin.com/in/ankitasingh007), [Rohit Yadav](https://www.linkedin.com/in/rty2423), [Arpita Das](https://www.linkedin.com/in/arpitadas01/), [Mohamma Iftekher (Iftu) Ebne Jalal](https://twitter.com/iftu119), [Dishita Bhasin](https://www.linkedin.com/in/dishita-bhasin-7065281bb), [Miguel Correa](https://www.linkedin.com/in/miguelmque/), [Nawrin Tabassum](https://www.linkedin.com/in/nawrin-tabassum), [Sanya Sinha](https://www.linkedin.com/mwlite/in/sanya-sinha-13aab1200), [Majd Safi](https://www.linkedin.com/in/majd-s/), [Sheena Narula](https://www.linkedin.com/in/sheena-narula-n/), [Anupam Mishra](https://www.linkedin.com/in/anupam--mishra/), [Dibri Nsofor](https://www.linkedin.com/in/dibrinsofor), [Aditya Garg](https://github.com/AdityaGarg00), [Alondra Sanchez](https://www.linkedin.com/in/alondra-sanchez-molina/), Yogendrasingh Pawar, Max Blum, Samridhi Sharma, Tauqeer Ahmad, Aaryan Arora, ChhailBihari Dubey
**🙏 Special thanks 🙏 to our [Microsoft Student Ambassador](https://studentambassadors.microsoft.com/) authors, reviewers and content contributors,** notably [Raymond Wangsa Putra](https://www.linkedin.com/in/raymond-wp/), [Ankita Singh](https://www.linkedin.com/in/ankitasingh007), [Rohit Yadav](https://www.linkedin.com/in/rty2423), [Arpita Das](https://www.linkedin.com/in/arpitadas01/), [Mohamma Iftekher (Iftu) Ebne Jalal](https://twitter.com/iftu119), [Dishita Bhasin](https://www.linkedin.com/in/dishita-bhasin-7065281bb), [Miguel Correa](https://www.linkedin.com/in/miguelmque/), [Nawrin Tabassum](https://www.linkedin.com/in/nawrin-tabassum), [Sanya Sinha](https://www.linkedin.com/mwlite/in/sanya-sinha-13aab1200), [Majd Safi](https://www.linkedin.com/in/majd-s/), [Sheena Narula](https://www.linkedin.com/in/sheena-narula-n/), [Anupam Mishra](https://www.linkedin.com/in/anupam--mishra/), [Dibri Nsofor](https://www.linkedin.com/in/dibrinsofor), [Aditya Garg](https://github.com/AdityaGarg00), [Alondra Sanchez](https://www.linkedin.com/in/alondra-sanchez-molina/), [Max Blum](https://www.linkedin.com/in/max-blum-6036a1186/), Yogendrasingh Pawar, Samridhi Sharma, Tauqeer Ahmad, Aaryan Arora, ChhailBihari Dubey
|![ Sketchnote by [(@sketchthedocs)](https://sketchthedocs.dev) ](./sketchnotes/00-Title.png)|
|:---:|
@ -78,8 +78,8 @@ In addition, a low-stakes quiz before a class sets the intention of the student
| 12 | Visualizing Relationships | [Data Visualization](3-Data-Visualization/README.md) | Visualizing connections and correlations between sets of data and their variables. | [lesson](3-Data-Visualization/12-visualization-relationships/README.md) | [Jen](https://twitter.com/jenlooper) |
| 13 | Meaningful Visualizations | [Data Visualization](3-Data-Visualization/README.md) | Techniques and guidance for making your visualizations valuable for effective problem solving and insights. | [lesson](3-Data-Visualization/13-meaningful-visualizations/README.md) | [Jen](https://twitter.com/jenlooper) |
| 14 | Introduction to the Data Science lifecycle | [Lifecycle](4-Data-Science-Lifecycle/README.md) | Introduction to the data science lifecycle and its first step of acquiring and extracting data. | [lesson](4-Data-Science-Lifecycle/14-Introduction/README.md) | [Jasmine](https://twitter.com/paladique) |
| 15 | Analyzing | [Lifecycle](4-Data-Science-Lifecycle/README.md) | This phase of the data science lifecycle focuses on techniques to analyze data. | [lesson](4-Data-Science-Lifecycle/15-Analyzing/README.md) | [Jasmine](https://twitter.com/paladique) | | |
| 16 | Communication | [Lifecycle](4-Data-Science-Lifecycle/README.md) | This phase of the data science lifecycle focuses on presenting the insights from the data in a way that makes it easier for decision makers to understand. | [lesson](4-Data-Science-Lifecycle/16-Communication/README.md) | [Jalen](https://twitter.com/JalenMcG) | | |
| 15 | Analyzing | [Lifecycle](4-Data-Science-Lifecycle/README.md) | This phase of the data science lifecycle focuses on techniques to analyze data. | [lesson](4-Data-Science-Lifecycle/15-analyzing/README.md) | [Jasmine](https://twitter.com/paladique) | | |
| 16 | Communication | [Lifecycle](4-Data-Science-Lifecycle/README.md) | This phase of the data science lifecycle focuses on presenting the insights from the data in a way that makes it easier for decision makers to understand. | [lesson](4-Data-Science-Lifecycle/16-communication/README.md) | [Jalen](https://twitter.com/JalenMcG) | | |
| 17 | Data Science in the Cloud | [Cloud Data](5-Data-Science-In-Cloud/README.md) | This series of lessons introduces data science in the cloud and its benefits. | [lesson](5-Data-Science-In-Cloud/17-Introduction/README.md) | [Tiffany](https://twitter.com/TiffanySouterre) and [Maud](https://twitter.com/maudstweets) |
| 18 | Data Science in the Cloud | [Cloud Data](5-Data-Science-In-Cloud/README.md) | Training models using Low Code tools. |[lesson](5-Data-Science-In-Cloud/18-Low-Code/README.md) | [Tiffany](https://twitter.com/TiffanySouterre) and [Maud](https://twitter.com/maudstweets) |
| 19 | Data Science in the Cloud | [Cloud Data](5-Data-Science-In-Cloud/README.md) | Deploying models with Azure Machine Learning Studio. | [lesson](5-Data-Science-In-Cloud/19-Azure/README.md)| [Tiffany](https://twitter.com/TiffanySouterre) and [Maud](https://twitter.com/maudstweets) |

@ -0,0 +1,111 @@
<div dir="rtl">
# علم داده برای مبتدیان - برنامه درسی
[![GitHub license](https://img.shields.io/github/license/microsoft/Data-Science-For-Beginners.svg)](https://github.com/microsoft/Data-Science-For-Beginners/blob/master/LICENSE)
[![GitHub contributors](https://img.shields.io/github/contributors/microsoft/Data-Science-For-Beginners.svg)](https://GitHub.com/microsoft/Data-Science-For-Beginners/graphs/contributors/)
[![GitHub issues](https://img.shields.io/github/issues/microsoft/Data-Science-For-Beginners.svg)](https://GitHub.com/microsoft/Data-Science-For-Beginners/issues/)
[![GitHub pull-requests](https://img.shields.io/github/issues-pr/microsoft/Data-Science-For-Beginners.svg)](https://GitHub.com/microsoft/Data-Science-For-Beginners/pulls/)
[![PRs Welcome](https://img.shields.io/badge/PRs-welcome-brightgreen.svg?style=flat-square)](http://makeapullrequest.com)
[![GitHub watchers](https://img.shields.io/github/watchers/microsoft/Data-Science-For-Beginners.svg?style=social&label=Watch)](https://GitHub.com/microsoft/Data-Science-For-Beginners/watchers/)
[![GitHub forks](https://img.shields.io/github/forks/microsoft/Data-Science-For-Beginners.svg?style=social&label=Fork)](https://GitHub.com/microsoft/Data-Science-For-Beginners/network/)
[![GitHub stars](https://img.shields.io/github/stars/microsoft/Data-Science-For-Beginners.svg?style=social&label=Star)](https://GitHub.com/microsoft/Data-Science-For-Beginners/stargazers/)
طرفداران Azure Cloud در مایکروسافت مفتخر هستند که یک برنامه درسی 10 هفته ای و 20 درسی درباره علم داده ارائه دهند. هر درس شامل کوییزهای پیش از درس و پس از درس، دستورالعمل های کتبی برای تکمیل درس، راه حل و تکلیف است. آموزش پروژه محور ما به شما این امکان را می دهد در حین ساختن یاد بگیرید، راهی ثابت شده جهت "ماندگاری" مهارت های جدید.
**تشکر از صمیم قلب از نویسندگانمان:** [Jasmine Greenaway](https://www.twitter.com/paladique), [Dmitry Soshnikov](http://soshnikov.com), [Nitya Narasimhan](https://twitter.com/nitya), [Jalen McGee](https://twitter.com/JalenMcG), [Jen Looper](https://twitter.com/jenlooper), [Maud Levy](https://twitter.com/maudstweets), [Tiffany Souterre](https://twitter.com/TiffanySouterre), [Christopher Harrison](https://www.twitter.com/geektrainer).
**🙏 تشکر ویژه 🙏 از نویسندگان سفیر دانشجویی مایکروسافت، بازبینی کنندگان، و مشارکت کنندگان در محتوا،** به ویژه [Raymond Wangsa Putra](https://www.linkedin.com/in/raymond-wp/), [Ankita Singh](https://www.linkedin.com/in/ankitasingh007), [Rohit Yadav](https://www.linkedin.com/in/rty2423), [Arpita Das](https://www.linkedin.com/in/arpitadas01/), [Mohamma Iftekher (Iftu) Ebne Jalal](https://twitter.com/iftu119), [Dishita Bhasin](https://www.linkedin.com/in/dishita-bhasin-7065281bb), [Miguel Correa](https://www.linkedin.com/in/miguelmque/), [Nawrin Tabassum](https://www.linkedin.com/in/nawrin-tabassum), [Sanya Sinha](https://www.linkedin.com/mwlite/in/sanya-sinha-13aab1200), [Majd Safi](https://www.linkedin.com/in/majd-s/), [Sheena Narula](https://www.linkedin.com/in/sheena-narula-n/), [Anupam Mishra](https://www.linkedin.com/in/anupam--mishra/), [Dibri Nsofor](https://www.linkedin.com/in/dibrinsofor), [Aditya Garg](https://github.com/AdityaGarg00), [Alondra Sanchez](https://www.linkedin.com/in/alondra-sanchez-molina/), Yogendrasingh Pawar, Max Blum, Samridhi Sharma, Tauqeer Ahmad, Aaryan Arora, ChhailBihari Dubey
|![ Sketchnote by [(@sketchthedocs)](https://sketchthedocs.dev) ](../sketchnotes/00-Title.png)|
|:---:|
| علم داده برای مبتدیان - یادداشت بصری (sketchnote) از [@nitya](https://twitter.com/nitya)_ |
# شروع به کار
> **معلمان**، ما در مورد نحوه استفاده از این برنامه درسی [برخی از پیشنهادات را درج کرده ایم](../for-teachers.md). بسیار خوشحال می شویم که بازخوردهای شما را در [انجمن بحث و گفت و گوی](https://github.com/microsoft/Data-Science-For-Beginners/discussions) خود داشته باشیم!
> **دانش آموزان**، اگر قصد دارید به تنهایی از این برنامه درسی استفاده کنید، کل مخزن را فورک کنید و تمرینات را خودتان به تنهایی انجام دهید. ابتدا با آزمون قبل از درس آغاز کنید، سپس درسنامه را خوانده و باقی فعالیت ها را تکمیل کنید. سعی کنید به جای کپی کردن کد راه حل، خودتان پروژه ها را با درک مفاهیم درسنامه ایجاد کنید. با این حال،کد راه حل در پوشه های /solutions داخل هر درس پروژه محور موجود می باشد. ایده دیگر تشکیل گروه مطالعه با دوستان است تا بتوانید مطالب را با هم مرور کنید، پیشنهاد ما [Microsoft Learn](https://docs.microsoft.com/en-us/users/jenlooper-2911/collections/qprpajyoy3x0g7?WT.mc_id=academic-40229-cxa) می باشد.
<!--[![Promo video](../screenshot.png)]( "Promo video")
> 🎥 برای مشاهده ویدیویی در مورد این پروژه و افرادی که آن را ایجاد کرده اند، روی تصویر بالا کلیک کنید!-->
## آموزش
ما هنگام تدوین این برنامه درسی دو اصل آموزشی را انتخاب کرده ایم: اطمینان حاصل کنیم که پروژه محور است و شامل آزمونهای مکرر می باشد. دانش آموزان به محض تکمیل این سری آموزشی، اصول اولیه علم داده، شامل اصول اخلاقی، آماده سازی داده ها، روش های مختلف کار با داده ها، تصویرسازی داده ها، تجزیه و تحلیل داده ها، موارد استفاده از علم داده در دنیای واقعی و بسیاری مورد دیگر را فرا می گیرند.
علاوه بر این، یک کوییز با امتیاز کم قبل از کلاس، مقصود دانش آموز درجهت یادگیری یک موضوع را مشخص می کند، در حالی که کوییز دوم بعد از کلاس ماندگاری بیشتر مطالب را تضمین می کند. این برنامه درسی طوری طراحی شده است که انعطاف پذیر و سرگرم کننده باشد و می تواند به طور کامل یا جزئی مورد استفاده قرار گیرد. پروژه از کوچک شروع می شوند و تا پایان چرخه ۱۰ هفته ای همینطور پیچیده تر می شوند.
> دستورالعمل های ما را درباره [کد رفتار](../CODE_OF_CONDUCT.md), [مشارکت](../CONTRIBUTING.md), [ترجمه](../TRANSLATIONS.md) ببینید. ما از بازخورد سازنده شما استقبال می کنیم!
## هر درس شامل:
- یادداشت های بصری (sketchnote) اختیاری
- فیلم های مکمل اختیاری
- کوییز های دست گرمی قبل از درس
- درسنامه مکتوب
- راهنمای گام به گام نحوه ساخت پروژه برای درس های مبتنی بر پروژه
- بررسی دانش
- یک چالش
- منابع خواندنی مکمل
- تمرین
- کوییز پس از درس
> **نکته ای در مورد آزمونها**: همه آزمون ها در [این برنامه](https://red-water-0103e7a0f.azurestaticapps.net/) موجود هستند، برای در مجموع ۴۰ کوییز که هرکدام شامل سه سوال می باشد. کوییزها از داخل درسنامه لینک داده شده اند اما برنامه کوییز را می توان به صورت محلی اجرا کرد. برای اینکار، دستورالعمل موجود در پوشه `quiz-app` را دنبال کنید. سوالات به تدریج در حال محلی سازی هستند.
## درسنامه
|![ یادداشت بصری (Sketchnote) از [(@sketchthedocs)](https://sketchthedocs.dev) ](../sketchnotes/00-Roadmap.png)|
|:---:|
| علم داده برای مبتدیان: نقشه راه - یادداشت بصری از [@nitya](https://twitter.com/nitya)_ |
| شماره درس | موضوع | گروه بندی درس | اهداف یادگیری | درس پیوند شده | نویسنده |
| :-----------: | :----------------------------------------: | :--------------------------------------------------: | :-----------------------------------------------------------------------------------------------------------------------------------------------------------------------: | :---------------------------------------------------------------------: | :----: |
| ۱ | تعریف علم داده | [معرفی](../1-Introduction/README.md) | مفاهیم اساسی علم داده و نحوه ارتباط آن با هوش مصنوعی، یادگیری ماشین و کلان داده را بیاموزید. | [درسنامه](../1-Introduction/01-defining-data-science/README.md) [ویدیو](https://youtu.be/pqqsm5reGvs) | [Dmitry](http://soshnikov.com) |
| ۲ | اصول اخلاقی علم داده | [معرفی](../1-Introduction/README.md) | مفاهیم اخلاق داده ها، چالش ها و چارچوب ها. | [درسنامه](../1-Introduction/02-ethics/README.md) | [Nitya](https://twitter.com/nitya) |
| ۳ | تعریف داده | [معرفی](../1-Introduction/README.md) | نحوه دسته بندی داده ها و منابع رایج آن. | [درسنامه](../1-Introduction/03-defining-data/README.md) | [Jasmine](https://www.twitter.com/paladique) |
| ۴ | مقدمه ای بر آمار و احتمال | [معرفی](../1-Introduction/README.md) | تکنیک های ریاضی آمار و احتمال برای درک داده ها. | [درسنامه](../1-Introduction/04-stats-and-probability/README.md) [ویدیو](https://youtu.be/Z5Zy85g4Yjw) | [Dmitry](http://soshnikov.com) |
| ۵ | کار با داده های رابطه ای | [کار با داده ها](../2-Working-With-Data/README.md) | مقدمه ای بر داده های رابطه ای و مبانی اکتشاف و تجزیه و تحلیل داده های رابطه ای با زبان پرس و جوی ساختار یافته ، که به SQL نیز معروف است (تلفظ کنید “see-quell”). | [درسنامه](../2-Working-With-Data/05-relational-databases/README.md) | [Christopher](https://www.twitter.com/geektrainer) | | |
| ۶ | کار با داده های NoSQL | [کار با داده ها](../2-Working-With-Data/README.md) | مقدمه ای بر داده های غیر رابطه ای، انواع مختلف آن و مبانی کاوش و تجزیه و تحلیل پایگاه داده های اسناد(document databases). | [درسنامه](../2-Working-With-Data/06-non-relational/README.md) | [Jasmine](https://twitter.com/paladique)|
| ۷ | کار با پایتون | [کار با داده ها](../2-Working-With-Data/README.md) | اصول استفاده از پایتون برای کاوش داده با کتابخانه هایی مانند Pandas. توصیه می شود مبانی برنامه نویسی پایتون را بلد باشید. | [درسنامه](../2-Working-With-Data/07-python/README.md) [ویدیو](https://youtu.be/dZjWOGbsN4Y) | [Dmitry](http://soshnikov.com) |
| ۸ | آماده سازی داده ها | [کار با داده ها](../2-Working-With-Data/README.md) | مباحث مربوط به تکنیک های داده ای برای پاکسازی و تبدیل داده ها به منظور رسیدگی به چالش های داده های مفقود شده، نادرست یا ناقص. | [درسنامه](../2-Working-With-Data/08-data-preparation/README.md) | [Jasmine](https://www.twitter.com/paladique) |
| ۹ | تصویرسازی مقادیر | [تصویرسازی داده ها](../3-Data-Visualization/README.md) | نحوه استفاده از Matplotlib برای تصویرسازی داده های پرندگان را می آموزید. 🦆 | [درسنامه](../3-Data-Visualization/09-visualization-quantities/README.md) | [Jen](https://twitter.com/jenlooper) |
| ۱۰ | تصویرسازی توزیع داده ها | [تصویرسازی داده ها](../3-Data-Visualization/README.md) | تصویرسازی مشاهدات و روندها در یک بازه زمانی. | [درسنامه](../3-Data-Visualization/10-visualization-distributions/README.md) | [Jen](https://twitter.com/jenlooper) |
| ۱۱ | تصویرسازی نسبت ها | [تصویرسازی داده ها](../3-Data-Visualization/README.md) | تصویرسازی درصدهای مجزا و گروهی. | [درسنامه](../3-Data-Visualization/11-visualization-proportions/README.md) | [Jen](https://twitter.com/jenlooper) |
| ۱۲ | تصویرسازی روابط | [تصویرسازی داده ها](../3-Data-Visualization/README.md) | تصویرسازی ارتباطات و همبستگی بین مجموعه داده ها و متغیرهای آنها. | [درسنامه](../3-Data-Visualization/12-visualization-relationships/README.md) | [Jen](https://twitter.com/jenlooper) |
| ۱۳ | تصویرسازی های معنی دار | [تصویرسازی داده ها](../3-Data-Visualization/README.md) | تکنیک ها و راهنمایی هایی برای تبدیل تصویرسازی های شما به خروجی های ارزشمندی جهت حل موثرتر مشکلات و بینش ها. | [درسنامه](../3-Data-Visualization/13-meaningful-visualizations/README.md) | [Jen](https://twitter.com/jenlooper) |
| ۱۴ | مقدمه ای بر چرخه حیات علم داده | [چرخه حیات](../4-Data-Science-Lifecycle/README.md) | مقدمه ای بر چرخه حیات علم داده و اولین گام آن برای دستیابی به داده ها و استخراج آن ها. | [درسنامه](../4-Data-Science-Lifecycle/14-Introduction/README.md) | [Jasmine](https://twitter.com/paladique) |
| ۱۵ | تجزیه و تحلیل | [چرخه حیات](../4-Data-Science-Lifecycle/README.md) | این مرحله از چرخه حیات علم داده بر تکنیک های تجزیه و تحلیل داده ها متمرکز است. | [درسنامه](../4-Data-Science-Lifecycle/15-Analyzing/README.md) | [Jasmine](https://twitter.com/paladique) | | |
| ۱۶ | ارتباطات | [چرخه حیات](../4-Data-Science-Lifecycle/README.md) | این مرحله از چرخه حیات علم داده بر روی ارائه بینش از داده ها به نحوی که درک آنها را برای تصمیم گیرندگان آسان تر بکند، متمرکز است. | [درسنامه](../4-Data-Science-Lifecycle/16-Communication/README.md) | [Jalen](https://twitter.com/JalenMcG) | | |
| ۱۷ | علم داده در فضای ابری | [داده های ابری](../5-Data-Science-In-Cloud/README.md) | این سری از درسنامه ها علم داده در فضای ابری و مزایای آن را معرفی می کند. | [درسنامه](../5-Data-Science-In-Cloud/17-Introduction/README.md) | [Tiffany](https://twitter.com/TiffanySouterre) و [Maud](https://twitter.com/maudstweets) |
| ۱۸ | علم داده در فضای ابری | [داده های ابری](../5-Data-Science-In-Cloud/README.md) | آموزش مدل ها با استفاده از ابزارهای کد کمتر(low code). |[درسنامه](../5-Data-Science-In-Cloud/18-Low-Code/README.md) | [Tiffany](https://twitter.com/TiffanySouterre) و [Maud](https://twitter.com/maudstweets) |
| ۱۹ | علم داده در فضای | [داده های ابری](../5-Data-Science-In-Cloud/README.md) | استقرار(Deploy) مدل ها با استفاده از استودیوی یادگیری ماشین آژور(Azure Machine Learning Studio). | [درسنامه](../5-Data-Science-In-Cloud/19-Azure/README.md)| [Tiffany](https://twitter.com/TiffanySouterre) و [Maud](https://twitter.com/maudstweets) |
| ۲۰ | علم داده در طبیعت | [در طبیعت](../6-Data-Science-In-Wild/README.md) | پروژه های علم داده در دنیای واقعی. | [درسنامه](../6-Data-Science-In-Wild/20-Real-World-Examples/README.md) | [Nitya](https://twitter.com/nitya) |
## دسترسی آفلاین
شما می توانید این سند را به با استفاده از [Docsify](https://docsify.js.org/#/) به صورت آفلاین اجرا کنید. این مخزن را فورک کنید، [Docsify را روی دستگاه محلی خود نصب کنید](https://docsify.js.org/#/quickstart)، سپس در شاخه اصلی(root) این مخزن، بنویسید `docsify serve`. وب سایت در پورت 3000 روی localhost شما ارائه می شود: `localhost:3000`.
> توجه داشته باشید، نوت بوک ها توسط Docsify ترجمه نمی شوند، بنابراین هنگامی که شما نیاز به اجرای یک نوت بوک دارید، این کار را به صورت جداگانه در VS Code با اجرای یک کرنل پایتون انجام دهید.
## پی دی اف
یک پی دی اف شامل همه درسها را می توان [اینجا](https://microsoft.github.io/Data-Science-For-Beginners/pdf/readme.pdf) یافت.
## به کمک شما نیازمندیم!
اگر می خواهید تمام یا بخشی از برنامه درسی را ترجمه کنید، لطفاً ظبق راهنمای [ترجمه ها](../TRANSLATIONS.md)ی ما عمل کنید.
## سایر برنامه های درسی
تیم ما برنامه های درسی دیگری نیز تولید می کند! بدین منظور ببینید:
- [یادگیری ماشین برای مبتدیان](https://aka.ms/ml-beginners)
- [اینترنت اشیا برای مبتدیان](https://aka.ms/iot-beginners)
- [توسعه سایت برای مبتدیان](https://aka.ms/webdev-beginners)
</div>

@ -0,0 +1,106 @@
# La Data Science pour les débutants - Curriculum
[![GitHub license](https://img.shields.io/github/license/microsoft/Data-Science-For-Beginners.svg)](https://github.com/microsoft/Data-Science-For-Beginners/blob/master/LICENSE)
[![GitHub contributors](https://img.shields.io/github/contributors/microsoft/Data-Science-For-Beginners.svg)](https://GitHub.com/microsoft/Data-Science-For-Beginners/graphs/contributors/)
[![GitHub issues](https://img.shields.io/github/issues/microsoft/Data-Science-For-Beginners.svg)](https://GitHub.com/microsoft/Data-Science-For-Beginners/issues/)
[![GitHub pull-requests](https://img.shields.io/github/issues-pr/microsoft/Data-Science-For-Beginners.svg)](https://GitHub.com/microsoft/Data-Science-For-Beginners/pulls/)
[![PRs Welcome](https://img.shields.io/badge/PRs-welcome-brightgreen.svg?style=flat-square)](http://makeapullrequest.com)
[![GitHub watchers](https://img.shields.io/github/watchers/microsoft/Data-Science-For-Beginners.svg?style=social&label=Watch)](https://GitHub.com/microsoft/Data-Science-For-Beginners/watchers/)
[![GitHub forks](https://img.shields.io/github/forks/microsoft/Data-Science-For-Beginners.svg?style=social&label=Fork)](https://GitHub.com/microsoft/Data-Science-For-Beginners/network/)
[![GitHub stars](https://img.shields.io/github/stars/microsoft/Data-Science-For-Beginners.svg?style=social&label=Star)](https://GitHub.com/microsoft/Data-Science-For-Beginners/stargazers/)
L'équipe Azure Cloud Advocates de Microsoft a le plaisir de vous offrir un curriculum d'apprentissage de la Data Science, ou "science des données" en français, comprenant vingt cours à étudier sur une durée d'environ dix semaines. Chaque cours comprend un quiz préalable, un quiz à effectuer après le cours, ainsi que des instructions, un exercice et une solution. Notre pédagogie est basée vous permet d'apprendre tout en réalisant des projets, ce qui permet de bien intégrer les nouvelles compétences que vous allez acquérir.
**Un grand merci à nos auteurs :** [Jasmine Greenaway](https://www.twitter.com/paladique), [Dmitry Soshnikov](http://soshnikov.com), [Nitya Narasimhan](https://twitter.com/nitya), [Jalen McGee](https://twitter.com/JalenMcG), [Jen Looper](https://twitter.com/jenlooper), [Maud Levy](https://twitter.com/maudstweets), [Tiffany Souterre](https://twitter.com/TiffanySouterre), [Christopher Harrison](https://www.twitter.com/geektrainer).
**🙏 Nous remercions également particulièrement 🙏 les auteurs, correcteurs et contributeurs membres du programme Microsoft Learn Student Ambassadors**, notamment [Raymond Wangsa Putra](https://www.linkedin.com/in/raymond-wp/), [Ankita Singh](https://www.linkedin.com/in/ankitasingh007), [Rohit Yadav](https://www.linkedin.com/in/rty2423), [Arpita Das](https://www.linkedin.com/in/arpitadas01/), [Mohamma Iftekher (Iftu) Ebne Jalal](https://twitter.com/iftu119), [Dishita Bhasin](https://www.linkedin.com/in/dishita-bhasin-7065281bb), [Miguel Correa](https://www.linkedin.com/in/miguelmque/), [Nawrin Tabassum](https://www.linkedin.com/in/nawrin-tabassum), [Sanya Sinha](https://www.linkedin.com/mwlite/in/sanya-sinha-13aab1200), [Majd Safi](https://www.linkedin.com/in/majd-s/), [Sheena Narula](https://www.linkedin.com/in/sheena-narula-n/), [Anupam Mishra](https://www.linkedin.com/in/anupam--mishra/), [Dibri Nsofor](https://www.linkedin.com/in/dibrinsofor), [Aditya Garg](https://github.com/AdityaGarg00), [Alondra Sanchez](https://www.linkedin.com/in/alondra-sanchez-molina/), Yogendrasingh Pawar, Max Blum, Samridhi Sharma, Tauqeer Ahmad, Aaryan Arora, ChhailBihari Dubey
|![ Sketchnote by [(@sketchthedocs)](https://sketchthedocs.dev) ](../sketchnotes/00-Title.png)|
|:---:|
| Data Science For Beginners - _Sketchnote réalisé par [@nitya](https://twitter.com/nitya)_ |
# Prise en main
> **Enseignants**, nous avons [inclus des suggestions](../for-teachers.md) concernant la manière dont vous pouvez utiliser ce curriculum. Nous aimerions beaucoup lire vos feedbacks [dans notre forum de discussion](https://github.com/microsoft/Data-Science-For-Beginners/discussions) !
> **Etudiants**, pour suivre ce curriculum, la première chose à faire est de forker ce repository en entier, vous pourrez ensuite réaliser les exercices de votre côté, en commençant un quiz préalable, en lisant le contenu des cours, et en complétant le reste des activités. Essayez de créer les projets en intégrant bien les cours, plutôt qu'en copiant les solutions. Vous verrez que chaque cours orientée projet contient un dossier dossier /solutions dans lequel vous trouverez la solution des exercices. Vous pouvez aussi former un groupe d'apprentissage avec des amis et vous former ensemble. Pour poursuivre votre apprentissage, nous recommandons d'aller consulter [Microsoft Learn](https://docs.microsoft.com/en-us/users/jenlooper-2911/collections/qprpajyoy3x0g7?WT.mc_id=academic-40229-cxa).
<!--[![Promo video](../screenshot.png)]( "Promo video")
> 🎥 Cliquez sur l'image ci-dessus pour regarder la vidéo de présentation du projet réalisée par les auteurs du curriculum !-->
## Pédagogie
Nous avons choisi deux principes pédagogiques lors de la création de ce programme d'études : veiller à ce qu'il soit basé sur des projets et à ce qu'il comprenne des quiz fréquents. À la fin de cette série, les élèves auront appris les principes de base de la data science, notamment les concepts éthiques, la préparation des données, les différentes façons de travailler avec les données, la visualisation des données, l'analyse des données, des cas d'utilisation réels de data science, etc.
De plus, un quiz à faible enjeu à réaliser avant chaque cours permet de préparer l'étudiant à l'apprentissage du sujet, et un second quiz après le cours permet de fixer encore davantage le contenu dans l'esprit des apprenants. Ce curriculum se veut flexible et ammusant et il peut être suivi dans son intégralité ou en partie. Les premiers projets sont modestes et deviennent de plus en plus ardus.
> Qeulques liens utiles : [Code de conduite](../CODE_OF_CONDUCT.md), [Comment contribuer](../CONTRIBUTING.md), [Traductions](../TRANSLATIONS.md). Tout feedback constructif sera le bienvenu !
## Chaque cours comprend :
- Un sketchnote optionnel
- Une vidéo complémentaire optionnelle
- Un quiz préalable
- Un cours écrit
- Pour les cours basés sur des projets à réaliser : un guide de création du projet
- Des vérifications de connaissances
- Un challenge
- De la lecture complémentaire
- Un exercice
- Un quiz de fin
> **Concernant les quiz** : Vous pourrez retrouver tous les quiz [dans cette application](https://red-water-0103e7a0f.azurestaticapps.net/). Il y a 40 quiz, avec trois questions chacun. Vous les retrouverez dans chaque cours correspondant, mais vous pouvez aussi utiliser l'application de quiz en local en suivant les instruction disponibles dans le dossier `quiz-app`. Les quiz sont en cours de localisation.
## Cours
|![ Sketchnote réalisé par [(@sketchthedocs)](https://sketchthedocs.dev) ](../sketchnotes/00-Roadmap.png)|
|:---:|
| Data Science For Beginners: Roadmap - _Sketchnote réalisé par [@nitya](https://twitter.com/nitya)_ |
| Numéro du cours | Topic | Chapitre | Objectifs d'apprentissage | Liens vers les cours | Auteurs |
| :-----------: | :----------------------------------------: | :--------------------------------------------------: | :-----------------------------------------------------------------------------------------------------------------------------------------------------------------------: | :---------------------------------------------------------------------: | :----: |
| 01 | Qu'est-ce que la Data Science ? | [Introduction](../1-Introduction/README.md) | Apprenez les concepts de base de la data science et le lien entre la data science, l'intelligence artificielle, le machine learning et la big data. | [cours](../1-Introduction/01-defining-data-science/README.md) [vidéo](https://youtu.be/pqqsm5reGvs) | [Dmitry](http://soshnikov.com) |
| 02 | Data Science et éthique | [Introduction](../1-Introduction/README.md) | Les concepts d'éthique dans le domaine des données, les challenges et les principes d'encadrement. | [cours](../1-Introduction/02-ethics/README.md) | [Nitya](https://twitter.com/nitya) |
| 03 | Définition de la data | [Introduction](../1-Introduction/README.md) | Comment classifier les données et d'où viennent-elles principalement ? | [cours](../1-Introduction/03-defining-data/README.md) | [Jasmine](https://www.twitter.com/paladique) |
| 04 | Introduction aux statistiques et aux probabilités | [Introduction](../1-Introduction/README.md) | Techniques mathématiques de probabilités et de statistiques au service de la data. | [cours](../1-Introduction/04-stats-and-probability/README.md) [vidéo](https://youtu.be/Z5Zy85g4Yjw) | [Dmitry](http://soshnikov.com) |
| 05 | Utilisation de données relationnelles | [Exploiter des données](../2-Working-With-Data/README.md) | Introduction aux données relationnelles et aux bases d'exploration et d'analyse des données relationnelles avec le Structured Query Language, alias SQL (pronouncé “sicouel”). | [cours](../2-Working-With-Data/05-relational-databases/README.md) | [Christopher](https://www.twitter.com/geektrainer) | | |
| 06 | Utilisation de données NoSQL | [Exploiter des données](../2-Working-With-Data/README.md) | Présentation des données non relationelles, les types de données et les fondamentaux de l'exploration et de l'analyse de bases de données documentaires. | [cours](../2-Working-With-Data/06-non-relational/README.md) | [Jasmine](https://twitter.com/paladique)|
| 07 | Utilisation de Python | [Exploiter des données](../2-Working-With-Data/README.md) | Les principes de base de Python pour l'exploration de données, et les librairies courantes telles que Pandas. Des connaissances de base de la programmation Python sont recommandées pour ce cours.| [cours](../2-Working-With-Data/07-python/README.md) [vidéo](https://youtu.be/dZjWOGbsN4Y) | [Dmitry](http://soshnikov.com) |
| 08 | Préparation des données | [Working With Data](../2-Working-With-Data/README.md) | Techniques de nettoyage et de transformation des données pour gérer des données manquantes, inexactesou incomplètes. | [cours](../2-Working-With-Data/08-data-preparation/README.md) | [Jasmine](https://www.twitter.com/paladique) |
| 09 | Visualisation des quantités | [Data Visualization](../3-Data-Visualization/README.md) | Apprendre à utiliser Matplotlib pour visualiser des données sur les oiseaux 🦆 | [cours](../3-Data-Visualization/09-visualization-quantities/README.md) | [Jen](https://twitter.com/jenlooper) |
| 10 | Visualisation de la distribution des données | [Data Visualization](../3-Data-Visualization/README.md) | Visualisation d'observations et de tendances dans un intervalle. | [cours](../3-Data-Visualization/10-visualization-distributions/README.md) | [Jen](https://twitter.com/jenlooper) |
| 11 | Visualiser des proportions | [Data Visualization](../3-Data-Visualization/README.md) | Visualisation de pourcentages discrets et groupés. | [cours](../3-Data-Visualization/11-visualization-proportions/README.md) | [Jen](https://twitter.com/jenlooper) |
| 12 | Visualisation de relations | [Data Visualization](../3-Data-Visualization/README.md) | Visualisation de connections et de corrélations entre différents sets de données et leurs variables. | [cours](../3-Data-Visualization/12-visualization-relationships/README.md) | [Jen](https://twitter.com/jenlooper) |
| 13 | Visualisations significatives | [Data Visualization](../3-Data-Visualization/README.md) | Techniques et conseils pour donner de la valeur à vos visualisations, les rendre utiles à la compréhension et à la résolution de problèmes. | [cours](../3-Data-Visualization/13-meaningful-visualizations/README.md) | [Jen](https://twitter.com/jenlooper) |
| 14 | Introduction au cycle de vie de la Data Science | [Cycle de vie](../4-Data-Science-Lifecycle/README.md) | Présentation du cycle de la data science et des premières étapes d'acquisition et d'extraction des données. | [cours](../4-Data-Science-Lifecycle/14-Introduction/README.md) | [Jasmine](https://twitter.com/paladique) |
| 15 | Analyse | [Cycle de vie](../4-Data-Science-Lifecycle/README.md) | Cette étape du cycle de vie de la data science se concentre sur les techniques d'analysation des données. | [cours](../4-Data-Science-Lifecycle/15-Analyzing/README.md) | [Jasmine](https://twitter.com/paladique) | | |
| 16 | Communication | [Cycle de vie](../4-Data-Science-Lifecycle/README.md) | Cette étape du cycle de vie de la data science se concentre sur la présentation des informations tirées des données de manière à faciliter la compréhension d'une situation par des décisionnaires. | [cours](../4-Data-Science-Lifecycle/16-Communication/README.md) | [Jalen](https://twitter.com/JalenMcG) | | |
| 17 | La Data Science dans le Cloud | [Cloud Data](../5-Data-Science-In-Cloud/README.md) | Ce cours présente le Cloud et l'intérêt du Cloud pour la Data Science. | [cours](../5-Data-Science-In-Cloud/17-Introduction/README.md) | [Tiffany](https://twitter.com/TiffanySouterre) et [Maud](https://twitter.com/maudstweets) |
| 18 | La Data Science dans le Cloud | [Cloud Data](../5-Data-Science-In-Cloud/README.md) | Entraîner un modèle avec des outils de low code. |[cours](../5-Data-Science-In-Cloud/18-Low-Code/README.md) | [Tiffany](https://twitter.com/TiffanySouterre) et [Maud](https://twitter.com/maudstweets) |
| 19 | La Data Science dans le Cloud | [Cloud Data](../5-Data-Science-In-Cloud/README.md) | Déployer des modèles avec Azure Machine Learning Studio. | [cours](../5-Data-Science-In-Cloud/19-Azure/README.md)| [Tiffany](https://twitter.com/TiffanySouterre) et [Maud](https://twitter.com/maudstweets) |
| 20 | La Data Science dans la nature | [In the Wild](../6-Data-Science-In-Wild/README.md) | Des projets concrets de data science sur le terrain. | [cours](../6-Data-Science-In-Wild/20-Real-World-Examples/README.md) | [Nitya](https://twitter.com/nitya) |
## Accès hors ligne
Vous pouvez retrouver cette documentation hors ligne à l'aide de [Docsify](https://docsify.js.org/#/). Forkez ce repository, [installez Docsify](https://docsify.js.org/#/quickstart) sur votre machine locale, et tapez `docsify serve` dans le dossier racine de ce repository. Vous retrouverez le site web sur le port 3000 de votre localhost : `localhost:3000`.
> Remarque : vous ne pourrez pas utiliser de notebook avec Docsify. Si vous vouhaitez utilisr un notebook, vous pouvez le faire séparémmnt à l'aide d'un kernel Python dans VS Code.
## PDF
Vous trouverez un PDF contenant tous les cours du curriculum [ici](https://microsoft.github.io/Data-Science-For-Beginners/pdf/readme.pdf).
## Appel à contribution
Si vous souhaitez traduire le curriculum entier ou en partie, veuillez suivre notre guide de [traduction](../TRANSLATIONS.md).
## Autres Curricula
Notre équipe a créé d'autres cours ! Ne manquez pas :
- [Le Machine Learning pour les débutants](https://aka.ms/ml-beginners)
- [L'IoT pour les débutants](https://aka.ms/iot-beginners)
- [Le développement Web pour les débutants](https://aka.ms/webdev-beginners)
Loading…
Cancel
Save