From 3503f048609497bb058313989c6fcefe936e544b Mon Sep 17 00:00:00 2001 From: "copilot-swe-agent[bot]" <198982749+Copilot@users.noreply.github.com> Date: Fri, 3 Oct 2025 10:26:20 +0000 Subject: [PATCH] Add comprehensive documentation: installation, usage, and troubleshooting guides Co-authored-by: leestott <2511341+leestott@users.noreply.github.com> --- CONTRIBUTING.md | 348 +++++++++++++++++++++++++- INSTALLATION.md | 239 ++++++++++++++++++ README.md | 22 +- TROUBLESHOOTING.md | 611 +++++++++++++++++++++++++++++++++++++++++++++ USAGE.md | 360 ++++++++++++++++++++++++++ 5 files changed, 1575 insertions(+), 5 deletions(-) create mode 100644 INSTALLATION.md create mode 100644 TROUBLESHOOTING.md create mode 100644 USAGE.md diff --git a/CONTRIBUTING.md b/CONTRIBUTING.md index ebf23aca..2cfa8e2e 100644 --- a/CONTRIBUTING.md +++ b/CONTRIBUTING.md @@ -1,4 +1,338 @@ -# Contributing +# Contributing to Data Science for Beginners + +Thank you for your interest in contributing to the Data Science for Beginners curriculum! We welcome contributions from the community. + +## Table of Contents + +- [Code of Conduct](#code-of-conduct) +- [How Can I Contribute?](#how-can-i-contribute) +- [Getting Started](#getting-started) +- [Contribution Guidelines](#contribution-guidelines) +- [Pull Request Process](#pull-request-process) +- [Style Guidelines](#style-guidelines) +- [Contributor License Agreement](#contributor-license-agreement) + +## Code of Conduct + +This project has adopted the [Microsoft Open Source Code of Conduct](https://opensource.microsoft.com/codeofconduct/). +For more information see the [Code of Conduct FAQ](https://opensource.microsoft.com/codeofconduct/faq/) +or contact [opencode@microsoft.com](mailto:opencode@microsoft.com) with any additional questions or comments. + +## How Can I Contribute? + +### Reporting Bugs + +Before creating bug reports, please check the existing issues to avoid duplicates. When you create a bug report, include as many details as possible: + +- **Use a clear and descriptive title** +- **Describe the exact steps to reproduce the problem** +- **Provide specific examples** (code snippets, screenshots) +- **Describe the behavior you observed and what you expected** +- **Include your environment details** (OS, Python version, browser) + +### Suggesting Enhancements + +Enhancement suggestions are welcome! When suggesting enhancements: + +- **Use a clear and descriptive title** +- **Provide a detailed description of the suggested enhancement** +- **Explain why this enhancement would be useful** +- **List any similar features in other projects, if applicable** + +### Contributing to Documentation + +Documentation improvements are always appreciated: + +- **Fix typos and grammatical errors** +- **Improve clarity of explanations** +- **Add missing documentation** +- **Update outdated information** +- **Add examples or use cases** + +### Contributing Code + +We welcome code contributions including: + +- **New lessons or exercises** +- **Bug fixes** +- **Improvements to existing notebooks** +- **New datasets or examples** +- **Quiz application enhancements** + +## Getting Started + +### Prerequisites + +Before contributing, ensure you have: + +1. A GitHub account +2. Git installed on your system +3. Python 3.7+ and Jupyter installed +4. Node.js and npm (for quiz app contributions) +5. Familiarity with the curriculum structure + +See [INSTALLATION.md](INSTALLATION.md) for detailed setup instructions. + +### Fork and Clone + +1. **Fork the repository** on GitHub +2. **Clone your fork** locally: + ```bash + git clone https://github.com/YOUR-USERNAME/Data-Science-For-Beginners.git + cd Data-Science-For-Beginners + ``` +3. **Add upstream remote**: + ```bash + git remote add upstream https://github.com/microsoft/Data-Science-For-Beginners.git + ``` + +### Create a Branch + +Create a new branch for your work: + +```bash +git checkout -b feature/your-feature-name +# or +git checkout -b fix/your-bug-fix +``` + +Branch naming conventions: +- `feature/` - New features or lessons +- `fix/` - Bug fixes +- `docs/` - Documentation changes +- `refactor/` - Code refactoring + +## Contribution Guidelines + +### For Lesson Content + +When contributing lessons or modifying existing ones: + +1. **Follow the existing structure**: + - README.md with lesson content + - Jupyter notebook with exercises + - Assignment (if applicable) + - Link to pre and post quizzes + +2. **Include these elements**: + - Clear learning objectives + - Step-by-step explanations + - Code examples with comments + - Exercises for practice + - Links to additional resources + +3. **Ensure accessibility**: + - Use clear, simple language + - Provide alt text for images + - Include code comments + - Consider different learning styles + +### For Jupyter Notebooks + +1. **Clear all outputs** before committing: + ```bash + jupyter nbconvert --clear-output --inplace notebook.ipynb + ``` + +2. **Include markdown cells** with explanations + +3. **Use consistent formatting**: + ```python + # Import libraries at the top + import pandas as pd + import numpy as np + import matplotlib.pyplot as plt + + # Use meaningful variable names + # Add comments for complex operations + # Follow PEP 8 style guidelines + ``` + +4. **Test your notebook** completely before submitting + +### For Python Code + +Follow [PEP 8](https://www.python.org/dev/peps/pep-0008/) style guidelines: + +```python +# Good practices +import pandas as pd + +def calculate_mean(data): + """Calculate the mean of a dataset. + + Args: + data (list): List of numerical values + + Returns: + float: Mean of the dataset + """ + return sum(data) / len(data) +``` + +### For Quiz App Contributions + +When modifying the quiz application: + +1. **Test locally**: + ```bash + cd quiz-app + npm install + npm run serve + ``` + +2. **Run linter**: + ```bash + npm run lint + ``` + +3. **Build successfully**: + ```bash + npm run build + ``` + +4. **Follow Vue.js style guide** and existing patterns + +### For Translations + +When adding or updating translations: + +1. Follow the structure in `translations/` folder +2. Use the language code as folder name (e.g., `fr` for French) +3. Maintain the same file structure as English version +4. Update quiz links to include language parameter: `?loc=fr` +5. Test all links and formatting + +## Pull Request Process + +### Before Submitting + +1. **Update your branch** with latest changes: + ```bash + git fetch upstream + git rebase upstream/main + ``` + +2. **Test your changes**: + - Run all modified notebooks + - Test quiz app if modified + - Verify all links work + - Check for spelling and grammar errors + +3. **Commit your changes**: + ```bash + git add . + git commit -m "Brief description of changes" + ``` + + Write clear commit messages: + - Use present tense ("Add feature" not "Added feature") + - Use imperative mood ("Move cursor to..." not "Moves cursor to...") + - Limit first line to 72 characters + - Reference issues and pull requests when relevant + +4. **Push to your fork**: + ```bash + git push origin feature/your-feature-name + ``` + +### Creating the Pull Request + +1. Go to the [repository](https://github.com/microsoft/Data-Science-For-Beginners) +2. Click "Pull requests" → "New pull request" +3. Click "compare across forks" +4. Select your fork and branch +5. Click "Create pull request" + +### PR Title Format + +Use clear, descriptive titles following this format: + +``` +[Component] Brief description +``` + +Examples: +- `[Lesson 7] Fix Python notebook import error` +- `[Quiz App] Add German translation` +- `[Docs] Update README with new prerequisites` +- `[Fix] Correct data path in visualization lesson` + +### PR Description + +Include in your PR description: + +- **What**: What changes did you make? +- **Why**: Why are these changes necessary? +- **How**: How did you implement the changes? +- **Testing**: How did you test the changes? +- **Screenshots**: Include screenshots for visual changes +- **Related Issues**: Link to related issues (e.g., "Fixes #123") + +### Review Process + +1. **Automated checks** will run on your PR +2. **Maintainers will review** your contribution +3. **Address feedback** by making additional commits +4. Once approved, a **maintainer will merge** your PR + +### After Your PR is Merged + +1. Delete your branch: + ```bash + git branch -d feature/your-feature-name + git push origin --delete feature/your-feature-name + ``` + +2. Update your fork: + ```bash + git checkout main + git pull upstream main + git push origin main + ``` + +## Style Guidelines + +### Markdown + +- Use consistent heading levels +- Include blank lines between sections +- Use code blocks with language specifiers: + ````markdown + ```python + import pandas as pd + ``` + ```` +- Add alt text to images: `![Alt text](image.png)` +- Keep line lengths reasonable (around 80-100 characters) + +### Python + +- Follow PEP 8 style guide +- Use meaningful variable names +- Add docstrings to functions +- Include type hints where appropriate: + ```python + def process_data(df: pd.DataFrame) -> pd.DataFrame: + """Process the input dataframe.""" + return df + ``` + +### JavaScript/Vue.js + +- Follow Vue.js 2 style guide +- Use ESLint configuration provided +- Write modular, reusable components +- Add comments for complex logic + +### File Organization + +- Keep related files together +- Use descriptive file names +- Follow existing directory structure +- Don't commit unnecessary files (.DS_Store, .pyc, node_modules, etc.) + +## Contributor License Agreement This project welcomes contributions and suggestions. Most contributions require you to agree to a Contributor License Agreement (CLA) declaring that you have the right to, @@ -9,6 +343,12 @@ When you submit a pull request, a CLA-bot will automatically determine whether y to provide a CLA and decorate the PR appropriately (e.g., label, comment). Simply follow the instructions provided by the bot. You will only need to do this once across all repositories using our CLA. -This project has adopted the [Microsoft Open Source Code of Conduct](https://opensource.microsoft.com/codeofconduct/). -For more information see the [Code of Conduct FAQ](https://opensource.microsoft.com/codeofconduct/faq/) -or contact [opencode@microsoft.com](mailto:opencode@microsoft.com) with any additional questions or comments. +## Questions? + +- Check our [GitHub Discussions](https://github.com/microsoft/Data-Science-For-Beginners/discussions) +- Join our [Discord community](https://aka.ms/ds4beginners/discord) +- Review existing [issues](https://github.com/microsoft/Data-Science-For-Beginners/issues) and [pull requests](https://github.com/microsoft/Data-Science-For-Beginners/pulls) + +## Thank You! + +Your contributions make this curriculum better for everyone. Thank you for taking the time to contribute! diff --git a/INSTALLATION.md b/INSTALLATION.md new file mode 100644 index 00000000..43c20133 --- /dev/null +++ b/INSTALLATION.md @@ -0,0 +1,239 @@ +# Installation Guide + +This guide will help you set up your environment to work with the Data Science for Beginners curriculum. + +## Table of Contents + +- [Prerequisites](#prerequisites) +- [Quick Start Options](#quick-start-options) +- [Local Installation](#local-installation) +- [Verify Your Installation](#verify-your-installation) + +## Prerequisites + +Before you begin, you should have: + +- Basic familiarity with command line/terminal +- A GitHub account (free) +- Stable internet connection for initial setup + +## Quick Start Options + +### Option 1: GitHub Codespaces (Recommended for Beginners) + +The easiest way to get started is with GitHub Codespaces, which provides a complete development environment in your browser. + +1. Navigate to the [repository](https://github.com/microsoft/Data-Science-For-Beginners) +2. Click the **Code** dropdown menu +3. Select the **Codespaces** tab +4. Click **Create codespace on main** +5. Wait for the environment to initialize (2-3 minutes) + +Your environment is now ready with all dependencies pre-installed! + +### Option 2: Local Development + +For working on your own computer, follow the detailed instructions below. + +## Local Installation + +### Step 1: Install Git + +Git is required to clone the repository and track your changes. + +**Windows:** +- Download from [git-scm.com](https://git-scm.com/download/win) +- Run the installer with default settings + +**macOS:** +- Install via Homebrew: `brew install git` +- Or download from [git-scm.com](https://git-scm.com/download/mac) + +**Linux:** +```bash +# Debian/Ubuntu +sudo apt-get update +sudo apt-get install git + +# Fedora +sudo dnf install git + +# Arch +sudo pacman -S git +``` + +### Step 2: Clone the Repository + +```bash +# Clone the repository +git clone https://github.com/microsoft/Data-Science-For-Beginners.git + +# Navigate to the directory +cd Data-Science-For-Beginners +``` + +### Step 3: Install Python and Jupyter + +Python 3.7 or higher is required for the data science lessons. + +**Windows:** +1. Download Python from [python.org](https://www.python.org/downloads/) +2. During installation, check "Add Python to PATH" +3. Verify installation: +```bash +python --version +``` + +**macOS:** +```bash +# Using Homebrew +brew install python3 + +# Verify installation +python3 --version +``` + +**Linux:** +```bash +# Most Linux distributions come with Python pre-installed +python3 --version + +# If not installed: +# Debian/Ubuntu +sudo apt-get install python3 python3-pip + +# Fedora +sudo dnf install python3 python3-pip +``` + +### Step 4: Set Up Python Environment + +It's recommended to use a virtual environment to keep dependencies isolated. + +```bash +# Create a virtual environment +python -m venv venv + +# Activate the virtual environment +# On Windows: +venv\Scripts\activate + +# On macOS/Linux: +source venv/bin/activate +``` + +### Step 5: Install Python Packages + +Install the required data science libraries: + +```bash +pip install jupyter pandas numpy matplotlib seaborn scikit-learn +``` + +### Step 6: Install Node.js and npm (For Quiz App) + +The quiz application requires Node.js and npm. + +**Windows/macOS:** +- Download from [nodejs.org](https://nodejs.org/) (LTS version recommended) +- Run the installer + +**Linux:** +```bash +# Debian/Ubuntu +curl -fsSL https://deb.nodesource.com/setup_lts.x | sudo -E bash - +sudo apt-get install -y nodejs + +# Fedora +sudo dnf install nodejs + +# Verify installation +node --version +npm --version +``` + +### Step 7: Install Quiz App Dependencies + +```bash +# Navigate to quiz app directory +cd quiz-app + +# Install dependencies +npm install + +# Return to root directory +cd .. +``` + +### Step 8: Install Docsify (Optional) + +For offline access to documentation: + +```bash +npm install -g docsify-cli +``` + +## Verify Your Installation + +### Test Python and Jupyter + +```bash +# Activate your virtual environment if not already activated +# On Windows: +venv\Scripts\activate +# On macOS/Linux: +source venv/bin/activate + +# Start Jupyter Notebook +jupyter notebook +``` + +Your browser should open with the Jupyter interface. You can now navigate to any lesson's `.ipynb` file. + +### Test Quiz Application + +```bash +# Navigate to quiz app +cd quiz-app + +# Start development server +npm run serve +``` + +The quiz app should be available at `http://localhost:8080` (or another port if 8080 is busy). + +### Test Documentation Server + +```bash +# From the root directory of the repository +docsify serve +``` + +The documentation should be available at `http://localhost:3000`. + +## Using VS Code Dev Containers + +If you have Docker installed, you can use VS Code Dev Containers: + +1. Install [Docker Desktop](https://www.docker.com/products/docker-desktop) +2. Install [Visual Studio Code](https://code.visualstudio.com/) +3. Install the [Remote - Containers extension](https://marketplace.visualstudio.com/items?itemName=ms-vscode-remote.remote-containers) +4. Open the repository in VS Code +5. Press `F1` and select "Remote-Containers: Reopen in Container" +6. Wait for the container to build (first time only) + +## Next Steps + +- Explore the [README.md](README.md) for an overview of the curriculum +- Read [USAGE.md](USAGE.md) for common workflows and examples +- Check [TROUBLESHOOTING.md](TROUBLESHOOTING.md) if you encounter issues +- Review [CONTRIBUTING.md](CONTRIBUTING.md) if you want to contribute + +## Getting Help + +If you encounter issues: + +1. Check the [TROUBLESHOOTING.md](TROUBLESHOOTING.md) guide +2. Search existing [GitHub Issues](https://github.com/microsoft/Data-Science-For-Beginners/issues) +3. Join our [Discord community](https://aka.ms/ds4beginners/discord) +4. Create a new issue with detailed information about your problem diff --git a/README.md b/README.md index cdffd920..db1465a7 100644 --- a/README.md +++ b/README.md @@ -51,10 +51,28 @@ Get started with the following resources: # Getting Started -> **Teachers**: we have [included some suggestions](for-teachers.md) on how to use this curriculum. We'd love your feedback [in our discussion forum](https://github.com/microsoft/Data-Science-For-Beginners/discussions)! +## 📚 Documentation + +- **[Installation Guide](INSTALLATION.md)** - Step-by-step setup instructions for beginners +- **[Usage Guide](USAGE.md)** - Examples and common workflows +- **[Troubleshooting](TROUBLESHOOTING.md)** - Solutions to common issues +- **[Contributing Guide](CONTRIBUTING.md)** - How to contribute to this project +- **[For Teachers](for-teachers.md)** - Teaching guidance and classroom resources + +## 👨‍🎓 For Students > **[Students](https://aka.ms/student-page)**: to use this curriculum on your own, fork the entire repo and complete the exercises on your own, starting with a pre-lecture quiz. Then read the lecture and complete the rest of the activities. Try to create the projects by comprehending the lessons rather than copying the solution code; however, that code is available in the /solutions folders in each project-oriented lesson. Another idea would be to form a study group with friends and go through the content together. For further study, we recommend [Microsoft Learn](https://docs.microsoft.com/en-us/users/jenlooper-2911/collections/qprpajyoy3x0g7?WT.mc_id=academic-77958-bethanycheum). +**Quick Start:** +1. Check the [Installation Guide](INSTALLATION.md) to set up your environment +2. Review the [Usage Guide](USAGE.md) to learn how to work with the curriculum +3. Start with Lesson 1 and work through sequentially +4. Join our [Discord community](https://aka.ms/ds4beginners/discord) for support + +## 👩‍🏫 For Teachers + +> **Teachers**: we have [included some suggestions](for-teachers.md) on how to use this curriculum. We'd love your feedback [in our discussion forum](https://github.com/microsoft/Data-Science-For-Beginners/discussions)! + ## Meet the Team [![Promo video](ds-for-beginners.gif)](https://youtu.be/8mzavjQSMM4 "Promo video") @@ -171,6 +189,8 @@ Our team produces other curricula! Check out: ## Getting Help +**Encountering issues?** Check our [Troubleshooting Guide](TROUBLESHOOTING.md) for solutions to common problems. + If you get stuck or have any questions about building AI apps, join: [![Azure AI Foundry Discord](https://img.shields.io/badge/Discord-Azure_AI_Foundry_Community_Discord-blue?style=for-the-badge&logo=discord&color=5865f2&logoColor=fff)](https://aka.ms/foundry/discord) diff --git a/TROUBLESHOOTING.md b/TROUBLESHOOTING.md new file mode 100644 index 00000000..d0a66f06 --- /dev/null +++ b/TROUBLESHOOTING.md @@ -0,0 +1,611 @@ +# Troubleshooting Guide + +This guide provides solutions to common issues you might encounter while working with the Data Science for Beginners curriculum. + +## Table of Contents + +- [Python and Jupyter Issues](#python-and-jupyter-issues) +- [Package and Dependency Issues](#package-and-dependency-issues) +- [Jupyter Notebook Issues](#jupyter-notebook-issues) +- [Quiz Application Issues](#quiz-application-issues) +- [Git and GitHub Issues](#git-and-github-issues) +- [Docsify Documentation Issues](#docsify-documentation-issues) +- [Data and File Issues](#data-and-file-issues) +- [Performance Issues](#performance-issues) +- [Getting Additional Help](#getting-additional-help) + +## Python and Jupyter Issues + +### Python Not Found or Wrong Version + +**Problem:** `python: command not found` or wrong Python version + +**Solution:** + +```bash +# Check Python version +python --version +python3 --version + +# If Python 3 is installed as 'python3', create an alias +# On macOS/Linux, add to ~/.bashrc or ~/.zshrc: +alias python=python3 +alias pip=pip3 + +# Or use python3 explicitly +python3 -m pip install jupyter +``` + +**Windows Solution:** +1. Reinstall Python from [python.org](https://www.python.org/) +2. During installation, check "Add Python to PATH" +3. Restart your terminal/command prompt + +### Virtual Environment Activation Issues + +**Problem:** Virtual environment won't activate + +**Solution:** + +**Windows:** +```bash +# If you get execution policy error +Set-ExecutionPolicy -ExecutionPolicy RemoteSigned -Scope CurrentUser + +# Then activate +venv\Scripts\activate +``` + +**macOS/Linux:** +```bash +# Ensure the activate script is executable +chmod +x venv/bin/activate + +# Then activate +source venv/bin/activate +``` + +**Verify activation:** +```bash +# Your prompt should show (venv) +# Check Python location +which python # Should point to venv +``` + +### Jupyter Kernel Issues + +**Problem:** "Kernel not found" or "Kernel keeps dying" + +**Solution:** + +```bash +# Reinstall kernel +python -m ipykernel install --user --name=datascience --display-name="Python (Data Science)" + +# Or use the default kernel +python -m ipykernel install --user + +# Restart Jupyter +jupyter notebook +``` + +**Problem:** Wrong Python version in Jupyter + +**Solution:** +```bash +# Install Jupyter in your virtual environment +source venv/bin/activate # Activate first +pip install jupyter ipykernel + +# Register the kernel +python -m ipykernel install --user --name=venv --display-name="Python (venv)" + +# In Jupyter, select Kernel -> Change kernel -> Python (venv) +``` + +## Package and Dependency Issues + +### Import Errors + +**Problem:** `ModuleNotFoundError: No module named 'pandas'` (or other packages) + +**Solution:** + +```bash +# Ensure virtual environment is activated +source venv/bin/activate # macOS/Linux +venv\Scripts\activate # Windows + +# Install missing package +pip install pandas + +# Install all common packages +pip install jupyter pandas numpy matplotlib seaborn scikit-learn + +# Verify installation +python -c "import pandas; print(pandas.__version__)" +``` + +### Pip Installation Failures + +**Problem:** `pip install` fails with permission errors + +**Solution:** + +```bash +# Use --user flag +pip install --user package-name + +# Or use virtual environment (recommended) +python -m venv venv +source venv/bin/activate +pip install package-name +``` + +**Problem:** `pip install` fails with SSL certificate errors + +**Solution:** + +```bash +# Update pip first +python -m pip install --upgrade pip + +# Try installing with trusted host (temporary workaround) +pip install --trusted-host pypi.org --trusted-host files.pythonhosted.org package-name +``` + +### Package Version Conflicts + +**Problem:** Incompatible package versions + +**Solution:** + +```bash +# Create fresh virtual environment +python -m venv venv-new +source venv-new/bin/activate # or venv-new\Scripts\activate on Windows + +# Install packages with specific versions if needed +pip install pandas==1.3.0 +pip install numpy==1.21.0 + +# Or let pip resolve dependencies +pip install jupyter pandas numpy matplotlib seaborn scikit-learn +``` + +## Jupyter Notebook Issues + +### Jupyter Won't Start + +**Problem:** `jupyter notebook` command not found + +**Solution:** + +```bash +# Install Jupyter +pip install jupyter + +# Or use python -m +python -m jupyter notebook + +# Add to PATH if needed (macOS/Linux) +export PATH="$HOME/.local/bin:$PATH" +``` + +### Notebook Won't Load or Save + +**Problem:** "Notebook failed to load" or save errors + +**Solution:** + +1. Check file permissions +```bash +# Make sure you have write permissions +ls -l notebook.ipynb +chmod 644 notebook.ipynb # If needed +``` + +2. Check for file corruption +```bash +# Try opening in text editor to check JSON structure +# Copy content to new notebook if corrupted +``` + +3. Clear Jupyter cache +```bash +jupyter notebook --clear-cache +``` + +### Cell Won't Execute + +**Problem:** Cell stuck on "In [*]" or takes forever + +**Solution:** + +1. **Interrupt the kernel**: Click "Interrupt" button or press `I, I` +2. **Restart kernel**: Kernel menu → Restart +3. **Check for infinite loops** in your code +4. **Clear output**: Cell → All Output → Clear + +### Plots Not Displaying + +**Problem:** `matplotlib` plots don't show in notebook + +**Solution:** + +```python +# Add magic command at the top of notebook +%matplotlib inline + +import matplotlib.pyplot as plt + +# Create plot +plt.plot([1, 2, 3, 4]) +plt.show() # Make sure to call show() +``` + +**Alternative for interactive plots:** +```python +%matplotlib notebook +# Or +%matplotlib widget +``` + +## Quiz Application Issues + +### npm install Fails + +**Problem:** Errors during `npm install` + +**Solution:** + +```bash +# Clear npm cache +npm cache clean --force + +# Remove node_modules and package-lock.json +rm -rf node_modules package-lock.json + +# Reinstall +npm install + +# If still failing, try with legacy peer deps +npm install --legacy-peer-deps +``` + +### Quiz App Won't Start + +**Problem:** `npm run serve` fails + +**Solution:** + +```bash +# Check Node.js version +node --version # Should be 12.x or higher + +# Reinstall dependencies +cd quiz-app +rm -rf node_modules package-lock.json +npm install + +# Try different port +npm run serve -- --port 8081 +``` + +### Port Already in Use + +**Problem:** "Port 8080 is already in use" + +**Solution:** + +```bash +# Find and kill process on port 8080 +# macOS/Linux: +lsof -ti:8080 | xargs kill -9 + +# Windows: +netstat -ano | findstr :8080 +taskkill /PID /F + +# Or use a different port +npm run serve -- --port 8081 +``` + +### Quiz Not Loading or Blank Page + +**Problem:** Quiz app loads but shows blank page + +**Solution:** + +1. Check browser console for errors (F12) +2. Clear browser cache and cookies +3. Try a different browser +4. Ensure JavaScript is enabled +5. Check for ad blockers interfering + +```bash +# Rebuild the app +npm run build +npm run serve +``` + +## Git and GitHub Issues + +### Git Not Recognized + +**Problem:** `git: command not found` + +**Solution:** + +**Windows:** +- Install Git from [git-scm.com](https://git-scm.com/) +- Restart terminal after installation + +**macOS:** +```bash +# Install via Homebrew +brew install git + +# Or install Xcode Command Line Tools +xcode-select --install +``` + +**Linux:** +```bash +sudo apt-get install git # Debian/Ubuntu +sudo dnf install git # Fedora +``` + +### Clone Fails + +**Problem:** `git clone` fails with authentication errors + +**Solution:** + +```bash +# Use HTTPS URL +git clone https://github.com/microsoft/Data-Science-For-Beginners.git + +# If you have 2FA enabled on GitHub, use Personal Access Token +# Create token at: https://github.com/settings/tokens +# Use token as password when prompted +``` + +### Permission Denied (publickey) + +**Problem:** SSH key authentication fails + +**Solution:** + +```bash +# Generate SSH key +ssh-keygen -t ed25519 -C "your_email@example.com" + +# Add key to ssh-agent +eval "$(ssh-agent -s)" +ssh-add ~/.ssh/id_ed25519 + +# Add public key to GitHub +# Copy key: cat ~/.ssh/id_ed25519.pub +# Add at: https://github.com/settings/keys +``` + +## Docsify Documentation Issues + +### Docsify Command Not Found + +**Problem:** `docsify: command not found` + +**Solution:** + +```bash +# Install globally +npm install -g docsify-cli + +# If permission error on macOS/Linux +sudo npm install -g docsify-cli + +# Verify installation +docsify --version + +# If still not found, add npm global path +# Find npm global path +npm config get prefix + +# Add to PATH (add to ~/.bashrc or ~/.zshrc) +export PATH="$PATH:/usr/local/bin" +``` + +### Documentation Not Loading + +**Problem:** Docsify serves but content doesn't load + +**Solution:** + +```bash +# Ensure you're in the repository root +cd Data-Science-For-Beginners + +# Check for index.html +ls index.html + +# Serve with specific port +docsify serve --port 3000 + +# Check browser console for errors (F12) +``` + +### Images Not Displaying + +**Problem:** Images show broken link icon + +**Solution:** + +1. Check image paths are relative +2. Ensure image files exist in the repository +3. Clear browser cache +4. Verify file extensions match (case-sensitive on some systems) + +## Data and File Issues + +### File Not Found Errors + +**Problem:** `FileNotFoundError` when loading data + +**Solution:** + +```python +import os + +# Check current working directory +print(os.getcwd()) + +# Use absolute path +data_path = os.path.join(os.getcwd(), 'data', 'filename.csv') +df = pd.read_csv(data_path) + +# Or use relative path from notebook location +df = pd.read_csv('../data/filename.csv') + +# Verify file exists +print(os.path.exists('data/filename.csv')) +``` + +### CSV Reading Errors + +**Problem:** Errors reading CSV files + +**Solution:** + +```python +import pandas as pd + +# Try different encodings +df = pd.read_csv('file.csv', encoding='utf-8') +# or +df = pd.read_csv('file.csv', encoding='latin-1') +# or +df = pd.read_csv('file.csv', encoding='ISO-8859-1') + +# Handle missing values +df = pd.read_csv('file.csv', na_values=['NA', 'N/A', '']) + +# Specify delimiter if not comma +df = pd.read_csv('file.csv', delimiter=';') +``` + +### Memory Errors with Large Datasets + +**Problem:** `MemoryError` when loading large files + +**Solution:** + +```python +# Read in chunks +chunk_size = 10000 +chunks = [] +for chunk in pd.read_csv('large_file.csv', chunksize=chunk_size): + # Process chunk + chunks.append(chunk) +df = pd.concat(chunks) + +# Or read specific columns only +df = pd.read_csv('file.csv', usecols=['col1', 'col2']) + +# Use more efficient data types +df = pd.read_csv('file.csv', dtype={'column_name': 'int32'}) +``` + +## Performance Issues + +### Slow Notebook Performance + +**Problem:** Notebooks run very slowly + +**Solution:** + +1. **Restart kernel and clear output** + - Kernel → Restart & Clear Output + +2. **Close unused notebooks** + +3. **Optimize code:** +```python +# Use vectorized operations instead of loops +# Bad: +result = [] +for x in data: + result.append(x * 2) + +# Good: +result = data * 2 # NumPy/Pandas vectorization +``` + +4. **Sample large datasets:** +```python +# Work with sample during development +df_sample = df.sample(n=1000) # or df.head(1000) +``` + +### Browser Crashes + +**Problem:** Browser crashes or becomes unresponsive + +**Solution:** + +1. Close unused tabs +2. Clear browser cache +3. Increase browser memory (Chrome: `chrome://settings/system`) +4. Use JupyterLab instead: +```bash +pip install jupyterlab +jupyter lab +``` + +## Getting Additional Help + +### Before Asking for Help + +1. Check this troubleshooting guide +2. Search [GitHub Issues](https://github.com/microsoft/Data-Science-For-Beginners/issues) +3. Review [INSTALLATION.md](INSTALLATION.md) and [USAGE.md](USAGE.md) +4. Try searching the error message online + +### How to Ask for Help + +When creating an issue or asking for help, include: + +1. **Operating System**: Windows, macOS, or Linux (which distribution) +2. **Python Version**: Run `python --version` +3. **Error Message**: Copy the complete error message +4. **Steps to Reproduce**: What you did before the error occurred +5. **What You've Tried**: Solutions you've already attempted + +**Example:** +``` +**Operating System:** macOS 12.0 +**Python Version:** 3.9.7 +**Error Message:** ModuleNotFoundError: No module named 'pandas' +**Steps to Reproduce:** +1. Activated virtual environment +2. Started Jupyter notebook +3. Tried to import pandas + +**What I've Tried:** +- Ran pip install pandas +- Restarted Jupyter +``` + +### Community Resources + +- **GitHub Issues**: [Create an issue](https://github.com/microsoft/Data-Science-For-Beginners/issues/new) +- **Discord**: [Join our community](https://aka.ms/ds4beginners/discord) +- **Discussions**: [GitHub Discussions](https://github.com/microsoft/Data-Science-For-Beginners/discussions) +- **Microsoft Learn**: [Q&A Forums](https://docs.microsoft.com/answers/) + +### Related Documentation + +- [INSTALLATION.md](INSTALLATION.md) - Setup instructions +- [USAGE.md](USAGE.md) - How to use the curriculum +- [CONTRIBUTING.md](CONTRIBUTING.md) - How to contribute +- [README.md](README.md) - Project overview diff --git a/USAGE.md b/USAGE.md new file mode 100644 index 00000000..857dfe01 --- /dev/null +++ b/USAGE.md @@ -0,0 +1,360 @@ +# Usage Guide + +This guide provides examples and common workflows for using the Data Science for Beginners curriculum. + +## Table of Contents + +- [How to Use This Curriculum](#how-to-use-this-curriculum) +- [Working with Lessons](#working-with-lessons) +- [Working with Jupyter Notebooks](#working-with-jupyter-notebooks) +- [Using the Quiz Application](#using-the-quiz-application) +- [Common Workflows](#common-workflows) +- [Tips for Self-Learners](#tips-for-self-learners) +- [Tips for Teachers](#tips-for-teachers) + +## How to Use This Curriculum + +This curriculum is designed to be flexible and can be used in multiple ways: + +- **Self-paced learning**: Work through lessons independently at your own speed +- **Classroom instruction**: Use as a structured course with guided instruction +- **Study groups**: Learn collaboratively with peers +- **Workshop format**: Intensive short-term learning sessions + +## Working with Lessons + +Each lesson follows a consistent structure to maximize learning: + +### Lesson Structure + +1. **Pre-lesson Quiz**: Test your existing knowledge +2. **Sketchnote** (Optional): Visual summary of key concepts +3. **Video** (Optional): Supplemental video content +4. **Written Lesson**: Core concepts and explanations +5. **Jupyter Notebook**: Hands-on coding exercises +6. **Assignment**: Practice what you've learned +7. **Post-lesson Quiz**: Reinforce your understanding + +### Example Workflow for a Lesson + +```bash +# 1. Navigate to the lesson directory +cd 1-Introduction/01-defining-data-science + +# 2. Read the README.md +# Open README.md in your browser or editor + +# 3. Take the pre-lesson quiz +# Click the quiz link in the README + +# 4. Open the Jupyter notebook (if available) +jupyter notebook + +# 5. Complete the exercises in the notebook + +# 6. Work on the assignment + +# 7. Take the post-lesson quiz +``` + +## Working with Jupyter Notebooks + +### Starting Jupyter + +```bash +# Activate your virtual environment +source venv/bin/activate # On macOS/Linux +# OR +venv\Scripts\activate # On Windows + +# Start Jupyter from the repository root +jupyter notebook +``` + +### Running Notebook Cells + +1. **Execute a cell**: Press `Shift + Enter` or click the "Run" button +2. **Execute all cells**: Select "Cell" → "Run All" from the menu +3. **Restart kernel**: Select "Kernel" → "Restart" if you encounter issues + +### Example: Working with Data in a Notebook + +```python +# Import required libraries +import pandas as pd +import numpy as np +import matplotlib.pyplot as plt + +# Load a dataset +df = pd.read_csv('data/sample.csv') + +# Explore the data +df.head() +df.info() +df.describe() + +# Create a visualization +plt.figure(figsize=(10, 6)) +plt.plot(df['column_name']) +plt.title('Sample Visualization') +plt.xlabel('X-axis Label') +plt.ylabel('Y-axis Label') +plt.show() +``` + +### Saving Your Work + +- Jupyter auto-saves periodically +- Manually save: Press `Ctrl + S` (or `Cmd + S` on macOS) +- Your progress is saved in the `.ipynb` file + +## Using the Quiz Application + +### Running the Quiz App Locally + +```bash +# Navigate to quiz app directory +cd quiz-app + +# Start the development server +npm run serve + +# Access at http://localhost:8080 +``` + +### Taking Quizzes + +1. Pre-lesson quizzes are linked at the top of each lesson +2. Post-lesson quizzes are linked at the bottom of each lesson +3. Each quiz has 3 questions +4. Quizzes are designed to reinforce learning, not to test exhaustively + +### Quiz Numbering + +- Quizzes are numbered 0-39 (40 total quizzes) +- Each lesson typically has a pre and post quiz +- Quiz URLs include the quiz number: `https://ff-quizzes.netlify.app/en/ds/quiz/0` + +## Common Workflows + +### Workflow 1: Complete Beginner Path + +```bash +# 1. Set up your environment (see INSTALLATION.md) + +# 2. Start with Lesson 1 +cd 1-Introduction/01-defining-data-science + +# 3. For each lesson: +# - Take pre-lesson quiz +# - Read the lesson content +# - Work through the notebook +# - Complete the assignment +# - Take post-lesson quiz + +# 4. Progress through all 20 lessons sequentially +``` + +### Workflow 2: Topic-Specific Learning + +If you're interested in a specific topic: + +```bash +# Example: Focus on Data Visualization +cd 3-Data-Visualization + +# Explore lessons 9-13: +# - Lesson 9: Visualizing Quantities +# - Lesson 10: Visualizing Distributions +# - Lesson 11: Visualizing Proportions +# - Lesson 12: Visualizing Relationships +# - Lesson 13: Meaningful Visualizations +``` + +### Workflow 3: Project-Based Learning + +```bash +# 1. Review the Data Science Lifecycle lessons (14-16) +cd 4-Data-Science-Lifecycle + +# 2. Work through a real-world example (Lesson 20) +cd ../6-Data-Science-In-Wild/20-Real-World-Examples + +# 3. Apply concepts to your own project +``` + +### Workflow 4: Cloud-Based Data Science + +```bash +# Learn about cloud data science (Lessons 17-19) +cd 5-Data-Science-In-Cloud + +# 17: Introduction to Cloud Data Science +# 18: Low-Code ML Tools +# 19: Azure Machine Learning Studio +``` + +## Tips for Self-Learners + +### Stay Organized + +```bash +# Create a learning journal +mkdir my-learning-journal + +# For each lesson, create notes +echo "# Lesson 1 Notes" > my-learning-journal/lesson-01-notes.md +``` + +### Practice Regularly + +- Set aside dedicated time each day or week +- Complete at least one lesson per week +- Review previous lessons periodically + +### Engage with the Community + +- Join the [Discord community](https://aka.ms/ds4beginners/discord) +- Participate in [GitHub Discussions](https://github.com/microsoft/Data-Science-For-Beginners/discussions) +- Share your progress and ask questions + +### Build Your Own Projects + +After completing lessons, apply concepts to personal projects: + +```python +# Example: Analyze your own dataset +import pandas as pd + +# Load your own data +my_data = pd.read_csv('my-project/data.csv') + +# Apply techniques learned +# - Data cleaning (Lesson 8) +# - Exploratory data analysis (Lesson 7) +# - Visualization (Lessons 9-13) +# - Analysis (Lesson 15) +``` + +## Tips for Teachers + +### Classroom Setup + +1. Review [for-teachers.md](for-teachers.md) for detailed guidance +2. Set up a shared environment (GitHub Classroom or Codespaces) +3. Establish a communication channel (Discord, Slack, or Teams) + +### Lesson Planning + +**Suggested 10-Week Schedule:** + +- **Week 1-2**: Introduction (Lessons 1-4) +- **Week 3-4**: Working with Data (Lessons 5-8) +- **Week 5-6**: Data Visualization (Lessons 9-13) +- **Week 7-8**: Data Science Lifecycle (Lessons 14-16) +- **Week 9**: Cloud Data Science (Lessons 17-19) +- **Week 10**: Real-World Applications & Final Projects (Lesson 20) + +### Running Docsify for Offline Access + +```bash +# Serve documentation locally for classroom use +docsify serve + +# Students can access at localhost:3000 +# No internet required after initial setup +``` + +### Assignment Grading + +- Review student notebooks for completed exercises +- Check for understanding through quiz scores +- Evaluate final projects using data science lifecycle principles + +### Creating Assignments + +```python +# Example custom assignment template +""" +Assignment: [Topic] + +Objective: [Learning goal] + +Dataset: [Provide or have students find one] + +Tasks: +1. Load and explore the dataset +2. Clean and prepare the data +3. Create at least 3 visualizations +4. Perform analysis +5. Communicate findings + +Deliverables: +- Jupyter notebook with code and explanations +- Written summary of findings +""" +``` + +## Working Offline + +### Download Resources + +```bash +# Clone the entire repository +git clone https://github.com/microsoft/Data-Science-For-Beginners.git + +# Download datasets in advance +# Most datasets are included in the repository +``` + +### Run Documentation Locally + +```bash +# Serve with Docsify +docsify serve + +# Access at localhost:3000 +``` + +### Run Quiz App Locally + +```bash +cd quiz-app +npm run serve +``` + +## Accessing Translated Content + +Translations are available in 40+ languages: + +```bash +# Access translated lessons +cd translations/fr # French +cd translations/es # Spanish +cd translations/de # German +# ... and many more +``` + +Each translation maintains the same structure as the English version. + +## Additional Resources + +### Continue Learning + +- [Microsoft Learn](https://docs.microsoft.com/learn/) - Additional learning paths +- [Student Hub](https://docs.microsoft.com/learn/student-hub) - Resources for students +- [Azure AI Foundry](https://aka.ms/foundry/forum) - Community forum + +### Related Curricula + +- [AI for Beginners](https://aka.ms/ai-beginners) +- [ML for Beginners](https://aka.ms/ml-beginners) +- [Web Dev for Beginners](https://aka.ms/webdev-beginners) +- [Generative AI for Beginners](https://aka.ms/genai-beginners) + +## Getting Help + +- Check [TROUBLESHOOTING.md](TROUBLESHOOTING.md) for common issues +- Search [GitHub Issues](https://github.com/microsoft/Data-Science-For-Beginners/issues) +- Join our [Discord](https://aka.ms/ds4beginners/discord) +- Review [CONTRIBUTING.md](CONTRIBUTING.md) to report issues or contribute