Add comprehensive documentation: installation, usage, and troubleshooting guides

Co-authored-by: leestott <2511341+leestott@users.noreply.github.com>
copilot/fix-0de5e46c-afe2-43ab-8c38-67d5a3358ccc
copilot-swe-agent[bot] 2 months ago
parent fc45572aa6
commit 3503f04860

@ -1,4 +1,338 @@
# Contributing
# Contributing to Data Science for Beginners
Thank you for your interest in contributing to the Data Science for Beginners curriculum! We welcome contributions from the community.
## Table of Contents
- [Code of Conduct](#code-of-conduct)
- [How Can I Contribute?](#how-can-i-contribute)
- [Getting Started](#getting-started)
- [Contribution Guidelines](#contribution-guidelines)
- [Pull Request Process](#pull-request-process)
- [Style Guidelines](#style-guidelines)
- [Contributor License Agreement](#contributor-license-agreement)
## Code of Conduct
This project has adopted the [Microsoft Open Source Code of Conduct](https://opensource.microsoft.com/codeofconduct/).
For more information see the [Code of Conduct FAQ](https://opensource.microsoft.com/codeofconduct/faq/)
or contact [opencode@microsoft.com](mailto:opencode@microsoft.com) with any additional questions or comments.
## How Can I Contribute?
### Reporting Bugs
Before creating bug reports, please check the existing issues to avoid duplicates. When you create a bug report, include as many details as possible:
- **Use a clear and descriptive title**
- **Describe the exact steps to reproduce the problem**
- **Provide specific examples** (code snippets, screenshots)
- **Describe the behavior you observed and what you expected**
- **Include your environment details** (OS, Python version, browser)
### Suggesting Enhancements
Enhancement suggestions are welcome! When suggesting enhancements:
- **Use a clear and descriptive title**
- **Provide a detailed description of the suggested enhancement**
- **Explain why this enhancement would be useful**
- **List any similar features in other projects, if applicable**
### Contributing to Documentation
Documentation improvements are always appreciated:
- **Fix typos and grammatical errors**
- **Improve clarity of explanations**
- **Add missing documentation**
- **Update outdated information**
- **Add examples or use cases**
### Contributing Code
We welcome code contributions including:
- **New lessons or exercises**
- **Bug fixes**
- **Improvements to existing notebooks**
- **New datasets or examples**
- **Quiz application enhancements**
## Getting Started
### Prerequisites
Before contributing, ensure you have:
1. A GitHub account
2. Git installed on your system
3. Python 3.7+ and Jupyter installed
4. Node.js and npm (for quiz app contributions)
5. Familiarity with the curriculum structure
See [INSTALLATION.md](INSTALLATION.md) for detailed setup instructions.
### Fork and Clone
1. **Fork the repository** on GitHub
2. **Clone your fork** locally:
```bash
git clone https://github.com/YOUR-USERNAME/Data-Science-For-Beginners.git
cd Data-Science-For-Beginners
```
3. **Add upstream remote**:
```bash
git remote add upstream https://github.com/microsoft/Data-Science-For-Beginners.git
```
### Create a Branch
Create a new branch for your work:
```bash
git checkout -b feature/your-feature-name
# or
git checkout -b fix/your-bug-fix
```
Branch naming conventions:
- `feature/` - New features or lessons
- `fix/` - Bug fixes
- `docs/` - Documentation changes
- `refactor/` - Code refactoring
## Contribution Guidelines
### For Lesson Content
When contributing lessons or modifying existing ones:
1. **Follow the existing structure**:
- README.md with lesson content
- Jupyter notebook with exercises
- Assignment (if applicable)
- Link to pre and post quizzes
2. **Include these elements**:
- Clear learning objectives
- Step-by-step explanations
- Code examples with comments
- Exercises for practice
- Links to additional resources
3. **Ensure accessibility**:
- Use clear, simple language
- Provide alt text for images
- Include code comments
- Consider different learning styles
### For Jupyter Notebooks
1. **Clear all outputs** before committing:
```bash
jupyter nbconvert --clear-output --inplace notebook.ipynb
```
2. **Include markdown cells** with explanations
3. **Use consistent formatting**:
```python
# Import libraries at the top
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
# Use meaningful variable names
# Add comments for complex operations
# Follow PEP 8 style guidelines
```
4. **Test your notebook** completely before submitting
### For Python Code
Follow [PEP 8](https://www.python.org/dev/peps/pep-0008/) style guidelines:
```python
# Good practices
import pandas as pd
def calculate_mean(data):
"""Calculate the mean of a dataset.
Args:
data (list): List of numerical values
Returns:
float: Mean of the dataset
"""
return sum(data) / len(data)
```
### For Quiz App Contributions
When modifying the quiz application:
1. **Test locally**:
```bash
cd quiz-app
npm install
npm run serve
```
2. **Run linter**:
```bash
npm run lint
```
3. **Build successfully**:
```bash
npm run build
```
4. **Follow Vue.js style guide** and existing patterns
### For Translations
When adding or updating translations:
1. Follow the structure in `translations/` folder
2. Use the language code as folder name (e.g., `fr` for French)
3. Maintain the same file structure as English version
4. Update quiz links to include language parameter: `?loc=fr`
5. Test all links and formatting
## Pull Request Process
### Before Submitting
1. **Update your branch** with latest changes:
```bash
git fetch upstream
git rebase upstream/main
```
2. **Test your changes**:
- Run all modified notebooks
- Test quiz app if modified
- Verify all links work
- Check for spelling and grammar errors
3. **Commit your changes**:
```bash
git add .
git commit -m "Brief description of changes"
```
Write clear commit messages:
- Use present tense ("Add feature" not "Added feature")
- Use imperative mood ("Move cursor to..." not "Moves cursor to...")
- Limit first line to 72 characters
- Reference issues and pull requests when relevant
4. **Push to your fork**:
```bash
git push origin feature/your-feature-name
```
### Creating the Pull Request
1. Go to the [repository](https://github.com/microsoft/Data-Science-For-Beginners)
2. Click "Pull requests" → "New pull request"
3. Click "compare across forks"
4. Select your fork and branch
5. Click "Create pull request"
### PR Title Format
Use clear, descriptive titles following this format:
```
[Component] Brief description
```
Examples:
- `[Lesson 7] Fix Python notebook import error`
- `[Quiz App] Add German translation`
- `[Docs] Update README with new prerequisites`
- `[Fix] Correct data path in visualization lesson`
### PR Description
Include in your PR description:
- **What**: What changes did you make?
- **Why**: Why are these changes necessary?
- **How**: How did you implement the changes?
- **Testing**: How did you test the changes?
- **Screenshots**: Include screenshots for visual changes
- **Related Issues**: Link to related issues (e.g., "Fixes #123")
### Review Process
1. **Automated checks** will run on your PR
2. **Maintainers will review** your contribution
3. **Address feedback** by making additional commits
4. Once approved, a **maintainer will merge** your PR
### After Your PR is Merged
1. Delete your branch:
```bash
git branch -d feature/your-feature-name
git push origin --delete feature/your-feature-name
```
2. Update your fork:
```bash
git checkout main
git pull upstream main
git push origin main
```
## Style Guidelines
### Markdown
- Use consistent heading levels
- Include blank lines between sections
- Use code blocks with language specifiers:
````markdown
```python
import pandas as pd
```
````
- Add alt text to images: `![Alt text](image.png)`
- Keep line lengths reasonable (around 80-100 characters)
### Python
- Follow PEP 8 style guide
- Use meaningful variable names
- Add docstrings to functions
- Include type hints where appropriate:
```python
def process_data(df: pd.DataFrame) -> pd.DataFrame:
"""Process the input dataframe."""
return df
```
### JavaScript/Vue.js
- Follow Vue.js 2 style guide
- Use ESLint configuration provided
- Write modular, reusable components
- Add comments for complex logic
### File Organization
- Keep related files together
- Use descriptive file names
- Follow existing directory structure
- Don't commit unnecessary files (.DS_Store, .pyc, node_modules, etc.)
## Contributor License Agreement
This project welcomes contributions and suggestions. Most contributions require you to
agree to a Contributor License Agreement (CLA) declaring that you have the right to,
@ -9,6 +343,12 @@ When you submit a pull request, a CLA-bot will automatically determine whether y
to provide a CLA and decorate the PR appropriately (e.g., label, comment). Simply follow the
instructions provided by the bot. You will only need to do this once across all repositories using our CLA.
This project has adopted the [Microsoft Open Source Code of Conduct](https://opensource.microsoft.com/codeofconduct/).
For more information see the [Code of Conduct FAQ](https://opensource.microsoft.com/codeofconduct/faq/)
or contact [opencode@microsoft.com](mailto:opencode@microsoft.com) with any additional questions or comments.
## Questions?
- Check our [GitHub Discussions](https://github.com/microsoft/Data-Science-For-Beginners/discussions)
- Join our [Discord community](https://aka.ms/ds4beginners/discord)
- Review existing [issues](https://github.com/microsoft/Data-Science-For-Beginners/issues) and [pull requests](https://github.com/microsoft/Data-Science-For-Beginners/pulls)
## Thank You!
Your contributions make this curriculum better for everyone. Thank you for taking the time to contribute!

@ -0,0 +1,239 @@
# Installation Guide
This guide will help you set up your environment to work with the Data Science for Beginners curriculum.
## Table of Contents
- [Prerequisites](#prerequisites)
- [Quick Start Options](#quick-start-options)
- [Local Installation](#local-installation)
- [Verify Your Installation](#verify-your-installation)
## Prerequisites
Before you begin, you should have:
- Basic familiarity with command line/terminal
- A GitHub account (free)
- Stable internet connection for initial setup
## Quick Start Options
### Option 1: GitHub Codespaces (Recommended for Beginners)
The easiest way to get started is with GitHub Codespaces, which provides a complete development environment in your browser.
1. Navigate to the [repository](https://github.com/microsoft/Data-Science-For-Beginners)
2. Click the **Code** dropdown menu
3. Select the **Codespaces** tab
4. Click **Create codespace on main**
5. Wait for the environment to initialize (2-3 minutes)
Your environment is now ready with all dependencies pre-installed!
### Option 2: Local Development
For working on your own computer, follow the detailed instructions below.
## Local Installation
### Step 1: Install Git
Git is required to clone the repository and track your changes.
**Windows:**
- Download from [git-scm.com](https://git-scm.com/download/win)
- Run the installer with default settings
**macOS:**
- Install via Homebrew: `brew install git`
- Or download from [git-scm.com](https://git-scm.com/download/mac)
**Linux:**
```bash
# Debian/Ubuntu
sudo apt-get update
sudo apt-get install git
# Fedora
sudo dnf install git
# Arch
sudo pacman -S git
```
### Step 2: Clone the Repository
```bash
# Clone the repository
git clone https://github.com/microsoft/Data-Science-For-Beginners.git
# Navigate to the directory
cd Data-Science-For-Beginners
```
### Step 3: Install Python and Jupyter
Python 3.7 or higher is required for the data science lessons.
**Windows:**
1. Download Python from [python.org](https://www.python.org/downloads/)
2. During installation, check "Add Python to PATH"
3. Verify installation:
```bash
python --version
```
**macOS:**
```bash
# Using Homebrew
brew install python3
# Verify installation
python3 --version
```
**Linux:**
```bash
# Most Linux distributions come with Python pre-installed
python3 --version
# If not installed:
# Debian/Ubuntu
sudo apt-get install python3 python3-pip
# Fedora
sudo dnf install python3 python3-pip
```
### Step 4: Set Up Python Environment
It's recommended to use a virtual environment to keep dependencies isolated.
```bash
# Create a virtual environment
python -m venv venv
# Activate the virtual environment
# On Windows:
venv\Scripts\activate
# On macOS/Linux:
source venv/bin/activate
```
### Step 5: Install Python Packages
Install the required data science libraries:
```bash
pip install jupyter pandas numpy matplotlib seaborn scikit-learn
```
### Step 6: Install Node.js and npm (For Quiz App)
The quiz application requires Node.js and npm.
**Windows/macOS:**
- Download from [nodejs.org](https://nodejs.org/) (LTS version recommended)
- Run the installer
**Linux:**
```bash
# Debian/Ubuntu
curl -fsSL https://deb.nodesource.com/setup_lts.x | sudo -E bash -
sudo apt-get install -y nodejs
# Fedora
sudo dnf install nodejs
# Verify installation
node --version
npm --version
```
### Step 7: Install Quiz App Dependencies
```bash
# Navigate to quiz app directory
cd quiz-app
# Install dependencies
npm install
# Return to root directory
cd ..
```
### Step 8: Install Docsify (Optional)
For offline access to documentation:
```bash
npm install -g docsify-cli
```
## Verify Your Installation
### Test Python and Jupyter
```bash
# Activate your virtual environment if not already activated
# On Windows:
venv\Scripts\activate
# On macOS/Linux:
source venv/bin/activate
# Start Jupyter Notebook
jupyter notebook
```
Your browser should open with the Jupyter interface. You can now navigate to any lesson's `.ipynb` file.
### Test Quiz Application
```bash
# Navigate to quiz app
cd quiz-app
# Start development server
npm run serve
```
The quiz app should be available at `http://localhost:8080` (or another port if 8080 is busy).
### Test Documentation Server
```bash
# From the root directory of the repository
docsify serve
```
The documentation should be available at `http://localhost:3000`.
## Using VS Code Dev Containers
If you have Docker installed, you can use VS Code Dev Containers:
1. Install [Docker Desktop](https://www.docker.com/products/docker-desktop)
2. Install [Visual Studio Code](https://code.visualstudio.com/)
3. Install the [Remote - Containers extension](https://marketplace.visualstudio.com/items?itemName=ms-vscode-remote.remote-containers)
4. Open the repository in VS Code
5. Press `F1` and select "Remote-Containers: Reopen in Container"
6. Wait for the container to build (first time only)
## Next Steps
- Explore the [README.md](README.md) for an overview of the curriculum
- Read [USAGE.md](USAGE.md) for common workflows and examples
- Check [TROUBLESHOOTING.md](TROUBLESHOOTING.md) if you encounter issues
- Review [CONTRIBUTING.md](CONTRIBUTING.md) if you want to contribute
## Getting Help
If you encounter issues:
1. Check the [TROUBLESHOOTING.md](TROUBLESHOOTING.md) guide
2. Search existing [GitHub Issues](https://github.com/microsoft/Data-Science-For-Beginners/issues)
3. Join our [Discord community](https://aka.ms/ds4beginners/discord)
4. Create a new issue with detailed information about your problem

@ -51,10 +51,28 @@ Get started with the following resources:
# Getting Started
> **Teachers**: we have [included some suggestions](for-teachers.md) on how to use this curriculum. We'd love your feedback [in our discussion forum](https://github.com/microsoft/Data-Science-For-Beginners/discussions)!
## 📚 Documentation
- **[Installation Guide](INSTALLATION.md)** - Step-by-step setup instructions for beginners
- **[Usage Guide](USAGE.md)** - Examples and common workflows
- **[Troubleshooting](TROUBLESHOOTING.md)** - Solutions to common issues
- **[Contributing Guide](CONTRIBUTING.md)** - How to contribute to this project
- **[For Teachers](for-teachers.md)** - Teaching guidance and classroom resources
## 👨‍🎓 For Students
> **[Students](https://aka.ms/student-page)**: to use this curriculum on your own, fork the entire repo and complete the exercises on your own, starting with a pre-lecture quiz. Then read the lecture and complete the rest of the activities. Try to create the projects by comprehending the lessons rather than copying the solution code; however, that code is available in the /solutions folders in each project-oriented lesson. Another idea would be to form a study group with friends and go through the content together. For further study, we recommend [Microsoft Learn](https://docs.microsoft.com/en-us/users/jenlooper-2911/collections/qprpajyoy3x0g7?WT.mc_id=academic-77958-bethanycheum).
**Quick Start:**
1. Check the [Installation Guide](INSTALLATION.md) to set up your environment
2. Review the [Usage Guide](USAGE.md) to learn how to work with the curriculum
3. Start with Lesson 1 and work through sequentially
4. Join our [Discord community](https://aka.ms/ds4beginners/discord) for support
## 👩‍🏫 For Teachers
> **Teachers**: we have [included some suggestions](for-teachers.md) on how to use this curriculum. We'd love your feedback [in our discussion forum](https://github.com/microsoft/Data-Science-For-Beginners/discussions)!
## Meet the Team
[![Promo video](ds-for-beginners.gif)](https://youtu.be/8mzavjQSMM4 "Promo video")
@ -171,6 +189,8 @@ Our team produces other curricula! Check out:
## Getting Help
**Encountering issues?** Check our [Troubleshooting Guide](TROUBLESHOOTING.md) for solutions to common problems.
If you get stuck or have any questions about building AI apps, join:
[![Azure AI Foundry Discord](https://img.shields.io/badge/Discord-Azure_AI_Foundry_Community_Discord-blue?style=for-the-badge&logo=discord&color=5865f2&logoColor=fff)](https://aka.ms/foundry/discord)

@ -0,0 +1,611 @@
# Troubleshooting Guide
This guide provides solutions to common issues you might encounter while working with the Data Science for Beginners curriculum.
## Table of Contents
- [Python and Jupyter Issues](#python-and-jupyter-issues)
- [Package and Dependency Issues](#package-and-dependency-issues)
- [Jupyter Notebook Issues](#jupyter-notebook-issues)
- [Quiz Application Issues](#quiz-application-issues)
- [Git and GitHub Issues](#git-and-github-issues)
- [Docsify Documentation Issues](#docsify-documentation-issues)
- [Data and File Issues](#data-and-file-issues)
- [Performance Issues](#performance-issues)
- [Getting Additional Help](#getting-additional-help)
## Python and Jupyter Issues
### Python Not Found or Wrong Version
**Problem:** `python: command not found` or wrong Python version
**Solution:**
```bash
# Check Python version
python --version
python3 --version
# If Python 3 is installed as 'python3', create an alias
# On macOS/Linux, add to ~/.bashrc or ~/.zshrc:
alias python=python3
alias pip=pip3
# Or use python3 explicitly
python3 -m pip install jupyter
```
**Windows Solution:**
1. Reinstall Python from [python.org](https://www.python.org/)
2. During installation, check "Add Python to PATH"
3. Restart your terminal/command prompt
### Virtual Environment Activation Issues
**Problem:** Virtual environment won't activate
**Solution:**
**Windows:**
```bash
# If you get execution policy error
Set-ExecutionPolicy -ExecutionPolicy RemoteSigned -Scope CurrentUser
# Then activate
venv\Scripts\activate
```
**macOS/Linux:**
```bash
# Ensure the activate script is executable
chmod +x venv/bin/activate
# Then activate
source venv/bin/activate
```
**Verify activation:**
```bash
# Your prompt should show (venv)
# Check Python location
which python # Should point to venv
```
### Jupyter Kernel Issues
**Problem:** "Kernel not found" or "Kernel keeps dying"
**Solution:**
```bash
# Reinstall kernel
python -m ipykernel install --user --name=datascience --display-name="Python (Data Science)"
# Or use the default kernel
python -m ipykernel install --user
# Restart Jupyter
jupyter notebook
```
**Problem:** Wrong Python version in Jupyter
**Solution:**
```bash
# Install Jupyter in your virtual environment
source venv/bin/activate # Activate first
pip install jupyter ipykernel
# Register the kernel
python -m ipykernel install --user --name=venv --display-name="Python (venv)"
# In Jupyter, select Kernel -> Change kernel -> Python (venv)
```
## Package and Dependency Issues
### Import Errors
**Problem:** `ModuleNotFoundError: No module named 'pandas'` (or other packages)
**Solution:**
```bash
# Ensure virtual environment is activated
source venv/bin/activate # macOS/Linux
venv\Scripts\activate # Windows
# Install missing package
pip install pandas
# Install all common packages
pip install jupyter pandas numpy matplotlib seaborn scikit-learn
# Verify installation
python -c "import pandas; print(pandas.__version__)"
```
### Pip Installation Failures
**Problem:** `pip install` fails with permission errors
**Solution:**
```bash
# Use --user flag
pip install --user package-name
# Or use virtual environment (recommended)
python -m venv venv
source venv/bin/activate
pip install package-name
```
**Problem:** `pip install` fails with SSL certificate errors
**Solution:**
```bash
# Update pip first
python -m pip install --upgrade pip
# Try installing with trusted host (temporary workaround)
pip install --trusted-host pypi.org --trusted-host files.pythonhosted.org package-name
```
### Package Version Conflicts
**Problem:** Incompatible package versions
**Solution:**
```bash
# Create fresh virtual environment
python -m venv venv-new
source venv-new/bin/activate # or venv-new\Scripts\activate on Windows
# Install packages with specific versions if needed
pip install pandas==1.3.0
pip install numpy==1.21.0
# Or let pip resolve dependencies
pip install jupyter pandas numpy matplotlib seaborn scikit-learn
```
## Jupyter Notebook Issues
### Jupyter Won't Start
**Problem:** `jupyter notebook` command not found
**Solution:**
```bash
# Install Jupyter
pip install jupyter
# Or use python -m
python -m jupyter notebook
# Add to PATH if needed (macOS/Linux)
export PATH="$HOME/.local/bin:$PATH"
```
### Notebook Won't Load or Save
**Problem:** "Notebook failed to load" or save errors
**Solution:**
1. Check file permissions
```bash
# Make sure you have write permissions
ls -l notebook.ipynb
chmod 644 notebook.ipynb # If needed
```
2. Check for file corruption
```bash
# Try opening in text editor to check JSON structure
# Copy content to new notebook if corrupted
```
3. Clear Jupyter cache
```bash
jupyter notebook --clear-cache
```
### Cell Won't Execute
**Problem:** Cell stuck on "In [*]" or takes forever
**Solution:**
1. **Interrupt the kernel**: Click "Interrupt" button or press `I, I`
2. **Restart kernel**: Kernel menu → Restart
3. **Check for infinite loops** in your code
4. **Clear output**: Cell → All Output → Clear
### Plots Not Displaying
**Problem:** `matplotlib` plots don't show in notebook
**Solution:**
```python
# Add magic command at the top of notebook
%matplotlib inline
import matplotlib.pyplot as plt
# Create plot
plt.plot([1, 2, 3, 4])
plt.show() # Make sure to call show()
```
**Alternative for interactive plots:**
```python
%matplotlib notebook
# Or
%matplotlib widget
```
## Quiz Application Issues
### npm install Fails
**Problem:** Errors during `npm install`
**Solution:**
```bash
# Clear npm cache
npm cache clean --force
# Remove node_modules and package-lock.json
rm -rf node_modules package-lock.json
# Reinstall
npm install
# If still failing, try with legacy peer deps
npm install --legacy-peer-deps
```
### Quiz App Won't Start
**Problem:** `npm run serve` fails
**Solution:**
```bash
# Check Node.js version
node --version # Should be 12.x or higher
# Reinstall dependencies
cd quiz-app
rm -rf node_modules package-lock.json
npm install
# Try different port
npm run serve -- --port 8081
```
### Port Already in Use
**Problem:** "Port 8080 is already in use"
**Solution:**
```bash
# Find and kill process on port 8080
# macOS/Linux:
lsof -ti:8080 | xargs kill -9
# Windows:
netstat -ano | findstr :8080
taskkill /PID <PID> /F
# Or use a different port
npm run serve -- --port 8081
```
### Quiz Not Loading or Blank Page
**Problem:** Quiz app loads but shows blank page
**Solution:**
1. Check browser console for errors (F12)
2. Clear browser cache and cookies
3. Try a different browser
4. Ensure JavaScript is enabled
5. Check for ad blockers interfering
```bash
# Rebuild the app
npm run build
npm run serve
```
## Git and GitHub Issues
### Git Not Recognized
**Problem:** `git: command not found`
**Solution:**
**Windows:**
- Install Git from [git-scm.com](https://git-scm.com/)
- Restart terminal after installation
**macOS:**
```bash
# Install via Homebrew
brew install git
# Or install Xcode Command Line Tools
xcode-select --install
```
**Linux:**
```bash
sudo apt-get install git # Debian/Ubuntu
sudo dnf install git # Fedora
```
### Clone Fails
**Problem:** `git clone` fails with authentication errors
**Solution:**
```bash
# Use HTTPS URL
git clone https://github.com/microsoft/Data-Science-For-Beginners.git
# If you have 2FA enabled on GitHub, use Personal Access Token
# Create token at: https://github.com/settings/tokens
# Use token as password when prompted
```
### Permission Denied (publickey)
**Problem:** SSH key authentication fails
**Solution:**
```bash
# Generate SSH key
ssh-keygen -t ed25519 -C "your_email@example.com"
# Add key to ssh-agent
eval "$(ssh-agent -s)"
ssh-add ~/.ssh/id_ed25519
# Add public key to GitHub
# Copy key: cat ~/.ssh/id_ed25519.pub
# Add at: https://github.com/settings/keys
```
## Docsify Documentation Issues
### Docsify Command Not Found
**Problem:** `docsify: command not found`
**Solution:**
```bash
# Install globally
npm install -g docsify-cli
# If permission error on macOS/Linux
sudo npm install -g docsify-cli
# Verify installation
docsify --version
# If still not found, add npm global path
# Find npm global path
npm config get prefix
# Add to PATH (add to ~/.bashrc or ~/.zshrc)
export PATH="$PATH:/usr/local/bin"
```
### Documentation Not Loading
**Problem:** Docsify serves but content doesn't load
**Solution:**
```bash
# Ensure you're in the repository root
cd Data-Science-For-Beginners
# Check for index.html
ls index.html
# Serve with specific port
docsify serve --port 3000
# Check browser console for errors (F12)
```
### Images Not Displaying
**Problem:** Images show broken link icon
**Solution:**
1. Check image paths are relative
2. Ensure image files exist in the repository
3. Clear browser cache
4. Verify file extensions match (case-sensitive on some systems)
## Data and File Issues
### File Not Found Errors
**Problem:** `FileNotFoundError` when loading data
**Solution:**
```python
import os
# Check current working directory
print(os.getcwd())
# Use absolute path
data_path = os.path.join(os.getcwd(), 'data', 'filename.csv')
df = pd.read_csv(data_path)
# Or use relative path from notebook location
df = pd.read_csv('../data/filename.csv')
# Verify file exists
print(os.path.exists('data/filename.csv'))
```
### CSV Reading Errors
**Problem:** Errors reading CSV files
**Solution:**
```python
import pandas as pd
# Try different encodings
df = pd.read_csv('file.csv', encoding='utf-8')
# or
df = pd.read_csv('file.csv', encoding='latin-1')
# or
df = pd.read_csv('file.csv', encoding='ISO-8859-1')
# Handle missing values
df = pd.read_csv('file.csv', na_values=['NA', 'N/A', ''])
# Specify delimiter if not comma
df = pd.read_csv('file.csv', delimiter=';')
```
### Memory Errors with Large Datasets
**Problem:** `MemoryError` when loading large files
**Solution:**
```python
# Read in chunks
chunk_size = 10000
chunks = []
for chunk in pd.read_csv('large_file.csv', chunksize=chunk_size):
# Process chunk
chunks.append(chunk)
df = pd.concat(chunks)
# Or read specific columns only
df = pd.read_csv('file.csv', usecols=['col1', 'col2'])
# Use more efficient data types
df = pd.read_csv('file.csv', dtype={'column_name': 'int32'})
```
## Performance Issues
### Slow Notebook Performance
**Problem:** Notebooks run very slowly
**Solution:**
1. **Restart kernel and clear output**
- Kernel → Restart & Clear Output
2. **Close unused notebooks**
3. **Optimize code:**
```python
# Use vectorized operations instead of loops
# Bad:
result = []
for x in data:
result.append(x * 2)
# Good:
result = data * 2 # NumPy/Pandas vectorization
```
4. **Sample large datasets:**
```python
# Work with sample during development
df_sample = df.sample(n=1000) # or df.head(1000)
```
### Browser Crashes
**Problem:** Browser crashes or becomes unresponsive
**Solution:**
1. Close unused tabs
2. Clear browser cache
3. Increase browser memory (Chrome: `chrome://settings/system`)
4. Use JupyterLab instead:
```bash
pip install jupyterlab
jupyter lab
```
## Getting Additional Help
### Before Asking for Help
1. Check this troubleshooting guide
2. Search [GitHub Issues](https://github.com/microsoft/Data-Science-For-Beginners/issues)
3. Review [INSTALLATION.md](INSTALLATION.md) and [USAGE.md](USAGE.md)
4. Try searching the error message online
### How to Ask for Help
When creating an issue or asking for help, include:
1. **Operating System**: Windows, macOS, or Linux (which distribution)
2. **Python Version**: Run `python --version`
3. **Error Message**: Copy the complete error message
4. **Steps to Reproduce**: What you did before the error occurred
5. **What You've Tried**: Solutions you've already attempted
**Example:**
```
**Operating System:** macOS 12.0
**Python Version:** 3.9.7
**Error Message:** ModuleNotFoundError: No module named 'pandas'
**Steps to Reproduce:**
1. Activated virtual environment
2. Started Jupyter notebook
3. Tried to import pandas
**What I've Tried:**
- Ran pip install pandas
- Restarted Jupyter
```
### Community Resources
- **GitHub Issues**: [Create an issue](https://github.com/microsoft/Data-Science-For-Beginners/issues/new)
- **Discord**: [Join our community](https://aka.ms/ds4beginners/discord)
- **Discussions**: [GitHub Discussions](https://github.com/microsoft/Data-Science-For-Beginners/discussions)
- **Microsoft Learn**: [Q&A Forums](https://docs.microsoft.com/answers/)
### Related Documentation
- [INSTALLATION.md](INSTALLATION.md) - Setup instructions
- [USAGE.md](USAGE.md) - How to use the curriculum
- [CONTRIBUTING.md](CONTRIBUTING.md) - How to contribute
- [README.md](README.md) - Project overview

@ -0,0 +1,360 @@
# Usage Guide
This guide provides examples and common workflows for using the Data Science for Beginners curriculum.
## Table of Contents
- [How to Use This Curriculum](#how-to-use-this-curriculum)
- [Working with Lessons](#working-with-lessons)
- [Working with Jupyter Notebooks](#working-with-jupyter-notebooks)
- [Using the Quiz Application](#using-the-quiz-application)
- [Common Workflows](#common-workflows)
- [Tips for Self-Learners](#tips-for-self-learners)
- [Tips for Teachers](#tips-for-teachers)
## How to Use This Curriculum
This curriculum is designed to be flexible and can be used in multiple ways:
- **Self-paced learning**: Work through lessons independently at your own speed
- **Classroom instruction**: Use as a structured course with guided instruction
- **Study groups**: Learn collaboratively with peers
- **Workshop format**: Intensive short-term learning sessions
## Working with Lessons
Each lesson follows a consistent structure to maximize learning:
### Lesson Structure
1. **Pre-lesson Quiz**: Test your existing knowledge
2. **Sketchnote** (Optional): Visual summary of key concepts
3. **Video** (Optional): Supplemental video content
4. **Written Lesson**: Core concepts and explanations
5. **Jupyter Notebook**: Hands-on coding exercises
6. **Assignment**: Practice what you've learned
7. **Post-lesson Quiz**: Reinforce your understanding
### Example Workflow for a Lesson
```bash
# 1. Navigate to the lesson directory
cd 1-Introduction/01-defining-data-science
# 2. Read the README.md
# Open README.md in your browser or editor
# 3. Take the pre-lesson quiz
# Click the quiz link in the README
# 4. Open the Jupyter notebook (if available)
jupyter notebook
# 5. Complete the exercises in the notebook
# 6. Work on the assignment
# 7. Take the post-lesson quiz
```
## Working with Jupyter Notebooks
### Starting Jupyter
```bash
# Activate your virtual environment
source venv/bin/activate # On macOS/Linux
# OR
venv\Scripts\activate # On Windows
# Start Jupyter from the repository root
jupyter notebook
```
### Running Notebook Cells
1. **Execute a cell**: Press `Shift + Enter` or click the "Run" button
2. **Execute all cells**: Select "Cell" → "Run All" from the menu
3. **Restart kernel**: Select "Kernel" → "Restart" if you encounter issues
### Example: Working with Data in a Notebook
```python
# Import required libraries
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
# Load a dataset
df = pd.read_csv('data/sample.csv')
# Explore the data
df.head()
df.info()
df.describe()
# Create a visualization
plt.figure(figsize=(10, 6))
plt.plot(df['column_name'])
plt.title('Sample Visualization')
plt.xlabel('X-axis Label')
plt.ylabel('Y-axis Label')
plt.show()
```
### Saving Your Work
- Jupyter auto-saves periodically
- Manually save: Press `Ctrl + S` (or `Cmd + S` on macOS)
- Your progress is saved in the `.ipynb` file
## Using the Quiz Application
### Running the Quiz App Locally
```bash
# Navigate to quiz app directory
cd quiz-app
# Start the development server
npm run serve
# Access at http://localhost:8080
```
### Taking Quizzes
1. Pre-lesson quizzes are linked at the top of each lesson
2. Post-lesson quizzes are linked at the bottom of each lesson
3. Each quiz has 3 questions
4. Quizzes are designed to reinforce learning, not to test exhaustively
### Quiz Numbering
- Quizzes are numbered 0-39 (40 total quizzes)
- Each lesson typically has a pre and post quiz
- Quiz URLs include the quiz number: `https://ff-quizzes.netlify.app/en/ds/quiz/0`
## Common Workflows
### Workflow 1: Complete Beginner Path
```bash
# 1. Set up your environment (see INSTALLATION.md)
# 2. Start with Lesson 1
cd 1-Introduction/01-defining-data-science
# 3. For each lesson:
# - Take pre-lesson quiz
# - Read the lesson content
# - Work through the notebook
# - Complete the assignment
# - Take post-lesson quiz
# 4. Progress through all 20 lessons sequentially
```
### Workflow 2: Topic-Specific Learning
If you're interested in a specific topic:
```bash
# Example: Focus on Data Visualization
cd 3-Data-Visualization
# Explore lessons 9-13:
# - Lesson 9: Visualizing Quantities
# - Lesson 10: Visualizing Distributions
# - Lesson 11: Visualizing Proportions
# - Lesson 12: Visualizing Relationships
# - Lesson 13: Meaningful Visualizations
```
### Workflow 3: Project-Based Learning
```bash
# 1. Review the Data Science Lifecycle lessons (14-16)
cd 4-Data-Science-Lifecycle
# 2. Work through a real-world example (Lesson 20)
cd ../6-Data-Science-In-Wild/20-Real-World-Examples
# 3. Apply concepts to your own project
```
### Workflow 4: Cloud-Based Data Science
```bash
# Learn about cloud data science (Lessons 17-19)
cd 5-Data-Science-In-Cloud
# 17: Introduction to Cloud Data Science
# 18: Low-Code ML Tools
# 19: Azure Machine Learning Studio
```
## Tips for Self-Learners
### Stay Organized
```bash
# Create a learning journal
mkdir my-learning-journal
# For each lesson, create notes
echo "# Lesson 1 Notes" > my-learning-journal/lesson-01-notes.md
```
### Practice Regularly
- Set aside dedicated time each day or week
- Complete at least one lesson per week
- Review previous lessons periodically
### Engage with the Community
- Join the [Discord community](https://aka.ms/ds4beginners/discord)
- Participate in [GitHub Discussions](https://github.com/microsoft/Data-Science-For-Beginners/discussions)
- Share your progress and ask questions
### Build Your Own Projects
After completing lessons, apply concepts to personal projects:
```python
# Example: Analyze your own dataset
import pandas as pd
# Load your own data
my_data = pd.read_csv('my-project/data.csv')
# Apply techniques learned
# - Data cleaning (Lesson 8)
# - Exploratory data analysis (Lesson 7)
# - Visualization (Lessons 9-13)
# - Analysis (Lesson 15)
```
## Tips for Teachers
### Classroom Setup
1. Review [for-teachers.md](for-teachers.md) for detailed guidance
2. Set up a shared environment (GitHub Classroom or Codespaces)
3. Establish a communication channel (Discord, Slack, or Teams)
### Lesson Planning
**Suggested 10-Week Schedule:**
- **Week 1-2**: Introduction (Lessons 1-4)
- **Week 3-4**: Working with Data (Lessons 5-8)
- **Week 5-6**: Data Visualization (Lessons 9-13)
- **Week 7-8**: Data Science Lifecycle (Lessons 14-16)
- **Week 9**: Cloud Data Science (Lessons 17-19)
- **Week 10**: Real-World Applications & Final Projects (Lesson 20)
### Running Docsify for Offline Access
```bash
# Serve documentation locally for classroom use
docsify serve
# Students can access at localhost:3000
# No internet required after initial setup
```
### Assignment Grading
- Review student notebooks for completed exercises
- Check for understanding through quiz scores
- Evaluate final projects using data science lifecycle principles
### Creating Assignments
```python
# Example custom assignment template
"""
Assignment: [Topic]
Objective: [Learning goal]
Dataset: [Provide or have students find one]
Tasks:
1. Load and explore the dataset
2. Clean and prepare the data
3. Create at least 3 visualizations
4. Perform analysis
5. Communicate findings
Deliverables:
- Jupyter notebook with code and explanations
- Written summary of findings
"""
```
## Working Offline
### Download Resources
```bash
# Clone the entire repository
git clone https://github.com/microsoft/Data-Science-For-Beginners.git
# Download datasets in advance
# Most datasets are included in the repository
```
### Run Documentation Locally
```bash
# Serve with Docsify
docsify serve
# Access at localhost:3000
```
### Run Quiz App Locally
```bash
cd quiz-app
npm run serve
```
## Accessing Translated Content
Translations are available in 40+ languages:
```bash
# Access translated lessons
cd translations/fr # French
cd translations/es # Spanish
cd translations/de # German
# ... and many more
```
Each translation maintains the same structure as the English version.
## Additional Resources
### Continue Learning
- [Microsoft Learn](https://docs.microsoft.com/learn/) - Additional learning paths
- [Student Hub](https://docs.microsoft.com/learn/student-hub) - Resources for students
- [Azure AI Foundry](https://aka.ms/foundry/forum) - Community forum
### Related Curricula
- [AI for Beginners](https://aka.ms/ai-beginners)
- [ML for Beginners](https://aka.ms/ml-beginners)
- [Web Dev for Beginners](https://aka.ms/webdev-beginners)
- [Generative AI for Beginners](https://aka.ms/genai-beginners)
## Getting Help
- Check [TROUBLESHOOTING.md](TROUBLESHOOTING.md) for common issues
- Search [GitHub Issues](https://github.com/microsoft/Data-Science-For-Beginners/issues)
- Join our [Discord](https://aka.ms/ds4beginners/discord)
- Review [CONTRIBUTING.md](CONTRIBUTING.md) to report issues or contribute
Loading…
Cancel
Save