You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
ML-For-Beginners/TROUBLESHOOTING.md

597 lines
13 KiB

# Troubleshooting Guide
This guide helps you solve common problems when working with the Machine Learning for Beginners curriculum. If you don't find a solution here, please check our [Discord Discussions](https://aka.ms/foundry/discord) or [open an issue](https://github.com/microsoft/ML-For-Beginners/issues).
## Table of Contents
- [Installation Issues](#installation-issues)
- [Jupyter Notebook Issues](#jupyter-notebook-issues)
- [Python Package Issues](#python-package-issues)
- [R Environment Issues](#r-environment-issues)
- [Quiz Application Issues](#quiz-application-issues)
- [Data and File Path Issues](#data-and-file-path-issues)
- [Common Error Messages](#common-error-messages)
- [Performance Issues](#performance-issues)
- [Environment and Configuration](#environment-and-configuration)
---
## Installation Issues
### Python Installation
**Problem**: `python: command not found`
**Solution**:
1. Install Python 3.8 or higher from [python.org](https://www.python.org/downloads/)
2. Verify installation: `python --version` or `python3 --version`
3. On macOS/Linux, you may need to use `python3` instead of `python`
**Problem**: Multiple Python versions causing conflicts
**Solution**:
```bash
# Use virtual environments to isolate projects
python -m venv ml-env
# Activate virtual environment
# On Windows:
ml-env\Scripts\activate
# On macOS/Linux:
source ml-env/bin/activate
```
### Jupyter Installation
**Problem**: `jupyter: command not found`
**Solution**:
```bash
# Install Jupyter
pip install jupyter
# Or with pip3
pip3 install jupyter
# Verify installation
jupyter --version
```
**Problem**: Jupyter won't launch in browser
**Solution**:
```bash
# Try specifying the browser
jupyter notebook --browser=chrome
# Or copy the URL with token from terminal and paste in browser manually
# Look for: http://localhost:8888/?token=...
```
### R Installation
**Problem**: R packages won't install
**Solution**:
```r
# Ensure you have the latest R version
# Install packages with dependencies
install.packages(c("tidyverse", "tidymodels", "caret"), dependencies = TRUE)
# If compilation fails, try installing binary versions
install.packages("package-name", type = "binary")
```
**Problem**: IRkernel not available in Jupyter
**Solution**:
```r
# In R console
install.packages('IRkernel')
IRkernel::installspec(user = TRUE)
```
---
## Jupyter Notebook Issues
### Kernel Issues
**Problem**: Kernel keeps dying or restarting
**Solution**:
1. Restart the kernel: `Kernel → Restart`
2. Clear output and restart: `Kernel → Restart & Clear Output`
3. Check for memory issues (see [Performance Issues](#performance-issues))
4. Try running cells individually to identify problematic code
**Problem**: Wrong Python kernel selected
**Solution**:
1. Check current kernel: `Kernel → Change Kernel`
2. Select the correct Python version
3. If kernel is missing, create it:
```bash
python -m ipykernel install --user --name=ml-env
```
**Problem**: Kernel won't start
**Solution**:
```bash
# Reinstall ipykernel
pip uninstall ipykernel
pip install ipykernel
# Register the kernel again
python -m ipykernel install --user
```
### Notebook Cell Issues
**Problem**: Cells are running but not showing output
**Solution**:
1. Check if cell is still running (look for `[*]` indicator)
2. Restart kernel and run all cells: `Kernel → Restart & Run All`
3. Check browser console for JavaScript errors (F12)
**Problem**: Can't run cells - no response when clicking "Run"
**Solution**:
1. Check if Jupyter server is still running in terminal
2. Refresh the browser page
3. Close and reopen the notebook
4. Restart Jupyter server
---
## Python Package Issues
### Import Errors
**Problem**: `ModuleNotFoundError: No module named 'sklearn'`
**Solution**:
```bash
pip install scikit-learn
# Common ML packages for this course
pip install scikit-learn pandas numpy matplotlib seaborn
```
**Problem**: `ImportError: cannot import name 'X' from 'sklearn'`
**Solution**:
```bash
# Update scikit-learn to latest version
pip install --upgrade scikit-learn
# Check version
python -c "import sklearn; print(sklearn.__version__)"
```
### Version Conflicts
**Problem**: Package version incompatibility errors
**Solution**:
```bash
# Create a new virtual environment
python -m venv fresh-env
source fresh-env/bin/activate # or fresh-env\Scripts\activate on Windows
# Install packages fresh
pip install jupyter scikit-learn pandas numpy matplotlib seaborn
# If specific version needed
pip install scikit-learn==1.3.0
```
**Problem**: `pip install` fails with permission errors
**Solution**:
```bash
# Install for current user only
pip install --user package-name
# Or use virtual environment (recommended)
python -m venv venv
source venv/bin/activate
pip install package-name
```
### Data Loading Issues
**Problem**: `FileNotFoundError` when loading CSV files
**Solution**:
```python
import os
# Check current working directory
print(os.getcwd())
# Use relative paths from notebook location
df = pd.read_csv('../../data/filename.csv')
# Or use absolute paths
df = pd.read_csv('/full/path/to/data/filename.csv')
```
---
## R Environment Issues
### Package Installation
**Problem**: Package installation fails with compilation errors
**Solution**:
```r
# Install binary version (Windows/macOS)
install.packages("package-name", type = "binary")
# Update R to latest version if packages require it
# Check R version
R.version.string
# Install system dependencies (Linux)
# For Ubuntu/Debian, in terminal:
# sudo apt-get install r-base-dev
```
**Problem**: `tidyverse` won't install
**Solution**:
```r
# Install dependencies first
install.packages(c("rlang", "vctrs", "pillar"))
# Then install tidyverse
install.packages("tidyverse")
# Or install components individually
install.packages(c("dplyr", "ggplot2", "tidyr", "readr"))
```
### RMarkdown Issues
**Problem**: RMarkdown won't render
**Solution**:
```r
# Install/update rmarkdown
install.packages("rmarkdown")
# Install pandoc if needed
install.packages("pandoc")
# For PDF output, install tinytex
install.packages("tinytex")
tinytex::install_tinytex()
```
---
## Quiz Application Issues
### Build and Installation
**Problem**: `npm install` fails
**Solution**:
```bash
# Clear npm cache
npm cache clean --force
# Remove node_modules and package-lock.json
rm -rf node_modules package-lock.json
# Reinstall
npm install
# If still fails, try with legacy peer deps
npm install --legacy-peer-deps
```
**Problem**: Port 8080 already in use
**Solution**:
```bash
# Use different port
npm run serve -- --port 8081
# Or find and kill process using port 8080
# On Linux/macOS:
lsof -ti:8080 | xargs kill -9
# On Windows:
netstat -ano | findstr :8080
taskkill /PID <PID> /F
```
### Build Errors
**Problem**: `npm run build` fails
**Solution**:
```bash
# Check Node.js version (should be 14+)
node --version
# Update Node.js if needed
# Then clean install
rm -rf node_modules package-lock.json
npm install
npm run build
```
**Problem**: Linting errors preventing build
**Solution**:
```bash
# Fix auto-fixable issues
npm run lint -- --fix
# Or temporarily disable linting in build
# (not recommended for production)
```
---
## Data and File Path Issues
### Path Problems
**Problem**: Data files not found when running notebooks
**Solution**:
1. **Always run notebooks from their containing directory**
```bash
cd /path/to/lesson/folder
jupyter notebook
```
2. **Check relative paths in code**
```python
# Correct path from notebook location
df = pd.read_csv('../data/filename.csv')
# Not from your terminal location
```
3. **Use absolute paths if needed**
```python
import os
base_path = os.path.dirname(os.path.abspath(__file__))
data_path = os.path.join(base_path, 'data', 'filename.csv')
```
### Missing Data Files
**Problem**: Dataset files are missing
**Solution**:
1. Check if data should be in the repository - most datasets are included
2. Some lessons may require downloading data - check lesson README
3. Ensure you've pulled the latest changes:
```bash
git pull origin main
```
---
## Common Error Messages
### Memory Errors
**Error**: `MemoryError` or kernel dies when processing data
**Solution**:
```python
# Load data in chunks
for chunk in pd.read_csv('large_file.csv', chunksize=10000):
process(chunk)
# Or read only needed columns
df = pd.read_csv('file.csv', usecols=['col1', 'col2'])
# Free memory when done
del large_dataframe
import gc
gc.collect()
```
### Convergence Warnings
**Warning**: `ConvergenceWarning: Maximum number of iterations reached`
**Solution**:
```python
from sklearn.linear_model import LogisticRegression
# Increase max iterations
model = LogisticRegression(max_iter=1000)
# Or scale your features first
from sklearn.preprocessing import StandardScaler
scaler = StandardScaler()
X_scaled = scaler.fit_transform(X)
```
### Plotting Issues
**Problem**: Plots not showing in Jupyter
**Solution**:
```python
# Enable inline plotting
%matplotlib inline
# Import pyplot
import matplotlib.pyplot as plt
# Show plot explicitly
plt.plot(data)
plt.show()
```
**Problem**: Seaborn plots look different or throw errors
**Solution**:
```python
import warnings
warnings.filterwarnings('ignore', category=UserWarning)
# Update to compatible version
# pip install --upgrade seaborn matplotlib
```
### Unicode/Encoding Errors
**Problem**: `UnicodeDecodeError` when reading files
**Solution**:
```python
# Specify encoding explicitly
df = pd.read_csv('file.csv', encoding='utf-8')
# Or try different encoding
df = pd.read_csv('file.csv', encoding='latin-1')
# For errors='ignore' to skip problematic characters
df = pd.read_csv('file.csv', encoding='utf-8', errors='ignore')
```
---
## Performance Issues
### Slow Notebook Execution
**Problem**: Notebooks are very slow to run
**Solution**:
1. **Restart kernel to free memory**: `Kernel → Restart`
2. **Close unused notebooks** to free resources
3. **Use smaller data samples for testing**:
```python
# Work with subset during development
df_sample = df.sample(n=1000)
```
4. **Profile your code** to find bottlenecks:
```python
%time operation() # Time single operation
%timeit operation() # Time with multiple runs
```
### High Memory Usage
**Problem**: System running out of memory
**Solution**:
```python
# Check memory usage
df.info(memory_usage='deep')
# Optimize data types
df['column'] = df['column'].astype('int32') # Instead of int64
# Drop unnecessary columns
df = df[['col1', 'col2']] # Keep only needed columns
# Process in batches
for batch in np.array_split(df, 10):
process(batch)
```
---
## Environment and Configuration
### Virtual Environment Issues
**Problem**: Virtual environment not activating
**Solution**:
```bash
# Windows
python -m venv venv
venv\Scripts\activate.bat
# macOS/Linux
python3 -m venv venv
source venv/bin/activate
# Check if activated (should show venv name in prompt)
which python # Should point to venv python
```
**Problem**: Packages installed but not found in notebook
**Solution**:
```bash
# Ensure notebook uses the correct kernel
# Install ipykernel in your venv
pip install ipykernel
python -m ipykernel install --user --name=ml-env --display-name="Python (ml-env)"
# In Jupyter: Kernel → Change Kernel → Python (ml-env)
```
### Git Issues
**Problem**: Can't pull latest changes - merge conflicts
**Solution**:
```bash
# Stash your changes
git stash
# Pull latest
git pull origin main
# Reapply your changes
git stash pop
# If conflicts, resolve manually or:
git checkout --theirs path/to/file # Take remote version
git checkout --ours path/to/file # Keep your version
```
### VS Code Integration
**Problem**: Jupyter notebooks won't open in VS Code
**Solution**:
1. Install Python extension in VS Code
2. Install Jupyter extension in VS Code
3. Select correct Python interpreter: `Ctrl+Shift+P` → "Python: Select Interpreter"
4. Restart VS Code
---
## Additional Resources
- **Discord Discussions**: [Ask questions and share solutions in the #ml-for-beginners channel](https://aka.ms/foundry/discord)
- **Microsoft Learn**: [ML for Beginners modules](https://learn.microsoft.com/en-us/collections/qrqzamz1nn2wx3?WT.mc_id=academic-77952-bethanycheum)
- **Video Tutorials**: [YouTube Playlist](https://aka.ms/ml-beginners-videos)
- **Issue Tracker**: [Report bugs](https://github.com/microsoft/ML-For-Beginners/issues)
---
## Still Having Issues?
If you've tried the solutions above and still experiencing problems:
1. **Search existing issues**: [GitHub Issues](https://github.com/microsoft/ML-For-Beginners/issues)
2. **Check discussions in Discord**: [Discord Discussions](https://aka.ms/foundry/discord)
3. **Open a new issue**: Include:
- Your operating system and version
- Python/R version
- Error message (full traceback)
- Steps to reproduce the problem
- What you've already tried
We're here to help! 🚀