You can not select more than 25 topics
Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
614 lines
13 KiB
614 lines
13 KiB
# Troubleshooting Guide
|
|
|
|
This guide provides solutions to common issues you might encounter while working with the Data Science for Beginners curriculum.
|
|
|
|
## Table of Contents
|
|
|
|
- [Python and Jupyter Issues](#python-and-jupyter-issues)
|
|
- [Package and Dependency Issues](#package-and-dependency-issues)
|
|
- [Jupyter Notebook Issues](#jupyter-notebook-issues)
|
|
- [Quiz Application Issues](#quiz-application-issues)
|
|
- [Git and GitHub Issues](#git-and-github-issues)
|
|
- [Docsify Documentation Issues](#docsify-documentation-issues)
|
|
- [Data and File Issues](#data-and-file-issues)
|
|
- [Performance Issues](#performance-issues)
|
|
- [Getting Additional Help](#getting-additional-help)
|
|
|
|
## Python and Jupyter Issues
|
|
|
|
### Python Not Found or Wrong Version
|
|
|
|
**Problem:** `python: command not found` or wrong Python version
|
|
|
|
**Solution:**
|
|
|
|
```bash
|
|
# Check Python version
|
|
python --version
|
|
python3 --version
|
|
|
|
# If Python 3 is installed as 'python3', create an alias
|
|
# On macOS/Linux, add to ~/.bashrc or ~/.zshrc:
|
|
alias python=python3
|
|
alias pip=pip3
|
|
|
|
# Or use python3 explicitly
|
|
python3 -m pip install jupyter
|
|
```
|
|
|
|
**Windows Solution:**
|
|
1. Reinstall Python from [python.org](https://www.python.org/)
|
|
2. During installation, check "Add Python to PATH"
|
|
3. Restart your terminal/command prompt
|
|
|
|
### Virtual Environment Activation Issues
|
|
|
|
**Problem:** Virtual environment won't activate
|
|
|
|
**Solution:**
|
|
|
|
**Windows:**
|
|
```bash
|
|
# If you get execution policy error
|
|
Set-ExecutionPolicy -ExecutionPolicy RemoteSigned -Scope CurrentUser
|
|
|
|
# Then activate
|
|
venv\Scripts\activate
|
|
```
|
|
|
|
**macOS/Linux:**
|
|
```bash
|
|
# Ensure the activate script is executable
|
|
chmod +x venv/bin/activate
|
|
|
|
# Then activate
|
|
source venv/bin/activate
|
|
```
|
|
|
|
**Verify activation:**
|
|
```bash
|
|
# Your prompt should show (venv)
|
|
# Check Python location
|
|
which python # Should point to venv
|
|
```
|
|
|
|
### Jupyter Kernel Issues
|
|
|
|
**Problem:** "Kernel not found" or "Kernel keeps dying"
|
|
|
|
**Solution:**
|
|
|
|
```bash
|
|
# Reinstall kernel
|
|
python -m ipykernel install --user --name=datascience --display-name="Python (Data Science)"
|
|
|
|
# Or use the default kernel
|
|
python -m ipykernel install --user
|
|
|
|
# Restart Jupyter
|
|
jupyter notebook
|
|
```
|
|
|
|
**Problem:** Wrong Python version in Jupyter
|
|
|
|
**Solution:**
|
|
```bash
|
|
# Install Jupyter in your virtual environment
|
|
source venv/bin/activate # Activate first
|
|
pip install jupyter ipykernel
|
|
|
|
# Register the kernel
|
|
python -m ipykernel install --user --name=venv --display-name="Python (venv)"
|
|
|
|
# In Jupyter, select Kernel -> Change kernel -> Python (venv)
|
|
```
|
|
|
|
## Package and Dependency Issues
|
|
|
|
### Import Errors
|
|
|
|
**Problem:** `ModuleNotFoundError: No module named 'pandas'` (or other packages)
|
|
|
|
**Solution:**
|
|
|
|
```bash
|
|
# Ensure virtual environment is activated
|
|
source venv/bin/activate # macOS/Linux
|
|
venv\Scripts\activate # Windows
|
|
|
|
# Install missing package
|
|
pip install pandas
|
|
|
|
# Install all common packages
|
|
pip install jupyter pandas numpy matplotlib seaborn scikit-learn
|
|
|
|
# Verify installation
|
|
python -c "import pandas; print(pandas.__version__)"
|
|
```
|
|
|
|
### Pip Installation Failures
|
|
|
|
**Problem:** `pip install` fails with permission errors
|
|
|
|
**Solution:**
|
|
|
|
```bash
|
|
# Use --user flag
|
|
pip install --user package-name
|
|
|
|
# Or use virtual environment (recommended)
|
|
python -m venv venv
|
|
source venv/bin/activate
|
|
pip install package-name
|
|
```
|
|
|
|
**Problem:** `pip install` fails with SSL certificate errors
|
|
|
|
**Solution:**
|
|
|
|
```bash
|
|
# Update pip first
|
|
python -m pip install --upgrade pip
|
|
|
|
# Try installing with trusted host (temporary workaround)
|
|
pip install --trusted-host pypi.org --trusted-host files.pythonhosted.org package-name
|
|
```
|
|
|
|
### Package Version Conflicts
|
|
|
|
**Problem:** Incompatible package versions
|
|
|
|
**Solution:**
|
|
|
|
```bash
|
|
# Create fresh virtual environment
|
|
python -m venv venv-new
|
|
source venv-new/bin/activate # or venv-new\Scripts\activate on Windows
|
|
|
|
# Install packages with specific versions if needed
|
|
pip install pandas==1.3.0
|
|
pip install numpy==1.21.0
|
|
|
|
# Or let pip resolve dependencies
|
|
pip install jupyter pandas numpy matplotlib seaborn scikit-learn
|
|
```
|
|
|
|
## Jupyter Notebook Issues
|
|
|
|
### Jupyter Won't Start
|
|
|
|
**Problem:** `jupyter notebook` command not found
|
|
|
|
**Solution:**
|
|
|
|
```bash
|
|
# Install Jupyter
|
|
pip install jupyter
|
|
|
|
# Or use python -m
|
|
python -m jupyter notebook
|
|
|
|
# Add to PATH if needed (macOS/Linux)
|
|
export PATH="$HOME/.local/bin:$PATH"
|
|
```
|
|
|
|
### Notebook Won't Load or Save
|
|
|
|
**Problem:** "Notebook failed to load" or save errors
|
|
|
|
**Solution:**
|
|
|
|
1. Check file permissions
|
|
```bash
|
|
# Make sure you have write permissions
|
|
ls -l notebook.ipynb
|
|
chmod 644 notebook.ipynb # If needed
|
|
```
|
|
|
|
2. Check for file corruption
|
|
```bash
|
|
# Try opening in text editor to check JSON structure
|
|
# Copy content to new notebook if corrupted
|
|
```
|
|
|
|
3. Clear Jupyter cache
|
|
```bash
|
|
jupyter notebook --clear-cache
|
|
```
|
|
|
|
### Cell Won't Execute
|
|
|
|
**Problem:** Cell stuck on "In [*]" or takes forever
|
|
|
|
**Solution:**
|
|
|
|
1. **Interrupt the kernel**: Click "Interrupt" button or press `I, I`
|
|
2. **Restart kernel**: Kernel menu → Restart
|
|
3. **Check for infinite loops** in your code
|
|
4. **Clear output**: Cell → All Output → Clear
|
|
|
|
### Plots Not Displaying
|
|
|
|
**Problem:** `matplotlib` plots don't show in notebook
|
|
|
|
**Solution:**
|
|
|
|
```python
|
|
# Add magic command at the top of notebook
|
|
%matplotlib inline
|
|
|
|
import matplotlib.pyplot as plt
|
|
|
|
# Create plot
|
|
plt.plot([1, 2, 3, 4])
|
|
plt.show() # Make sure to call show()
|
|
```
|
|
|
|
**Alternative for interactive plots:**
|
|
```python
|
|
%matplotlib notebook
|
|
# Or
|
|
%matplotlib widget
|
|
```
|
|
|
|
## Quiz Application Issues
|
|
|
|
### npm install Fails
|
|
|
|
**Problem:** Errors during `npm install`
|
|
|
|
**Solution:**
|
|
|
|
```bash
|
|
# Clear npm cache
|
|
npm cache clean --force
|
|
|
|
# Remove node_modules and package-lock.json
|
|
rm -rf node_modules package-lock.json
|
|
|
|
# Reinstall
|
|
npm install
|
|
|
|
# If still failing, try with legacy peer deps
|
|
npm install --legacy-peer-deps
|
|
```
|
|
|
|
### Quiz App Won't Start
|
|
|
|
**Problem:** `npm run serve` fails
|
|
|
|
**Solution:**
|
|
|
|
```bash
|
|
# Check Node.js version
|
|
node --version # Should be 12.x or higher
|
|
|
|
# Reinstall dependencies
|
|
cd quiz-app
|
|
rm -rf node_modules package-lock.json
|
|
npm install
|
|
|
|
# Try different port
|
|
npm run serve -- --port 8081
|
|
```
|
|
|
|
### Port Already in Use
|
|
|
|
**Problem:** "Port 8080 is already in use"
|
|
|
|
**Solution:**
|
|
|
|
```bash
|
|
# Find and kill process on port 8080
|
|
# macOS/Linux:
|
|
lsof -ti:8080 | xargs kill -9
|
|
|
|
# Windows:
|
|
netstat -ano | findstr :8080
|
|
taskkill /PID <PID> /F
|
|
|
|
# Or use a different port
|
|
npm run serve -- --port 8081
|
|
```
|
|
|
|
### Quiz Not Loading or Blank Page
|
|
|
|
**Problem:** Quiz app loads but shows blank page
|
|
|
|
**Solution:**
|
|
|
|
1. Check browser console for errors (F12)
|
|
2. Clear browser cache and cookies
|
|
3. Try a different browser
|
|
4. Ensure JavaScript is enabled
|
|
5. Check for ad blockers interfering
|
|
|
|
```bash
|
|
# Rebuild the app
|
|
npm run build
|
|
npm run serve
|
|
```
|
|
|
|
## Git and GitHub Issues
|
|
|
|
### Git Not Recognized
|
|
|
|
**Problem:** `git: command not found`
|
|
|
|
**Solution:**
|
|
|
|
**Windows:**
|
|
- Install Git from [git-scm.com](https://git-scm.com/)
|
|
- Restart terminal after installation
|
|
|
|
**macOS:**
|
|
|
|
> **Note:** If you do not have Homebrew installed, follow the instructions at [https://brew.sh/](https://brew.sh/) to install it first.
|
|
```bash
|
|
# Install via Homebrew
|
|
brew install git
|
|
|
|
# Or install Xcode Command Line Tools
|
|
xcode-select --install
|
|
```
|
|
|
|
**Linux:**
|
|
```bash
|
|
sudo apt-get install git # Debian/Ubuntu
|
|
sudo dnf install git # Fedora
|
|
```
|
|
|
|
### Clone Fails
|
|
|
|
**Problem:** `git clone` fails with authentication errors
|
|
|
|
**Solution:**
|
|
|
|
```bash
|
|
# Use HTTPS URL
|
|
git clone https://github.com/microsoft/Data-Science-For-Beginners.git
|
|
|
|
# If you have 2FA enabled on GitHub, use Personal Access Token
|
|
# Create token at: https://github.com/settings/tokens
|
|
# Use token as password when prompted
|
|
```
|
|
|
|
### Permission Denied (publickey)
|
|
|
|
**Problem:** SSH key authentication fails
|
|
|
|
**Solution:**
|
|
|
|
```bash
|
|
# Generate SSH key
|
|
ssh-keygen -t ed25519 -C "your_email@example.com"
|
|
|
|
# Add key to ssh-agent
|
|
eval "$(ssh-agent -s)"
|
|
ssh-add ~/.ssh/id_ed25519
|
|
|
|
# Add public key to GitHub
|
|
# Copy key: cat ~/.ssh/id_ed25519.pub
|
|
# Add at: https://github.com/settings/keys
|
|
```
|
|
|
|
## Docsify Documentation Issues
|
|
|
|
### Docsify Command Not Found
|
|
|
|
**Problem:** `docsify: command not found`
|
|
|
|
**Solution:**
|
|
|
|
```bash
|
|
# Install globally
|
|
npm install -g docsify-cli
|
|
|
|
# If permission error on macOS/Linux
|
|
sudo npm install -g docsify-cli
|
|
|
|
# Verify installation
|
|
docsify --version
|
|
|
|
# If still not found, add npm global path
|
|
# Find npm global path
|
|
npm config get prefix
|
|
|
|
# Add to PATH (add to ~/.bashrc or ~/.zshrc)
|
|
export PATH="$PATH:/usr/local/bin"
|
|
```
|
|
|
|
### Documentation Not Loading
|
|
|
|
**Problem:** Docsify serves but content doesn't load
|
|
|
|
**Solution:**
|
|
|
|
```bash
|
|
# Ensure you're in the repository root
|
|
cd Data-Science-For-Beginners
|
|
|
|
# Check for index.html
|
|
ls index.html
|
|
|
|
# Serve with specific port
|
|
docsify serve --port 3000
|
|
|
|
# Check browser console for errors (F12)
|
|
```
|
|
|
|
### Images Not Displaying
|
|
|
|
**Problem:** Images show broken link icon
|
|
|
|
**Solution:**
|
|
|
|
1. Check image paths are relative
|
|
2. Ensure image files exist in the repository
|
|
3. Clear browser cache
|
|
4. Verify file extensions match (case-sensitive on some systems)
|
|
|
|
## Data and File Issues
|
|
|
|
### File Not Found Errors
|
|
|
|
**Problem:** `FileNotFoundError` when loading data
|
|
|
|
**Solution:**
|
|
|
|
```python
|
|
import os
|
|
|
|
# Check current working directory
|
|
print(os.getcwd())
|
|
|
|
# Use absolute path
|
|
data_path = os.path.join(os.getcwd(), 'data', 'filename.csv')
|
|
df = pd.read_csv(data_path)
|
|
|
|
# Or use relative path from notebook location
|
|
df = pd.read_csv('../data/filename.csv')
|
|
|
|
# Verify file exists
|
|
print(os.path.exists('data/filename.csv'))
|
|
```
|
|
|
|
### CSV Reading Errors
|
|
|
|
**Problem:** Errors reading CSV files
|
|
|
|
**Solution:**
|
|
|
|
```python
|
|
import pandas as pd
|
|
|
|
# Try different encodings
|
|
df = pd.read_csv('file.csv', encoding='utf-8')
|
|
# or
|
|
df = pd.read_csv('file.csv', encoding='latin-1')
|
|
# or
|
|
df = pd.read_csv('file.csv', encoding='ISO-8859-1')
|
|
|
|
# Handle missing values
|
|
df = pd.read_csv('file.csv', na_values=['NA', 'N/A', ''])
|
|
|
|
# Specify delimiter if not comma
|
|
df = pd.read_csv('file.csv', delimiter=';')
|
|
```
|
|
|
|
### Memory Errors with Large Datasets
|
|
|
|
**Problem:** `MemoryError` when loading large files
|
|
|
|
**Solution:**
|
|
|
|
```python
|
|
# Read in chunks
|
|
chunk_size = 10000
|
|
chunks = []
|
|
for chunk in pd.read_csv('large_file.csv', chunksize=chunk_size):
|
|
# Process chunk
|
|
chunks.append(chunk)
|
|
df = pd.concat(chunks)
|
|
|
|
# Or read specific columns only
|
|
df = pd.read_csv('file.csv', usecols=['col1', 'col2'])
|
|
|
|
# Use more efficient data types
|
|
df = pd.read_csv('file.csv', dtype={'column_name': 'int32'})
|
|
```
|
|
|
|
## Performance Issues
|
|
|
|
### Slow Notebook Performance
|
|
|
|
**Problem:** Notebooks run very slowly
|
|
|
|
**Solution:**
|
|
|
|
1. **Restart kernel and clear output**
|
|
- Kernel → Restart & Clear Output
|
|
|
|
2. **Close unused notebooks**
|
|
|
|
3. **Optimize code:**
|
|
```python
|
|
# Use vectorized operations instead of loops
|
|
# Bad:
|
|
result = []
|
|
for x in data:
|
|
result.append(x * 2)
|
|
|
|
# Good:
|
|
result = data * 2 # NumPy/Pandas vectorization
|
|
```
|
|
|
|
4. **Sample large datasets:**
|
|
```python
|
|
# Work with sample during development
|
|
df_sample = df.sample(n=1000) # or df.head(1000)
|
|
```
|
|
|
|
### Browser Crashes
|
|
|
|
**Problem:** Browser crashes or becomes unresponsive
|
|
|
|
**Solution:**
|
|
|
|
1. Close unused tabs
|
|
2. Clear browser cache
|
|
3. Increase browser memory (Chrome: `chrome://settings/system`)
|
|
4. Use JupyterLab instead:
|
|
```bash
|
|
pip install jupyterlab
|
|
jupyter lab
|
|
```
|
|
|
|
## Getting Additional Help
|
|
|
|
### Before Asking for Help
|
|
|
|
1. Check this troubleshooting guide
|
|
2. Search [GitHub Issues](https://github.com/microsoft/Data-Science-For-Beginners/issues)
|
|
3. Review [INSTALLATION.md](INSTALLATION.md) and [USAGE.md](USAGE.md)
|
|
4. Try searching the error message online
|
|
|
|
### How to Ask for Help
|
|
|
|
When creating an issue or asking for help, include:
|
|
|
|
1. **Operating System**: Windows, macOS, or Linux (which distribution)
|
|
2. **Python Version**: Run `python --version`
|
|
3. **Error Message**: Copy the complete error message
|
|
4. **Steps to Reproduce**: What you did before the error occurred
|
|
5. **What You've Tried**: Solutions you've already attempted
|
|
|
|
**Example:**
|
|
```
|
|
**Operating System:** macOS 12.0
|
|
**Python Version:** 3.9.7
|
|
**Error Message:** ModuleNotFoundError: No module named 'pandas'
|
|
**Steps to Reproduce:**
|
|
1. Activated virtual environment
|
|
2. Started Jupyter notebook
|
|
3. Tried to import pandas
|
|
|
|
**What I've Tried:**
|
|
- Ran pip install pandas
|
|
- Restarted Jupyter
|
|
```
|
|
|
|
### Community Resources
|
|
|
|
- **GitHub Issues**: [Create an issue](https://github.com/microsoft/Data-Science-For-Beginners/issues/new)
|
|
- **Discord**: [Join our community](https://aka.ms/ds4beginners/discord)
|
|
- **Discussions**: [GitHub Discussions](https://github.com/microsoft/Data-Science-For-Beginners/discussions)
|
|
- **Microsoft Learn**: [Q&A Forums](https://docs.microsoft.com/answers/)
|
|
|
|
### Related Documentation
|
|
|
|
- [INSTALLATION.md](INSTALLATION.md) - Setup instructions
|
|
- [USAGE.md](USAGE.md) - How to use the curriculum
|
|
- [CONTRIBUTING.md](CONTRIBUTING.md) - How to contribute
|
|
- [README.md](README.md) - Project overview
|