You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
Data-Science-For-Beginners/INSTALLATION.md

248 lines
5.8 KiB

# Installation Guide
This guide will help you set up your environment to work with the Data Science for Beginners curriculum.
## Table of Contents
- [Prerequisites](#prerequisites)
- [Quick Start Options](#quick-start-options)
- [Local Installation](#local-installation)
- [Verify Your Installation](#verify-your-installation)
## Prerequisites
Before you begin, you should have:
- Basic familiarity with command line/terminal
- A GitHub account (free)
- Stable internet connection for initial setup
## Quick Start Options
### Option 1: GitHub Codespaces (Recommended for Beginners)
The easiest way to get started is with GitHub Codespaces, which provides a complete development environment in your browser.
1. Navigate to the [repository](https://github.com/microsoft/Data-Science-For-Beginners)
2. Click the **Code** dropdown menu
3. Select the **Codespaces** tab
4. Click **Create codespace on main**
5. Wait for the environment to initialize (2-3 minutes)
Your environment is now ready with all dependencies pre-installed!
### Option 2: Local Development
For working on your own computer, follow the detailed instructions below.
## Local Installation
### Step 1: Install Git
Git is required to clone the repository and track your changes.
**Windows:**
- Download from [git-scm.com](https://git-scm.com/download/win)
- Run the installer with default settings
**macOS:**
- Install via Homebrew: `brew install git`
- Or download from [git-scm.com](https://git-scm.com/download/mac)
**Linux:**
```bash
# Debian/Ubuntu
sudo apt-get update
sudo apt-get install git
# Fedora
sudo dnf install git
# Arch
sudo pacman -S git
```
### Step 2: Clone the Repository
```bash
# Clone the repository
git clone https://github.com/microsoft/Data-Science-For-Beginners.git
# Navigate to the directory
cd Data-Science-For-Beginners
```
### Step 3: Install Python and Jupyter
Python 3.7 or higher is required for the data science lessons.
**Windows:**
1. Download Python from [python.org](https://www.python.org/downloads/)
2. During installation, check "Add Python to PATH"
3. Verify installation:
```bash
python --version
```
**macOS:**
```bash
# Using Homebrew
brew install python3
# Verify installation
python3 --version
```
**Linux:**
```bash
# Most Linux distributions come with Python pre-installed
python3 --version
# If not installed:
# Debian/Ubuntu
sudo apt-get install python3 python3-pip
# Fedora
sudo dnf install python3 python3-pip
```
### Step 4: Set Up Python Environment
It's recommended to use a virtual environment to keep dependencies isolated.
```bash
# Create a virtual environment
python -m venv venv
# Activate the virtual environment
# On Windows:
venv\Scripts\activate
# On macOS/Linux:
source venv/bin/activate
```
### Step 5: Install Python Packages
Install the required data science libraries:
```bash
pip install jupyter pandas numpy matplotlib seaborn scikit-learn
```
### Step 6: Install Node.js and npm (For Quiz App)
The quiz application requires Node.js and npm.
**Windows/macOS:**
- Download from [nodejs.org](https://nodejs.org/) (LTS version recommended)
- Run the installer
**Linux:**
```bash
# Debian/Ubuntu
# WARNING: Piping scripts from the internet directly into bash can be a security risk.
# It is recommended to review the script before running it:
# curl -fsSL https://deb.nodesource.com/setup_lts.x -o setup_lts.x
# less setup_lts.x
# Then run:
# sudo -E bash setup_lts.x
#
# Alternatively, you can use the one-liner below at your own risk:
curl -fsSL https://deb.nodesource.com/setup_lts.x | sudo -E bash -
sudo apt-get install -y nodejs
# Fedora
sudo dnf install nodejs
# Verify installation
node --version
npm --version
```
### Step 7: Install Quiz App Dependencies
```bash
# Navigate to quiz app directory
cd quiz-app
# Install dependencies
npm install
# Return to root directory
cd ..
```
### Step 8: Install Docsify (Optional)
For offline access to documentation:
```bash
npm install -g docsify-cli
```
## Verify Your Installation
### Test Python and Jupyter
```bash
# Activate your virtual environment if not already activated
# On Windows:
venv\Scripts\activate
# On macOS/Linux:
source venv/bin/activate
# Start Jupyter Notebook
jupyter notebook
```
Your browser should open with the Jupyter interface. You can now navigate to any lesson's `.ipynb` file.
### Test Quiz Application
```bash
# Navigate to quiz app
cd quiz-app
# Start development server
npm run serve
```
The quiz app should be available at `http://localhost:8080` (or another port if 8080 is busy).
### Test Documentation Server
```bash
# From the root directory of the repository
docsify serve
```
The documentation should be available at `http://localhost:3000`.
## Using VS Code Dev Containers
If you have Docker installed, you can use VS Code Dev Containers:
1. Install [Docker Desktop](https://www.docker.com/products/docker-desktop)
2. Install [Visual Studio Code](https://code.visualstudio.com/)
3. Install the [Remote - Containers extension](https://marketplace.visualstudio.com/items?itemName=ms-vscode-remote.remote-containers)
4. Open the repository in VS Code
5. Press `F1` and select "Remote-Containers: Reopen in Container"
6. Wait for the container to build (first time only)
## Next Steps
- Explore the [README.md](README.md) for an overview of the curriculum
- Read [USAGE.md](USAGE.md) for common workflows and examples
- Check [TROUBLESHOOTING.md](TROUBLESHOOTING.md) if you encounter issues
- Review [CONTRIBUTING.md](CONTRIBUTING.md) if you want to contribute
## Getting Help
If you encounter issues:
1. Check the [TROUBLESHOOTING.md](TROUBLESHOOTING.md) guide
2. Search existing [GitHub Issues](https://github.com/microsoft/Data-Science-For-Beginners/issues)
3. Join our [Discord community](https://aka.ms/ds4beginners/discord)
4. Create a new issue with detailed information about your problem