Co-authored-by: BethanyJep <44121227+BethanyJep@users.noreply.github.com>copilot/fix-fc133524-4bd1-478f-96ca-5db4b0edf20c
parent
bc7cd7128b
commit
db1da61e30
@ -0,0 +1,358 @@
|
|||||||
|
# AGENTS.md
|
||||||
|
|
||||||
|
## Project Overview
|
||||||
|
|
||||||
|
Data Science for Beginners is a comprehensive 10-week, 20-lesson curriculum created by Microsoft Azure Cloud Advocates. The repository is a learning resource that teaches foundational data science concepts through project-based lessons, including Jupyter notebooks, interactive quizzes, and hands-on assignments.
|
||||||
|
|
||||||
|
**Key Technologies:**
|
||||||
|
- **Jupyter Notebooks**: Primary learning medium using Python 3
|
||||||
|
- **Python Libraries**: pandas, numpy, matplotlib for data analysis and visualization
|
||||||
|
- **Vue.js 2**: Quiz application (quiz-app folder)
|
||||||
|
- **Docsify**: Documentation site generator for offline access
|
||||||
|
- **Node.js/npm**: Package management for JavaScript components
|
||||||
|
- **Markdown**: All lesson content and documentation
|
||||||
|
|
||||||
|
**Architecture:**
|
||||||
|
- Multi-language educational repository with extensive translations
|
||||||
|
- Structured into lesson modules (1-Introduction through 6-Data-Science-In-Wild)
|
||||||
|
- Each lesson includes README, notebooks, assignments, and quizzes
|
||||||
|
- Standalone Vue.js quiz application for pre/post-lesson assessments
|
||||||
|
- GitHub Codespaces and VS Code dev containers support
|
||||||
|
|
||||||
|
## Setup Commands
|
||||||
|
|
||||||
|
### Repository Setup
|
||||||
|
```bash
|
||||||
|
# Clone the repository (if not already cloned)
|
||||||
|
git clone https://github.com/microsoft/Data-Science-For-Beginners.git
|
||||||
|
cd Data-Science-For-Beginners
|
||||||
|
```
|
||||||
|
|
||||||
|
### Python Environment Setup
|
||||||
|
```bash
|
||||||
|
# Create a virtual environment (recommended)
|
||||||
|
python -m venv venv
|
||||||
|
source venv/bin/activate # On Windows: venv\Scripts\activate
|
||||||
|
|
||||||
|
# Install common data science libraries (no requirements.txt exists)
|
||||||
|
pip install jupyter pandas numpy matplotlib seaborn scikit-learn
|
||||||
|
```
|
||||||
|
|
||||||
|
### Quiz Application Setup
|
||||||
|
```bash
|
||||||
|
# Navigate to quiz app
|
||||||
|
cd quiz-app
|
||||||
|
|
||||||
|
# Install dependencies
|
||||||
|
npm install
|
||||||
|
|
||||||
|
# Start development server
|
||||||
|
npm run serve
|
||||||
|
|
||||||
|
# Build for production
|
||||||
|
npm run build
|
||||||
|
|
||||||
|
# Lint and fix files
|
||||||
|
npm run lint
|
||||||
|
```
|
||||||
|
|
||||||
|
### Docsify Documentation Server
|
||||||
|
```bash
|
||||||
|
# Install Docsify globally
|
||||||
|
npm install -g docsify-cli
|
||||||
|
|
||||||
|
# Serve documentation locally
|
||||||
|
docsify serve
|
||||||
|
|
||||||
|
# Documentation will be available at localhost:3000
|
||||||
|
```
|
||||||
|
|
||||||
|
### Visualization Projects Setup
|
||||||
|
For visualization projects like meaningful-visualizations (lesson 13):
|
||||||
|
```bash
|
||||||
|
# Navigate to starter or solution folder
|
||||||
|
cd 3-Data-Visualization/13-meaningful-visualizations/starter
|
||||||
|
|
||||||
|
# Install dependencies
|
||||||
|
npm install
|
||||||
|
|
||||||
|
# Start development server
|
||||||
|
npm run serve
|
||||||
|
|
||||||
|
# Build for production
|
||||||
|
npm run build
|
||||||
|
|
||||||
|
# Lint files
|
||||||
|
npm run lint
|
||||||
|
```
|
||||||
|
|
||||||
|
## Development Workflow
|
||||||
|
|
||||||
|
### Working with Jupyter Notebooks
|
||||||
|
1. Start Jupyter in the repository root: `jupyter notebook`
|
||||||
|
2. Navigate to the desired lesson folder
|
||||||
|
3. Open `.ipynb` files to work through exercises
|
||||||
|
4. Notebooks are self-contained with explanations and code cells
|
||||||
|
5. Most notebooks use pandas, numpy, and matplotlib - ensure these are installed
|
||||||
|
|
||||||
|
### Lesson Structure
|
||||||
|
Each lesson typically contains:
|
||||||
|
- `README.md` - Main lesson content with theory and examples
|
||||||
|
- `notebook.ipynb` - Hands-on Jupyter notebook exercises
|
||||||
|
- `assignment.ipynb` or `assignment.md` - Practice assignments
|
||||||
|
- `solution/` folder - Solution notebooks and code
|
||||||
|
- `images/` folder - Supporting visual materials
|
||||||
|
|
||||||
|
### Quiz Application Development
|
||||||
|
- Vue.js 2 application with hot-reload during development
|
||||||
|
- Quizzes stored in `quiz-app/src/assets/translations/`
|
||||||
|
- Each language has its own translation folder (en, fr, es, etc.)
|
||||||
|
- Quiz numbering starts at 0 and goes up to 39 (40 quizzes total)
|
||||||
|
|
||||||
|
### Adding Translations
|
||||||
|
- Translations go in `translations/` folder at repository root
|
||||||
|
- Each language has complete lesson structure mirrored from English
|
||||||
|
- Automated translation via GitHub Actions (co-op-translator.yml)
|
||||||
|
|
||||||
|
## Testing Instructions
|
||||||
|
|
||||||
|
### Quiz Application Testing
|
||||||
|
```bash
|
||||||
|
cd quiz-app
|
||||||
|
|
||||||
|
# Run lint checks
|
||||||
|
npm run lint
|
||||||
|
|
||||||
|
# Test build process
|
||||||
|
npm run build
|
||||||
|
|
||||||
|
# Manual testing: Start dev server and verify quiz functionality
|
||||||
|
npm run serve
|
||||||
|
```
|
||||||
|
|
||||||
|
### Notebook Testing
|
||||||
|
- No automated test framework exists for notebooks
|
||||||
|
- Manual validation: Run all cells in sequence to ensure no errors
|
||||||
|
- Verify data files are accessible and outputs are generated correctly
|
||||||
|
- Check that visualizations render properly
|
||||||
|
|
||||||
|
### Documentation Testing
|
||||||
|
```bash
|
||||||
|
# Verify Docsify renders correctly
|
||||||
|
docsify serve
|
||||||
|
|
||||||
|
# Check for broken links manually by navigating through content
|
||||||
|
# Verify all lesson links work in the rendered documentation
|
||||||
|
```
|
||||||
|
|
||||||
|
### Code Quality Checks
|
||||||
|
```bash
|
||||||
|
# Vue.js projects (quiz-app and visualization projects)
|
||||||
|
cd quiz-app # or visualization project folder
|
||||||
|
npm run lint
|
||||||
|
|
||||||
|
# Python notebooks - manual verification recommended
|
||||||
|
# Ensure imports work and cells execute without errors
|
||||||
|
```
|
||||||
|
|
||||||
|
## Code Style Guidelines
|
||||||
|
|
||||||
|
### Python (Jupyter Notebooks)
|
||||||
|
- Follow PEP 8 style guidelines for Python code
|
||||||
|
- Use clear variable names that explain the data being analyzed
|
||||||
|
- Include markdown cells with explanations before code cells
|
||||||
|
- Keep code cells focused on single concepts or operations
|
||||||
|
- Use pandas for data manipulation, matplotlib for visualization
|
||||||
|
- Common import pattern:
|
||||||
|
```python
|
||||||
|
import pandas as pd
|
||||||
|
import numpy as np
|
||||||
|
import matplotlib.pyplot as plt
|
||||||
|
```
|
||||||
|
|
||||||
|
### JavaScript/Vue.js
|
||||||
|
- Follow Vue.js 2 style guide and best practices
|
||||||
|
- ESLint configuration in `quiz-app/package.json`
|
||||||
|
- Use Vue single-file components (.vue files)
|
||||||
|
- Maintain component-based architecture
|
||||||
|
- Run `npm run lint` before committing changes
|
||||||
|
|
||||||
|
### Markdown Documentation
|
||||||
|
- Use clear headings hierarchy (# ## ### etc.)
|
||||||
|
- Include code blocks with language specifiers
|
||||||
|
- Add alt text for images
|
||||||
|
- Link to related lessons and resources
|
||||||
|
- Keep line lengths reasonable for readability
|
||||||
|
|
||||||
|
### File Organization
|
||||||
|
- Lesson content in numbered folders (01-defining-data-science, etc.)
|
||||||
|
- Solutions in dedicated `solution/` subfolders
|
||||||
|
- Translations mirror English structure in `translations/` folder
|
||||||
|
- Keep data files in `data/` or lesson-specific folders
|
||||||
|
|
||||||
|
## Build and Deployment
|
||||||
|
|
||||||
|
### Quiz Application Deployment
|
||||||
|
```bash
|
||||||
|
cd quiz-app
|
||||||
|
|
||||||
|
# Build production version
|
||||||
|
npm run build
|
||||||
|
|
||||||
|
# Output is in dist/ folder
|
||||||
|
# Deploy dist/ folder to static hosting (Azure Static Web Apps, Netlify, etc.)
|
||||||
|
```
|
||||||
|
|
||||||
|
### Azure Static Web Apps Deployment
|
||||||
|
The quiz-app can be deployed to Azure Static Web Apps:
|
||||||
|
1. Create Azure Static Web App resource
|
||||||
|
2. Connect to GitHub repository
|
||||||
|
3. Configure build settings:
|
||||||
|
- App location: `quiz-app`
|
||||||
|
- Output location: `dist`
|
||||||
|
4. GitHub Actions workflow will auto-deploy on push
|
||||||
|
|
||||||
|
### Documentation Site
|
||||||
|
```bash
|
||||||
|
# Build PDF from Docsify (optional)
|
||||||
|
npm run convert
|
||||||
|
|
||||||
|
# Docsify documentation is served directly from markdown files
|
||||||
|
# No build step required for deployment
|
||||||
|
# Deploy repository to static hosting with Docsify
|
||||||
|
```
|
||||||
|
|
||||||
|
### GitHub Codespaces
|
||||||
|
- Repository includes dev container configuration
|
||||||
|
- Codespaces automatically sets up Python and Node.js environment
|
||||||
|
- Open repository in Codespace via GitHub UI
|
||||||
|
- All dependencies install automatically
|
||||||
|
|
||||||
|
## Pull Request Guidelines
|
||||||
|
|
||||||
|
### Before Submitting
|
||||||
|
```bash
|
||||||
|
# For Vue.js changes in quiz-app
|
||||||
|
cd quiz-app
|
||||||
|
npm run lint
|
||||||
|
npm run build
|
||||||
|
|
||||||
|
# Test changes locally
|
||||||
|
npm run serve
|
||||||
|
```
|
||||||
|
|
||||||
|
### PR Title Format
|
||||||
|
- Use clear, descriptive titles
|
||||||
|
- Format: `[Component] Brief description`
|
||||||
|
- Examples:
|
||||||
|
- `[Lesson 7] Fix Python notebook import error`
|
||||||
|
- `[Quiz App] Add German translation`
|
||||||
|
- `[Docs] Update README with new prerequisites`
|
||||||
|
|
||||||
|
### Required Checks
|
||||||
|
- Ensure all code runs without errors
|
||||||
|
- Verify notebooks execute completely
|
||||||
|
- Confirm Vue.js apps build successfully
|
||||||
|
- Check that documentation links work
|
||||||
|
- Test quiz application if modified
|
||||||
|
- Verify translations maintain consistent structure
|
||||||
|
|
||||||
|
### Contribution Guidelines
|
||||||
|
- Follow existing code style and patterns
|
||||||
|
- Add explanatory comments for complex logic
|
||||||
|
- Update relevant documentation
|
||||||
|
- Test changes across different lesson modules if applicable
|
||||||
|
- Review the CONTRIBUTING.md file
|
||||||
|
|
||||||
|
## Additional Notes
|
||||||
|
|
||||||
|
### Common Libraries Used
|
||||||
|
- **pandas**: Data manipulation and analysis
|
||||||
|
- **numpy**: Numerical computing
|
||||||
|
- **matplotlib**: Data visualization and plotting
|
||||||
|
- **seaborn**: Statistical data visualization (some lessons)
|
||||||
|
- **scikit-learn**: Machine learning (advanced lessons)
|
||||||
|
|
||||||
|
### Working with Data Files
|
||||||
|
- Data files located in `data/` folder or lesson-specific directories
|
||||||
|
- Most notebooks expect data files in relative paths
|
||||||
|
- CSV files are primary data format
|
||||||
|
- Some lessons use JSON for non-relational data examples
|
||||||
|
|
||||||
|
### Multilingual Support
|
||||||
|
- 40+ language translations via automated GitHub Actions
|
||||||
|
- Translation workflow in `.github/workflows/co-op-translator.yml`
|
||||||
|
- Translations in `translations/` folder with language codes
|
||||||
|
- Quiz translations in `quiz-app/src/assets/translations/`
|
||||||
|
|
||||||
|
### Development Environment Options
|
||||||
|
1. **Local Development**: Install Python, Jupyter, Node.js locally
|
||||||
|
2. **GitHub Codespaces**: Cloud-based instant development environment
|
||||||
|
3. **VS Code Dev Containers**: Local container-based development
|
||||||
|
4. **Binder**: Launch notebooks in cloud (if configured)
|
||||||
|
|
||||||
|
### Lesson Content Guidelines
|
||||||
|
- Each lesson is standalone but builds on previous concepts
|
||||||
|
- Pre-lesson quizzes test prior knowledge
|
||||||
|
- Post-lesson quizzes reinforce learning
|
||||||
|
- Assignments provide hands-on practice
|
||||||
|
- Sketchnotes provide visual summaries
|
||||||
|
|
||||||
|
### Troubleshooting Common Issues
|
||||||
|
|
||||||
|
**Jupyter Kernel Issues:**
|
||||||
|
```bash
|
||||||
|
# Ensure correct kernel is installed
|
||||||
|
python -m ipykernel install --user --name=datascience
|
||||||
|
```
|
||||||
|
|
||||||
|
**npm Install Failures:**
|
||||||
|
```bash
|
||||||
|
# Clear npm cache and retry
|
||||||
|
npm cache clean --force
|
||||||
|
rm -rf node_modules package-lock.json
|
||||||
|
npm install
|
||||||
|
```
|
||||||
|
|
||||||
|
**Import Errors in Notebooks:**
|
||||||
|
- Verify all required libraries are installed
|
||||||
|
- Check Python version compatibility (Python 3.7+ recommended)
|
||||||
|
- Ensure virtual environment is activated
|
||||||
|
|
||||||
|
**Docsify Not Loading:**
|
||||||
|
- Verify you're serving from repository root
|
||||||
|
- Check that `index.html` exists
|
||||||
|
- Ensure proper network access (port 3000)
|
||||||
|
|
||||||
|
### Performance Considerations
|
||||||
|
- Large datasets may take time to load in notebooks
|
||||||
|
- Visualization rendering can be slow for complex plots
|
||||||
|
- Vue.js dev server enables hot-reload for quick iteration
|
||||||
|
- Production builds are optimized and minified
|
||||||
|
|
||||||
|
### Security Notes
|
||||||
|
- No sensitive data or credentials should be committed
|
||||||
|
- Use environment variables for any API keys in cloud lessons
|
||||||
|
- Azure-related lessons may require Azure account credentials
|
||||||
|
- Keep dependencies updated for security patches
|
||||||
|
|
||||||
|
## Contributing to Translations
|
||||||
|
- Automated translations managed via GitHub Actions
|
||||||
|
- Manual corrections welcomed for translation accuracy
|
||||||
|
- Follow existing translation folder structure
|
||||||
|
- Update quiz links to include language parameter: `?loc=fr`
|
||||||
|
- Test translated lessons for proper rendering
|
||||||
|
|
||||||
|
## Related Resources
|
||||||
|
- Main curriculum: https://aka.ms/datascience-beginners
|
||||||
|
- Microsoft Learn: https://docs.microsoft.com/learn/
|
||||||
|
- Student Hub: https://docs.microsoft.com/learn/student-hub
|
||||||
|
- Discussion Forum: https://github.com/microsoft/Data-Science-For-Beginners/discussions
|
||||||
|
- Other Microsoft curricula: ML for Beginners, AI for Beginners, Web Dev for Beginners
|
||||||
|
|
||||||
|
## Project Maintenance
|
||||||
|
- Regular updates to keep content current
|
||||||
|
- Community contributions welcome
|
||||||
|
- Issues tracked on GitHub
|
||||||
|
- PRs reviewed by curriculum maintainers
|
||||||
|
- Monthly content reviews and updates
|
||||||
Loading…
Reference in new issue