| name | project-guidelines |
| description | Comprehensive Python project development guidelines using uv and Jupyter notebooks. Use when creating modules, notebooks, tests, or working with Python code, data analysis, visualization, or package management. |
Project Guidelines - Python/uv/Jupyter
Purpose
Establish consistency and best practices for Python development using uv (fast package manager) and Jupyter notebooks for data analysis and experimentation.
When to Use This Skill
Automatically activates when working on:
- Creating or modifying Python modules in
src/ - Working with Jupyter notebooks in
notebooks/ - Writing tests with pytest
- Managing dependencies with uv
- Data analysis and visualization tasks
Quick Start
New Feature Checklist
- Plan: Define requirements and approach
- Module: Create Python module in
src/with type hints - Notebook: Create exploratory notebook in
notebooks/(if needed) - Tests: Add pytest tests in
tests/ - Document: Add docstrings and update README
- Type Check: Run
uv run mypy src/ - Lint: Run
uv run ruff check . - Format: Run
uv run ruff format . - Test: Run
uv run pytest - Commit: Use conventional commit message
Project Overview
Project Name: jupyter-analysis-template
Description: A Python project template for data analysis and exploration using Jupyter notebooks, managed with uv.
Primary Language: Python 3.12+
Tech Stack:
- Runtime: Python 3.12+ (managed via uv)
- Package Manager: uv (fast, modern Python package manager)
- Notebooks: Jupyter Lab
- Testing: pytest
- Linting: ruff
- Type Checking: mypy
- Formatting: ruff format
Key Features:
- Fast dependency management with uv
- Interactive data exploration with Jupyter
- Type-safe Python code with mypy
- Modern linting and formatting with ruff
Architecture Overview
Directory Structure
jupyter-analysis-template/
├── src/
│ └── analysis/ # Python modules for reusable code
│ ├── __init__.py
│ ├── data.py # Data loading utilities
│ ├── viz.py # Visualization helpers
│ └── utils.py # General utilities
├── notebooks/ # Jupyter notebooks for exploration
│ ├── 01_data_exploration.ipynb
│ └── 02_analysis.ipynb
├── tests/ # pytest tests
│ ├── __init__.py
│ ├── test_data.py
│ └── test_viz.py
├── .claude/ # Claude Code infrastructure
├── pyproject.toml # Project metadata and dependencies (uv)
├── .python-version # Python version for uv
├── uv.lock # Lockfile (auto-generated)
├── .gitignore
└── README.md
Key Principles:
- Separation: Notebooks for exploration,
src/for reusable code - Type Safety: Use type hints everywhere
- Testing: Test reusable code in
src/, not notebooks - Clean Notebooks: Keep notebooks focused and well-documented
See structure.md for detailed organization.
Coding Standards
Python Style
PEP 8 with ruff:
- 4 spaces for indentation
- 88 character line length (Black-compatible)
- Use ruff for linting and formatting
Naming Conventions
# Modules and packages: lowercase with underscores
# data_loader.py, visualization_utils.py
# Classes: PascalCase
class DataProcessor:
pass
class VisualizationHelper:
pass
# Functions and variables: snake_case
def load_dataset(file_path: str) -> pd.DataFrame:
pass
user_count = 42
is_valid = True
# Constants: UPPER_SNAKE_CASE
MAX_RETRIES = 3
DEFAULT_TIMEOUT = 30
API_BASE_URL = "https://api.example.com"
# Private: leading underscore
def _internal_helper():
pass
_cache = {}
Type Hints
Always use type hints:
from typing import Optional, List, Dict, Any
import pandas as pd
def process_data(
df: pd.DataFrame,
columns: List[str],
threshold: float = 0.5
) -> pd.DataFrame:
"""Process dataframe with specified columns."""
...
def fetch_user(user_id: int) -> Optional[Dict[str, Any]]:
"""Fetch user by ID, returns None if not found."""
...
Docstrings
Use Google-style docstrings:
def calculate_statistics(data: List[float]) -> Dict[str, float]:
"""Calculate basic statistics for a dataset.
Args:
data: List of numeric values to analyze.
Returns:
Dictionary containing mean, median, and std deviation.
Raises:
ValueError: If data list is empty.
Example:
>>> calculate_statistics([1, 2, 3, 4, 5])
{'mean': 3.0, 'median': 3.0, 'std': 1.41}
"""
if not data:
raise ValueError("Data list cannot be empty")
...
Error Handling
# Explicit error types
try:
result = process_data(df)
except FileNotFoundError as e:
logger.error(f"Data file not found: {e}")
raise
except pd.errors.EmptyDataError:
logger.warning("Empty dataset, using defaults")
result = pd.DataFrame()
# Context in errors
if value < 0:
raise ValueError(f"Expected positive value, got {value}")
# Use logging, not print
import logging
logger = logging.getLogger(__name__)
logger.info("Processing started")
Jupyter Notebooks Best Practices
Notebook Organization
# Notebook Title
**Author**: Your Name
**Created**: 2025-11-15
**Purpose**: Brief description of analysis
## Setup
Import libraries and load data
## Data Exploration
Initial data inspection and visualization
## Analysis
Main analytical work
## Conclusions
Key findings and next steps
Code in Notebooks
# ✅ Good: Import reusable code from src/
from analysis.data import load_dataset
from analysis.viz import plot_distribution
df = load_dataset("data.csv")
plot_distribution(df['column'])
# ❌ Bad: Defining complex functions in notebooks
# (Move to src/ modules instead)
Notebook Hygiene
- Clear outputs before committing:
jupyter nbconvert --clear-output --inplace notebook.ipynb - Keep cells focused: One task per cell
- Add markdown: Explain your thinking
- Restart and run all: Before finalizing, restart kernel and run all cells
- Extract reusable code: Move functions to
src/when you reuse them
Dependency Management with uv
Why uv?
- Fast: 10-100x faster than pip
- Reliable: Deterministic resolution like Poetry
- Simple: Uses
pyproject.toml, no new concepts - Compatible: Works with existing Python ecosystem
Common Commands
# Create new project
uv init
# Add dependency
uv add pandas jupyter
# Add dev dependency
uv add --dev pytest mypy ruff
# Install all dependencies
uv sync
# Run command in venv
uv run jupyter lab
uv run pytest
uv run mypy src/
# Update dependencies
uv lock --upgrade
# Remove dependency
uv remove pandas
pyproject.toml Structure
[project]
name = "jupyter-analysis-template"
version = "0.1.0"
description = "Python data analysis template"
requires-python = ">=3.12"
dependencies = [
"pandas>=2.0.0",
"jupyter>=1.0.0",
"matplotlib>=3.7.0",
"seaborn>=0.12.0",
]
[project.optional-dependencies]
dev = [
"pytest>=7.0.0",
"mypy>=1.0.0",
"ruff>=0.1.0",
"ipykernel>=6.25.0",
]
[tool.ruff]
line-length = 88
target-version = "py312"
[tool.mypy]
python_version = "3.12"
strict = true
Testing Strategy
Testing Pyramid for Data Projects
/\
/E2E\ ← Few: Full pipeline tests
/------\
/Integr.\ ← Some: Module interaction tests
/----------\
/Unit Tests \ ← Many: Function-level tests
/--------------\
Test Coverage Goals
- Unit Tests: 80%+ coverage for
src/modules - Integration Tests: Cover data pipelines
- Notebook Tests: Not typically tested (exploratory)
Writing Tests with pytest
# tests/test_data.py
import pytest
import pandas as pd
from analysis.data import load_dataset, clean_data
def test_load_dataset_success():
"""Test successful dataset loading."""
df = load_dataset("test_data.csv")
assert isinstance(df, pd.DataFrame)
assert not df.empty
def test_load_dataset_file_not_found():
"""Test handling of missing file."""
with pytest.raises(FileNotFoundError):
load_dataset("nonexistent.csv")
def test_clean_data_removes_nulls():
"""Test that clean_data removes null values."""
df = pd.DataFrame({'a': [1, None, 3], 'b': [4, 5, None]})
cleaned = clean_data(df)
assert cleaned.isnull().sum().sum() == 0
@pytest.fixture
def sample_dataframe():
"""Fixture providing sample data for tests."""
return pd.DataFrame({
'value': [1, 2, 3, 4, 5],
'category': ['A', 'B', 'A', 'B', 'A']
})
def test_analysis_with_fixture(sample_dataframe):
"""Test analysis using fixture data."""
result = analyze(sample_dataframe)
assert result['mean'] == 3.0
Running Tests
# Run all tests
uv run pytest
# Run with coverage
uv run pytest --cov=src --cov-report=html
# Run specific test file
uv run pytest tests/test_data.py
# Run specific test
uv run pytest tests/test_data.py::test_load_dataset_success
# Run in verbose mode
uv run pytest -v
See testing.md for comprehensive guide.
Code Quality Tools
Linting with ruff
# Check for issues
uv run ruff check .
# Auto-fix issues
uv run ruff check --fix .
# Format code
uv run ruff format .
# Check specific file
uv run ruff check src/analysis/data.py
Type Checking with mypy
# Check all source code
uv run mypy src/
# Check specific file
uv run mypy src/analysis/data.py
# Strict mode (recommended)
uv run mypy --strict src/
Pre-commit Workflow
# Before committing
uv run ruff format . # Format code
uv run ruff check --fix . # Fix linting issues
uv run mypy src/ # Type check
uv run pytest # Run tests
Git Workflow
Branching
- Main Branch:
main(stable code) - Feature Branches:
feature/data-pipeline,feature/visualization
Commit Messages
Conventional Commits:
feat(data): add CSV loading with validation
fix(viz): correct color scale in heatmap
docs(readme): update installation instructions
test(data): add tests for data cleaning
refactor(utils): simplify date parsing logic
.gitignore for Python
# Python
__pycache__/
*.py[cod]
*$py.class
.Python
*.so
# Virtual environments
.venv/
venv/
env/
# Jupyter
.ipynb_checkpoints/
*.ipynb (optionally clear outputs instead)
# uv
uv.lock (commit this!)
# IDEs
.vscode/
.idea/
# Data (usually)
data/*.csv
data/*.parquet
!data/sample.csv # Keep small examples
# Outputs
outputs/
figures/
*.png (or commit them)
# OS
.DS_Store
Development Workflows
Local Development
# 1. Setup (first time)
uv sync # Install dependencies
uv run pre-commit install # Setup git hooks (optional)
# 2. Daily workflow
uv run jupyter lab # Start Jupyter
# Work in notebooks...
# Extract reusable code to src/
# Write tests for new functions
uv run pytest # Run tests
uv run mypy src/ # Type check
uv run ruff check . # Lint
# 3. Commit
git add .
git commit -m "feat(analysis): add customer segmentation"
Adding New Analysis
- Create notebook:
notebooks/03_new_analysis.ipynb - Explore data interactively
- Extract reusable functions to
src/analysis/ - Add type hints and docstrings
- Write tests in
tests/ - Run quality checks
- Commit notebook + code + tests
Progressive Disclosure Resources
For detailed guidance on specific topics, see:
- structure.md: Python project organization patterns
- workflows.md: CI/CD for Python projects
- testing.md: pytest strategies and patterns
- docs.md: Python documentation with Sphinx
- release.md: Versioning and PyPI publishing
- security.md: Python-specific security practices
Common Patterns
Data Loading Pattern
# src/analysis/data.py
from pathlib import Path
import pandas as pd
from typing import Optional
def load_dataset(
file_path: str | Path,
*,
nrows: Optional[int] = None
) -> pd.DataFrame:
"""Load dataset from CSV with validation.
Args:
file_path: Path to CSV file
nrows: Optional limit on rows to load
Returns:
Loaded and validated DataFrame
Raises:
FileNotFoundError: If file doesn't exist
ValueError: If data validation fails
"""
path = Path(file_path)
if not path.exists():
raise FileNotFoundError(f"File not found: {path}")
df = pd.read_csv(path, nrows=nrows)
# Validate
if df.empty:
raise ValueError("Dataset is empty")
return df
Visualization Helper Pattern
# src/analysis/viz.py
import matplotlib.pyplot as plt
import seaborn as sns
from typing import Optional
def plot_distribution(
data: pd.Series,
title: Optional[str] = None,
**kwargs
) -> plt.Figure:
"""Plot distribution with KDE overlay.
Args:
data: Series to plot
title: Optional plot title
**kwargs: Additional arguments for seaborn
Returns:
Matplotlib figure
"""
fig, ax = plt.subplots(figsize=(10, 6))
sns.histplot(data, kde=True, ax=ax, **kwargs)
if title:
ax.set_title(title)
return fig
Troubleshooting
Common Issues
Issue: uv: command not found
- Solution: Install uv:
curl -LsSf https://astral.sh/uv/install.sh | sh
Issue: Jupyter kernel not found
- Solution:
uv run python -m ipykernel install --user --name=venv
Issue: Import errors in notebooks
- Solution: Ensure project is in path or use:
uv run jupyter lab
Issue: Type errors in tests
- Solution: Install dev dependencies:
uv sync --all-extras
References
- uv Documentation: https://docs.astral.sh/uv/
- Jupyter: https://jupyter.org/
- ruff: https://docs.astral.sh/ruff/
- mypy: https://mypy.readthedocs.io/
- pytest: https://docs.pytest.org/
Last Updated: 2025-11-15
Version: 1.0.0