| name | code-quality-tools |
| description | Automated code quality tools for scientific Python using ruff, mypy, and pre-commit hooks |
Code Quality Tools for Scientific Python
Master the essential code quality tools that keep scientific Python projects maintainable, consistent, and error-free. Learn how to configure ruff for lightning-fast linting and formatting, mypy for static type checking, and pre-commit hooks for automated quality gates. These tools help catch bugs early, enforce consistent style across teams, and make code reviews focus on logic rather than formatting.
Key Tools:
- Ruff: Ultra-fast Python linter and formatter (replaces flake8, black, isort, and more)
- MyPy: Static type checker for Python
- Pre-commit: Git hook framework for automated checks
Quick Reference Card
Installation & Setup
# Using pixi (recommended for scientific projects)
pixi add --feature dev ruff mypy pre-commit
# Using pip
pip install ruff mypy pre-commit
# Initialize pre-commit
pre-commit install
Essential Ruff Commands
# Check code (linting)
ruff check .
# Fix auto-fixable issues
ruff check --fix .
# Format code
ruff format .
# Check and format together
ruff check --fix . && ruff format .
Essential MyPy Commands
# Type check entire project
mypy src/
# Type check with strict mode
mypy --strict src/
# Type check specific file
mypy src/mymodule/analysis.py
# Generate type coverage report
mypy --html-report mypy-report src/
Essential Pre-commit Commands
# Run all hooks on all files
pre-commit run --all-files
# Run hooks on staged files only
pre-commit run
# Update hook versions
pre-commit autoupdate
# Skip hooks temporarily (not recommended)
git commit --no-verify
Quick Decision Tree
Need to enforce code style and catch common errors?
YES → Use Ruff (linting + formatting)
NO → Skip to type checking
Want to catch type-related bugs before runtime?
YES → Add MyPy
NO → Ruff alone is sufficient
Need to ensure checks run automatically?
YES → Set up pre-commit hooks
NO → Run tools manually (not recommended for teams)
Working with legacy code without type hints?
YES → Start with Ruff only, add MyPy gradually
NO → Use both Ruff and MyPy from the start
When to Use This Skill
Use this skill when you need to establish or improve code quality practices in scientific Python projects:
- Starting a new scientific Python project and want to establish code quality standards from day one
- Maintaining existing research code that needs consistency and error prevention
- Collaborating with multiple contributors who need automated style enforcement
- Preparing code for publication or package distribution
- Catching bugs early through static type checking before runtime
- Automating code reviews to focus on logic rather than style
- Integrating with CI/CD for automated quality checks
- Migrating from older tools like black, flake8, or isort to modern alternatives
Core Concepts
1. Ruff: The All-in-One Linter and Formatter
Ruff is a blazingly fast Python linter and formatter written in Rust that replaces multiple tools you might be using today.
What Ruff Replaces:
- flake8 (linting)
- black (formatting)
- isort (import sorting)
- pyupgrade (syntax modernization)
- pydocstyle (docstring linting)
- And 50+ other tools
Why Ruff for Scientific Python:
Ruff is 10-100x faster than traditional tools, which matters when you have large codebases with thousands of lines of numerical code. Instead of managing multiple configuration files and tool versions, you get a single tool that handles everything. Ruff can auto-fix most issues automatically, saving time during development. It includes NumPy-aware docstring checking, understanding the conventions used throughout the scientific Python ecosystem. Best of all, it's compatible with existing black and flake8 configurations, making migration straightforward.
Example:
# Before ruff format
import sys
import os
import numpy as np
def calculate_mean(data):
return np.mean(data)
# After ruff format
import os
import sys
import numpy as np
def calculate_mean(data):
return np.mean(data)
Ruff automatically organizes imports (standard library, third party, local) and applies consistent formatting.
2. MyPy: Static Type Checking
MyPy analyzes type hints to catch errors before your code ever runs. This is especially valuable in scientific computing where dimension mismatches and type errors can lead to subtle bugs in numerical calculations.
Example of what MyPy catches:
import numpy as np
from numpy.typing import NDArray
def calculate_mean(data: NDArray[np.float64]) -> float:
"""Calculate mean of array."""
return float(np.mean(data))
# MyPy catches this error at type-check time:
result: int = calculate_mean(np.array([1.0, 2.0, 3.0]))
# Error: Incompatible types (expression has type "float", variable has type "int")
Benefits for Scientific Code:
Type hints catch dimension mismatches in array operations before you run expensive computations. They validate function signatures, ensuring you pass the right types to numerical functions. Type hints serve as documentation, making it clear what types functions expect and return. They prevent None-related bugs that can crash long-running simulations. Modern IDEs use type hints to provide better autocomplete and inline documentation.
3. Pre-commit: Automated Quality Gates
Pre-commit runs checks automatically before each commit, ensuring code quality standards are maintained without manual intervention.
How it works:
# .pre-commit-config.yaml
repos:
- repo: https://github.com/astral-sh/ruff-pre-commit
rev: v0.6.0
hooks:
- id: ruff
args: [--fix]
- id: ruff-format
- repo: https://github.com/pre-commit/mirrors-mypy
rev: v1.11.0
hooks:
- id: mypy
Workflow:
- Developer runs
git commit - Pre-commit automatically runs ruff, mypy, and other checks
- If checks fail, commit is blocked
- Developer fixes issues and commits again
- Once all checks pass, commit succeeds
This ensures that code quality issues are caught immediately, before they enter the codebase.
Decision Trees
Choosing Between Ruff and Legacy Tools
Already using black + flake8 + isort?
YES → Migrate to Ruff (single tool, much faster)
Ruff is compatible with black formatting
NO → Start with Ruff directly
Need custom linting rules?
YES → Check if Ruff supports them (700+ rules available)
Supported → Use Ruff
Not supported → Consider pylint as supplement
NO → Ruff covers 99% of use cases
Performance matters (large codebase)?
Always → Ruff is 10-100x faster
MyPy Strictness Levels
Starting a new project?
YES → Use --strict mode from day one
NO → Adding types to existing code?
Start with basic mypy (no flags)
Add --check-untyped-defs
Add --disallow-untyped-defs
Eventually reach --strict
Scientific library with complex NumPy types?
YES → Install numpy type stubs: pip install types-numpy
NO → Standard mypy is sufficient
Patterns and Examples
Pattern 1: Basic Ruff Configuration
Configure ruff in pyproject.toml for your scientific Python project:
[tool.ruff]
# Target Python 3.10+
target-version = "py310"
# Line length (match black default)
line-length = 88
# Exclude common directories
exclude = [
".git",
".mypy_cache",
".ruff_cache",
".venv",
"build",
"dist",
]
[tool.ruff.lint]
# Enable rule sets
select = [
"E", # pycodestyle errors
"W", # pycodestyle warnings
"F", # pyflakes
"I", # isort (import sorting)
"N", # pep8-naming
"UP", # pyupgrade
"B", # flake8-bugbear
"C4", # flake8-comprehensions
"NPY", # NumPy-specific rules
"PD", # pandas-vet
]
# Ignore specific rules
ignore = [
"E501", # Line too long (handled by formatter)
]
# Allow autofix for all enabled rules
fixable = ["ALL"]
[tool.ruff.lint.per-file-ignores]
# Ignore imports in __init__.py
"__init__.py" = ["F401"]
# Allow print statements in scripts
"scripts/*.py" = ["T201"]
[tool.ruff.format]
# Use double quotes
quote-style = "double"
# Indent with spaces
indent-style = "space"
Usage:
# Check and fix
ruff check --fix .
# Format
ruff format .
Pattern 2: MyPy Configuration for Scientific Python
Configure mypy in pyproject.toml with appropriate strictness for scientific code:
[tool.mypy]
# Python version
python_version = "3.10"
# Strictness options (start lenient, increase gradually)
check_untyped_defs = true
disallow_untyped_defs = false # Set to true when ready
warn_return_any = true
warn_unused_configs = true
warn_redundant_casts = true
# Output options
show_error_codes = true
pretty = true
# Ignore missing imports for packages without type stubs
[[tool.mypy.overrides]]
module = [
"scipy.*",
"matplotlib.*",
]
ignore_missing_imports = true
Install type stubs for common libraries:
pixi add --feature dev types-requests types-PyYAML
# NumPy and pandas have built-in type hints (Python 3.9+)
Example typed scientific function:
import numpy as np
from typing import Optional
from numpy.typing import NDArray
def normalize_data(
data: NDArray[np.float64],
method: str = "zscore",
axis: Optional[int] = None
) -> NDArray[np.float64]:
"""
Normalize numerical data.
Parameters
----------
data : NDArray[np.float64]
Input data array.
method : str, default "zscore"
Normalization method: "zscore" or "minmax".
axis : int, optional
Axis along which to normalize.
Returns
-------
NDArray[np.float64]
Normalized data.
Raises
------
ValueError
If method is not recognized.
"""
if method == "zscore":
mean = np.mean(data, axis=axis, keepdims=True)
std = np.std(data, axis=axis, keepdims=True)
return (data - mean) / std
elif method == "minmax":
min_val = np.min(data, axis=axis, keepdims=True)
max_val = np.max(data, axis=axis, keepdims=True)
return (data - min_val) / (max_val - min_val)
else:
raise ValueError(f"Unknown method: {method}")
Pattern 3: Pre-commit Configuration
Set up pre-commit hooks to automatically enforce quality standards:
# .pre-commit-config.yaml
# See https://pre-commit.com for more information
repos:
# Ruff linter and formatter
- repo: https://github.com/astral-sh/ruff-pre-commit
rev: v0.6.0
hooks:
# Run the linter
- id: ruff
args: [--fix]
# Run the formatter
- id: ruff-format
# MyPy type checking
- repo: https://github.com/pre-commit/mirrors-mypy
rev: v1.11.0
hooks:
- id: mypy
additional_dependencies:
- types-requests
- types-PyYAML
args: [--ignore-missing-imports]
# General pre-commit hooks
- repo: https://github.com/pre-commit/pre-commit-hooks
rev: v4.6.0
hooks:
- id: trailing-whitespace
- id: end-of-file-fixer
- id: check-yaml
- id: check-toml
- id: check-added-large-files
args: [--maxkb=1000]
- id: check-merge-conflict
# Jupyter notebook cleaning
- repo: https://github.com/kynan/nbstripout
rev: 0.7.1
hooks:
- id: nbstripout
Setup:
# Install pre-commit
pixi add --feature dev pre-commit
# Install git hooks
pre-commit install
# Run on all files (first time)
pre-commit run --all-files
Pattern 4: Pixi Integration
Integrate quality tools with pixi for reproducible development environments:
[project]
name = "my-science-project"
version = "0.1.0"
dependencies = [
"numpy>=1.24",
"pandas>=2.0",
]
[tool.pixi.project]
channels = ["conda-forge"]
platforms = ["linux-64", "osx-64", "osx-arm64", "win-64"]
[tool.pixi.dependencies]
python = ">=3.10"
numpy = ">=1.24"
pandas = ">=2.0"
[tool.pixi.feature.dev.dependencies]
ruff = ">=0.6.0"
mypy = ">=1.11"
pre-commit = ">=3.5"
pytest = ">=7.0"
[tool.pixi.feature.dev.tasks]
# Linting and formatting
lint = "ruff check ."
format = "ruff format ."
check = { depends-on = ["lint", "format"] }
# Type checking
typecheck = "mypy src/"
# Run all quality checks
quality = { depends-on = ["check", "typecheck"] }
# Testing
test = "pytest tests/"
# Full validation (run before committing)
validate = { depends-on = ["quality", "test"] }
Usage:
# Run quality checks
pixi run quality
# Run full validation
pixi run validate
# Format code
pixi run format
# Type check
pixi run typecheck
Pattern 5: CI/CD Integration (GitHub Actions)
Ensure quality checks run automatically in continuous integration:
# .github/workflows/quality.yml
name: Code Quality
on:
push:
branches: [main]
pull_request:
branches: [main]
jobs:
quality:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Set up Python
uses: actions/setup-python@v5
with:
python-version: "3.11"
- name: Install dependencies
run: |
pip install ruff mypy pytest
pip install -e .
- name: Run Ruff
run: |
ruff check .
ruff format --check .
- name: Run MyPy
run: mypy src/
- name: Run tests
run: pytest tests/
Pattern 6: Gradual Type Hint Adoption
Add type hints incrementally to existing scientific code:
Step 1: Start with function signatures
import numpy as np
from numpy.typing import NDArray
def calculate_statistics(data: NDArray) -> dict:
"""Calculate basic statistics."""
return {
"mean": np.mean(data),
"std": np.std(data),
"min": np.min(data),
"max": np.max(data),
}
Step 2: Add return type details
from typing import TypedDict
import numpy as np
from numpy.typing import NDArray
class Statistics(TypedDict):
mean: float
std: float
min: float
max: float
def calculate_statistics(data: NDArray) -> Statistics:
"""Calculate basic statistics."""
return {
"mean": float(np.mean(data)),
"std": float(np.std(data)),
"min": float(np.min(data)),
"max": float(np.max(data)),
}
Step 3: Add internal variable types (optional)
from typing import TypedDict
import numpy as np
from numpy.typing import NDArray
class Statistics(TypedDict):
mean: float
std: float
min: float
max: float
def calculate_statistics(data: NDArray) -> Statistics:
"""Calculate basic statistics."""
mean_val: float = float(np.mean(data))
std_val: float = float(np.std(data))
min_val: float = float(np.min(data))
max_val: float = float(np.max(data))
return {
"mean": mean_val,
"std": std_val,
"min": min_val,
"max": max_val,
}
Pattern 7: NumPy Array Type Hints
Use numpy.typing for proper array annotations in scientific code:
import numpy as np
from numpy.typing import NDArray
# Generic array
def process_array(data: NDArray) -> NDArray:
"""Process numerical array."""
return data * 2
# Specific dtype
def process_float_array(data: NDArray[np.float64]) -> NDArray[np.float64]:
"""Process float64 array."""
return data * 2.0
# Multiple dimensions
Vector = NDArray[np.float64] # 1D array
Matrix = NDArray[np.float64] # 2D array
def matrix_multiply(a: Matrix, b: Matrix) -> Matrix:
"""Multiply two matrices."""
return np.matmul(a, b)
# More specific shape hints
def normalize_vector(v: NDArray[np.float64]) -> NDArray[np.float64]:
"""
Normalize a vector to unit length.
Parameters
----------
v : NDArray[np.float64]
Input vector of shape (n,).
Returns
-------
NDArray[np.float64]
Normalized vector of shape (n,).
"""
norm = np.linalg.norm(v)
if norm == 0:
return v
return v / norm
Pattern 8: Handling Optional and Union Types
Properly type functions with optional parameters and multiple accepted types:
import numpy as np
from typing import Optional, Union
from pathlib import Path
from numpy.typing import NDArray
def load_data(
filepath: Union[str, Path],
delimiter: str = ",",
skip_rows: Optional[int] = None
) -> NDArray:
"""
Load data from file.
Parameters
----------
filepath : str or Path
Path to data file.
delimiter : str, default ","
Column delimiter.
skip_rows : int, optional
Number of rows to skip at start.
Returns
-------
NDArray
Loaded data array.
"""
# Convert to Path if string
path = Path(filepath) if isinstance(filepath, str) else filepath
# Load with optional skip_rows
kwargs = {"delimiter": delimiter}
if skip_rows is not None:
kwargs["skiprows"] = skip_rows
return np.loadtxt(path, **kwargs)
Pattern 9: Ruff Rule Selection for Scientific Python
Configure ruff with rules appropriate for scientific computing:
[tool.ruff.lint]
select = [
# Essential
"E", # pycodestyle errors
"F", # pyflakes
"I", # isort
# Code quality
"B", # flake8-bugbear (common bugs)
"C4", # flake8-comprehensions
"UP", # pyupgrade (modern syntax)
# Scientific Python specific
"NPY", # NumPy-specific rules
"PD", # pandas-vet
# Documentation
"D", # pydocstyle (docstrings)
# Type hints
"ANN", # flake8-annotations
]
# Customize docstring rules for NumPy style
[tool.ruff.lint.pydocstyle]
convention = "numpy"
# Common rules to ignore in scientific code
ignore = [
"E501", # Line too long (formatter handles this)
"ANN101", # Missing type annotation for self
"ANN102", # Missing type annotation for cls
"D100", # Missing docstring in public module (optional)
"D104", # Missing docstring in public package (optional)
]
Pattern 10: Fixing Common Ruff Warnings
Learn to fix the most common issues ruff identifies:
F401: Unused import
# Before
import numpy as np
import pandas as pd # Not used
# After
import numpy as np
F841: Unused variable
# Before
def process_data(data):
result = expensive_computation(data)
return data # Oops, should return result
# After
def process_data(data):
result = expensive_computation(data)
return result
E711: Comparison to None
# Before
if value == None:
pass
# After
if value is None:
pass
B006: Mutable default argument
# Before (dangerous!)
def append_data(value, data=[]):
data.append(value)
return data
# After
def append_data(value, data=None):
if data is None:
data = []
data.append(value)
return data
NPY002: Legacy NumPy random
# Before (old style)
import numpy as np
data = np.random.rand(100)
# After (new style, better for reproducibility)
import numpy as np
rng = np.random.default_rng(seed=42)
data = rng.random(100)
Best Practices Checklist
Initial Setup
- Install ruff, mypy, and pre-commit in dev environment
- Create
pyproject.tomlwith tool configurations - Create
.pre-commit-config.yaml - Run
pre-commit installto enable git hooks - Run
pre-commit run --all-filesto check existing code - Add quality check tasks to pixi configuration
Configuration
- Set appropriate Python version target
- Enable NumPy-specific rules (NPY) for scientific code
- Configure NumPy-style docstring checking
- Set up per-file ignores for special cases (init.py, scripts)
- Configure mypy strictness appropriate for project maturity
- Install type stubs for third-party libraries
Workflow Integration
- Add quality checks to CI/CD pipeline
- Document quality standards in CONTRIBUTING.md
- Create pixi tasks for common quality checks
- Set up IDE integration (VS Code, PyCharm)
- Configure editor to run ruff on save
- Add quality check badge to README
Team Practices
- Run
ruff check --fixbefore committing - Run
ruff formatbefore committing - Address mypy errors (don't use
# type: ignorewithout reason) - Review pre-commit failures before using
--no-verify - Keep pre-commit hooks updated (
pre-commit autoupdate) - Add type hints to new functions
- Gradually add types to existing code
Maintenance
- Update ruff regularly (fast-moving project)
- Update pre-commit hook versions monthly
- Review and adjust ignored rules as project matures
- Increase mypy strictness gradually
- Monitor CI/CD for quality check failures
- Refactor code flagged by quality tools
Common Issues and Solutions
Issue 1: Ruff and Black Formatting Conflicts
Problem: Using both ruff format and black causes conflicts.
Solution: Choose one formatter. Ruff format is compatible with black's style:
[tool.ruff.format]
# Use black-compatible formatting
quote-style = "double"
indent-style = "space"
line-ending = "auto"
Remove black from dependencies and pre-commit hooks.
Issue 2: MyPy Can't Find Imports
Problem: error: Cannot find implementation or library stub for module named 'scipy'
Solution: Install type stubs or ignore missing imports:
[[tool.mypy.overrides]]
module = ["scipy.*", "matplotlib.*"]
ignore_missing_imports = true
Or install stubs:
pixi add --feature dev types-requests types-PyYAML
Issue 3: Pre-commit Hooks Too Slow
Problem: Pre-commit takes too long on large codebases.
Solution:
Use ruff instead of multiple tools (much faster). Limit hooks to staged files only (default behavior). Skip expensive checks in pre-commit, run in CI instead by removing mypy from .pre-commit-config.yaml and keeping it in CI workflow.
Issue 4: Too Many Ruff Errors on Legacy Code
Problem: Hundreds of ruff errors on existing codebase.
Solution: Gradual adoption strategy:
# 1. Start with auto-fixable issues only
ruff check --fix .
# 2. Add baseline to ignore existing issues
ruff check --add-noqa .
# 3. Fix new code going forward
# 4. Gradually remove # noqa comments
Issue 5: Type Hints Break at Runtime
Problem: Code with type hints fails with NameError in Python < 3.10.
Solution: Use from __future__ import annotations:
from __future__ import annotations # Must be first import
import numpy as np
def process(data: np.ndarray) -> np.ndarray:
"""This works in Python 3.7+"""
return data * 2
Issue 6: MyPy Errors in Test Files
Problem: MyPy complains about pytest fixtures and dynamic test generation.
Solution: Configure mypy to be lenient with tests:
[[tool.mypy.overrides]]
module = "tests.*"
disallow_untyped_defs = false
Issue 7: Ruff Conflicts with Project Style
Problem: Team prefers single quotes, but ruff uses double quotes.
Solution: Configure ruff to match team preferences:
[tool.ruff.format]
quote-style = "single"
Issue 8: Pre-commit Fails in CI
Problem: Pre-commit hooks pass locally but fail in CI.
Solution: Ensure consistent environments:
# In CI, use same Python version and dependencies
- name: Set up Python
uses: actions/setup-python@v5
with:
python-version: "3.11" # Match local version
# Or use pre-commit's CI action
- uses: pre-commit/action@v3.0.0
Integration with Other Tools
VS Code Integration
Install extensions for seamless integration with your editor:
Extensions:
- Ruff (charliermarsh.ruff)
- Mypy Type Checker (ms-python.mypy-type-checker)
Settings (.vscode/settings.json):
{
"[python]": {
"editor.defaultFormatter": "charliermarsh.ruff",
"editor.formatOnSave": true,
"editor.codeActionsOnSave": {
"source.fixAll": "explicit",
"source.organizeImports": "explicit"
}
},
"ruff.lint.args": ["--config=pyproject.toml"],
"mypy-type-checker.args": ["--config-file=pyproject.toml"]
}
PyCharm Integration
Ruff:
- Install Ruff plugin from marketplace
- Configure: Settings → Tools → Ruff
- Enable "Run ruff on save"
MyPy:
- Install Mypy plugin
- Configure: Settings → Tools → Mypy
- Set mypy executable path
Jupyter Notebook Integration
Use nbqa to run quality tools on notebooks:
# Install nbqa
pixi add --feature dev nbqa
# Run ruff on notebooks
nbqa ruff notebooks/
# Run mypy on notebooks
nbqa mypy notebooks/
Pre-commit config for notebooks:
repos:
- repo: https://github.com/nbQA-dev/nbQA
rev: 1.8.5
hooks:
- id: nbqa-ruff
args: [--fix]
- id: nbqa-mypy
pytest Integration
Type checking in tests ensures your test code is also correct:
import numpy as np
from numpy.typing import NDArray
def test_normalize_data():
"""Test data normalization."""
data: NDArray[np.float64] = np.array([1.0, 2.0, 3.0])
result = normalize_data(data)
# MyPy ensures types match
assert isinstance(result, np.ndarray)
assert result.dtype == np.float64
Documentation Integration
Ruff checks docstrings for completeness and correctness:
def calculate_mean(data: np.ndarray) -> float:
"""
Calculate arithmetic mean.
Parameters
----------
data : np.ndarray
Input data array.
Returns
-------
float
Mean value.
Examples
--------
>>> calculate_mean(np.array([1, 2, 3]))
2.0
"""
return float(np.mean(data))
Ruff validates docstring presence, NumPy-style formatting, parameter documentation matches signature, and return type documentation.
Real-World Examples
Example 1: Complete Scientific Python Project Setup
Set up a new project with all quality tools configured:
Project structure:
my-science-project/
├── src/
│ └── my_project/
│ ├── __init__.py
│ ├── analysis.py
│ └── visualization.py
├── tests/
│ └── test_analysis.py
├── pyproject.toml
├── .pre-commit-config.yaml
└── README.md
pyproject.toml:
[project]
name = "my-science-project"
version = "0.1.0"
requires-python = ">=3.10"
dependencies = [
"numpy>=1.24",
"pandas>=2.0",
"matplotlib>=3.7",
]
[tool.pixi.project]
channels = ["conda-forge"]
platforms = ["linux-64", "osx-64", "osx-arm64"]
[tool.pixi.dependencies]
python = "3.11.*"
numpy = ">=1.24"
pandas = ">=2.0"
matplotlib = ">=3.7"
[tool.pixi.feature.dev.dependencies]
ruff = ">=0.6.0"
mypy = ">=1.11"
pre-commit = ">=3.5"
pytest = ">=7.0"
pytest-cov = ">=4.0"
[tool.pixi.feature.dev.tasks]
lint = "ruff check ."
format = "ruff format ."
typecheck = "mypy src/"
test = "pytest tests/ --cov=src/"
quality = { depends-on = ["lint", "format", "typecheck"] }
all = { depends-on = ["quality", "test"] }
[tool.ruff]
target-version = "py310"
line-length = 88
[tool.ruff.lint]
select = ["E", "F", "I", "N", "UP", "B", "C4", "NPY", "D"]
ignore = ["E501", "D100", "D104"]
[tool.ruff.lint.pydocstyle]
convention = "numpy"
[tool.mypy]
python_version = "3.10"
check_untyped_defs = true
warn_return_any = true
warn_unused_configs = true
show_error_codes = true
Usage:
# Setup
pixi install
pre-commit install
# Development workflow
pixi run format # Format code
pixi run lint # Check for issues
pixi run typecheck # Type check
pixi run test # Run tests
pixi run all # Run everything
# Before committing (automatic via pre-commit)
git commit -m "Add new analysis function"
Example 2: Adding Types to Existing Scientific Code
Transform untyped code into well-typed, documented code:
Before (no types):
import numpy as np
def calculate_correlation(x, y, method="pearson"):
"""Calculate correlation between two arrays."""
if method == "pearson":
return np.corrcoef(x, y)[0, 1]
elif method == "spearman":
from scipy.stats import spearmanr
return spearmanr(x, y)[0]
else:
raise ValueError(f"Unknown method: {method}")
After (with types):
import numpy as np
from numpy.typing import NDArray
from typing import Literal
CorrelationMethod = Literal["pearson", "spearman"]
def calculate_correlation(
x: NDArray[np.float64],
y: NDArray[np.float64],
method: CorrelationMethod = "pearson"
) -> float:
"""
Calculate correlation between two arrays.
Parameters
----------
x : NDArray[np.float64]
First data array.
y : NDArray[np.float64]
Second data array.
method : {"pearson", "spearman"}, default "pearson"
Correlation method to use.
Returns
-------
float
Correlation coefficient.
Raises
------
ValueError
If method is not recognized.
Examples
--------
>>> x = np.array([1.0, 2.0, 3.0])
>>> y = np.array([2.0, 4.0, 6.0])
>>> calculate_correlation(x, y)
1.0
"""
if method == "pearson":
corr_matrix: NDArray[np.float64] = np.corrcoef(x, y)
return float(corr_matrix[0, 1])
elif method == "spearman":
from scipy.stats import spearmanr
result = spearmanr(x, y)
return float(result.statistic)
else:
raise ValueError(f"Unknown method: {method}")
Benefits:
MyPy catches invalid method names at type-check time. IDE provides autocomplete for method parameter. Clear documentation of expected types. Runtime errors caught before execution.
Example 3: Pre-commit Hook Workflow
See how pre-commit catches issues before they enter the codebase:
Scenario: Developer commits code with issues
$ git add src/analysis.py
$ git commit -m "Add new analysis function"
# Pre-commit runs automatically
ruff....................................................................Failed
hook id: ruff
exit code: 1
src/analysis.py:15:1: F401 [*] `numpy` imported but unused
src/analysis.py:23:5: E711 Comparison to `None` should be `cond is None`
Found 2 errors.
mypy....................................................................Failed
hook id: mypy
exit code: 1
src/analysis.py:30: error: Incompatible return value type (got "None", expected "float")
# Fix the issues
$ ruff check --fix src/analysis.py # Auto-fix F401, E711
$ # Manually fix mypy error
# Commit again
$ git commit -m "Add new analysis function"
ruff....................................................................Passed
mypy....................................................................Passed
[feature/new-analysis abc123] Add new analysis function
1 file changed, 25 insertions(+)
Migration Guides
Migrating from Black + Flake8 + isort
Replace multiple tools with ruff for better performance and simpler configuration:
Step 1: Install ruff
pixi add --feature dev ruff
Step 2: Convert configuration
# Old: pyproject.toml
[tool.black]
line-length = 88
[tool.isort]
profile = "black"
# Old: setup.cfg
[flake8]
max-line-length = 88
# New: pyproject.toml
[tool.ruff]
line-length = 88
[tool.ruff.lint]
select = ["E", "F", "I"] # pycodestyle, pyflakes, isort
Step 3: Update pre-commit
# Remove these
# - repo: https://github.com/psf/black
# - repo: https://github.com/pycqa/flake8
# - repo: https://github.com/pycqa/isort
# Add this
- repo: https://github.com/astral-sh/ruff-pre-commit
rev: v0.6.0
hooks:
- id: ruff
args: [--fix]
- id: ruff-format
Step 4: Remove old tools
pixi remove --feature dev black flake8 isort
Migrating from Pylint
Ruff covers most pylint rules with better performance:
[tool.ruff.lint]
select = [
"E", # pycodestyle errors
"W", # pycodestyle warnings
"F", # pyflakes
"C90", # mccabe complexity
"I", # isort
"N", # pep8-naming
"UP", # pyupgrade
"B", # flake8-bugbear
"A", # flake8-builtins
"C4", # flake8-comprehensions
"PL", # pylint rules
]
Keep pylint only if you need specific rules:
# Check what pylint rules you use
pylint --list-msgs
# See if ruff supports them
ruff rule <rule-code>
Resources and References
Official Documentation
- Ruff: https://docs.astral.sh/ruff/
- MyPy: https://mypy.readthedocs.io/
- Pre-commit: https://pre-commit.com/
- NumPy Typing: https://numpy.org/devdocs/reference/typing.html
Ruff Resources
- Rule reference: https://docs.astral.sh/ruff/rules/
- Configuration: https://docs.astral.sh/ruff/configuration/
- Editor integrations: https://docs.astral.sh/ruff/integrations/
- Migration guide: https://docs.astral.sh/ruff/faq/#how-does-ruff-compare-to-flake8
MyPy Resources
- Type hints cheat sheet: https://mypy.readthedocs.io/en/stable/cheat_sheet_py3.html
- Common issues: https://mypy.readthedocs.io/en/stable/common_issues.html
- Running mypy: https://mypy.readthedocs.io/en/stable/running_mypy.html
- Type stubs: https://mypy.readthedocs.io/en/stable/stubs.html
Pre-commit Resources
- Supported hooks: https://pre-commit.com/hooks.html
- Creating hooks: https://pre-commit.com/index.html#creating-new-hooks
- CI integration: https://pre-commit.ci/
Scientific Python Resources
- Scientific Python Development Guide: https://learn.scientific-python.org/development/
- NumPy documentation style: https://numpydoc.readthedocs.io/
- Type hints for scientific code: https://numpy.org/devdocs/reference/typing.html
Community Examples
- Scientific Python Cookie: https://github.com/scientific-python/cookie
- NumPy: https://github.com/numpy/numpy (see pyproject.toml)
- SciPy: https://github.com/scipy/scipy (see pyproject.toml)
- Pandas: https://github.com/pandas-dev/pandas (see pyproject.toml)
Quick Start Template
Copy-paste starter configuration for immediate use:
# pyproject.toml
[tool.ruff]
target-version = "py310"
line-length = 88
[tool.ruff.lint]
select = ["E", "F", "I", "B", "NPY"]
ignore = ["E501"]
[tool.ruff.lint.pydocstyle]
convention = "numpy"
[tool.mypy]
python_version = "3.10"
check_untyped_defs = true
warn_return_any = true
show_error_codes = true
# .pre-commit-config.yaml
repos:
- repo: https://github.com/astral-sh/ruff-pre-commit
rev: v0.6.0
hooks:
- id: ruff
args: [--fix]
- id: ruff-format
- repo: https://github.com/pre-commit/mirrors-mypy
rev: v1.11.0
hooks:
- id: mypy
args: [--ignore-missing-imports]
# Setup commands
pixi add --feature dev ruff mypy pre-commit
pre-commit install
pre-commit run --all-files
Summary
Code quality tools are essential for maintaining scientific Python projects. Ruff provides fast, comprehensive linting and formatting. MyPy catches type errors before runtime. Pre-commit automates quality checks in your workflow.
Key takeaways:
Start with ruff for immediate impact as it replaces multiple tools with a single fast solution. Add mypy gradually as you add type hints to catch bugs early. Use pre-commit to enforce standards automatically without manual intervention. Integrate with pixi for reproducible development environments. Configure tools in pyproject.toml for centralized management. Run quality checks in CI/CD to maintain standards across the team.
Next steps:
Set up ruff and pre-commit in your project today. Add type hints to new functions you write. Gradually increase mypy strictness as your codebase matures. Share configurations with your team for consistency. Integrate quality checks into your development workflow.
Quality tools save time by catching errors early and maintaining consistency across your scientific codebase. They make code reviews more productive by automating style discussions, allowing reviewers to focus on scientific correctness and algorithmic choices rather than formatting details.