| name | working-with-spreadsheets |
| description | Creates and edits Excel spreadsheets with formulas, formatting, and financial modeling standards. Use when working with .xlsx files, financial models, data analysis, or formula-heavy spreadsheets. Covers formula recalculation, color coding standards, and common pitfalls. |
Working with Spreadsheets
Quick Start
from openpyxl import Workbook
wb = Workbook()
sheet = wb.active
sheet['A1'] = 'Revenue'
sheet['B1'] = 1000
sheet['B2'] = '=B1*1.1' # Use formulas, not hardcoded values!
wb.save('output.xlsx')
Critical Rule: Use Formulas, Not Hardcoded Values
Always use Excel formulas instead of calculating in Python.
# WRONG - Hardcoding calculated values
total = df['Sales'].sum()
sheet['B10'] = total # Hardcodes 5000
# CORRECT - Using Excel formulas
sheet['B10'] = '=SUM(B2:B9)'
Financial Model Color Coding Standards
| Color | RGB | Usage |
|---|---|---|
| Blue text | 0,0,255 | Hardcoded inputs, scenario values |
| Black text | 0,0,0 | ALL formulas and calculations |
| Green text | 0,128,0 | Links from other worksheets |
| Red text | 255,0,0 | External links to other files |
| Yellow background | 255,255,0 | Key assumptions needing attention |
from openpyxl.styles import Font
# Input cell (user changeable)
sheet['B5'].font = Font(color='0000FF') # Blue
# Formula cell
sheet['C5'] = '=B5*1.1'
sheet['C5'].font = Font(color='000000') # Black
# Cross-sheet link
sheet['D5'] = "=Sheet2!A1"
sheet['D5'].font = Font(color='008000') # Green
Number Formatting Standards
# Currency with thousands separator
sheet['B5'].number_format = '$#,##0'
# Zeros display as dash
sheet['B5'].number_format = '$#,##0;($#,##0);-'
# Percentages with one decimal
sheet['C5'].number_format = '0.0%'
# Valuation multiples
sheet['D5'].number_format = '0.0x'
# Years as text (not 2,024)
sheet['A1'] = '2024' # String, not number
Library Selection
| Task | Library | Example |
|---|---|---|
| Data analysis | pandas | df = pd.read_excel('file.xlsx') |
| Formulas & formatting | openpyxl | sheet['A1'] = '=SUM(B:B)' |
| Large files (read) | openpyxl | load_workbook('file.xlsx', read_only=True) |
| Large files (write) | openpyxl | Workbook(write_only=True) |
Reading Excel Files
import pandas as pd
from openpyxl import load_workbook
# pandas - data analysis
df = pd.read_excel('file.xlsx')
all_sheets = pd.read_excel('file.xlsx', sheet_name=None) # Dict of DataFrames
# openpyxl - preserve formulas
wb = load_workbook('file.xlsx')
sheet = wb.active
print(sheet['A1'].value) # Returns formula string
# openpyxl - get calculated values (WARNING: loses formulas on save!)
wb = load_workbook('file.xlsx', data_only=True)
Creating Excel Files
from openpyxl import Workbook
from openpyxl.styles import Font, PatternFill, Alignment
wb = Workbook()
sheet = wb.active
sheet.title = 'Model'
# Headers
sheet['A1'] = 'Metric'
sheet['B1'] = '2024'
sheet['A1'].font = Font(bold=True)
# Data with formulas
sheet['A2'] = 'Revenue'
sheet['B2'] = 1000000
sheet['B2'].font = Font(color='0000FF') # Blue = input
sheet['A3'] = 'Growth'
sheet['B3'] = '=B2*0.1'
sheet['B3'].font = Font(color='000000') # Black = formula
# Formatting
sheet['B2'].number_format = '$#,##0'
sheet.column_dimensions['A'].width = 20
wb.save('model.xlsx')
Editing Existing Files
from openpyxl import load_workbook
wb = load_workbook('existing.xlsx')
sheet = wb['Data'] # Or wb.active
# Modify cells
sheet['A1'] = 'Updated Value'
sheet.insert_rows(2)
sheet.delete_cols(3)
# Add new sheet
new_sheet = wb.create_sheet('Analysis')
new_sheet['A1'] = '=Data!B5' # Cross-sheet reference
wb.save('modified.xlsx')
Formula Recalculation
openpyxl writes formulas but doesn't calculate values. Use LibreOffice to recalculate:
# Recalculate and check for errors
python recalc.py output.xlsx
The script returns JSON:
{
"status": "success", // or "errors_found"
"total_errors": 0,
"total_formulas": 42,
"error_summary": {
"#REF!": {"count": 2, "locations": ["Sheet1!B5", "Sheet1!C10"]}
}
}
Formula Verification Checklist
Before Building
- Test 2-3 sample references first
- Confirm column mapping (column 64 = BL, not BK)
- Remember: DataFrame row 5 = Excel row 6 (1-indexed)
Common Pitfalls
- Check for NaN with
pd.notna()before using values - FY data often in columns 50+ (far right)
- Search ALL occurrences, not just first match
- Check denominators before division (#DIV/0!)
- Verify cross-sheet references use correct format (
Sheet1!A1)
After Building
- Run
recalc.pyand fix any errors - Verify #REF!, #DIV/0!, #VALUE!, #NAME? = 0
Common Errors
| Error | Cause | Fix |
|---|---|---|
| #REF! | Invalid cell reference | Check deleted rows/columns |
| #DIV/0! | Division by zero | Add IF check: =IF(B5=0,0,A5/B5) |
| #VALUE! | Wrong data type | Check cell contains expected type |
| #NAME? | Unknown function | Check spelling, quotes around text |
Verification
Run: python scripts/verify.py
Related Skills
building-nextjs-apps- Frontend for spreadsheet uploadsscaffolding-fastapi-dapr- API for spreadsheet processing