| name | architect |
| description | Expert guidance for GabeDA v2.1 architecture (34 modules) - implementing models, features, debugging 4-case logic, and maintaining the /src codebase. |
| version | 2.1.0 |
GabeDA Architecture Expert
Purpose
This skill provides expert guidance for the GabeDA v2.1 refactored architecture. It focuses on implementing models, adding features, debugging execution logic, and maintaining architectural principles across the 34-module /src codebase.
Core Expertise:
/srcarchitecture (34 modules in 6 packages)- 4-case logic execution engine
- Feature implementation (filters, attributes, aggregations)
- Dependency resolution and data flow
- External data integration patterns
- Frontend development (React + TypeScript + Vite)
- Testing strategies and validation
When to Use This Skill
Invoke this skill when:
- Working with the
/srcrefactored codebase (v2.1) - Implementing new aggregation models (daily, weekly, monthly, customer, product)
- Adding filters, attributes, or computed features
- Debugging 4-case logic execution issues
- Configuring external data joins
- Developing frontend features (React + TypeScript + Vite)
- Troubleshooting blank pages or HMR issues
- Understanding data flow and persistence strategies
- Troubleshooting column naming or dependency resolution
- Ensuring architectural principles are maintained
- Creating tests in
/testfolder
NOT for: Business strategy, marketing content, data analysis notebooks (delegate to business, marketing, insights skills)
Quick Start
Essential Documents:
- ๐ Feature Implementation Guide - PRIMARY GUIDE for implementation
- ๐ Documentation Master Index - Central hub for all documentation
- ๐งช Test Manifest - Complete test catalog (197 tests)
- ๐ Documentation Guidelines - Before creating any docs
Key References:
- references/module_reference.md - 34 modules structure
- references/4_case_logic.md - Critical execution engine
- references/external_data_integration.md - Column naming rules
Core Architecture Overview
Module Structure (v2.1)
34 modules in 6 packages following Single Responsibility Principle:
src/
โโโ utils/ # Utilities (7 modules) - 88 tests, 92% coverage
โโโ core/ # Core infrastructure (5 modules)
โโโ preprocessing/ # Data preparation (5 modules)
โโโ features/ # Feature management (4 modules)
โโโ execution/ # Feature computation (5 modules) - Includes 4-case logic
โโโ export/ # Output generation (2 modules)
For complete module details: See references/module_reference.md
Data Flow Pipeline
CSV โ DataLoader โ SchemaProcessor โ SyntheticEnricher โ
FeatureStore โ DependencyResolver โ ModelExecutor โ ExcelExporter
For detailed flow stages: See references/data_flow_pipeline.md
Critical: 4-Case Logic
The GroupByProcessor (src/execution/groupby.py) implements single-loop execution with 4 cases:
- Case 1: Standard filter (reads data_in only)
- Case 2: Filter using attributes (reads data_in + agg_results) KEY INNOVATION
- Case 3: Attribute with aggregation
- Case 4: Attribute composition (uses only other attributes)
Case 2 Example:
def price_above_avg(price_total: float, prod_price_avg: float) -> bool:
"""Filter that uses an attribute as input"""
return price_total > prod_price_avg
For deep dive: See references/4_case_logic.md
Core Workflows
Workflow 1: Implementing a New Model
When creating daily, weekly, monthly, customer, or product aggregation models:
- Read primary guide - Feature Implementation Guide
- Define features - Create filter and attribute functions with type hints
- Create features dictionary - Register all features
- Configure model - Set group_by, external_data, output_cols
- Verify naming - Check external column prefixes (join keys NOT prefixed, others ARE)
- Test execution - Verify output shapes and values
- Create tests - Add repeatable tests in
/testfolder
Detailed guide: assets/examples/implementing_new_model.md
Working examples:
- 02_1_week.ipynb - Weekly model with external data
- 01_1_1_day.ipynb - Daily aggregation
- 03_consolidated_all_models.ipynb - 9-model pipeline
Workflow 2: Adding a New Feature
When adding filters (row-level) or attributes (aggregated):
- Define function - Include type hints and docstring
- Determine type - Filter (vectorized) or attribute (aggregated)?
- Register in dictionary - Add to features dict
- Check dependencies - Ensure resolvable via DFS
- Verify external data - If using, check column naming
- Update model config - Add to output_cols
- Create tests - Add to
/testfolder with sample data
Detailed guide: assets/examples/adding_new_feature.md
For feature type details: See references/feature_types.md
Workflow 3: Configuring External Data
When joining external datasets (daily โ weekly, customer โ product):
- Verify dataset exists - Check
ctx.list_datasets() - Configure in model - Add external_data section with source, join_on, columns
- Remember naming rules:
- Join keys: NOT prefixed (e.g.,
dt_datestaysdt_date) - Regular columns: ARE prefixed (e.g.,
price_total_sumโdaily_attrs_price_total_sum)
- Join keys: NOT prefixed (e.g.,
- Write feature functions - Use correct prefixed names
- Test join - Verify merged data has expected columns
Critical naming table:
| Column Type | Original | After Merge | Prefixed? |
|---|---|---|---|
| Join key | dt_date |
dt_date |
โ NO |
| Regular column | price_total_sum |
daily_attrs_price_total_sum |
โ YES |
Detailed guide: assets/examples/configuring_external_data.md
For complete naming rules: See references/external_data_integration.md
Workflow 4: Debugging Execution Issues
When encountering errors during model execution:
- Check error message - "Argument 'X' not found" most common
- Verify column naming - Join keys vs regular columns prefixes
- Validate external data - Check dataset exists and join_on matches
- Print available columns - Use
ctx.get_dataset('name').columns.tolist() - Test incrementally - Add features one at a time
- Check dependencies - Ensure DFS can resolve order
Common error: "Argument not found"
Causes:
- External column wrong prefix (join key vs regular)
- Missing external data config
- Typo in column name
Solution:
# 1. Check input dataset
print(ctx.get_dataset('transactions_filters').columns.tolist())
# 2. Check external dataset
print(ctx.get_dataset('daily_attrs').columns.tolist())
# 3. Remember: Join keys NO prefix, others WITH prefix
For complete troubleshooting: See references/troubleshooting.md
Workflow 5: Frontend Development (React/Vite)
CRITICAL: Always clean dev environment BEFORE starting new features
When working on frontend features (GabeDA Dashboard - React + TypeScript + Vite):
Step 0: Clean Dev Environment (MANDATORY)
# Kill all node processes to avoid port conflicts and HMR corruption
cd C:/Projects/play/gabeda_frontend
taskkill //F //IM node.exe
# Clear Vite cache
rm -rf node_modules/.vite
# Start fresh dev server
npm run dev
Why This Matters:
- Problem: Multiple Vite HMR instances can run simultaneously on different ports (5173, 5174, 5175...)
- Symptom: Blank pages, "module does not provide export" errors, stuck/corrupted state
- Root Cause: Old dev servers hold corrupted module cache, new changes start on different port
- Solution: Kill ALL node processes before starting work
Quick Fix Script: Use restart-dev.bat in frontend folder:
@echo off
taskkill //F //IM node.exe 2>nul
if exist node_modules\.vite rmdir /s /q node_modules\.vite
npm run dev
Development Workflow:
- CLEAN - Run
restart-dev.bator kill node processes manually - BRANCH - Create feature branch (
git checkout -b feature/feature-name) - IMPLEMENT - Make code changes
- BUILD - Run
npm run buildto check for TypeScript errors - TEST - Test locally on http://localhost:5173 (verify correct port!)
- E2E - Use Playwright skill for automated testing
- COMMIT - Only after local verification passes
- DEPLOY - Merge to main โ auto-deploy to Render
Common Issues:
- Blank pages on all routes โ Multiple Vite instances running, kill all node processes
- "Module does not provide export" errors โ HMR cache corruption, clear
.vitecache - Wrong port (5174, 5175 instead of 5173) โ Old servers still running, kill and restart
- Changes not appearing โ Browser accessing old port, hard refresh (Ctrl+Shift+R) or use incognito
Port Detection:
# Check which port Vite started on (look for "Local: http://localhost:XXXX")
npm run dev
# If not 5173, there are stuck processes - kill and restart
Best Practices:
- Always kill node processes before starting new feature work
- Always verify you're accessing the correct port (check terminal output)
- Always use incognito/private browsing for testing to avoid browser cache issues
- Always build (
npm run build) before committing to catch TypeScript errors - Never commit without local testing on the correct port
Core Principles (DO NOT BREAK)
โ
Single Responsibility - Each module does ONE thing
โ
Single Input - Each model gets exactly 1 dataframe
โ
DFS Resolution - Features auto-ordered by dependencies
โ
4-Case Logic - Filters can use attributes as inputs
โ
Immutable Context - User config never changes during execution
โ
Save Checkpoints - Save after every major transformation
โ
Type Annotations - All functions have type hints
โ
Logging - Every module uses get_logger(name)
โ
Testing - All tests MUST be in /test folder and be repeatable
For detailed principles: See references/core_principles.md
Testing Requirements
Current Statistics:
- Total Tests: 197 tests (6 integration, 108 unit, 69 validation, 14 notebook)
- Code Coverage: 85% (target: โฅ85%)
- Test Manifest: ai/testing/TEST_MANIFEST.md โญ Living Document
Test Rules:
- Location: All tests MUST be in
/testfolder - Repeatability: Tests MUST be idempotent (run multiple times, same result)
- Cleanup: Tests MUST delete temp files/folders
- Independence: No external state dependencies
- Naming: Use
test_{module_name}.pyortest_{feature_name}.py - Documentation: ALWAYS append to Test Manifest
Running Tests:
pytest test/ # All tests
pytest test/unit/ # Unit tests only
pytest test/integration/ # Integration tests only
pytest test/ -v # With verbose output
For complete testing guidelines: See references/testing_guidelines.md
Configuration Patterns
Base Config:
base_cfg = {
'input_file': 'path/to/data.csv',
'client': 'project_name',
'analysis_dt': 'YYYY-MM-DD',
'data_schema': {
'in_dt': {'source_column': 'Fecha venta', 'dtype': 'date'},
'in_product_id': {'source_column': 'SKU', 'dtype': 'str'},
'in_price_total': {'source_column': 'Total', 'dtype': 'float'}
}
}
Model Config (With External Data):
cfg_model = {
'model_name': 'weekly',
'group_by': ['dt_year', 'dt_weekofyear'],
'row_id': 'in_trans_id',
'output_cols': list(features.keys()),
'features': features,
'external_data': {
'daily_attrs': {
'source': 'daily_attrs',
'join_on': ['dt_date'],
'columns': None # None = ALL, or ['col1', 'col2']
}
}
}
For complete patterns: See references/configuration_patterns.md
Additional Resources
Reference Documentation
- module_reference.md - 34 modules structure with coverage stats
- data_flow_pipeline.md - 7-stage pipeline flow
- 4_case_logic.md - Critical execution engine โญ KEY INNOVATION
- feature_types.md - Filters vs attributes
- dependency_resolution.md - DFS traversal
- configuration_patterns.md - Config templates
- external_data_integration.md - Column naming rules
- synthetic_enrichment.md - Auto-infer 17 columns
- testing_guidelines.md - Test requirements (197 tests)
- troubleshooting.md - Common error patterns
- core_principles.md - 9 DO NOT BREAK rules
Implementation Examples
- implementing_new_model.md - Step-by-step model creation
- adding_new_feature.md - Filter and attribute addition
- configuring_external_data.md - External joins
- adding_aggregation_level.md - New aggregation levels
External Documentation
- Feature Implementation Guide - PRIMARY REFERENCE
- Documentation Master Index - All guides
- Module Reference - Technical module docs
- Model Specifications - Tech specs, aggregation architecture
Integration with Other Skills
From Business Skill
- Receive: User stories, acceptance criteria, priority rankings, business requirements
- Provide: Technical feasibility assessment, effort estimates, architecture proposals
- Example: Business defines "VIP customer retention" โ Architect implements RFM model
From Executive Skill
- Receive: Feature requirements, quality standards, timeline constraints
- Provide: Implementation plans, trade-off analysis, technical specs
- Example: Executive prioritizes Chilean launch โ Architect implements CLP currency support
To Insights Skill
- Provide: Available features, data schema, execution capabilities
- Receive: Notebook requirements, visualization needs, metric definitions
- Example: Architect adds RFM model โ Insights creates VIP retention notebook
To Marketing Skill
- Provide: Technical capabilities, feature descriptions, performance metrics
- Receive: Feature positioning requirements, technical content needs
- Example: Architect implements 4-case logic โ Marketing positions as "KEY INNOVATION"
Living Documents (Append Only)
When making changes, ALWAYS append to these 9 living documents:
| Document | When to Use |
|---|---|
| CHANGELOG.md | After modifying any .py file |
| ISSUES.md | After fixing bugs or errors |
| PROJECT_STATUS.md | Weekly updates |
| FEATURE_IMPLEMENTATIONS.md | After implementing features |
| TESTING_RESULTS.md | After running tests |
| TEST_MANIFEST.md | When adding/modifying tests โญ |
| ARCHITECTURE_DECISIONS.md | When making architectural choices |
| NOTEBOOK_IMPROVEMENTS.md | When improving notebooks |
| FUTURE_ENHANCEMENTS.md | When proposing enhancements |
Documentation Workflow:
- Check if change fits into one of these 9 living documents
- If YES โ APPEND to that document (do NOT create new file)
- If NO โ Check Documentation Guidelines
- NEVER create documentation files without checking guidelines first
Working Directory
Architect Workspace: .claude/skills/architect/
Bundled Resources:
references/- 11 technical reference documents (module structure, 4-case logic, external data, testing, troubleshooting, core principles)assets/examples/- 4 implementation guides (new model, new feature, external data, aggregation level)
Technical Documents (Create Here):
/ai/architect/- Architecture proposals, spike results, design documents- Use descriptive names:
integration_analysis.md,feature_implementation_guide.md
Context Folders (Reference as Needed):
/ai/backend/- Django backend context/ai/frontend/- React frontend context/ai/specs/- Technical specifications (context, edge cases, feature store, model specs)
When Suggesting Changes
Always explain:
- Why - Maintains architectural integrity
- Which modules - Affected components
- How - Fits into data flow
- Where - Data persistence location
- What testing - Required in
/testfolder - How repeatable - Test idempotency strategy
For every change:
- Identify implementation files
- Create corresponding test in
/testfolder - Ensure tests are repeatable and self-contained
- Use sample data from
data/tests/when needed - Document test execution in code comments
- Append to Test Manifest when adding tests
Think like an architect: Prioritize maintainability, testability, and adherence to established patterns.
Version History
v2.1.0 (2025-10-30)
- Refactored to use progressive disclosure pattern
- Extracted detailed content to
references/(11 files) andassets/examples/(4 files) - Converted to imperative form (removed second-person voice)
- Reduced from 576 lines to ~295 lines
- Enhanced with v2.1 utils package details (7 utility modules)
- Added clear workflow sections with examples
v2.0.0 (2025-10-28)
- Updated for v2.1 architecture (34 modules, 6 packages)
- Added comprehensive testing guidelines
- Enhanced external data integration documentation
Last Updated: 2025-10-30 Architecture Version: v2.1 (34 modules in 6 packages) Test Coverage: 197 tests, 85% coverage Core Innovation: 4-case logic engine (filters can use attributes as inputs)