| name | dbt Development |
| description | Proactive skill for validating dbt models against coding conventions. Auto-activates when creating, reviewing, or refactoring dbt models in staging, integration, or warehouse layers. Validates naming, SQL structure, field conventions, testing coverage, and documentation. Supports project-specific convention overrides and sqlfluff integration. |
dbt Development Skill
Purpose
This skill automatically activates when working with dbt models to ensure adherence to coding conventions and best practices. It provides validation and recommendations for model structure, naming, SQL style, testing, and documentation.
When This Skill Activates
User-Triggered Activation
This skill should activate when users:
- Create new dbt models: "Create a staging model for users from Salesforce"
- Review existing models: "Review this dbt model for issues"
- Refactor models: "Refactor this integration model to follow best practices"
- Work with .sql files in models/: Any read/write operations on dbt model files
- Ask about dbt conventions: "What are the naming conventions for warehouse models?"
- Request schema/test files: "Add tests to this model"
Keywords to watch for:
- "dbt model", "staging", "integration", "warehouse", "intermediate"
- "refactor", "review", "validate", "check conventions"
- "stg_", "int_", "_dim", "_fct"
- "schema.yml", "tests", "dbt test"
Self-Triggered Activation (Proactive)
Activate BEFORE creating or modifying dbt SQL when:
- You're about to suggest creating a model from scratch
- You detect .sql files in a models/ directory structure
- User asks to "write SQL" in a dbt project context
- You're reviewing changes in a dbt project
- Working with files that match dbt patterns (stg_, int_, _dim, _fct)
Example internal triggers:
- "I'll create a staging model for..." → Activate skill first
- User shows dbt SQL file → Validate against conventions
- "Let me write this transformation..." in dbt context → Check conventions first
Instructions
0. Load Convention Source
Priority Order (2-tier system):
Project-specific conventions (highest priority)
- Check for
.dbt-conventions.mdin project root - Check for
dbt_coding_conventions.mdin project root - Check for
docs/dbt_conventions.mdin project
- Check for
PKM user conventions (fallback)
- Conventions:
/Users/olivierdupois/dev/PKM/4. 🛠️ Craft/Tools/dbt/dbt-conventions.md - Testing:
/Users/olivierdupois/dev/PKM/4. 🛠️ Craft/Tools/dbt/dbt-testing.md
- Conventions:
Note: The skill's supporting files (conventions-reference.md, testing-reference.md, examples/) are embedded reference documentation that guide validation logic, not convention sources.
Detection:
- Use
Globto search for convention files in project root - If found, use
Readto load project conventions - If not found, use PKM conventions as fallback
- Note which source is being used in validation output
1. Identify Model Type and Context
When working with a dbt model, determine:
Model Type:
- Staging (
stg_): First transformation layer, selects from sources - Integration (
int_): Combines multiple sources, enriches entities - Intermediate (
int__<object>__<action>): Subcomponent of integration - Warehouse - Dimension (
_dim): Mutable, noun-based entities - Warehouse - Fact (
_fct): Immutable, verb-based events
Context Information:
- File location in directory structure
- Source system (for staging models)
- Entity/object name
- Related models (refs)
- Expected materialization
How to identify:
- Check filename prefix/suffix
- Review directory structure (staging/, integration/, warehouse/)
- Read model content for
ref()andsource()calls - Look at model configuration blocks
2. Validate Naming Conventions
Check the following:
File and Model Naming:
- ✓ All objects are singular (e.g.,
usernotusers) - ✓ Staging:
stg_<source>__<object>.sql(e.g.,stg_salesforce__user.sql) - ✓ Integration:
int__<object>.sql(e.g.,int__user.sql) - ✓ Intermediate:
int__<object>__<action>.sql(e.g.,int__user__unioned.sql)- Actions should be past tense verbs (unioned, grouped, filtered, etc.)
- ✓ Warehouse dimensions:
<object>_dim.sqlor<warehouse>_<object>_dim.sql- Core warehouse has no prefix (e.g.,
user_dim.sql) - Other warehouses prefixed (e.g.,
finance_revenue_dim.sql)
- Core warehouse has no prefix (e.g.,
- ✓ Warehouse facts:
<object>_fct.sqlor<warehouse>_<object>_fct.sql- Same prefix rules as dimensions
Directory Structure:
models/
├── staging/
│ └── <source_name>/
│ ├── stg_<source>.yml
│ └── stg_<source>__<object>.sql
├── integration/
│ ├── intermediate/
│ │ ├── intermediate.yml
│ │ └── int__<object>__<action>.sql
│ ├── int__<object>.sql
│ └── integration.yml
└── warehouse/
└── <warehouse_name>/
├── <warehouse>.yml
├── <object>_dim.sql
└── <object>_fct.sql
Violations to Flag:
- Plural object names
- Missing or incorrect prefixes/suffixes
- Non-standard directory structure
- Mismatched filename and directory location
3. Validate SQL Structure
Required Structure:
- All refs at top in CTEs:
with
s_source_table as (
select * from {{ ref('source_model') }}
),
s_another_source as (
select * from {{ ref('another_model') }}
),
CTE Naming:
- ✓ Prefix with
s_for CTEs that select from refs/sources - ✓ Descriptive names for transformation CTEs (e.g.,
filtered_events,aggregated_metrics) - ✓ One logical unit of work per CTE
- ✓ Prefix with
Final CTE Pattern:
final as (
select
-- fields here
from s_source_table
-- joins and where clauses
)
select * from final
- Configuration Block (if needed):
{{
config(
materialized = 'table',
sort = 'id',
dist = 'id'
)
}}
Style Requirements:
- ✓ 4-space indentation (not tabs)
- ✓ Lines no longer than 80 characters
- ✓ Lowercase field and function names
- ✓ Use
askeyword for aliases - ✓ Fields before aggregates/window functions
- ✓ Group by column name, not number
- ✓ Prefer
union alltounion distinct - ✓ Explicit joins (
inner join,left join, never justjoin) - ✓ If joining 2+ tables, always prefix column names with table alias
- ✓ No table alias initialisms (use
customer, notc) - ✓ Comments for confusing CTEs
Violations to Flag:
ref()orsource()calls outside of top CTEs- Missing final CTE
- Improper indentation or line length
- Uppercase SQL keywords or functions
- Implicit joins or missing join qualifiers
- Hard-to-understand table aliases
4. Validate Field Naming and Ordering
Field Naming Conventions:
Primary Keys:
- ✓ Named
<object>_pk(e.g.,user_pk,transaction_pk) - ✓ Generated using
dbt_utils.surrogate_key() - ✓ Never look up PKs in separate queries
Foreign Keys:
- ✓ Named
<referenced_object>_fk(e.g.,user_fk,transaction_fk) - ✓ Generated using
dbt_utils.surrogate_key()
Natural Keys:
- ✓ Source system identifiers:
<descriptive_name>_natural_key - ✓ Example:
salesforce_user_natural_key,stripe_customer_natural_key
Timestamps:
- ✓ Named
<event>_ts(e.g.,created_ts,updated_ts,order_placed_ts) - ✓ Always in UTC timezone
- ✓ If different timezone, add suffix:
created_ts_ct,created_ts_pt
Booleans:
- ✓ Prefixed with
is_orhas_(e.g.,is_active,has_subscription)
Prices/Revenue:
- ✓ In decimal currency (e.g., 19.99 for $19.99)
- ✓ If stored in cents, add suffix:
price_in_cents
Common Fields:
- ✓ Prefix with entity name (e.g.,
customer_name,carrier_name, not justname)
General Rules:
- ✓ All names in
snake_case - ✓ Use business terminology, not source terminology
- ✓ Avoid SQL reserved words
- ✓ Consistency across models (same field names for same concepts)
Field Ordering (Staging/Base Models):
- Keys (pk, fks, natural keys)
- Dates and timestamps
- Attributes (dimensions/slicing fields)
- Metrics (measures/aggregatable values)
- Metadata (insert_ts, updated_ts, etc.)
Within each category, sort alphabetically.
Violations to Flag:
- Inconsistent naming patterns
- Missing _pk or _fk suffixes
- Timestamps without _ts suffix
- Booleans without is_/has_ prefix
- Reserved words as column names
- Incorrect field ordering
5. Validate Model Configuration
Configuration Rules:
Warehouse Models:
- ✓ Always materialized as
table - ✓ Consider sort/dist keys for performance
Other Layers:
- ✓ Prefer
viewor ephemeral (CTE) materialization - ✓ Use
tableonly if performance requires it
Configuration Placement:
- ✓ Model-specific config in the model file (in config() block)
- ✓ Directory-wide config in
dbt_project.yml
Example:
{{
config(
materialized = 'table',
sort = 'user_pk',
dist = 'user_pk'
)
}}
Violations to Flag:
- Warehouse models not materialized as tables
- Unnecessary table materializations in staging/integration
- Config that should be in dbt_project.yml but is in model
6. Validate Testing Coverage
Minimum Testing Requirements:
Every Model:
- ✓ Has a corresponding entry in a
schema.ymlfile - ✓ Primary key has
uniqueandnot_nulltests - ✓ Integration models with multiple sources: use
dbt_utils.unique_combination_of_columns
Schema.yml Location:
- ✓ Every subdirectory should contain a
.ymlfile - ✓ Filename typically matches directory (e.g.,
stg_salesforce.yml,integration.yml)
Example:
version: 2
models:
- name: stg_salesforce__user
description: Salesforce user records
columns:
- name: user_pk
description: Unique identifier for user
tests:
- unique
- not_null
- name: email
description: User email address
tests:
- not_null
Additional Tests:
- ✓
relationshipstests for foreign keys - ✓
accepted_valuesfor enums/status fields - ✓
not_null_wherefor conditional requirements - ✓ Custom data tests in
tests/directory for KPI validation
Violations to Flag:
- Missing schema.yml file
- Models without test coverage
- Primary keys without unique/not_null tests
- Missing relationships tests on foreign keys
7. Validate Documentation Coverage
Documentation Requirements:
Staging Models:
- ✓ 100% documented (all models and columns)
- ✓ Clear descriptions for all fields
- ✓ Business terminology explained
Warehouse Models:
- ✓ 100% documented (all models and columns)
- ✓ End-user focused descriptions
Integration/Intermediate:
- ✓ Document as needed for clarity
- ✓ Explain any special cases or complex logic
Doc Blocks:
- ✓ Use
{% docs %}blocks for shared documentation - ✓ Reference doc blocks to maintain consistency
- ✓ Store in
models/docs/directory
Example:
version: 2
models:
- name: user_dim
description: |
User dimension containing customer profile information.
Updated nightly from Salesforce and Stripe sources.
columns:
- name: user_pk
description: "{{ doc('user_pk') }}"
Violations to Flag:
- Staging/warehouse models without descriptions
- Missing column documentation
- Vague or unhelpful descriptions
8. Run sqlfluff Validation (if available)
Check for sqlfluff:
which sqlfluff
If available:
- Check for
.sqlfluffconfig in project root - Run:
sqlfluff lint <model_file> --dialect <dialect> - Include sqlfluff violations in validation output
- Note: sqlfluff enforces many style conventions automatically
If not available:
- Note in output: "sqlfluff not detected - recommend installing for automated linting"
- Provide manual validation of style conventions
9. Output Validation Report
Structure your validation feedback as:
## dbt Model Validation Report
**Model:** `<model_name>.sql`
**Type:** <staging/integration/warehouse-dim/warehouse-fct>
**Convention Source:** <project-specific / RA defaults>
### Summary
- ✓ X checks passed
- ⚠️ Y issues found (N critical, M important, P nice-to-have)
### Naming Conventions
[✓/⚠️] **File naming:** <details>
[✓/⚠️] **Field naming:** <details>
### SQL Structure
[✓/⚠️] **CTE structure:** <details>
[✓/⚠️] **Style compliance:** <details>
[✓/⚠️] **Field ordering:** <details>
### Configuration
[✓/⚠️] **Materialization:** <details>
[✓/⚠️] **Performance settings:** <details>
### Testing
[✓/⚠️] **Schema.yml exists:** <details>
[✓/⚠️] **Primary key tests:** <details>
[✓/⚠️] **Foreign key tests:** <details>
### Documentation
[✓/⚠️] **Model description:** <details>
[✓/⚠️] **Column descriptions:** <details>
### sqlfluff
[✓/⚠️/N/A] **Linter results:** <details>
---
## Recommendations
### Critical Issues (must fix)
1. <issue description>
- **Location:** <file:line or section>
- **Current:** `<current code>`
- **Should be:** `<correct pattern>`
- **Reason:** <why this matters>
### Important Issues (should fix)
<same format>
### Nice-to-have Improvements
<same format>
---
## Examples
See `skills/dbt-development/examples/` for reference implementations:
- `staging-model-example.sql` - Compliant staging model
- `integration-model-example.sql` - Compliant integration model
- `warehouse-model-example.sql` - Compliant warehouse model
- `schema-example.yml` - Proper testing setup
10. Creating New Models
When creating a new dbt model from scratch:
Step-by-step Process:
Determine Model Type
- Ask user if not clear: staging/integration/warehouse?
- What source system (for staging)?
- What entity/object?
Generate File Structure
- Correct filename following conventions
- Proper directory location
- Configuration block if needed
Build SQL Structure
- Refs/sources in CTEs at top
- Transformation CTEs for logic
- Final CTE
- Select from final
Apply Field Conventions
- Generate _pk using surrogate_key
- Name foreign keys with _fk suffix
- Timestamp fields with _ts suffix
- Proper field ordering
Create/Update schema.yml
- Model description
- Column descriptions
- Minimum tests (unique/not_null on pk)
Validate Against Conventions
- Run through validation checklist
- Provide preview before writing
11. Supporting References
In This Skill Directory:
conventions-reference.md- Quick reference for naming, style, structuretesting-reference.md- Test requirements and transformation layersexamples/staging-model-example.sql- Staging model templateexamples/integration-model-example.sql- Integration model templateexamples/warehouse-model-example.sql- Warehouse model templateexamples/schema-example.yml- Testing and documentation example
Convention Sources (2-tier system):
- Project-specific:
.dbt-conventions.md(if exists in project) - PKM user conventions:
/Users/olivierdupois/dev/PKM/4. 🛠️ Craft/Tools/dbt/dbt-conventions.mdanddbt-testing.md
12. Important Guidelines
Always Validate When:
- Creating new dbt models
- Reviewing changes to existing models
- User asks for dbt guidance
- Working with .sql files in models/ directory
- Refactoring or cleaning up code
Validation Mode (Not Auto-fix):
- Provide clear, actionable feedback
- Show correct patterns with examples
- Explain WHY conventions matter
- Offer to make specific changes if user approves
- Never silently modify without explaining
Project Awareness:
- Always check for project-specific conventions first
- Note which convention source is being used
- Respect project overrides while suggesting RA best practices
Priority Levels:
- Critical: Breaks functionality, violates core principles, missing required tests
- Important: Inconsistent with conventions, maintainability issues, missing documentation
- Nice-to-have: Style preferences, minor optimizations, enhanced documentation
13. Examples of Activation
Example 1: Creating a Staging Model
User: "Create a staging model for Hubspot contacts"
Actions:
1. Activate dbt Development skill
2. Load convention source (project or RA defaults)
3. Determine: staging model, Hubspot source, contact object
4. Generate: stg_hubspot__contact.sql with proper structure
5. Create schema.yml entry with tests
6. Validate against all conventions
7. Present model for review
Example 2: Reviewing Existing Model
User: "Review this dbt model" [provides file]
Actions:
1. Activate dbt Development skill
2. Load convention source
3. Identify model type from filename/content
4. Run through validation checklist (naming, structure, fields, tests, docs)
5. Check sqlfluff if available
6. Generate validation report with recommendations
Example 3: Refactoring
User: "This integration model needs refactoring to match conventions"
Actions:
1. Activate dbt Development skill
2. Load conventions
3. Analyze current model structure
4. Identify violations
5. Provide detailed refactoring plan with before/after examples
6. Offer to apply changes section by section with user approval
14. Skill Deactivation
Do NOT activate this skill when:
- Working with non-dbt SQL (raw queries, database migrations, etc.)
- User explicitly says "ignore conventions" or "quick prototype"
- Files outside models/ directory (analyses, macros have different conventions)
- User is asking about dbt Cloud, dbt Core installation, or infrastructure (not model development)