name	Encapsulating Complexity
description	Hide implementation details behind interfaces - work at domain level (what), not implementation level (how)
when_to_use	When designing any class or interface. When implementation details leak into public API. When storage format (JSON, SQL, files) is exposed. When working with raw data structures (dicts, rows) instead of domain objects. When client code must know HOW things work internally. When changing implementation would break client code. When database queries mixed with business logic. When switching storage type requires interface changes. When tests must know internal structure.
version	1.0.0
languages	all

Encapsulating Complexity

Overview

Hide HOW things work. Expose only WHAT they do. Work at domain level (Users, Orders, Config), not implementation level (dicts, SQL rows, JSON files).

Core principle: The point of encapsulation is to create possibilities (many ways to implement) and restrict possibilities (one way to use). Implementation details hidden = free to change implementation without breaking clients.

Violating the letter of this rule is violating the spirit of information hiding.

When to Use

Apply to every class and interface:

Designing new classes/modules
Creating public APIs
Reviewing code for abstraction leaks
Refactoring to improve maintainability

Warning signs you're violating encapsulation:

Client code knows storage format (JSON, SQL, files)
Interface exposes database/file operations
Returns raw dicts/rows instead of domain objects
Client code constructs SQL queries or file paths
Changing from JSON to YAML breaks client code
Switching databases requires changing all callers
Tests must know internal data structures
Method names reveal implementation (saveToJSON, queryDatabase)
Public fields exposing internal state

The Encapsulation Test

Ask these questions about every public method/field:

Does the interface expose HOW or WHAT?
- "Get user" = WHAT (good)
- "SELECT * FROM users" = HOW (bad)
Can I change implementation without breaking clients?
- JSON → YAML: should work
- PostgreSQL → MongoDB: should work
- If clients break: encapsulation violated
Do clients work at domain level or implementation level?
- Domain: user.email, config.get_timeout()
- Implementation: row[2], json_data['timeout_ms']
Do return values expose internals?
- Domain: User object
- Implementation: dict with database column names

If answers reveal implementation details → encapsulation violated.

Core Pattern: Hide the How

Before (Implementation Exposed)

class ConfigManager:
    def __init__(self, json_path):
        self.json_path = json_path  # ❌ Exposes JSON
        self._data = {}

    def get_value(self, key):
        return self._data.get(key)  # ⚠️ Returns raw value

    def save_to_json(self):  # ❌ "JSON" in method name
        with open(self.json_path, 'w') as f:
            json.dump(self._data, f)

Client code:

config = ConfigManager("/path/to/config.json")  # Must know it's JSON
timeout = config.get_value("timeout_ms")  # Must know exact key format
config.save_to_json()  # Tied to JSON format

If you switch JSON → YAML: Client code breaks. Method names wrong. Constructor signature wrong.

After (Implementation Hidden)

class Config:
    def __init__(self, config_file):  # ✅ No format specified
        self._storage = self._load(config_file)  # ✅ Implementation hidden

    def get_timeout(self):  # ✅ Domain method, not raw key access
        return self._storage.get("timeout_ms", 5000)

    def set_timeout(self, seconds):  # ✅ Domain operation
        self._storage["timeout_ms"] = seconds * 1000  # ms internally

    def save(self):  # ✅ No "JSON" in name
        self._persist(self._storage)

    def _load(self, config_file):  # ✅ Private - can change format
        # Could load JSON, YAML, TOML, etc.
        pass

    def _persist(self, data):  # ✅ Private - implementation detail
        # Format hidden from clients
        pass

Client code:

config = Config("app.config")  # Format agnostic
timeout = config.get_timeout()  # Domain method (seconds)
config.set_timeout(10)
config.save()

Switch JSON → YAML: Client code unchanged. Just change _load() and _persist().

Domain Level vs Implementation Level

Work at the problem domain, not the solution domain:

Example 1: User Management

❌ Implementation Level:

class UserManager:
    def get_user_row(self, user_id):
        # Returns database row as dict
        cur.execute("SELECT * FROM users WHERE id = %s", (user_id,))
        return dict(cur.fetchone())  # ❌ Raw row dict

    def update_user_column(self, user_id, column, value):
        # ❌ Exposes database columns
        cur.execute(f"UPDATE users SET {column} = %s WHERE id = %s", ...)

Client must know:

Database schema (column names)
SQL concepts (rows, columns)
Data types in database

✅ Domain Level:

class UserRepository:
    def get_user(self, user_id: str) -> User:
        # Returns domain object
        row = self._query_user(user_id)
        return User.from_storage(row)  # ✅ Domain object

    def update_email(self, user_id: str, new_email: str) -> None:
        # ✅ Domain operation, no column names
        user = self.get_user(user_id)
        user.email = new_email
        self._persist_user(user)

class User:
    def __init__(self, id, email, name):
        self.id = id
        self.email = email
        self.name = name

Client knows:

Users (domain concept)
Email (domain field)
Nothing about database

Example 2: Configuration

❌ Implementation Level:

class ConfigManager:
    def get(self, key):
        return self._config.get(key)  # ❌ Must know exact key names

    # Client code
    timeout = config.get("server.timeout_milliseconds")  # ❌ Internal structure
    port = config.get("server.port")

✅ Domain Level:

class Config:
    def server_timeout(self):  # ✅ Domain method
        return self._get("server.timeout_ms", 5000) / 1000  # Hide ms→sec

    def server_port(self):  # ✅ Domain method
        return self._get("server.port", 8080)

    # Client code
    timeout = config.server_timeout()  # ✅ Clean domain API
    port = config.server_port()

Separation of Concerns

Don't mix persistence with domain logic:

❌ Mixed (From Baseline Test):

class ReportGenerator:
    def generate(self, start_date, end_date):
        # Database access
        conn = psycopg2.connect(...)
        cursor.execute("SELECT ...")
        rows = cursor.fetchall()

        # Business logic
        total = sum(row[3] for row in rows)

        # HTML formatting
        html = f"<html>...</html>"

        # File I/O
        with open(f"report_{start_date}.html", "w") as f:
            f.write(html)

4 concerns mixed: database, calculation, formatting, file I/O

✅ Separated:

class SalesRepository:
    """Concern: Data access"""
    def get_sales(self, start_date, end_date) -> List[Sale]:
        # Database logic hidden here
        pass

class SalesCalculator:
    """Concern: Business logic"""
    def calculate_metrics(self, sales: List[Sale]) -> SalesMetrics:
        # Pure calculation, no database/file knowledge
        pass

class HTMLFormatter:
    """Concern: Presentation"""
    def format_report(self, metrics: SalesMetrics) -> str:
        # HTML generation, no database/calculation knowledge
        pass

class ReportGenerator:
    """Concern: Orchestration"""
    def __init__(self, repo, calculator, formatter):
        self._repo = repo
        self._calculator = calculator
        self._formatter = formatter

    def generate(self, start_date, end_date, output_path):
        sales = self._repo.get_sales(start_date, end_date)  # What
        metrics = self._calculator.calculate_metrics(sales)  # What
        html = self._formatter.format_report(metrics)  # What

        with open(output_path, "w") as f:
            f.write(html)

Each class can change independently. Database switch doesn't touch formatting. HTML→PDF doesn't touch database.

What to Hide

Hide Storage Format

❌ Exposed:

class DataStore:
    def load_from_json(self):  # ❌ "JSON" in name
        pass

    def save_to_json(self):  # ❌ Tied to JSON
        pass

✅ Hidden:

class DataStore:
    def load(self):  # ✅ Format agnostic
        self._read_storage()  # Private - can be JSON/YAML/SQL

    def save(self):  # ✅ Format agnostic
        self._write_storage()  # Private - implementation hidden

Hide Database Details

❌ Exposed:

def get_users(self):
    cur.execute("SELECT * FROM users")
    return [dict(row) for row in cur.fetchall()]  # ❌ Raw database rows

✅ Hidden:

def get_users(self) -> List[User]:
    rows = self._query_all_users()  # SQL hidden in private method
    return [User.from_row(row) for row in rows]  # ✅ Domain objects

Hide Data Structures

❌ Exposed:

class UserCache:
    def __init__(self):
        self.users_dict = {}  # ❌ Public dict

    # Client code manipulates dict directly
    cache.users_dict[user_id] = user_data

✅ Hidden:

class UserCache:
    def __init__(self):
        self._storage = {}  # ✅ Private implementation

    def add_user(self, user):  # ✅ Domain operation
        self._storage[user.id] = user

    def get_user(self, user_id):  # ✅ Domain operation
        return self._storage.get(user_id)

Internal dict can become Redis, Memcached, or database without breaking clients.

Hide Algorithms and Complexity

❌ Exposed:

def sort_users_by_name(users):
    # ❌ Client must understand sorting implementation
    # Uses quicksort internally
    return quicksort(users, key=lambda u: u.name)

✅ Hidden:

def get_users_sorted_by_name(users):
    # ✅ How it sorts is hidden - could be quicksort, mergesort, timsort
    return sorted(users, key=lambda u: u.name)

Interface Design Principles

1. Name by What, Not How

❌ How (implementation exposed):

save_to_json(), load_from_yaml(), write_to_file()
execute_sql_query(), get_database_connection()
parse_json_response(), build_xml_request()

✅ What (implementation hidden):

save(), load(), persist()
get_users(), find_by_email()
send_request(), get_response()

2. Accept and Return Domain Objects

❌ Raw data structures:

def create_user(self, user_dict: dict) -> dict:
    # ❌ Accepts and returns dicts
    # Client must know dict structure

✅ Domain objects:

def create_user(self, email: str, name: str) -> User:
    # ✅ Clear parameters, returns domain object
    # Internal structure hidden

3. Work at Single Abstraction Level

❌ Mixed levels:

def process_order(self, order):
    validate_order(order)  # High level

    # ❌ Low-level SQL mixed in
    conn.execute("INSERT INTO orders VALUES (%s, %s)", ...)

    send_confirmation(order)  # High level

✅ Consistent level:

def process_order(self, order):
    # All high level - persistence hidden
    validate_order(order)
    saved_order = self._repository.save(order)  # Hides SQL
    send_confirmation(saved_order)

Common Violations from Baseline Testing

Violation 1: Exposing Storage Format

Baseline: ConfigManager exposes JSON in interface, UserManager exposes SQL.

❌ What agents naturally do:

class ConfigManager:
    def __init__(self, json_path):  # Exposes JSON
        self.json_path = json_path  # Public attribute

    def save(self):
        with open(self.json_path, 'w') as f:
            json.dump(self._config, f)  # JSON logic in public view

✅ Hide format:

class Config:
    def __init__(self, config_file):  # ✅ Format neutral
        self._loader = self._create_loader(config_file)  # Private
        self._data = self._loader.load()  # Hidden

    def save(self):
        self._loader.persist(self._data)  # ✅ Format hidden

    def _create_loader(self, config_file):
        # Private - can support JSON, YAML, TOML, etc.
        if config_file.endswith('.yaml'):
            return YAMLLoader(config_file)
        return JSONLoader(config_file)  # Default

Violation 2: Working at Implementation Level

Baseline: UserManager returns dicts, exposes column names.

❌ What agents naturally do:

def get_user(self, user_id):
    cur.execute("SELECT * FROM users WHERE id = %s", (user_id,))
    return dict(cur.fetchone())  # ❌ Raw database row

# Client must know database schema
user_dict = manager.get_user(123)
email = user_dict['email']  # ❌ Column name knowledge required

✅ Work at domain level:

def get_user(self, user_id: str) -> User:
    row = self._fetch_user_row(user_id)  # SQL hidden
    return User(
        id=row['id'],
        email=row['email'],
        name=row['name']
    )  # ✅ Domain object

# Client works with domain concepts
user = repo.get_user("123")
email = user.email  # ✅ Domain field, not column

Violation 3: Mixed Concerns

Baseline: ReportGenerator mixes database, calculation, formatting, file I/O in one method.

❌ What agents naturally do:

def generate_report(self):
    conn = psycopg2.connect(...)  # Database
    rows = cursor.fetchall()  # Database
    total = sum(row[3] for row in rows)  # Calculation
    html = f"<html>...</html>"  # Formatting
    with open("report.html", "w") as f:  # File I/O
        f.write(html)

✅ Separate and encapsulate:

class ReportGenerator:
    def __init__(self, repo, calculator, formatter, file_writer):
        self._repo = repo  # Database concern encapsulated
        self._calculator = calculator  # Calculation encapsulated
        self._formatter = formatter  # Formatting encapsulated
        self._writer = file_writer  # I/O encapsulated

    def generate(self, start_date, end_date):
        # Orchestrate without knowing HOW each works
        sales = self._repo.get_sales(start_date, end_date)
        metrics = self._calculator.calculate(sales)
        content = self._formatter.format(metrics)
        return self._writer.write(content)

Quick Reference

Encapsulation Target	What to Hide	What to Expose
Storage format	JSON/YAML/SQL/files	`load()`, `save()`
Database	SQL queries, connection, schema	Domain operations (`get_user()`)
Data structures	Dict/list/tree internal structure	Domain methods
Algorithms	Sorting/searching implementation	High-level operation
File paths	Internal directory structure	Logical identifiers
External APIs	HTTP/gRPC/REST details	Domain operations
Complex state	State machine internals	Simple operations

Techniques for Encapsulation

Technique 1: Private Implementation Methods

class UserRepository:
    # ✅ Public interface - domain level
    def find_by_email(self, email: str) -> Optional[User]:
        row = self._query_by_email(email)  # Call private method
        return User.from_row(row) if row else None

    # ✅ Private implementation - can change freely
    def _query_by_email(self, email: str):
        # SQL hidden in private method
        # Can change SQL, database, caching without affecting public interface
        pass

Technique 2: Adapter/Wrapper Pattern

# ✅ Wrap complex external library behind simple interface
class EmailService:
    def __init__(self):
        self._smtp_client = smtplib.SMTP(...)  # ✅ Private
        self._templates = self._load_templates()  # ✅ Private

    def send_welcome_email(self, user: User):
        # ✅ Domain operation, SMTP hidden
        template = self._templates['welcome']
        message = template.format(name=user.name)
        self._smtp_client.send(user.email, message)

Technique 3: Abstract Data Types

# ✅ Font operations at problem level, not bit manipulation
class Font:
    def set_bold(self):
        # Hides bit manipulation: attr = attr | 0x02
        self._attributes |= self.BOLD_FLAG

    def is_bold(self):
        return bool(self._attributes & self.BOLD_FLAG)

Client code: font.set_bold() not font.attr = font.attr | 0x02

Technique 4: Facade for Complex Subsystems

# ✅ Simple interface hiding complex subsystem
class OrderProcessor:
    """Facade hiding inventory, payment, shipping complexity."""

    def __init__(self):
        self._inventory = InventorySystem()  # Complex
        self._payment = PaymentGateway()  # Complex
        self._shipping = ShippingService()  # Complex

    def place_order(self, order: Order) -> OrderResult:
        # ✅ Simple interface, complex coordination hidden
        self._inventory.reserve(order.items)
        self._payment.charge(order.payment_info)
        self._shipping.create_shipment(order.address)
        return OrderResult(success=True, order_id=order.id)

Layering and Stratification

Create layers where each hides complexity of layer below:

┌─────────────────────────────┐
│  Application Layer          │  Works with: Users, Orders (domain)
├─────────────────────────────┤
│  Domain Layer               │  Works with: Entities, Value Objects
├─────────────────────────────┤
│  Persistence Layer          │  Works with: Rows, JSON (hidden)
├─────────────────────────────┤
│  Database/Storage           │  SQL, files (fully hidden)
└─────────────────────────────┘

Each layer:

Encapsulates complexity of layers below
Presents abstraction to layers above
Can be changed without affecting other layers

Benefits of Encapsulation

1. Change Implementation Freely

# Can change from JSON to YAML
# Can change from PostgreSQL to MongoDB
# Can change from files to cloud storage
# WITHOUT changing client code

2. Simpler Client Code

# ❌ Without encapsulation
config_data = json.load(open("config.json"))
timeout = config_data.get("server", {}).get("timeout_ms", 5000) / 1000

# ✅ With encapsulation
timeout = config.server_timeout()

3. Easier Testing

# ✅ Can mock at domain level
mock_repo = Mock(UserRepository)
mock_repo.get_user.return_value = User("1", "test@test.com", "Test")

# Don't need to mock:
# - Database connections
# - SQL queries
# - Row dictionaries

4. Prevents Ripple Effects

# Change database schema:
# - With encapsulation: Change private methods only
# - Without: Change every place that accesses row['column_name']

Common Mistakes

❌ Exposing internal structure:

class ShoppingCart:
    def __init__(self):
        self.items = []  # ❌ Public list - clients depend on list structure

✅ Hide structure:

class ShoppingCart:
    def __init__(self):
        self._items = []  # ✅ Private - can change to set, dict, etc.

    def add_item(self, item):  # ✅ Domain operation
        self._items.append(item)

    def get_items(self):  # ✅ Returns copy or iterator
        return list(self._items)

❌ Returning mutable internal state:

def get_config_dict(self):
    return self._config  # ❌ Clients can modify internal state

✅ Return copies or immutable:

def get_config_dict(self):
    return dict(self._config)  # ✅ Copy - changes don't affect internal

❌ Method names revealing implementation:

get_json_data(), save_to_database(), execute_sql(), write_file()

✅ Method names revealing purpose:

get_data(), save(), persist(), execute(), write()

Red Flags - Improve Encapsulation

Interface design:

Method names mention implementation (JSON, SQL, HTTP, file)
Returns raw dicts/rows/JSON instead of domain objects
Accepts raw data instead of domain parameters
Public fields exposing internal state
Client code must know storage format

Implementation:

Database queries in business logic classes
File I/O mixed with calculations
Multiple concerns in one class
Working with rows/dicts instead of domain objects
Switching implementation would break clients

All of these mean: Improve encapsulation.

Common Rationalizations

From baseline testing, agents justify poor encapsulation with:

Excuse	Reality
"Client needs to know the structure"	No, client needs domain operations. Hide structure.
"Returning dict is simpler than creating class"	Simple to write ≠ simple to maintain. Domain objects prevent errors.
"Just a thin wrapper, not worth it"	Wrappers enable change. Worth it.
"Everything's in one place, easier to find"	Easier to find ≠ easier to change. Separation enables modification.
"It works, no need to abstract"	Working now ≠ maintainable later. Abstract anyway.
"YAGNI - we won't change database"	Can't predict future. Encapsulation is cheap insurance.
"Too much boilerplate"	Boilerplate prevents ripple effects. Trade-off worth it.

Verification Checklist

Before marking class/interface design complete:

Method names reveal purpose, not implementation
Returns domain objects, not raw dicts/rows
Accepts domain parameters, not raw data structures
No public fields exposing internal state
Can change storage format without breaking clients
Can change database without changing interface
Client works at domain level (User, Order) not implementation level (dict, row)
Mixed concerns separated (persistence, logic, formatting)
Private methods hide implementation details
Tests don't depend on internal structure

If any "no" → improve encapsulation.

Real-World Impact

From Code Complete:

Information hiding is fundamental design heuristic
Classes should hide their implementation behind interfaces
Work at problem domain level, not solution domain level
Benefits: easier to modify, reuse, and understand

From baseline testing:

Agents naturally use private variables (_config)
BUT expose storage format (JSON, SQL)
Work at implementation level (dicts, rows) not domain (objects)
Mix concerns (database + calculation + formatting)
Don't create domain objects - return raw data

With this skill: Hide implementation, work at domain level, separate concerns.

Integration with Other Skills

For keeping interfaces focused: See skills/coding/keeping-routines-focused - single responsibility applies to classes too

For reducing complexity: See skills/reducing-complexity - encapsulation reduces complexity by hiding details

Install Skill

SKILL.md

Encapsulating Complexity

Overview

When to Use

The Encapsulation Test

Core Pattern: Hide the How

Before (Implementation Exposed)

After (Implementation Hidden)

Domain Level vs Implementation Level

Example 1: User Management

Example 2: Configuration

Separation of Concerns

What to Hide

Hide Storage Format

Hide Database Details

Hide Data Structures

Hide Algorithms and Complexity

Interface Design Principles

1. Name by What, Not How

2. Accept and Return Domain Objects

3. Work at Single Abstraction Level

Common Violations from Baseline Testing

Violation 1: Exposing Storage Format

Violation 2: Working at Implementation Level

Violation 3: Mixed Concerns

Quick Reference

Techniques for Encapsulation

Technique 1: Private Implementation Methods

Technique 2: Adapter/Wrapper Pattern

Technique 3: Abstract Data Types

Technique 4: Facade for Complex Subsystems

Layering and Stratification

Benefits of Encapsulation

1. Change Implementation Freely

2. Simpler Client Code

3. Easier Testing

4. Prevents Ripple Effects

Common Mistakes

Red Flags - Improve Encapsulation

Common Rationalizations

Verification Checklist

Real-World Impact

Integration with Other Skills