| name | unity-catalog-governance |
| description | Unity Catalog governance patterns, permissions models, security best practices, and policy enforcement for enterprise data governance. |
| triggers | unity catalog governance, catalog security, permissions management, data governance patterns |
| category | governance |
Unity Catalog Governance Skill
Overview
Unity Catalog provides centralized governance for data and AI assets across Databricks workspaces. This skill covers governance patterns, security models, access control, and compliance frameworks.
When to Use This Skill
Use this skill when you need to:
- Design Unity Catalog security architecture
- Implement RBAC/ABAC access controls
- Configure row-level security and column masking
- Set up data classification and tagging
- Implement compliance requirements (GDPR, HIPAA, SOC2)
- Audit permissions and access patterns
Core Concepts
Three-Level Namespace
catalog.schema.table
production.customers.profiles
Securable Objects
- Metastore: Top-level container
- Catalog: Database container
- Schema: Table container
- Table/View: Data objects
- Volume: File storage
- Function: Callable routines
- External Location: External storage
- Storage Credential: Cloud credentials
Privilege Model
-- Catalog privileges
USE CATALOG, CREATE SCHEMA, USE SCHEMA, CREATE TABLE,
CREATE FUNCTION, CREATE VOLUME, ALL PRIVILEGES, OWNERSHIP
-- Schema privileges
USE SCHEMA, CREATE TABLE, CREATE FUNCTION, CREATE VOLUME,
SELECT, MODIFY, READ FILES, WRITE FILES
-- Table privileges
SELECT, MODIFY, READ METADATA
-- Function privileges
EXECUTE
-- Volume privileges
READ FILES, WRITE FILES
Governance Patterns
Pattern 1: Environment Isolation
-- Separate catalogs per environment
CREATE CATALOG IF NOT EXISTS dev;
CREATE CATALOG IF NOT EXISTS staging;
CREATE CATALOG IF NOT EXISTS production;
-- Grant appropriate access
GRANT USE CATALOG, CREATE SCHEMA ON CATALOG dev TO `developers`;
GRANT USE CATALOG, USE SCHEMA ON CATALOG staging TO `testers`;
GRANT USE CATALOG, USE SCHEMA, SELECT ON CATALOG production TO `analysts`;
Pattern 2: Data Domain Organization
-- Organize by business domain
CREATE CATALOG business_domains;
CREATE SCHEMA business_domains.customer_domain;
CREATE SCHEMA business_domains.financial_domain;
CREATE SCHEMA business_domains.product_domain;
-- Assign domain ownership
ALTER SCHEMA business_domains.customer_domain
SET OWNER TO `customer_data_team`;
-- Domain-specific access
GRANT SELECT ON SCHEMA business_domains.customer_domain TO `customer_analytics`;
GRANT MODIFY ON SCHEMA business_domains.customer_domain TO `customer_engineering`;
Pattern 3: Medallion with Security
-- Bronze: Raw data - limited access
CREATE SCHEMA production.bronze;
GRANT USE SCHEMA, SELECT, MODIFY ON SCHEMA production.bronze TO `data_engineers`;
-- Silver: Refined data - broader access
CREATE SCHEMA production.silver;
GRANT USE SCHEMA, SELECT ON SCHEMA production.silver TO `data_analysts`;
GRANT MODIFY ON SCHEMA production.silver TO `data_engineers`;
-- Gold: Business data - wide access
CREATE SCHEMA production.gold;
GRANT USE SCHEMA, SELECT ON SCHEMA production.gold TO `business_users`;
GRANT MODIFY ON SCHEMA production.gold TO `analytics_engineers`;
Pattern 4: Row-Level Security
-- Regional data access control
CREATE FUNCTION governance.regional_filter(user_region STRING, data_region STRING)
RETURNS BOOLEAN
RETURN CASE
WHEN IS_ACCOUNT_GROUP_MEMBER('global_access') THEN TRUE
WHEN user_region = data_region THEN TRUE
ELSE FALSE
END;
-- Apply row filter
ALTER TABLE production.customers.orders
SET ROW FILTER governance.regional_filter(current_user_region(), region);
Pattern 5: Column Masking
-- Create masking function
CREATE FUNCTION governance.mask_email(email STRING)
RETURNS STRING
RETURN CASE
WHEN IS_ACCOUNT_GROUP_MEMBER('pii_admin') THEN email
WHEN IS_ACCOUNT_GROUP_MEMBER('pii_viewer') THEN
CONCAT(SUBSTRING(email, 1, 3), '***@', SPLIT(email, '@')[1])
ELSE 'REDACTED'
END;
-- Apply to column
ALTER TABLE production.customers.profiles
ALTER COLUMN email SET MASK governance.mask_email;
Pattern 6: Data Classification
-- Create classification tags
CREATE TAG IF NOT EXISTS governance.sensitivity
VALUES ('PUBLIC', 'INTERNAL', 'CONFIDENTIAL', 'RESTRICTED');
CREATE TAG IF NOT EXISTS governance.data_domain
VALUES ('CUSTOMER', 'FINANCIAL', 'EMPLOYEE', 'PRODUCT');
-- Apply classification
ALTER TABLE production.customers.profiles
SET TAGS (
'governance.sensitivity' = 'RESTRICTED',
'governance.data_domain' = 'CUSTOMER'
);
ALTER TABLE production.customers.profiles
ALTER COLUMN ssn SET TAGS ('governance.sensitivity' = 'RESTRICTED');
Security Best Practices
1. Least Privilege Access
-- ❌ BAD: Overly permissive
GRANT ALL PRIVILEGES ON CATALOG production TO `all_users`;
-- ✅ GOOD: Minimal required permissions
GRANT USE CATALOG ON CATALOG production TO `analysts`;
GRANT USE SCHEMA ON SCHEMA production.gold TO `analysts`;
GRANT SELECT ON SCHEMA production.gold TO `analysts`;
2. Separation of Duties
# Define distinct roles
ROLES = {
"data_consumer": ["USE CATALOG", "USE SCHEMA", "SELECT"],
"data_producer": ["USE CATALOG", "USE SCHEMA", "SELECT", "MODIFY"],
"data_steward": ["USE CATALOG", "USE SCHEMA", "SELECT", "MODIFY", "ALTER"],
"platform_admin": ["ALL PRIVILEGES"] # Minimal assignments
}
3. Service Principal Ownership
-- Use service principals for ownership, not individuals
ALTER SCHEMA production.customers
SET OWNER TO `data-engineering-sp`;
-- Transfer from individual to service principal
ALTER TABLE production.customers.profiles
SET OWNER TO `customer-domain-sp`;
4. Audit and Monitor
-- Query audit logs
SELECT
user_identity.email,
request_params.full_name_arg as object_accessed,
action_name,
event_time
FROM system.access.audit
WHERE event_date >= CURRENT_DATE - INTERVAL 7 DAYS
AND action_name IN ('GET_TABLE', 'READ_TABLE')
ORDER BY event_time DESC;
5. Regular Access Reviews
def quarterly_access_review(catalog: str):
"""Review and recertify access quarterly."""
# Get all grants
grants = spark.sql(f"SHOW GRANTS ON CATALOG {catalog}")
# Identify excessive access
excessive = grants.filter("privilege = 'ALL PRIVILEGES'")
# Find unused permissions
unused = find_unused_grants(grants)
# Generate review report
report = generate_review_report(excessive, unused)
send_to_data_stewards(report)
Compliance Frameworks
GDPR Compliance Pattern
class GDPRCompliance:
"""GDPR compliance implementation."""
def setup_data_inventory(self):
"""Article 30: Records of processing."""
# Tag all PII
self.classify_pii_data()
# Document processing purposes
self.document_purposes()
def implement_right_to_erasure(self):
"""Article 17: Right to erasure."""
# Create deletion workflow
self.create_deletion_api()
# Setup cascade delete
self.configure_lineage()
def enable_data_portability(self):
"""Article 20: Right to data portability."""
# Export user data
self.create_export_api()
HIPAA Compliance Pattern
class HIPAACompliance:
"""HIPAA compliance implementation."""
def implement_access_controls(self):
"""Technical Safeguards: Access Control."""
# Unique user identification
self.enforce_sso()
# Automatic logoff
self.configure_session_timeout()
# Audit controls
self.enable_comprehensive_logging()
def enable_encryption(self):
"""Technical Safeguards: Encryption."""
# Encryption at rest
self.enable_catalog_encryption()
# Encryption in transit
self.enforce_tls()
Templates
See /templates/ directory for:
- governance-framework: Complete governance setup
- rbac-configuration: Role-based access control
- compliance-checklist: Regulatory compliance validation
- audit-procedures: Regular audit workflows
Examples
See /examples/ directory for:
- regional-access-control: Multi-region data isolation
- pii-protection: PII masking and encryption
- compliance-reporting: Automated compliance reports
Related Skills
data-classification: Data classification and taggingcompliance-automation: Automated compliance checksaccess-management: RBAC/ABAC implementation