| name | ai-ethics |
| description | Responsible AI development and ethical considerations. Use when evaluating AI bias, implementing fairness measures, conducting ethical assessments, or ensuring AI systems align with human values. |
| author | Joseph OBrien |
| status | unpublished |
| updated | 2025-12-23 |
| version | 1.0.1 |
| tag | skill |
| type | skill |
AI Ethics
Comprehensive AI ethics skill covering bias detection, fairness assessment, responsible AI development, and regulatory compliance.
When to Use This Skill
- Evaluating AI models for bias
- Implementing fairness measures
- Conducting ethical impact assessments
- Ensuring regulatory compliance (EU AI Act, etc.)
- Designing human-in-the-loop systems
- Creating AI transparency documentation
- Developing AI governance frameworks
Ethical Principles
Core AI Ethics Principles
| Principle | Description |
|---|---|
| Fairness | AI should not discriminate against individuals or groups |
| Transparency | AI decisions should be explainable |
| Privacy | Personal data must be protected |
| Accountability | Clear responsibility for AI outcomes |
| Safety | AI should not cause harm |
| Human Agency | Humans should maintain control |
Stakeholder Considerations
- Users: How does this affect people using the system?
- Subjects: How does this affect people the AI makes decisions about?
- Society: What are broader societal implications?
- Environment: What is the environmental impact?
Bias Detection & Mitigation
Types of AI Bias
| Bias Type | Source | Example |
|---|---|---|
| Historical | Training data reflects past discrimination | Hiring models favoring male candidates |
| Representation | Underrepresented groups in training data | Face recognition failing on darker skin |
| Measurement | Proxy variables for protected attributes | ZIP code correlating with race |
| Aggregation | One model for diverse populations | Medical model trained only on one ethnicity |
| Evaluation | Biased evaluation metrics | Accuracy hiding disparate impact |
Fairness Metrics
Group Fairness:
- Demographic Parity: Equal positive rates across groups
- Equalized Odds: Equal TPR and FPR across groups
- Predictive Parity: Equal precision across groups
Individual Fairness:
- Similar individuals should receive similar predictions
- Counterfactual fairness: Would outcome change if protected attribute differed?
Bias Mitigation Strategies
Pre-processing:
- Resampling/reweighting training data
- Removing biased features
- Data augmentation for underrepresented groups
In-processing:
- Fairness constraints in loss function
- Adversarial debiasing
- Fair representation learning
Post-processing:
- Threshold adjustment per group
- Calibration
- Reject option classification
Explainability & Transparency
Explanation Types
| Type | Audience | Purpose |
|---|---|---|
| Global | Developers | Understand overall model behavior |
| Local | End users | Explain specific decisions |
| Counterfactual | Affected parties | What would need to change for different outcome |
Explainability Techniques
- SHAP: Feature importance values
- LIME: Local interpretable explanations
- Attention maps: For neural networks
- Decision trees: Inherently interpretable
- Feature importance: Global model understanding
Model Cards
Document for each model:
- Model purpose and intended use
- Training data description
- Performance metrics by subgroup
- Limitations and ethical considerations
- Version and update history
AI Governance
AI Risk Assessment
Risk Categories (EU AI Act):
| Risk Level | Examples | Requirements |
|---|---|---|
| Unacceptable | Social scoring, manipulation | Prohibited |
| High | Healthcare, employment, credit | Strict requirements |
| Limited | Chatbots | Transparency obligations |
| Minimal | Spam filters | No requirements |
Governance Framework
- Policy: Define ethical principles and boundaries
- Process: Review and approval workflows
- People: Roles and responsibilities (ethics board)
- Technology: Tools for monitoring and enforcement
Documentation Requirements
- Data provenance and lineage
- Model training documentation
- Testing and validation results
- Deployment and monitoring plans
- Incident response procedures
Human Oversight
Human-in-the-Loop Patterns
| Pattern | Use Case | Example |
|---|---|---|
| Human-in-the-Loop | High-stakes decisions | Medical diagnosis confirmation |
| Human-on-the-Loop | Monitoring with intervention | Content moderation escalation |
| Human-out-of-Loop | Low-risk, high-volume | Spam filtering |
Designing for Human Control
- Clear escalation paths
- Override capabilities
- Confidence thresholds for automation
- Audit trails
- Feedback mechanisms
Privacy Considerations
Data Minimization
- Collect only necessary data
- Anonymize when possible
- Aggregate rather than individual data
- Delete data when no longer needed
Privacy-Preserving Techniques
- Differential privacy
- Federated learning
- Secure multi-party computation
- Homomorphic encryption
Environmental Impact
Considerations
- Training compute requirements
- Inference energy consumption
- Hardware lifecycle
- Data center energy sources
Mitigation
- Efficient architectures
- Model distillation
- Transfer learning
- Green hosting providers
Reference Files
references/bias_assessment.md- Detailed bias evaluation methodologyreferences/regulatory_compliance.md- AI regulation requirements
Integration with Other Skills
- machine-learning - For model development
- testing - For bias testing
- documentation - For model cards