name	ethics-safety-impact
description	Use when decisions could affect groups differently and need to anticipate harms/benefits, assess fairness and safety concerns, identify vulnerable populations, propose risk mitigations, define monitoring metrics, or when user mentions ethical review, impact assessment, differential harm, safety analysis, vulnerable groups, bias audit, or responsible AI/tech.

Ethics, Safety & Impact Assessment

Purpose
When to Use
What Is It?
Workflow
Common Patterns
Guardrails
Quick Reference

Purpose

Ethics, Safety & Impact Assessment provides a structured framework for identifying potential harms, benefits, and differential impacts before launching features, implementing policies, or making decisions that affect people. This skill guides you through stakeholder identification, harm/benefit analysis, fairness evaluation, risk mitigation design, and ongoing monitoring to ensure responsible and equitable outcomes.

When to Use

Use this skill when:

Product launches: New features, algorithm changes, UI redesigns that affect user experience or outcomes
Policy decisions: Terms of service updates, content moderation rules, data usage policies, pricing changes
Data & AI systems: Training models, deploying algorithms, using sensitive data, automated decision-making
Platform changes: Recommendation systems, search ranking, feed algorithms, matching/routing logic
Access & inclusion: Features affecting accessibility, vulnerable populations, underrepresented groups, global markets
Safety-critical systems: Health, finance, transportation, security applications where errors have serious consequences
High-stakes decisions: Hiring, lending, admissions, criminal justice, insurance where outcomes significantly affect lives
Content & communication: Moderation policies, fact-checking systems, content ranking, amplification rules

Trigger phrases: "ethical review", "impact assessment", "who might be harmed", "differential impact", "vulnerable populations", "bias audit", "fairness check", "safety analysis", "responsible AI", "unintended consequences"

What Is It?

Ethics, Safety & Impact Assessment is a proactive evaluation framework that systematically examines:

Who is affected (stakeholder mapping, vulnerable groups)
What could go wrong (harm scenarios, failure modes)
Why it matters (severity, likelihood, distribution of impacts)
How to mitigate (design changes, safeguards, monitoring)
When to escalate (triggers, thresholds, review processes)

Core ethical principles:

Fairness: Equal treatment, non-discrimination, equitable outcomes across groups
Autonomy: User choice, informed consent, control over data and experience
Beneficence: Maximize benefits, design for positive impact
Non-maleficence: Minimize harms, "do no harm" as baseline
Transparency: Explain decisions, disclose limitations, build trust
Accountability: Clear ownership, redress mechanisms, audit trails
Privacy: Data protection, confidentiality, purpose limitation
Justice: Equitable distribution of benefits and burdens, address historical inequities

Quick example:

Scenario: Launching credit scoring algorithm for loan approvals

Ethical impact assessment:

Stakeholders affected: Loan applicants (diverse demographics), lenders, society (economic mobility)
Potential harms:
- Disparate impact: Algorithm trained on historical data may perpetuate bias against protected groups (race, gender, age)
- Opacity: Applicants denied loans without explanation, cannot contest decision
- Feedback loops: Denying loans to disadvantaged groups → lack of credit history → continued denials
- Economic harm: Incorrect denials prevent wealth building, perpetuate poverty
Vulnerable groups: Racial minorities historically discriminated in lending, immigrants with thin credit files, young adults, people in poverty
Mitigations:
- Fairness audit: Test for disparate impact across protected classes, equalized odds
- Explainability: Provide reason codes (top 3 factors), allow appeals
- Alternative data: Include rent, utility payments to expand access
- Human review: Flag edge cases for manual review, override capability
- Regular monitoring: Track approval rates by demographic, quarterly bias audits
Monitoring & escalation:
- Metrics: Approval rate parity (within 10% across groups), false positive/negative rates, appeal overturn rate
- Triggers: If disparate impact >20%, escalate to ethics committee
- Review: Quarterly fairness audits, annual independent assessment

Workflow

Copy this checklist and track your progress:

Ethics & Safety Assessment Progress:
- [ ] Step 1: Map stakeholders and identify vulnerable groups
- [ ] Step 2: Analyze potential harms and benefits
- [ ] Step 3: Assess fairness and differential impacts
- [ ] Step 4: Evaluate severity and likelihood
- [ ] Step 5: Design mitigations and safeguards
- [ ] Step 6: Define monitoring and escalation protocols

Step 1: Map stakeholders and identify vulnerable groups

Identify all affected parties (direct users, indirect, society). Prioritize vulnerable populations most at risk. See resources/template.md for stakeholder analysis framework.

Step 2: Analyze potential harms and benefits

Brainstorm what could go wrong (harms) and what value is created (benefits) for each stakeholder group. See resources/template.md for structured analysis.

Step 3: Assess fairness and differential impacts

Evaluate whether outcomes, treatment, or access differ across groups. Check for disparate impact. See resources/methodology.md for fairness criteria and measurement.

Step 4: Evaluate severity and likelihood

Score each harm on severity (1-5) and likelihood (1-5), prioritize high-risk combinations. See resources/template.md for prioritization framework.

Step 5: Design mitigations and safeguards

For high-priority harms, propose design changes, policy safeguards, oversight mechanisms. See resources/methodology.md for intervention types.

Step 6: Define monitoring and escalation protocols

Set metrics, thresholds, review cadence, escalation triggers. Validate using resources/evaluators/rubric_ethics_safety_impact.json. Minimum standard: Average score ≥ 3.5.

Common Patterns

Pattern 1: Algorithm Fairness Audit

Stakeholders: Users receiving algorithmic decisions (hiring, lending, content ranking), protected groups
Harms: Disparate impact (bias against protected classes), feedback loops amplifying inequality, opacity preventing accountability
Assessment: Test for demographic parity, equalized odds, calibration across groups; analyze training data for historical bias
Mitigations: Debiasing techniques, fairness constraints, explainability, human review for edge cases, regular audits
Monitoring: Disparate impact ratio, false positive/negative rates by group, user appeals and overturn rates

Pattern 2: Data Privacy & Consent

Stakeholders: Data subjects (users whose data is collected), vulnerable groups (children, marginalized communities)
Harms: Privacy violations, surveillance, data breaches, lack of informed consent, secondary use without permission, re-identification risk
Assessment: Map data flows (collection → storage → use → sharing), identify sensitive attributes (PII, health, location), consent adequacy
Mitigations: Data minimization (collect only necessary), anonymization/differential privacy, granular consent, user data controls (export, delete), encryption
Monitoring: Breach incidents, data access logs, consent withdrawal rates, user data requests (GDPR, CCPA)

Pattern 3: Content Moderation & Free Expression

Stakeholders: Content creators, viewers, vulnerable groups (targets of harassment), society (information integrity)
Harms: Over-moderation (silencing legitimate speech, especially marginalized voices), under-moderation (allowing harm, harassment, misinformation), inconsistent enforcement
Assessment: Analyze moderation error rates (false positives/negatives), differential enforcement across groups, cultural context sensitivity
Mitigations: Clear policies with examples, appeals process, human review, diverse moderators, cultural context training, transparency reports
Monitoring: Moderation volume and error rates by category, appeal overturn rates, disparate enforcement across languages/regions

Pattern 4: Accessibility & Inclusive Design

Stakeholders: Users with disabilities (visual, auditory, motor, cognitive), elderly, low-literacy, low-bandwidth users
Harms: Exclusion (cannot use product), degraded experience, safety risks (cannot access critical features), digital divide
Assessment: WCAG compliance audit, assistive technology testing, user research with diverse abilities, cross-cultural usability
Mitigations: Accessible design (WCAG AA/AAA), alt text, keyboard navigation, screen reader support, low-bandwidth mode, multi-language, plain language
Monitoring: Accessibility test coverage, user feedback from disability communities, task completion rates across abilities

Pattern 5: Safety-Critical Systems

Stakeholders: End users (patients, drivers, operators), vulnerable groups (children, elderly, compromised health), public safety
Harms: Physical harm (injury, death), psychological harm (trauma), property damage, cascade failures affecting many
Assessment: Failure mode analysis (FMEA), fault tree analysis, worst-case scenarios, edge cases that break assumptions
Mitigations: Redundancy, fail-safes, human oversight, rigorous testing (stress, chaos, adversarial), incident response plans, staged rollouts
Monitoring: Error rates, near-miss incidents, safety metrics (accidents, adverse events), user-reported issues, compliance audits

Guardrails

Critical requirements:

Identify vulnerable groups explicitly: Not all stakeholders are equally at risk. Prioritize: children, elderly, people with disabilities, marginalized/discriminated groups, low-income, low-literacy, geographically isolated, politically targeted. If none identified, you're probably missing them.
Consider second-order and long-term effects: First-order obvious harms are just the start. Look for: feedback loops (harm → disadvantage → more harm), normalization (practice becomes standard), precedent (enables worse future behavior), accumulation (small harms compound over time). Ask "what happens next?"
Assess differential impact, not just average: Feature may help average user but harm specific groups. Metrics: disparate impact (outcome differences across groups >20% = red flag), intersectionality (combinations of identities may face unique harms), distributive justice (who gets benefits vs. burdens?).
Design mitigations before launch, not after harm: Reactive fixes are too late for those already harmed. Proactive: Build safeguards into design, test with diverse users, staged rollout with monitoring, kill switches, pre-commit to audits. "Move fast and break things" is unethical for systems affecting people's lives.
Provide transparency and recourse: People affected have right to know and contest. Minimum: Explain decisions (what factors, why outcome), Appeal mechanism (human review, overturn if wrong), Redress (compensate harm), Audit trails (investigate complaints). Opacity is often a sign of hidden bias or risk.
Monitor outcomes, not just intentions: Good intentions don't prevent harm. Measure actual impacts: outcome disparities by group, user-reported harms, error rates and their distribution, unintended consequences. Set thresholds that trigger review/shutdown.
Establish clear accountability and escalation: Assign ownership. Define: Who reviews ethics risks before launch? Who monitors post-launch? What triggers escalation? Who can halt harmful features? Document decisions and rationale for later review.
Respect autonomy and consent: Users deserve: Informed choice (understand what they're agreeing to, in plain language), Meaningful alternatives (consent not coerced), Control (opt out, delete data, configure settings), Purpose limitation (data used only for stated purpose). Children and vulnerable groups need extra protections.

Common pitfalls:

❌ Assuming "we treat everyone the same" = fairness: Equal treatment of unequal groups perpetuates inequality. Fairness often requires differential treatment.
❌ Optimization without constraints: Maximizing engagement/revenue unconstrained leads to amplifying outrage, addiction, polarization. Set ethical boundaries.
❌ Moving fast and apologizing later: For safety/ethics, prevention > apology. Harms to vulnerable groups are not acceptable experiments.
❌ Privacy theater: Requiring consent without explaining risks, or making consent mandatory for service, is not meaningful consent.
❌ Sampling bias in testing: Testing only on employees (young, educated, English-speaking) misses how diverse users experience harm.
❌ Ethics washing: Performative statements without material changes. Impact assessments must change decisions, not just document them.

Quick Reference

Key resources:

resources/template.md: Stakeholder mapping, harm/benefit analysis, risk matrix, mitigation planning, monitoring framework
resources/methodology.md: Fairness metrics, privacy analysis, safety assessment, bias detection, participatory design
resources/evaluators/rubric_ethics_safety_impact.json: Quality criteria for stakeholder analysis, harm identification, mitigation design, monitoring

Stakeholder Priorities:

High-risk groups to always consider:

Children (<18, especially <13)
People with disabilities (visual, auditory, motor, cognitive)
Racial/ethnic minorities, especially historically discriminated groups
Low-income, unhoused, financially precarious
LGBTQ+, especially in hostile jurisdictions
Elderly (>65), especially digitally less-skilled
Non-English speakers, low-literacy
Political dissidents, activists, journalists in repressive contexts
Refugees, immigrants, undocumented
Mentally ill, cognitively impaired

Harm Categories:

Physical: Injury, death, health deterioration
Psychological: Trauma, stress, anxiety, depression, addiction
Economic: Lost income, debt, poverty, exclusion from opportunity
Social: Discrimination, harassment, ostracism, loss of relationships
Autonomy: Coercion, manipulation, loss of control, dignity violation
Privacy: Surveillance, exposure, data breach, re-identification
Reputational: Stigma, defamation, loss of standing
Epistemic: Misinformation, loss of knowledge access, filter bubbles
Political: Disenfranchisement, censorship, targeted repression

Fairness Definitions (choose appropriate for context):

Demographic parity: Outcome rates equal across groups (e.g., 40% approval rate for all)
Equalized odds: False positive and false negative rates equal across groups
Equal opportunity: True positive rate equal across groups (equal access to benefit)
Calibration: Predicted probabilities match observed frequencies for all groups
Individual fairness: Similar individuals treated similarly (Lipschitz condition)
Counterfactual fairness: Outcome same if sensitive attribute (race, gender) were different

Mitigation Strategies:

Prevent: Design change eliminates harm (e.g., don't collect sensitive data)
Reduce: Decrease likelihood or severity (e.g., rate limiting, friction for risky actions)
Detect: Monitor and alert when harm occurs (e.g., bias dashboard, anomaly detection)
Respond: Process to address harm when found (e.g., appeals, human review, compensation)
Safeguard: Redundancy, fail-safes, circuit breakers for critical failures
Transparency: Explain, educate, build understanding and trust
Empower: Give users control, choice, ability to opt out or customize

Monitoring Metrics:

Outcome disparities: Measure by protected class (approval rates, error rates, treatment quality)
Error distribution: False positives/negatives, who bears burden?
User complaints: Volume, categories, resolution rates, disparities
Engagement/retention: Differences across groups (are some excluded?)
Safety incidents: Volume, severity, affected populations
Consent/opt-outs: How many decline? Demographics of decliners?

Escalation Triggers:

Disparate impact >20% without justification
Safety incidents causing serious harm (injury, death)
Vulnerable group disproportionately affected (>2× harm rate)
User complaints spike (>2× baseline)
Press/regulator attention
Internal ethics concerns raised

When to escalate beyond this skill:

Legal compliance required (GDPR, ADA, Civil Rights Act, industry regulations)
Life-or-death safety-critical system (medical, transportation)
Children or vulnerable populations primary users
High controversy or political salience
Novel ethical terrain (new technology, no precedent) → Consult: Legal counsel, ethics board, domain experts, affected communities, regulators

Inputs required:

Feature or decision (what is being proposed? what changes?)
Affected groups (who is impacted? direct and indirect?)
Context (what problem does this solve? why now?)

Outputs produced:

ethics-safety-impact.md: Stakeholder analysis, harm/benefit assessment, fairness evaluation, risk prioritization, mitigation plan, monitoring framework, escalation protocol

ethics-safety-impact

Install Skill

SKILL.md