| name | entity-resolution |
| description | Identifies and merges duplicate entities in Knowledge Graph projects. Use after extraction to consolidate duplicate entities, or when users ask about potential duplicates. Supports automatic merging for high-confidence matches and user confirmation for medium-confidence candidates. |
Entity Resolution Skill
Identifies and merges duplicate entities in Knowledge Graph projects.
When to Use
- Proactively after extraction: "I extracted 15 entities. Let me check for potential duplicates..."
- On user request: "Can you check for duplicates?" or "These seem like the same person"
- When graph seems noisy: Multiple similar-looking nodes that may be the same entity
Workflow
1. Scan for Duplicates
Use find_duplicate_entities with project_id
The tool uses multiple signals to detect duplicates:
- String similarity (Jaro-Winkler on labels)
- Alias overlap (Jaccard similarity)
- Type matching (same entity type bonus)
- Graph context (shared neighbors)
2. Apply Confidence-Based Automation
| Confidence | Action |
|---|---|
| >= 0.9 (HIGH) | Auto-merge with merge_entities_tool. Inform user: "I merged X and Y (95% confident they're the same)" |
| 0.7-0.9 (MEDIUM) | Ask user: "I found potential duplicates: X and Y (82% match). Should I merge them?" |
| < 0.7 (LOW) | Mention if relevant: "X and Y might be related but confidence is low (65%)" |
3. Handle User Decisions
- If user approves:
approve_mergeormerge_entities_tool - If user rejects:
reject_merge - If user wants more info:
compare_entities_semanticfor detailed analysis
Proactive Triggers
After any extract_to_kg operation, automatically:
- Call
find_duplicate_entities - Process HIGH confidence matches silently (auto-merge)
- Report MEDIUM confidence matches to user
- Mention LOW confidence only if user asks
Example Responses
After extraction:
"I extracted 12 entities and 8 relationships. I noticed 'Elon Musk' and 'Musk' appear to be the same person (94% confidence), so I merged them. I also found 'SpaceX' and 'Space X' might be duplicates (78% confidence) - should I merge these too?"
On duplicate scan:
"I scanned for duplicates and found 3 potential matches:
- 'OpenAI' and 'Open AI' (91% - auto-merged)
- 'Sam Altman' and 'Samuel Altman' (85% - awaiting your approval)
- 'Microsoft' and 'MS' (68% - low confidence, skipped)"
When comparing entities:
"Comparing 'Dr. John Smith' and 'J. Smith':
Signal Score Name similarity 72% Same type (Person) Yes Shared connections 3 Overall: 78% match. They share connections to MIT and OpenAI. Would you like to merge them?"
Available Tools
| Tool | Description |
|---|---|
find_duplicate_entities |
Scan for duplicates in a project |
merge_entities_tool |
Execute a merge directly (for high confidence) |
review_pending_merges |
See pending candidates awaiting approval |
approve_merge |
Approve a pending candidate |
reject_merge |
Reject a pending candidate |
compare_entities_semantic |
Deep comparison of two specific entities |
Tool Parameters
find_duplicate_entities
{
"project_id": "abc123",
"min_confidence": 0.7
}
merge_entities_tool
{
"project_id": "abc123",
"survivor_id": "node_to_keep",
"merged_id": "node_to_remove"
}
compare_entities_semantic
{
"project_id": "abc123",
"node_a_id": "first_entity_id",
"node_b_id": "second_entity_id"
}
Merge Behavior
When entities are merged:
- Survivor keeps its primary label
- Merged entity's label becomes an alias of survivor
- All aliases transfer to survivor
- All relationships redirect to survivor
- Properties merge (survivor wins on conflict)
- Source IDs combine for provenance tracking
Error Handling
| Issue | Response |
|---|---|
| No project selected | "Please select a Knowledge Graph project first" |
| Empty graph | "Your graph doesn't have any entities yet. Extract content first" |
| No duplicates found | "No potential duplicates found above the confidence threshold" |
| Entity not found | "Entity 'X' was not found. It may have been merged or deleted" |
Follow-Up Suggestions Format
After presenting duplicate scan results, offer interactive follow-ups:
### Explore Further
- "Merge Sam Altman and Samuel Altman" - Merge these entities
- "Compare Sam Altman and Samuel Altman" - See detailed similarity analysis
- "Reject the Sam Altman merge" - Keep them as separate entities
- "Show me all pending merges" - Review all candidates
Integration with KG Insights
After merging entities, the graph may reveal new insights:
- "After merging, [Entity] is now connected to 5 more entities"
- "The merge resolved an isolated topic - [Entity] now links to the main graph"
- "Consider running
ask_about_graphwithquestion_type: key_entitiesto see updated rankings"
Best Practices
- Be transparent - Always explain what was merged and why
- Preserve information - Merged labels become aliases, nothing is lost
- Ask when uncertain - Only auto-merge above 90% confidence
- Show evidence - Include signal breakdown for user decisions
- Suggest next steps - Offer to scan again or explore the updated graph