| name | Suggest Reviewers Helper |
| description | Git blame analysis helper for the suggest-reviewers command |
Suggest Reviewers Helper
This skill provides a Python helper script that analyzes git blame data for the /git:suggest-reviewers command. The script handles the complex task of identifying which lines were changed and who authored the original code.
When to Use This Skill
Use this skill when implementing the /git:suggest-reviewers command. The helper script should be invoked during Step 3 of the command implementation (analyzing git blame for changed lines).
DO NOT implement git blame analysis manually - always use the provided analyze_blame.py script.
Prerequisites
- Python 3.6 or higher
- Git repository with commit history
- Git CLI available in PATH
Helper Script: analyze_blame.py
The analyze_blame.py script automates the complex process of:
- Parsing git diff output to identify specific line ranges that were modified
- Running git blame on only the changed line ranges (not entire files)
- Extracting and aggregating author information with statistics
- Filtering out bot accounts automatically
Usage
For uncommitted changes:
python3 ${CLAUDE_PLUGIN_ROOT}/skills/suggest-reviewers/analyze_blame.py \
--mode uncommitted \
--file path/to/file1.go \
--file path/to/file2.py \
--output json
For committed changes on a feature branch:
python3 ${CLAUDE_PLUGIN_ROOT}/skills/suggest-reviewers/analyze_blame.py \
--mode committed \
--base-branch main \
--file path/to/file1.go \
--file path/to/file2.py \
--output json
Parameters
--mode: Required. Eitheruncommittedorcommitteduncommitted: Analyzes unstaged/staged changes against HEADcommitted: Analyzes committed changes against a base branch
--base-branch: Required when mode iscommitted. The base branch to compare against (e.g.,main,master)--file: Can be specified multiple times. Each file to analyze for blame information. Only changed files should be passed.--output: Output format. Default isjson. Options:json: Machine-readable JSON outputtext: Human-readable text output
Output Format (JSON)
{
"Author Name": {
"line_count": 45,
"most_recent_date": "2024-10-15T14:23:10",
"files": ["file1.go", "file2.go"],
"email": "author@example.com"
},
"Another Author": {
"line_count": 23,
"most_recent_date": "2024-09-20T09:15:33",
"files": ["file3.py"],
"email": "another@example.com"
}
}
Output Fields
line_count: Total number of modified lines authored by this personmost_recent_date: ISO 8601 timestamp of their most recent contribution to the changed codefiles: Array of files where this author has contributions in the changed linesemail: Author's email address from git commits
Bot Filtering
The script automatically filters out common bot accounts:
- GitHub bots (e.g.,
dependabot[bot],renovate[bot]) - CI bots (e.g.,
openshift-ci-robot,k8s-ci-robot) - Generic bot patterns (any name containing
[bot]or ending in-bot)
Implementation Steps
Step 1: Collect changed files
Before invoking the script, collect the list of changed files based on the scenario:
Uncommitted changes:
# Get staged and unstaged files
files=$(git diff --name-only --diff-filter=d HEAD)
files+=" $(git diff --name-only --diff-filter=d --cached)"
Committed changes:
# Get files changed from base branch
files=$(git diff --name-only --diff-filter=d ${base_branch}...HEAD)
Step 2: Invoke the script
Build the command with the appropriate mode and all changed files:
# Start building the command
cmd="python3 ${CLAUDE_PLUGIN_ROOT}/skills/suggest-reviewers/analyze_blame.py"
# Add mode
if [ "$has_uncommitted" = true ] || [ "$on_base_branch" = true ]; then
cmd="$cmd --mode uncommitted"
else
cmd="$cmd --mode committed --base-branch $base_branch"
fi
# Add each file
for file in $files; do
cmd="$cmd --file $file"
done
# Add output format
cmd="$cmd --output json"
# Execute and capture JSON output
blame_data=$($cmd)
Step 3: Parse the output
The JSON output can be parsed using Python, jq, or any JSON parser:
# Example using jq to get top contributor
echo "$blame_data" | jq -r 'to_entries | sort_by(-.value.line_count) | .[0].key'
# Example using Python
python3 << EOF
import json
import sys
data = json.loads('''$blame_data''')
# Sort by line count
sorted_authors = sorted(data.items(), key=lambda x: x[1]['line_count'], reverse=True)
for author, stats in sorted_authors:
print(f"{author}: {stats['line_count']} lines, last modified {stats['most_recent_date']}")
EOF
Step 4: Combine with OWNERS data
After getting blame data, merge it with OWNERS file information to produce the final ranked list of reviewers.
Error Handling
No changed files
If no files are passed to the script:
Error: No files specified. Use --file option at least once.
Resolution: Ensure you've detected changed files correctly before invoking the script.
Invalid mode
If an invalid mode is specified:
Error: Invalid mode 'invalid'. Must be 'uncommitted' or 'committed'.
Resolution: Use either --mode uncommitted or --mode committed.
Missing base branch in committed mode
If --mode committed is used without --base-branch:
Error: --base-branch is required when mode is 'committed'.
Resolution: Provide the base branch: --base-branch main
File not in repository
If a specified file is not tracked by git:
Warning: File 'path/to/file' is not tracked by git, skipping.
Resolution: This is a warning and can be safely ignored. The script will skip untracked files.
No blame data found
If git blame returns no data for any files:
{}
Resolution: This can happen if:
- All changed files are newly created (no blame history)
- All changes are in binary files
- Git blame is unable to run
In this case, fall back to OWNERS-only suggestions.
Examples
Example 1: Analyze uncommitted changes
$ python3 analyze_blame.py --mode uncommitted --file src/main.go --file src/utils.go --output json
{
"Alice Developer": {
"line_count": 45,
"most_recent_date": "2024-10-15T14:23:10",
"files": ["src/main.go", "src/utils.go"],
"email": "alice@example.com"
},
"Bob Engineer": {
"line_count": 12,
"most_recent_date": "2024-09-20T09:15:33",
"files": ["src/main.go"],
"email": "bob@example.com"
}
}
Example 2: Analyze committed changes on feature branch
$ python3 analyze_blame.py --mode committed --base-branch main --file pkg/controller/manager.go --output json
{
"Charlie Contributor": {
"line_count": 78,
"most_recent_date": "2024-10-01T11:42:55",
"files": ["pkg/controller/manager.go"],
"email": "charlie@example.com"
}
}
Example 3: Text output format
$ python3 analyze_blame.py --mode uncommitted --file README.md --output text
Blame Analysis Results:
=======================
Alice Developer (alice@example.com)
Lines: 23
Most recent: 2024-10-15T14:23:10
Files: README.md
Bob Engineer (bob@example.com)
Lines: 5
Most recent: 2024-08-12T16:30:21
Files: README.md
Example 4: Multiple files with mixed results
$ python3 analyze_blame.py --mode committed --base-branch release-4.15 \
--file vendor/k8s.io/client-go/kubernetes/clientset.go \
--file pkg/controller/node.go \
--file docs/README.md \
--output json
{
"Diana Developer": {
"line_count": 156,
"most_recent_date": "2024-09-28T13:15:42",
"files": ["vendor/k8s.io/client-go/kubernetes/clientset.go", "pkg/controller/node.go"],
"email": "diana@example.com"
},
"Eve Technical Writer": {
"line_count": 34,
"most_recent_date": "2024-10-10T10:22:18",
"files": ["docs/README.md"],
"email": "eve@example.com"
}
}
Technical Details
How the script works
Determine diff range: Based on mode, calculates what to compare:
uncommitted: Compares working directory against HEADcommitted: Compares HEAD against base branch
Parse diff output: Runs
git diffwith unified format to identify:- Which files changed
- Which line ranges were added/modified
- Ignores deleted lines (can't blame what doesn't exist)
Run git blame: For each file and line range:
- Runs
git blame -L start,end --line-porcelain file - Parses porcelain format to extract author, email, and timestamp
- Aggregates data across all changed lines
- Runs
Filter and aggregate:
- Removes bot accounts
- Groups by author name
- Counts total lines per author
- Tracks most recent contribution date
- Lists all files each author contributed to
Output results: Formats as JSON or text based on
--outputparameter
Performance considerations
- Only blames changed line ranges, not entire files (much faster for small changes to large files)
- Processes files in parallel when possible
- Caches git commands where appropriate
- Skips binary files automatically
Limitations
- Does not handle file renames/moves (treats as delete + add)
- Bot filtering is based on common patterns; custom bots may not be filtered
- Requires git history; newly initialized repos may not have useful data
- Does not consider commit message content or PR review history
See Also
- Main command:
/git:suggest-reviewersinplugins/git/commands/suggest-reviewers.md - Git blame documentation: https://git-scm.com/docs/git-blame
- Git diff documentation: https://git-scm.com/docs/git-diff