| name | gh-code-search |
| description | Search GitHub for real-world code examples and implementation patterns. Use when user wants to find code examples on GitHub, search GitHub repositories, discover how others implement features, learn library usage patterns, or research architectural approaches. Fetches top results with smart ranking (stars, recency, language), extracts factual data (imports, syntax patterns, metrics), and returns clean markdown for analysis and pattern identification. |
| allowed-tools | Bash(gh auth:*), Bash(pnpm:*) |
GitHub Code Search
Fetch real-world code examples from GitHub through intelligent multi-search orchestration.
Prerequisites
Required:
- gh CLI installed and authenticated (
gh auth login --web) - Node.js + pnpm
Validation:
gh auth status # Should show authenticated
Usage
cd plugins/knowledge-work/skills/gh-code-search
pnpm install # First time only
pnpm search "your query here"
When to Use
Invoke this skill when user requests:
- Real-world examples ("how do people implement X?")
- Library usage patterns ("show me examples of workflow.dev usage")
- Implementation approaches ("claude code skills doing github search")
- Architectural patterns ("repository pattern in TypeScript")
- Best practices for specific frameworks/libraries
Examples requiring nuanced multi-search:
- "Find React hooks" → Need queries for
useState/useEffect(imports),function use(definitions),const use =(arrow functions) - "Express error handling middleware" → Queries for
(req, res, next) =>(signatures),app.use(registration),function(err,(error handlers) - "How do people use XState?" → Queries for
useMachine,createMachine,interpret(actual function names, not "state machine") - "GitHub Actions for TypeScript projects" → Queries for
.ymlworkflow files +actions/setup-node+tscorpnpm build(actual commands)
How It Works
Division of responsibilities:
Script Responsibilities (Tool Implementation)
- Authenticates with GitHub (gh CLI token)
- Executes single search query (Octokit API, pagination, text-match)
- Ranks by quality (stars, recency, code structure)
- Fetches top 10 files in parallel
- Extracts factual data (imports, syntax patterns, metrics)
- Returns clean markdown with code + metadata + GitHub URLs for all files
The script is a single-query tool. Claude orchestrates multiple invocations.
Claude's Orchestration Responsibilities
Claude executes a multi-phase workflow:
Query Strategy Generation: Craft 3-5 targeted search queries considering:
- Language context (TypeScript vs JavaScript)
- File type nuances (tsx vs ts for React, config files vs implementations)
- Specificity variants (broad → narrow)
- Related terminology expansion
Sequential Search Execution: Run tool multiple times, adapting queries based on intermediate results
Result Aggregation: Combine all code files, deduplicate by repo+path, preserve GitHub URLs
Pattern Analysis: Extract common imports, architectural styles, implementation patterns
Comprehensive Summary: Synthesize findings with trade-offs, recommendations, key code highlights, and GitHub URLs for ALL meaningful files
Claude's Orchestration Workflow
Step 1: Analyze Request & Generate Queries (BLOCKING)
Analyze user request to determine:
- Primary language/framework (TypeScript, Python, Go, etc.)
- Specific patterns sought (hooks, middleware, configs, actions)
- File type variations needed (tsx vs ts, yaml vs json)
Generate 3-5 targeted queries considering nuances:
Example 1: "Find React hooks"
- Query 1:
useState useEffect language:typescript extension:tsx(matches import statements, hook usage in components) - Query 2:
function use language:typescript extension:ts(matches hook definitions likefunction useFetch()) - Query 3:
const use = language:typescript(matches arrow function hooks likeconst useAuth = () =>)
Example 2: "Express error handling"
- Query 1:
(req, res, next) => language:javascript(matches middleware function signatures) - Query 2:
app.use express(matches Express middleware registration) - Query 3:
function(err, req, res, next) language:javascript(matches error handler signatures with 4 params)
Example 3: "XState state machines in React"
- Query 1:
useMachine language:typescript extension:tsx(matches React hook usage likeconst [state, send] = useMachine(machine)) - Query 2:
createMachine language:typescript(matches machine definitions and imports) - Query 3:
interpret xstate language:typescript(matches service creation likeconst service = interpret(machine))
Rationale for each query:
- Match actual code patterns (function names, import statements, signatures)
- NOT human-friendly search terms or documentation phrases
- Different file extensions capture different use cases (tsx for components, ts for utilities)
- Specific function/variable names find concrete implementations
- Signature patterns match common code structures (arrow functions, function declarations)
- Language/extension filters prevent noise from unrelated languages
CHECKPOINT: Verify 3-5 queries generated before proceeding
Step 2: Execute Searches Sequentially
For each query (in order):
cd plugins/knowledge-work/skills/gh-code-search
pnpm search "query text here"
After each search:
- Note result count - Too many? Too few?
- Check quality indicators - Stars, recency, code structure scores
- Identify patterns - Are results converging on specific approaches?
- Adapt next query if needed:
- Too many results → Narrow scope (add more filters)
- Too few results → Broaden (remove restrictive filters)
- Found strong pattern → Search for similar variations
Track which queries succeed - Note for summary which strategies were effective
Early stopping: If first 2-3 queries yield 30+ high-quality results, remaining queries optional
Step 3: Aggregate Results
Combine all code files from all searches:
- Collect files from each search output with their GitHub URLs
- Deduplicate by
repository.full_name + file_path- Keep highest-scored version if duplicates exist
- Note which queries found the same file (indicates strong relevance)
- Maintain diversity - Ensure multiple repositories represented (avoid 10 files from one repo)
- Track provenance - Remember which query found each file (useful for analysis)
- Preserve GitHub URLs - Maintain full GitHub URLs for every file to include in summary
Result: Unified list of 15-30 unique, high-quality code files with GitHub URLs
Step 4: Analyze Patterns
Extract factual patterns across all code files:
Import Analysis:
- List most common imports (group by library)
- Identify standard library vs third-party dependencies
- Note version-specific patterns (e.g., React 18 hooks)
Architectural Patterns:
- Functional vs class-based implementations
- Custom hooks vs inline logic
- Middleware chains vs single-handler
- Configuration patterns (env vars, config files, inline)
Code Structure:
- Function signatures (parameters, return types)
- Error handling approaches (try/catch, error middleware, Result types)
- Testing patterns (unit vs integration, mocking strategies)
- Documentation styles (JSDoc, inline comments, README)
Language Distribution:
- TypeScript vs JavaScript split
- Strict mode usage
- Type definition patterns
DO NOT editorialize - Extract facts, not opinions. Analysis comes in Step 5.
Step 5: Generate Comprehensive Summary
Output format (exact structure):
# GitHub Code Search: [User Query Topic]
## Search Strategy Executed
Ran [N] targeted queries:
1. `query text` - [brief rationale] → [X results]
2. `query text` - [brief rationale] → [Y results]
3. ...
**Total unique files analyzed:** [N]
---
## Pattern Analysis
### Common Imports
- `library-name` - Used in [N/total] files
- `another-lib` - Used in [M/total] files
### Architectural Styles
- **Functional Programming** - [N] files use pure functions, immutable patterns
- **Object-Oriented** - [M] files use classes, inheritance
- **Hybrid** - [K] files mix both approaches
### Implementation Patterns
- **Pattern 1 Name**: [Description with prevalence]
- **Pattern 2 Name**: [Description with prevalence]
---
## Approaches Found
### Approach 1: [Name]
**Repos:** [repo1], [repo2]
**Characteristics:**
- [Key trait 1]
- [Key trait 2]
**Example:** [repo/file_path:line_number](github_url)
```language
[relevant code snippet - NOT just first 40 lines]
Approach 2: [Name]
[Similar structure]
Trade-offs
| Approach | Pros | Cons | Best For |
|---|---|---|---|
| Approach 1 | [pros] | [cons] | [use case] |
| Approach 2 | [pros] | [cons] | [use case] |
Recommendations
For your use case ([infer from user's context]):
Primary recommendation: [Approach name]
- Why: [Rationale based on analysis]
- Implementation: [High-level steps]
- References: repo/file_path:line_number
Alternative: [Another approach]
- When to use: [Scenarios where this is better]
- References: repo/file_path:line_number
Key Code Sections
[Concept 1]
Source: repo/file_path:line_range
[Specific relevant code - imports, key function, not arbitrary truncation]
Why this matters: [Brief explanation]
[Concept 2]
Source: repo/file_path:line_range
[Specific relevant code]
Why this matters: [Brief explanation]
All GitHub Files Analyzed
IMPORTANT: Include GitHub URLs for ALL meaningful files found across all searches.
List every unique file analyzed (15-30 files), grouped by repository:
[repo-owner/repo-name] ([stars]⭐)
- file_path - [language] - [brief description of what this file demonstrates]
- file_path - [language] - [brief description]
[repo-owner/repo-name2] ([stars]⭐)
- file_path - [language] - [brief description]
Format: Direct GitHub blob URLs with line numbers where relevant (e.g., https://github.com/owner/repo/blob/main/path/file.ts#L10-L50)
**Summary characteristics:**
- **Factual** - Based on extracted data, not assumptions
- **Actionable** - Clear recommendations with reasoning
- **Contextualized** - References specific code locations with line numbers
- **Balanced** - Shows trade-offs, not just "best practice"
- **Comprehensive** - Covers patterns, approaches, trade-offs, recommendations
- **Accessible** - GitHub URLs for ALL meaningful files so users can explore full source code
---
## Error Handling
**Common issues:**
- **Auth errors:** Prompt user to run `gh auth login --web`
- **Rate limits:** Show remaining quota, reset time. If hit during multi-search, stop gracefully with partial results
- **Network failures:** Continue with partial results, note which queries failed
- **No results for query:** Note in summary, adjust subsequent queries to be broader
- **All queries return same files:** Note low diversity, recommend broader initial queries
**Graceful degradation:** Partial results are acceptable. Complete summary based on available data.
---
## Limitations
- GitHub API rate limit: 5,000 req/hr authenticated (multi-search uses more quota)
- Each tool invocation fetches top 10 results only
- Skips files >100KB
- Sequential execution takes longer than single query (10-30s per query)
- Provides factual data, not conclusions (Claude interprets patterns)
- Deduplication assumes exact path matches (renamed files treated as unique)
---
## Typical Execution Time
**Per query:** 10-30 seconds depending on:
- Number of results (100 max)
- File sizes
- Network latency
- API rate limits
**Full workflow (3-5 queries):** 30-150 seconds
**Optimization:** If first queries yield sufficient results, skip remaining queries
---
## Integration Notes
**Example: User asks "find Claude Code skills doing github search"**
```javascript
// 1. Claude analyzes: Skills use SKILL.md with frontmatter, likely in .claude/skills/
// 2. Claude generates queries:
// - "filename:SKILL.md github search" (matches SKILL.md files with "github search" text)
// - "octokit.rest.search language:typescript" (matches actual Octokit API usage)
// - "gh api language:typescript path:skills" (matches gh CLI usage in skills)
// 3. Claude executes sequentially:
cd plugins/knowledge-work/skills/gh-code-search
pnpm search "filename:SKILL.md github search"
// [analyze results]
pnpm search "octokit.rest.search language:typescript"
// [analyze results]
pnpm search "gh api language:typescript path:skills"
// 4. Claude aggregates, deduplicates, analyzes patterns
// 5. Claude generates comprehensive summary with trade-offs and recommendations
Testing
Validate workflow with diverse queries:
- Simple: "React hooks" (should generate 3+ queries, combine results)
- Complex: "GitHub Actions TypeScript project setup" (requires config + implementation queries)
- Ambiguous: "state management" (should generate queries for Redux, Zustand, XState, Context)
Check quality:
- Deduplication works (no repeated files in summary)
- Diverse repositories (not all from one repo)
- Summary includes trade-offs and recommendations
- Code snippets are relevant (not arbitrary truncations)