name	academic-search
description	Search academic paper repositories (arXiv, Semantic Scholar) for scholarly articles in physics, mathematics, computer science, quantitative biology, AI/ML, and related fields

Academic Search Skill

This skill provides access to academic paper repositories, primarily arXiv, for searching scholarly articles. arXiv is a free distribution service and open-access archive for preprints in physics, mathematics, computer science, quantitative biology, quantitative finance, statistics, electrical engineering, systems science, and economics.

When to Use This Skill

Use this skill when you need to:

Find cutting-edge research: Access preprints and recent papers before formal journal publication
Search AI/ML papers: Find machine learning, deep learning, and artificial intelligence research
Explore computational methods: Search for algorithms, theoretical frameworks, and mathematical foundations
Research interdisciplinary topics: Find papers spanning computer science, biology, physics, and mathematics
Gather literature reviews: Collect relevant papers for comprehensive topic overviews
Track state-of-the-art: Find the latest advances in rapidly evolving fields

Ideal Use Cases

Scenario	Example Query
Understanding new architectures	"transformer attention mechanism"
Exploring applications	"large language models code generation"
Finding benchmarks	"image classification benchmark ImageNet"
Surveying methods	"reinforcement learning robotics"
Technical deep-dives	"backpropagation neural networks"

How to Use

The skill provides a Python script that searches arXiv and returns formatted results with titles and abstracts.

Basic Usage

Note: Always use the absolute path from your skills directory.

If running from a virtual environment:

.venv/bin/python [YOUR_SKILLS_DIR]/academic-search/arxiv_search.py "your search query"

Or for system Python:

python3 [YOUR_SKILLS_DIR]/academic-search/arxiv_search.py "your search query"

Replace [YOUR_SKILLS_DIR] with the absolute skills directory path from your system prompt.

Command-Line Arguments

Argument	Required	Default	Description
`query`	Yes	-	The search query string
`--max-papers`	No	10	Maximum number of papers to retrieve
`--output-format`	No	text	Output format: `text`, `json`, or `markdown`

Examples

Search for transformer architecture papers:

python3 arxiv_search.py "attention is all you need transformer" --max-papers 5

Search for reinforcement learning papers:

python3 arxiv_search.py "deep reinforcement learning continuous control" --max-papers 10

Search for LLM papers with JSON output:

python3 arxiv_search.py "large language model reasoning" --output-format json

Search for specific author or topic:

python3 arxiv_search.py "author:Hinton deep learning"

Search in specific arXiv categories:

python3 arxiv_search.py "cat:cs.LG neural network pruning"

Step-by-Step Workflow

1. Formulate Your Query

Use specific, technical terms (e.g., "convolutional neural network image segmentation" not "AI for pictures")
Include key authors if known: author:Bengio
Specify arXiv categories for focused results: cat:cs.CL (Computation and Language)
Combine terms for intersection: "graph neural network" AND "molecular property"

2. Execute the Search

python3 [SKILLS_DIR]/academic-search/arxiv_search.py "your refined query" --max-papers 10

3. Review Results

The output includes:

Title: Full paper title
Authors: List of paper authors
Published: Publication date
arXiv ID: Unique identifier (useful for citing)
URL: Direct link to the paper
Summary: Abstract text

4. Iterate if Needed

Too many irrelevant results? Add more specific terms or use category filters
Too few results? Broaden the query or remove restrictive terms
Looking for recent work? arXiv sorts by relevance by default

5. Save and Synthesize

Save relevant findings to your research workspace for later synthesis:

research_workspace/
  papers/
    topic_findings.md

Output Formats

Text Format (Default)

================================================================================
Title: Attention Is All You Need
Authors: Ashish Vaswani, Noam Shazeer, Niki Parmar, ...
Published: 2017-06-12
arXiv ID: 1706.03762
URL: https://arxiv.org/abs/1706.03762
--------------------------------------------------------------------------------
Summary: The dominant sequence transduction models are based on complex
recurrent or convolutional neural networks...
================================================================================

JSON Format

{
  "query": "transformer attention",
  "total_results": 5,
  "papers": [
    {
      "title": "Attention Is All You Need",
      "authors": ["Ashish Vaswani", "Noam Shazeer", ...],
      "published": "2017-06-12",
      "arxiv_id": "1706.03762",
      "url": "https://arxiv.org/abs/1706.03762",
      "summary": "The dominant sequence transduction models..."
    }
  ]
}

Markdown Format

## Attention Is All You Need

**Authors:** Ashish Vaswani, Noam Shazeer, ...
**Published:** 2017-06-12
**arXiv ID:** [1706.03762](https://arxiv.org/abs/1706.03762)

### Abstract
The dominant sequence transduction models are based on complex...

arXiv Category Reference

Common categories for AI/ML research:

Category	Description
`cs.LG`	Machine Learning
`cs.AI`	Artificial Intelligence
`cs.CL`	Computation and Language (NLP)
`cs.CV`	Computer Vision
`cs.NE`	Neural and Evolutionary Computing
`cs.RO`	Robotics
`stat.ML`	Machine Learning (Statistics)
`q-bio`	Quantitative Biology
`math.OC`	Optimization and Control

Best Practices

Query Construction

Be specific: "graph attention network node classification" > "graph neural network"
Use quotation marks: For exact phrases: "self-supervised learning"
Combine operators: cat:cs.CV AND "object detection" AND 2023
Include variations: Search for both "LLM" and "large language model"

Research Workflow Integration

Start broad, then narrow: Begin with general queries, refine based on initial results
Track paper IDs: Save arXiv IDs for citing and revisiting
Check references: Seminal papers often cite foundational work
Note publication dates: Preprints may be superseded by updated versions

Limitations to Consider

Preprint status: Papers may not be peer-reviewed
Version updates: Check for newer versions (v2, v3, etc.)
Coverage gaps: Not all fields are well-represented on arXiv
Rate limiting: Avoid excessive rapid queries

Dependencies

This skill requires the arxiv Python package:

# Virtual environment (recommended)
.venv/bin/python -m pip install arxiv

# System-wide
python3 -m pip install arxiv

The script will detect if the package is missing and display installation instructions.

Troubleshooting

"Error: arxiv package not installed"

Install the arxiv package as shown in Dependencies section.

No results returned

Try broader search terms
Remove category restrictions
Check for typos in technical terms

Rate limiting errors

Wait a few seconds between queries
Reduce --max-papers value

Connection errors

Check internet connectivity
arXiv API may have temporary outages

Integration with Research Workflow

This skill works well with the web-research skill for comprehensive research:

Use academic-search for foundational/theoretical papers
Use web-research for current implementations, tutorials, and practical guides
Synthesize findings from both sources in your research report

Notes

arXiv is particularly strong for:
- Computer Science (cs.*)
- Physics (physics., hep-, cond-mat.*)
- Mathematics (math.*)
- Quantitative Biology (q-bio.*)
- Statistics (stat.*)
Results are sorted by relevance by default
The arXiv API is free and requires no authentication
Consider checking cited papers for deeper understanding

academic-search

Install Skill

SKILL.md