| name | academic-search |
| description | Search academic paper repositories (arXiv, Semantic Scholar) for scholarly articles in physics, mathematics, computer science, quantitative biology, AI/ML, and related fields |
Academic Search Skill
This skill provides access to academic paper repositories, primarily arXiv, for searching scholarly articles. arXiv is a free distribution service and open-access archive for preprints in physics, mathematics, computer science, quantitative biology, quantitative finance, statistics, electrical engineering, systems science, and economics.
When to Use This Skill
Use this skill when you need to:
- Find cutting-edge research: Access preprints and recent papers before formal journal publication
- Search AI/ML papers: Find machine learning, deep learning, and artificial intelligence research
- Explore computational methods: Search for algorithms, theoretical frameworks, and mathematical foundations
- Research interdisciplinary topics: Find papers spanning computer science, biology, physics, and mathematics
- Gather literature reviews: Collect relevant papers for comprehensive topic overviews
- Track state-of-the-art: Find the latest advances in rapidly evolving fields
Ideal Use Cases
| Scenario | Example Query |
|---|---|
| Understanding new architectures | "transformer attention mechanism" |
| Exploring applications | "large language models code generation" |
| Finding benchmarks | "image classification benchmark ImageNet" |
| Surveying methods | "reinforcement learning robotics" |
| Technical deep-dives | "backpropagation neural networks" |
How to Use
The skill provides a Python script that searches arXiv and returns formatted results with titles and abstracts.
Basic Usage
Note: Always use the absolute path from your skills directory.
If running from a virtual environment:
.venv/bin/python [YOUR_SKILLS_DIR]/academic-search/arxiv_search.py "your search query"
Or for system Python:
python3 [YOUR_SKILLS_DIR]/academic-search/arxiv_search.py "your search query"
Replace [YOUR_SKILLS_DIR] with the absolute skills directory path from your system prompt.
Command-Line Arguments
| Argument | Required | Default | Description |
|---|---|---|---|
query |
Yes | - | The search query string |
--max-papers |
No | 10 | Maximum number of papers to retrieve |
--output-format |
No | text | Output format: text, json, or markdown |
Examples
Search for transformer architecture papers:
python3 arxiv_search.py "attention is all you need transformer" --max-papers 5
Search for reinforcement learning papers:
python3 arxiv_search.py "deep reinforcement learning continuous control" --max-papers 10
Search for LLM papers with JSON output:
python3 arxiv_search.py "large language model reasoning" --output-format json
Search for specific author or topic:
python3 arxiv_search.py "author:Hinton deep learning"
Search in specific arXiv categories:
python3 arxiv_search.py "cat:cs.LG neural network pruning"
Step-by-Step Workflow
1. Formulate Your Query
- Use specific, technical terms (e.g., "convolutional neural network image segmentation" not "AI for pictures")
- Include key authors if known:
author:Bengio - Specify arXiv categories for focused results:
cat:cs.CL(Computation and Language) - Combine terms for intersection:
"graph neural network" AND "molecular property"
2. Execute the Search
python3 [SKILLS_DIR]/academic-search/arxiv_search.py "your refined query" --max-papers 10
3. Review Results
The output includes:
- Title: Full paper title
- Authors: List of paper authors
- Published: Publication date
- arXiv ID: Unique identifier (useful for citing)
- URL: Direct link to the paper
- Summary: Abstract text
4. Iterate if Needed
- Too many irrelevant results? Add more specific terms or use category filters
- Too few results? Broaden the query or remove restrictive terms
- Looking for recent work? arXiv sorts by relevance by default
5. Save and Synthesize
Save relevant findings to your research workspace for later synthesis:
research_workspace/
papers/
topic_findings.md
Output Formats
Text Format (Default)
================================================================================
Title: Attention Is All You Need
Authors: Ashish Vaswani, Noam Shazeer, Niki Parmar, ...
Published: 2017-06-12
arXiv ID: 1706.03762
URL: https://arxiv.org/abs/1706.03762
--------------------------------------------------------------------------------
Summary: The dominant sequence transduction models are based on complex
recurrent or convolutional neural networks...
================================================================================
JSON Format
{
"query": "transformer attention",
"total_results": 5,
"papers": [
{
"title": "Attention Is All You Need",
"authors": ["Ashish Vaswani", "Noam Shazeer", ...],
"published": "2017-06-12",
"arxiv_id": "1706.03762",
"url": "https://arxiv.org/abs/1706.03762",
"summary": "The dominant sequence transduction models..."
}
]
}
Markdown Format
## Attention Is All You Need
**Authors:** Ashish Vaswani, Noam Shazeer, ...
**Published:** 2017-06-12
**arXiv ID:** [1706.03762](https://arxiv.org/abs/1706.03762)
### Abstract
The dominant sequence transduction models are based on complex...
arXiv Category Reference
Common categories for AI/ML research:
| Category | Description |
|---|---|
cs.LG |
Machine Learning |
cs.AI |
Artificial Intelligence |
cs.CL |
Computation and Language (NLP) |
cs.CV |
Computer Vision |
cs.NE |
Neural and Evolutionary Computing |
cs.RO |
Robotics |
stat.ML |
Machine Learning (Statistics) |
q-bio |
Quantitative Biology |
math.OC |
Optimization and Control |
Best Practices
Query Construction
- Be specific: "graph attention network node classification" > "graph neural network"
- Use quotation marks: For exact phrases:
"self-supervised learning" - Combine operators:
cat:cs.CV AND "object detection" AND 2023 - Include variations: Search for both "LLM" and "large language model"
Research Workflow Integration
- Start broad, then narrow: Begin with general queries, refine based on initial results
- Track paper IDs: Save arXiv IDs for citing and revisiting
- Check references: Seminal papers often cite foundational work
- Note publication dates: Preprints may be superseded by updated versions
Limitations to Consider
- Preprint status: Papers may not be peer-reviewed
- Version updates: Check for newer versions (v2, v3, etc.)
- Coverage gaps: Not all fields are well-represented on arXiv
- Rate limiting: Avoid excessive rapid queries
Dependencies
This skill requires the arxiv Python package:
# Virtual environment (recommended)
.venv/bin/python -m pip install arxiv
# System-wide
python3 -m pip install arxiv
The script will detect if the package is missing and display installation instructions.
Troubleshooting
"Error: arxiv package not installed"
Install the arxiv package as shown in Dependencies section.
No results returned
- Try broader search terms
- Remove category restrictions
- Check for typos in technical terms
Rate limiting errors
- Wait a few seconds between queries
- Reduce
--max-papersvalue
Connection errors
- Check internet connectivity
- arXiv API may have temporary outages
Integration with Research Workflow
This skill works well with the web-research skill for comprehensive research:
- Use academic-search for foundational/theoretical papers
- Use web-research for current implementations, tutorials, and practical guides
- Synthesize findings from both sources in your research report
Notes
- arXiv is particularly strong for:
- Computer Science (cs.*)
- Physics (physics., hep-, cond-mat.*)
- Mathematics (math.*)
- Quantitative Biology (q-bio.*)
- Statistics (stat.*)
- Results are sorted by relevance by default
- The arXiv API is free and requires no authentication
- Consider checking cited papers for deeper understanding