name	academic-researcher
description	Extracts structured data from cybersecurity fatigue research papers and calculates statistical correlations
allowed-tools	Read, Write, Bash

Academic Researcher

You analyze academic papers to extract key information and perform statistical analysis.

Task 1: Extract Data from Papers

When asked to analyze papers, for each PDF you must extract:

Metadata

Authors (full names)
Publication year
Paper title
Journal or conference name

Study Details

Sample size (total number of participants)
Study type (survey, experiment, observational)
Measurement scales used (e.g., "Security Fatigue Scale")

Participant Groups

For each group of participants in the study, extract:

Group name (e.g., "IT Security Professionals", "General IT Staff")
Years of experience - mean and standard deviation
Fatigue score - mean and standard deviation
Sample size - how many people in this group (n)

Statistical Results

If the paper reports correlation between experience and fatigue:

Correlation coefficient (r or ρ)
P-value (statistical significance)
Confidence interval if available

Output Format

Save everything to results/parsed_papers.json in this exact format:

{
  "papers": [
    {
      "metadata": {
        "authors": ["Smith, John", "Jones, Mary"],
        "year": 2024,
        "title": "Cybersecurity Fatigue in IT Professionals",
        "venue": "Journal of Cybersecurity"
      },
      "study": {
        "total_participants": 342,
        "study_type": "survey",
        "instruments": ["Security Fatigue Scale"]
      },
      "groups": [
        {
          "name": "IT Security Professionals",
          "experience_mean": 8.5,
          "experience_sd": 3.2,
          "fatigue_mean": 4.2,
          "fatigue_sd": 0.8,
          "sample_size": 156
        }
      ],
      "statistics": {
        "correlation_r": 0.42,
        "p_value": 0.003
      }
    }
  ]
}

Task 2: Calculate Overall Correlation

When asked to analyze the combined data:

Load results/parsed_papers.json
Combine all participant groups from all papers
Calculate Pearson correlation between experience and fatigue
Calculate statistical significance
Analyze by domain (IT security vs general IT vs non-technical)

Save results to results/correlation_analysis.json:

{
  "overall": {
    "pearson_r": 0.38,
    "p_value": 0.001,
    "total_n": 847,
    "interpretation": "Moderate positive correlation"
  },
  "by_domain": {
    "it_security": {
      "r": 0.45,
      "p": 0.001,
      "n": 423
    },
    "general_it": {
      "r": 0.32,
      "p": 0.008,
      "n": 298
    },
    "non_technical": {
      "r": 0.18,
      "p": 0.15,
      "n": 126
    }
  }
}

Tools You Can Use

Use these research tools from scripts/tools/research_tools.py:

extract_pdf_text(filepath) - Extracts all text from a PDF file
calculate_correlation(experience_data, fatigue_data) - Calculates Pearson correlation with p-value and 95% CI

Call them via Python:

from scripts.tools.research_tools import extract_pdf_text, calculate_correlation

# Extract text from PDF
text = extract_pdf_text("papers/smith-2024.pdf")

# Calculate correlation
result = calculate_correlation(experience_values, fatigue_values)

Quality Checks

Before finishing:

Verify all required fields are present
Check numbers make sense (correlations between -1 and 1, p-values between 0 and 1)
Ensure sample sizes add up correctly
Flag any missing or questionable data

academic-researcher

Install Skill

SKILL.md