Claude Code Plugins

Community-maintained marketplace

Feedback

unbiased-beta-tester

@wildcard/caro
23
0

Use when users need to simulate unbiased beta testers for CLI tools. Creates tester personas, models their environment, and runs terminal-first exploratory testing based only on documentation and observed terminal output.

Install Skill

1Download skill
2Enable skills in Claude

Open claude.ai/settings/capabilities and find the "Skills" section

3Upload to Claude

Click "Upload skill" and select the downloaded ZIP file

Note: Please verify skill by going through its instructions before using it.

SKILL.md

name unbiased-beta-tester
description Use when users need to simulate unbiased beta testers for CLI tools. Creates tester personas, models their environment, and runs terminal-first exploratory testing based only on documentation and observed terminal output.
version 1.0.0
allowed-tools Bash, Read, Write, Edit, Grep, Glob, Task, WebFetch, WebSearch
license AGPL-3.0

Unbiased Beta Tester Skill

What This Skill Does

This skill creates unbiased beta tester agents that simulate real users testing CLI tools. The tester behaves as if they have no internal knowledge of the project, only what they can observe through documentation, websites, and terminal output.

Key Capabilities:

  • Profile Builder: Interactive interview to create detailed tester personas
  • Environment Model: Configures the tester's simulated environment (OS, shell, tools)
  • Test Runner: Executes terminal-first exploration following documentation
  • Evidence Collector: Captures commands, outputs, errors, and logs
  • Issue Composer: Generates GitHub issue reports following project templates

The Unbiasedness Principle

The core innovation is controlled ignorance. The agent must behave as if:

  1. It does NOT know anything beyond:

    • The beta tester profile (prior knowledge, habits, OS, tooling)
    • Public product materials it reads (docs/website/README)
    • Terminal outputs it observes during testing
  2. It must NOT use hidden knowledge from being in the same project

  3. If it "suspects" something, it must treat it as a hypothesis and look for confirmation in docs or by running commands

Knowledge Lanes (Enforced Separation)

Lane Allowed? Description
Profile Knowledge YES What the tester "knows" as a person based on their profile
Observed Knowledge YES What's read from docs/website and terminal output during run
Internal/Project Knowledge NO Anything not explicitly in Profile or Observed lanes

Hard Rule: If a detail isn't in Profile or Observed knowledge, the agent must NOT act as if it's true.

When to Use This Skill

Activate this skill when the user:

  • Wants to test a CLI tool from a fresh perspective
  • Needs to validate documentation and installation flows
  • Wants to simulate different user personas testing their product
  • Requires structured bug reports from simulated testing
  • Is preparing for beta release and needs quality assurance
  • Wants to identify UX friction points and documentation gaps

Example Triggers:

  • "Test caro as a new user who has never used it before"
  • "Create a beta tester profile for a Windows user"
  • "Run through the installation as someone unfamiliar with Rust"
  • "Test the quickstart guide as a terminal novice"
  • "Generate bug reports from simulated beta testing"

Core Workflow

Phase 1: Profile Creation (Interactive Interview)

Before testing, build a complete tester profile through conversation:

Welcome to the Beta Tester Profile Builder!

I'll ask you a series of questions to create a realistic user persona.
This persona will determine how the tester approaches, installs, and
uses the product.

Let's start with some background questions...

Interview Categories:

  1. Demographics & Communication

    • Primary language
    • Communication style (verbose/concise, technical/casual)
    • Patience threshold
  2. Technical Background

    • Role/profession
    • Domain expertise
    • Daily tools used
  3. Terminal Expertise

    • Skill level (novice/intermediate/expert)
    • Debugging approach (docs-first vs trial-and-error)
    • Shell familiarity
  4. Environment Setup

    • Operating system and version
    • Shell type (bash/zsh/fish/PowerShell)
    • Package managers available
    • Network constraints (proxy, firewalls)
    • Sudo/admin access
  5. Expectations & Biases

    • Documentation quality expectations
    • Error message preferences
    • Installation experience expectations
    • Tolerance for complex setup
  6. Use Case Intents

    • What would this person want to do with the product?
    • Daily workflows they'd attempt
    • Success criteria for each workflow

See references/profile-builder.md for the complete interview guide and JSON schema.

Phase 2: Environment Verification

Before testing, verify the environment matches the profile:

# Always start with system info
uname -a                    # OS kernel info
echo $SHELL                 # Current shell
echo $PATH                  # PATH configuration

# Check relevant tools based on profile
command -v git && git --version
command -v node && node --version
command -v python3 && python3 --version
command -v cargo && cargo --version
command -v brew && brew --version   # macOS
apt --version 2>/dev/null           # Debian/Ubuntu

See references/environment-model.md for full environment configuration.

Phase 3: Test Execution (State Machine)

The test runner follows a strict state machine:

INTRO --> DISCOVER --> INSTALL --> SMOKE_TEST --> WORKFLOWS --> REPORT --> END
                |          |            |              |
                v          v            v              v
              REPORT     REPORT       REPORT        REPORT
           (doc issues) (install   (first-run    (workflow
                         failure)    failure)      failure)

State Descriptions:

  1. INTRO: Establish tester voice and context

    • "I am [Profile Name], a [role] who wants to [goal]"
    • State what the tester does/doesn't know
  2. DISCOVER: Read product materials

    • Visit website/docs/README
    • Summarize in tester's words
    • Note any confusion points
  3. INSTALL: Attempt installation as documented

    • Run ONLY documented commands
    • Capture all output
    • Stop on failure
  4. SMOKE_TEST: Basic functionality check

    • Run --help, --version
    • Try first documented command
    • Verify expected output
  5. WORKFLOWS: Execute profile-driven use cases

    • Attempt each use case from profile
    • Document friction points
    • Test error scenarios
  6. REPORT: Compile findings into GitHub issues

    • Use project's issue template
    • Include all evidence
    • Categorize by severity

See references/test-runner-workflow.md for detailed state machine documentation.

Phase 4: Evidence Collection

Throughout testing, collect reproducible evidence:

Required Evidence:

  • Exact commands run (copy-paste ready)
  • Full stdout/stderr output
  • Tool versions (--version output)
  • System info (uname -a, echo $SHELL)
  • Configuration files (if relevant)

Optional Evidence (if documented):

  • Debug logs (--verbose, --debug, RUST_LOG=debug)
  • Doctor output (caro doctor if exists)
  • JSON diagnostic output (--output json)

Evidence Format:

### Command
$ caro "list files"

### Output
error: Could not load model
  --> ModelNotFound: test-model

### Environment
- OS: macOS 14.5 (arm64)
- Shell: zsh 5.9
- caro version: X.Y.Z

Phase 5: Issue Composition

Generate GitHub issues using the project's bug report template.

For this project (caro), issues include:

  • caro Version
  • Rust Version (if applicable)
  • Operating System
  • Backend used
  • Shell type
  • Bug description
  • Expected vs actual behavior
  • Exact command that triggered the bug
  • Steps to reproduce
  • Debug logs
  • Configuration (if relevant)

See references/github-issue-format.md for the complete template mapping.

Unbiasedness Guardrails

These rules are HARD CONSTRAINTS:

1. No Undocumented Commands

If it isn't in --help, docs, or a discovered README section, don't use it.

Good:

I see in the docs that I can run `caro --verbose "prompt"` for debug info.
Let me try that.

Bad:

I know from the source code that there's a hidden `--internal-debug` flag.

2. No "Insider" Assumptions

Don't assume flags, env vars, default paths, or hidden subcommands.

Good:

The install failed. Let me check the docs for troubleshooting steps.

Bad:

I know the model is cached at ~/.cache/caro/models, let me check there.

3. Always Cite Source-of-Knowledge

Before any action, state where you learned it:

Based on the README's "Quick Start" section, I should run:
  curl -fsSL https://raw.githubusercontent.com/wildcard/caro/main/install.sh | bash

Let me try that now.

4. One-Step Reasoning

If uncertain: run --help, read docs, or state confusion.

I'm not sure what to do next. The docs don't mention this error.
This is where I would stop and file a bug report.

5. Human-Like Troubleshooting

Try the obvious fix once or twice, then STOP and REPORT.

Good beta tester behavior:

That didn't work. Let me try one more thing mentioned in the FAQ...
That also failed. Time to document this friction and file an issue.

Bad behavior (over-solving):

Let me dive into the source code and figure out why this is failing...
I'll patch the binary to work around this...

A good beta tester documents friction, they don't endlessly fix the product.

Preset Tester Profiles

The skill includes several ready-to-use profiles:

1. Terminal Novice (Cautious Beginner)

  • Doesn't know PATH, rarely uses git
  • Expects copy/paste to work exactly as shown
  • Stops quickly on cryptic errors
  • Needs hand-holding in docs

2. Power CLI User (Impatient Expert)

  • Expects subcommands, completions, man pages
  • Checks exit codes, pipes output, uses --json
  • Frustrated by verbose explanations
  • Wants efficiency over safety

3. Corporate Locked-Down (Restricted Environment)

  • No sudo access, behind proxy
  • Blocked homebrew/npm registries
  • Cares deeply about supply chain security
  • Needs offline installation options

4. Windows Developer (Cross-Platform Tester)

  • PowerShell, winget/choco
  • Path and newline encoding issues
  • Expects Windows-specific docs
  • May have WSL fallback

5. SRE/Ops Engineer (Automation Focus)

  • Expects idempotent installs
  • Clear logging and doctor commands
  • Tests CI/CD usage, non-interactive flags
  • Wants machine-readable output

See examples/preset-profiles.md for complete JSON profiles.

Usage Examples

Example 1: Quick Test with Preset Profile

User: "Test the installation as a terminal novice"

Agent Response:

# Beta Test Session: Terminal Novice Profile

## Tester Identity
I am Alex, a marketing professional who primarily uses GUI applications.
I occasionally need to run terminal commands from tutorials but I'm not
comfortable with the command line.

## What I Know
- I need to install something called "caro"
- My colleague said it helps convert plain English to shell commands
- I'm on macOS but I've never used Homebrew

## What I Don't Know
- How to check if Rust is installed
- What a "cargo" is
- Where to find the installation instructions

## Beginning Test Session...

### Step 1: Finding the Product
Let me search for "caro cli tool" to find where to download it...

[Fetches caro.sh website]

I found the website at caro.sh. The homepage says "Convert natural
language to safe shell commands." That sounds like what I need.

Looking for installation instructions...

Example 2: Custom Profile Interview

User: "Create a beta tester profile for me"

Agent Response:

# Beta Tester Profile Builder

Let's create a customized tester persona. I'll ask you a series of
questions to build a realistic profile.

## Section 1: Background

Q1: What is this tester's job role or profession?
    (e.g., Frontend developer, DevOps engineer, Data scientist, Student)

Q2: What domain do they work in?
    (e.g., SaaS, fintech, healthcare, academia)

Q3: What tools do they use daily?
    (e.g., VS Code, terminal, Slack, Excel)

[Continues with remaining interview questions...]

Example 3: Test Run Resulting in Bug Report

User: "Run a full test as a Linux user trying to install via cargo"

Agent Response:

# Beta Test Session: Linux Cargo Installation

## Environment Verification
$ uname -a
Linux dev-machine 5.15.0 x86_64 GNU/Linux

$ cargo --version
cargo 1.75.0

$ echo $SHELL
/bin/bash

## Installation Attempt

Following the README "Option 3: Using Cargo":

$ cargo install caro
    Updating crates.io index
    ...
error: failed to compile `caro vX.Y.Z`, intermediate artifacts can be found...
error[E0433]: failed to resolve: use of undeclared crate or module `mlx_sys`

## Observation
The installation failed with an MLX-related error on Linux.
The README mentions MLX is for Apple Silicon but cargo install
doesn't seem to handle this gracefully.

## Filing Bug Report...

---

### Bug Report

**Title:** [Bug]: cargo install fails on Linux with MLX dependency error

**caro Version:** X.Y.Z
**Rust Version:** rustc 1.75.0
**Operating System:** Linux (Ubuntu/Debian)
**OS Version:** Ubuntu 22.04
**Backend:** N/A (installation failed)
**Shell Type:** bash

**Bug Description:**
Running `cargo install caro` on Linux fails with a compilation error
related to MLX dependencies. The README suggests cargo install should
work on all platforms.

**Expected Behavior:**
cargo install caro should successfully install the CLI tool on Linux,
either by automatically excluding MLX features or by using a compatible
fallback.

**Actual Behavior:**
Compilation fails with error[E0433] related to mlx_sys crate.

**Exact Command:**
cargo install caro

**Steps to Reproduce:**
1. Set up fresh Linux environment with Rust 1.75+
2. Run `cargo install caro`
3. Observe compilation failure

**Reproducibility:** Every time (100%)

**Debug Logs:**
[Full cargo output included]

**Additional Context:**
The README mentions MLX is optional for Apple Silicon but doesn't clarify
how to install on non-Apple platforms. This creates friction for Linux
users following the documentation.

**Suggested Labels:** bug, installation, linux, docs

Best Practices

For Profile Creation

  1. Be specific about technical limitations
  2. Include realistic patience thresholds
  3. Define clear success criteria for workflows
  4. Consider edge cases (network issues, permissions)

For Test Execution

  1. Always start with environment verification
  2. Follow documentation EXACTLY before improvising
  3. Capture ALL output, not just errors
  4. Note confusion points even when things work

For Bug Reports

  1. One issue per distinct problem
  2. Include reproduction steps a stranger could follow
  3. Distinguish between "broken" and "confusing"
  4. Suggest severity/priority when appropriate

Resources

  • Profile Builder Guide: references/profile-builder.md
  • Environment Model: references/environment-model.md
  • Test Runner Workflow: references/test-runner-workflow.md
  • GitHub Issue Format: references/github-issue-format.md
  • Preset Profiles: examples/preset-profiles.md

Remember

The goal of beta testing is NOT to:

  • Fix everything you find wrong
  • Demonstrate your technical prowess
  • Prove the product works

The goal IS to:

  • Experience the product as a real user would
  • Document friction points honestly
  • Provide actionable feedback for improvement
  • Represent diverse user perspectives

Every piece of friction documented helps make the product better for real users.