langwatch / browser-test

Install for your project team

Run this command in your project directory to install the skill for your entire team:

mkdir -p .claude/skills/browser-test && curl -L -o skill.zip "https://fastmcp.me/Skills/Download/4181" && unzip -o skill.zip -d .claude/skills/browser-test && rm skill.zip

New-Item -Path ".claude/skills/browser-test" -ItemType Directory -Force; Invoke-WebRequest -Uri "https://fastmcp.me/Skills/Download/4181" -OutFile "skill.zip"; Expand-Archive -Path "skill.zip" -DestinationPath ".claude/skills/browser-test" -Force; Remove-Item "skill.zip"

Project Skills

This skill will be saved in .claude/skills/browser-test/ and checked into git. All team members will have access to it automatically.

Important: Please verify the skill by reviewing its instructions before using it.

Install skill for Codex

Run one of these commands to install the skill depending on your needs:

Project Local ($CWD/.codex/skills)

mkdir -p .codex/skills/browser-test && curl -L -o skill.zip "https://fastmcp.me/Skills/Download/4181" && unzip -o skill.zip -d .codex/skills/browser-test && rm skill.zip

New-Item -Path ".codex/skills/browser-test" -ItemType Directory -Force; Invoke-WebRequest -Uri "https://fastmcp.me/Skills/Download/4181" -OutFile "skill.zip"; Expand-Archive -Path "skill.zip" -DestinationPath ".codex/skills/browser-test" -Force; Remove-Item "skill.zip"

User Global (~/.codex/skills)

mkdir -p ~/.codex/skills/browser-test && curl -L -o skill.zip "https://fastmcp.me/Skills/Download/4181" && unzip -o skill.zip -d ~/.codex/skills/browser-test && rm skill.zip

New-Item -Path "$HOME/.codex/skills/browser-test" -ItemType Directory -Force; Invoke-WebRequest -Uri "https://fastmcp.me/Skills/Download/4181" -OutFile "skill.zip"; Expand-Archive -Path "skill.zip" -DestinationPath "$HOME/.codex/skills/browser-test" -Force; Remove-Item "skill.zip"

Scope	Location	Suggested Use
REPO	`$CWD/.codex/skills`	Project directory. Teams can check in skills most relevant to a working folder here.
REPO	`$CWD/../.codex/skills`	A folder above CWD. Organizations can check in skills relevant to a shared area.
REPO	`$REPO_ROOT/.codex/skills`	Top-most root folder. Relevant to everyone using the repository.
USER	`$CODEX_HOME/skills`	Personal folder (`~/.codex/skills`). Curate skills that apply to any repository.

Install skill for GitHub Copilot

Run one of these commands to install the skill depending on your needs:

Project (.github/skills)

mkdir -p .github/skills/browser-test && curl -L -o skill.zip "https://fastmcp.me/Skills/Download/4181" && unzip -o skill.zip -d .github/skills/browser-test && rm skill.zip

New-Item -Path ".github/skills/browser-test" -ItemType Directory -Force; Invoke-WebRequest -Uri "https://fastmcp.me/Skills/Download/4181" -OutFile "skill.zip"; Expand-Archive -Path "skill.zip" -DestinationPath ".github/skills/browser-test" -Force; Remove-Item "skill.zip"

Personal (~/.copilot/skills)

mkdir -p ~/.copilot/skills/browser-test && curl -L -o skill.zip "https://fastmcp.me/Skills/Download/4181" && unzip -o skill.zip -d ~/.copilot/skills/browser-test && rm skill.zip

New-Item -Path "$HOME/.copilot/skills/browser-test" -ItemType Directory -Force; Invoke-WebRequest -Uri "https://fastmcp.me/Skills/Download/4181" -OutFile "skill.zip"; Expand-Archive -Path "skill.zip" -DestinationPath "$HOME/.copilot/skills/browser-test" -Force; Remove-Item "skill.zip"

Scope	Location	Suggested Use
Project	`.github/skills/`	Repository-specific skills. Checked into git for the whole team.
Personal	`~/.copilot/skills/`	Personal skills available across all your projects.

Install skill for Google Antigravity

Run one of these commands to install the skill depending on your needs:

Workspace (.agent/skills)

mkdir -p .agent/skills/browser-test && curl -L -o skill.zip "https://fastmcp.me/Skills/Download/4181" && unzip -o skill.zip -d .agent/skills/browser-test && rm skill.zip

New-Item -Path ".agent/skills/browser-test" -ItemType Directory -Force; Invoke-WebRequest -Uri "https://fastmcp.me/Skills/Download/4181" -OutFile "skill.zip"; Expand-Archive -Path "skill.zip" -DestinationPath ".agent/skills/browser-test" -Force; Remove-Item "skill.zip"

Global (~/.gemini/antigravity/skills)

mkdir -p ~/.gemini/antigravity/skills/browser-test && curl -L -o skill.zip "https://fastmcp.me/Skills/Download/4181" && unzip -o skill.zip -d ~/.gemini/antigravity/skills/browser-test && rm skill.zip

New-Item -Path "$HOME/.gemini/antigravity/skills/browser-test" -ItemType Directory -Force; Invoke-WebRequest -Uri "https://fastmcp.me/Skills/Download/4181" -OutFile "skill.zip"; Expand-Archive -Path "skill.zip" -DestinationPath "$HOME/.gemini/antigravity/skills/browser-test" -Force; Remove-Item "skill.zip"

Scope	Location	Suggested Use
Workspace	`.agent/skills/`	Workspace-specific skills for project workflows and conventions.
Global	`~/.gemini/antigravity/skills/`	Personal skills available across all workspaces.

Validate a feature works by driving a real browser with Playwright MCP. No test files — just interactive verification.

Productivity Coding

0 views

0 installs

Source: https://github.com/langwatch/langwatch/tree/main/.claude/skills/browser-test

Skill Content

---
name: browser-test
description: "Validate a feature works by driving a real browser with Playwright MCP. No test files — just interactive verification."
user-invocable: true
argument-hint: "[port] [feature description or feature-file-path]"
---

# Browser Test — Interactive Feature Validation

You are the **orchestrator**. You do NOT drive the browser yourself. You spawn a focused sub-agent to do the browser work, monitor its progress, and collect results.

## Step 1: Prepare

Parse `$ARGUMENTS` for:
- **Port** (optional): a number (e.g. `5570`) or `:<port>` format
- **Feature** (optional): a description of what to verify, or a path to a `specs/*.feature` file

If a feature file path is given, **read it now** and extract the scenarios into a concrete checklist. If a plain description is given, use it directly. If neither is provided, use the **default smoke test**: app loads, sign in works, dashboard renders after auth.

### Resolve the port

1. Explicit port in `$ARGUMENTS` → use it
2. Read `.dev-port` file in the repo root → source it for `APP_PORT`
3. **No port and no `.dev-port`?** → run `scripts/dev-up.sh` and then read the `.dev-port` it creates

```bash
# .dev-port format (written by dev-up.sh):
APP_PORT=5560
BASE_URL=http://localhost:5560
COMPOSE_PROJECT_NAME=langwatch-abcd1234
```

### Resolve the feature

If a feature file was given, read it and turn each scenario into a numbered verification step. Example:

```
Feature file: specs/features/beta-pill.feature
Scenarios:
  1. Navigate to dashboard → verify purple "Beta" badge next to Suites in sidebar
  2. Hover over badge → verify popover appears with beta disclaimer text
  3. Press Tab to focus badge → verify same popover appears via keyboard
```

### Create artifact directory

```
browser-tests/<feature-name>/<YYYY-MM-DD>/screenshots/
```

Derive `<feature-name>` from: feature filename (without extension) > slugified description > branch name suffix.

## Step 2: Determine data seeding needs

Before verification, decide what data the feature under test requires. Many features need pre-existing data to be meaningful (e.g., a suites page needs at least one suite with runs, a trace viewer needs traces, an evaluations dashboard requires completed evaluations).

1. **Analyze the verification steps** from Step 1. For each step, ask: "What data must already exist for this to be testable?"
2. **Build a seeding checklist** — the minimal set of entities needed. Examples:
   - Suites page → create one suite with a name and at least one scenario
   - Trace viewer → send at least one trace via the SDK or API
   - Evaluation results → trigger a batch run and wait for results
3. **Prefer seeding through the UI** — navigate to create forms, fill them in, submit. This exercises the same path a user would and is the most reliable approach in dev mode.
4. **Fall back to API/SDK only for bulk data** that would be impractical to create through the UI (e.g., 50 traces for a pagination test).
5. **Keep seeding MINIMAL** — only create what is strictly needed to verify the feature. Do not populate the app with extra data "just in case."

Include the seeding instructions in the sub-agent prompt (Step 3) so the sub-agent creates the data before verifying.

## Step 3: Spawn the browser agent

Use the **Agent tool** to spawn a sub-agent. Give it everything it needs in the prompt — port, verification steps, credentials, artifact path. The sub-agent has access to Playwright MCP tools and Bash.

**Critical:** The sub-agent prompt must include ALL of the following. Do not assume it knows anything — it starts with zero context:

```
You are a browser test agent. Your ONLY job is to drive a browser and verify features.

## Your mission
<paste the numbered verification steps here>

## Data seeding
Before verifying, create the minimal data the feature needs. Follow the checklist below.
Prefer seeding through the UI; use API/SDK only when the checklist explicitly calls for it:
<paste the seeding checklist from Step 2 here — e.g.:>
- Navigate to Suites → click "Create Suite" → fill name "Test Suite" → save
- Open the suite → add a scenario → run it once
- Wait for the run to complete before proceeding to verification

Only create what is listed above. Do not add extra data beyond what is needed.

## Connection
- App URL: http://localhost:<port>
- Browser: Chromium (headless) — use Playwright MCP tools
- Save screenshots to: <absolute artifact path>/screenshots/

## Auth (NextAuth credentials form, NOT Auth0)
- Navigate to the app → redirects to /auth/signin (Email + Password form)
- Email: browser-test@langwatch.ai
- Password: BrowserTest123!
- If "Register new account" needed, register first with same credentials
- Org name if onboarding: Browser Test Org
- After auth: dashboard shows "Hello, Browser" + "Browser Test Org" header

## How to interact
- Use browser_snapshot (accessibility tree) for finding elements — it's faster than screenshots
- Use browser_take_screenshot to capture evidence at each key step
- Use browser_wait_for with generous timeouts (60-120s for first page loads, dev mode is slow)
- Number screenshots sequentially: 01-sign-in.png, 02-dashboard.png, etc.

## Guardrails — READ THESE
- You have a maximum of 40 tool calls (seeding + verification). If you haven't finished, report what you verified and what's left.
- Do NOT debug app issues. If something doesn't work, screenshot it, mark it FAIL, and move on.
- Do NOT modify any files, fix any code, or investigate root causes.
- Do NOT go off-script. Only verify the steps listed above.
- If a step fails, take a screenshot, record FAIL, and continue to the next step.
- When done, return a markdown summary table: | # | Step | Result | Screenshot |
```

## Step 4: Collect results

When the sub-agent returns:
1. Parse its summary table
2. Write the report to `browser-tests/<feature-name>/<YYYY-MM-DD>/report.md`:

```markdown
# Browser Test: <feature-name>
**Date:** YYYY-MM-DD
**App:** http://localhost:<port>
**Browser:** Chromium (headless)
**Branch:** <current branch>
**PR:** #<number> (if known)

## Results

| # | Scenario | Result | Screenshot |
|---|----------|--------|------------|
| 1 | <name>   | PASS   | screenshots/01-xxx.png |

## Failures (if any)
- **Scenario 2:** Expected X but saw Y.

## Notes
<any observations>
```

3. If you started the app (no `.dev-port` existed before), tear it down: `scripts/dev-down.sh`

## Step 5: Upload screenshots and update the PR

Screenshots are uploaded to **img402.dev** (free, no auth) instead of committed to git. This avoids binary bloat in the repo.

1. **Upload each screenshot** to img402.dev:
   ```bash
   curl -s -F "image=@browser-tests/<feature>/<date>/screenshots/01-xxx.jpeg" https://img402.dev/api/free
   # Returns: {"url":"https://i.img402.dev/abc123.jpg", ...}
   ```
   Collect the returned URLs for each screenshot.

2. **Update the PR description** with the results table using img402 URLs so images render inline:

   Read the current PR body first (`gh pr view --json body`), then append a new section:
   ```markdown
   ## Browser Test: <feature-name>

   | # | Scenario | Result | Screenshot |
   |---|----------|--------|------------|
   | 1 | <name> | PASS | ![01](https://i.img402.dev/abc123.jpg) |
   ```

   Use `gh api repos/langwatch/langwatch/pulls/<number> -X PATCH -f body="..."` to update (not `gh pr edit`).

3. **Do NOT commit `browser-tests/`** — it is gitignored. Screenshots are ephemeral local artifacts; the img402 URLs in the PR body are the permanent record.

## Step 6: Report

Return the summary to the user/orchestrator. Include:
- The results table
- Link to the PR where screenshots are now visible
- Note: img402.dev free tier has 7-day retention; screenshots expire but remain in the PR body as broken images after that

## Rules

- **You are the orchestrator, not the browser driver.** Spawn a sub-agent for all browser work.
- **Never ask the user for anything.** Ports, credentials, features, browser choice — all resolved automatically.
- **Read `HOW_TO.md`** in this skill directory before your first run — it has gotchas about Chakra UI, dev mode slowness, and known issues. Include relevant warnings in the sub-agent prompt.
- **One sub-agent per run.** If it fails or times out, report the failure — don't retry.
- **Don't create test files.** This is interactive verification only.

langwatch / browser-test

Install for your project team

Download skill

Enable skills in Claude

Upload to Claude

Install skill for Codex

Install skill for GitHub Copilot

Install skill for Google Antigravity

Skill Content