CreatmanCEO/webtest-orch

OtherLive in production

🇬🇧 English · 🇷🇺 Русский

Token-efficient e2e orchestration skill for Claude Code: explore once, replay deterministically. Playwright + axe-core + run-diff. Tests stay in your repo. MIT.

★3⑂ 0PythonPush 1mo agoListed 11d agoMIT

accessibilityai-coding-assistantai-toolsanthropicaxe-coreclaude-codeclaude-skilldeveloper-tools

Python79.9%
Go Template12.4%
JavaScript5.2%
Shell2.5%

View on GitHub

Report a problem

1 Review

thejaycampbell9d ago

webtest-orch is a unusually well-documented and thoughtfully scoped tool for a young public-beta project. The core idea is strong: use Claude/Playwright MCP for exploratory test creation, then shift repeat verification into deterministic Playwright CLI runs so teams do not burn LLM context on the same browser flows over and over. That positioning is clear in the README, and the maintainer does a good job being honest about what the project is and is not: it is not a managed QA SaaS, not a “self-healing” marketing layer, and not a replacement for engineering judgment. That restraint makes the project more credible.

The repo has better engineering signals than I expected for something with only 3 stars and 13 commits. The structure is clean, with focused Python scripts for state detection, suite running, bug fingerprinting, report generation, console triage, visual diff handling, preflight checks, and server orchestration. The test layout mirrors those scripts, and CI is meaningful: Ruff, mypy, pytest with coverage, a Linux/macOS/Windows matrix, plus a Linux E2E smoke flow against a demo app. The npm package is published as 0.3.2-beta, and the package metadata, MIT license, contributing guide, code of conduct, bilingual README, examples, and reference docs all make the project feel adoptable rather than experimental-only.

The main improvement I would suggest is reducing the README’s cognitive load for first-time users. It contains a lot of market positioning, benchmark discussion, competitor comparison, citations, and architectural rationale before a new user has necessarily succeeded once. That material is valuable, but the project would likely convert more users if the README opened with a shorter “install, run, inspect report” path and moved more of the argument into reference/ or a design note. I would also like to see a small recorded demo, screenshots of the generated report, or a sample reports/ artifact linked from the README, because the output is central to trusting the workflow. Finally, since this is a Claude Code skill, a compatibility section covering expected Claude Code versions, MCP setup failure modes, and Windows shell edge cases would help users debug installation faster.

Overall, this is a practical, opinionated developer tool with a clear niche: teams already using Claude Code who want repeatable Playwright-based web testing without turning every regression check into another expensive agent session. The repo is young and community adoption is still minimal, but the implementation discipline, CI coverage, examples, and documentation depth are strong foundations.