📄 skillActivePromisingcritical

BoxPwnr

by 0ca

A modular framework for benchmarking LLMs and agentic strategies on security challenges across HackTheBox, TryHackMe, PortSwigger Labs, Cybench, picoCTF and more.

Supported Platforms

🤖 Claude Code🧠 Codex

Stars

405

Skill Type

📋 Runbooks

Quality Score

87/200

License

AGPL-3.0

Forks

48

Last Updated

Jun 11, 2026

Discovered

Apr 22, 2026

Validation

Passed

github.com/0ca/BoxPwnr

Quality Breakdown

87/ 200

Content Signals

â—‹Gotchas/Edge Cases+40
✓Progressive Disclosure+30
â—‹Trigger Description+20
â—‹Verification/Safety+20
✓Code Examples+15
✓Composability+15

Repo Health

✓Recent Activity+15
✓Scripts/Automation+10
â—‹Real Usage (Issues)+10
✓Single Responsibility+10
✓Config/Persistence+10
✓Install Instructions+5

Multi-platform bonus: +5 pts if tool supports 2+ platforms. Score derived from 12 structural signals — not stars or popularity.

Trust & Verification

critical

Manual security review required. Use with extreme caution.

Active

Updated within the last 90 days. Actively maintained.

Unverified skill. Always review source code before installing any skill from an unknown author.

Risk Assessment

  • Contains multi-agent orchestration system (.claude/agents directory with implement-new-platform.md)
  • Autonomous LLM-based executor framework with multiple solver types (single_loop, claude_code, hacksynth, external)
  • SSH executor capability (src/boxpwnr/executors/ssh/) enabling remote command execution
  • Docker executor capability (src/boxpwnr/executors/docker/) for containerized command execution
  • PTY manager for interactive terminal session handling without explicit user approval gates
  • LLM orchestrator core (src/boxpwnr/core/orchestrator.py) coordinating autonomous agent behavior