📄 skillActivePromisingcritical

BoxPwnr

Name: BoxPwnr
Author: 0ca

A modular framework for benchmarking LLMs and agentic strategies on security challenges across HackTheBox, TryHackMe, PortSwigger Labs, Cybench, picoCTF and more.

Supported Platforms

🤖 Claude Code🧠 Codex

Stars

405

Skill Type

📋 Runbooks

Quality Score

87/200

License

AGPL-3.0

Forks

Last Updated

Jul 24, 2026

Discovered

Apr 22, 2026

Validation

Passed

github.com/0ca/BoxPwnr

Quality Breakdown

87/ 200

Content Signals

○Gotchas/Edge Cases+40

✓Progressive Disclosure+30

○Trigger Description+20

○Verification/Safety+20

✓Code Examples+15

✓Composability+15

Repo Health

✓Recent Activity+15

✓Scripts/Automation+10

○Real Usage (Issues)+10

✓Single Responsibility+10

✓Config/Persistence+10

✓Install Instructions+5

Multi-platform bonus: +5 pts if tool supports 2+ platforms. Score derived from 12 structural signals — not stars or popularity.

Trust & Verification

critical

Manual security review required. Use with extreme caution.

Active

Updated within the last 90 days. Actively maintained.

Unverified skill. Always review source code before installing any skill from an unknown author.

Risk Assessment

Contains multi-agent orchestration system (.claude/agents directory with implement-new-platform.md)
Autonomous LLM-based executor framework with multiple solver types (single_loop, claude_code, hacksynth, external)
SSH executor capability (src/boxpwnr/executors/ssh/) enabling remote command execution
Docker executor capability (src/boxpwnr/executors/docker/) for containerized command execution
PTY manager for interactive terminal session handling without explicit user approval gates
LLM orchestrator core (src/boxpwnr/core/orchestrator.py) coordinating autonomous agent behavior

BoxPwnr

Quality Breakdown

Trust & Verification

Risk Assessment

Related guides