Building the next generation of offensive cybersecurity.

HackerSec.ai unifies the Pentest AI-First methodology, the Yaga agent, and the HAS platform to deliver real pentests with autonomous artificial intelligence and specialized human validation.

Yaga Pentest AI-First HAS
[ Mission ]

Offensive cybersecurity must operate at the speed of modern attacks.

Adversaries already operate with AI. We built HackerSec.ai so defense operates at the same speed, with the technical depth only human specialists ensure.

[ Who we are ]

Built on a decade of offensive research.

A research and product initiative by HackerSec, a reference in offensive cybersecurity in the Brazilian market. HackerSec.ai unites three fronts that evolved in parallel over the past years.

HackerSec

The organization

  • Founded in 2011
  • +500 companies served
  • Reference in offensive cybersecurity in Brazil
  • Pioneer in AI-First methodology
  • National and international operation

Yaga

The agent

  • +1,000 exploitation scenarios
  • 98% accuracy across 600 tests
  • 91.2% on exploitation chains
  • Compatible with 20+ backbone models
  • Documented in technical paper

HAS

The platform

  • HackerSec's proprietary platform
  • Native integrations (Jira, Slack, MCP)
  • In production 24/7
  • Real-time tracking
  • One-click retest
[ The proprietary methodology ]

Pentest AI-First.

Real artificial intelligence accelerating the process. Human specialists deepening the attack. Four stages to deliver pentests closer to the behavior of real adversaries.

Stage 01

Scope definition

Attack surface and objectives defined. AI and pentesters operate where it matters.

Stage 02

AI runs the pentest

Reconnaissance, real exploitations, contextual analysis, and vulnerability identification.

Stage 03

Specialized validation

Each finding passes through technical criteria. Only confirmed vulnerabilities advance.

Stage 04

Human deepening

Pentesters explore attack chains, business logic, and complex scenarios.

"The future of pentest is not full automation. It is artificial intelligence accelerating the process and specialists deepening the attack."

Andrew Martinez, CEO of HackerSec
[ The agent ]

Yaga

Autonomous pentest agent developed internally by HackerSec. Exploitation-first architecture across four layers: Intelligence Layer, Execution Layer, Chain Engine, and Memory System.

Powered by specialized agents covering more than 1,000 exploitation scenarios: injection, authentication, business logic, privilege escalation, lateral movement, and multi-vector chains. Toolchains with 140+ instruments complement execution.

Compatible with more than 20 backbone models, including frontier (GPT-5.5, GPT-5, Claude Opus 4.7/4.6, Qwen 72B, Llama 3.2) and custom models trained by HackerSec. Operates over web, APIs, networks, cloud, mobile, IoT, and AI/LLM systems, in white-box, gray-box, and black-box modes.

01
Reconnaissance and navigation. Maps the attack surface, navigates the application as a user, and enumerates endpoints, parameters, and authentication flows.
02
Architecture comprehension and analysis. Identifies technology stack, design patterns, and critical entry points. Extracts relevant inputs for testing (tokens, headers, payloads, and internal states).
03
Real exploitation. Executes exploits within the authorized scope, tests business logic, and chains vulnerabilities to validate operational impact.
04
Contextual refinement. Adapts execution to the environment, prioritizes what matters, and discards inconsistent signals before human validation.
[ Methodological positioning ]

Technical depth at machine speed.

Operational difference between the three approaches available in the market today for vulnerability identification.

Automated scan
Traditional pentest
HackerSec.ai
Surface coverage
Known vulns
Sample-based
Full
Execution time
Minutes
2 to 4 weeks
Hours
Real exploitation
Known signals
Yes
Yes
Business logic
n/a
Yes
Yes
Human validation
n/a
Yes
Yes
False positives delivered
High
Low
Zero
Scalability
High
Limited
High

Comparison based on methodologies operated in production environments, under defined scope. Human validation performed by HackerSec specialist pentesters.

[ In production ]

Yaga, by the numbers.

98%
Detection accuracy across 600 OWASP TOP 10 scenarios (Web, API, LLM, Mobile, and AD infrastructure).
91.2%
Accuracy on multi-vulnerability exploitation chains. 37 pp above any public autonomous system.
2%
False positive rate on the agent in isolation. Subsequent human validation lifts the delivered finding to zero false positives.
[ Comparison with published autonomous systems ]

Yaga vs. publicly evaluated agents.

We evaluated Yaga across 600 scenarios from our proprietary benchmark, covering OWASP TOP 10 Web, API, LLM, Mobile, and GOAD infrastructure (Active Directory).

PentAGI
Strix
Shannon
Yaga
Overall detection accuracy
~67%
~70%
~72%
98.0%
Exploitation chain (weighted average)
52.0%
55.4%
58.1%
91.2%
Flag capture (CTF, 120 scenarios)
70.0%
72.5%
74.2%
96.3%
Authentication bypass (85 scenarios)
~60%
~67%
~65%
94.7%
False positive rate (agent in isolation)
~16%
~13%
~14%
2.0%

Evaluation across HackerSec's proprietary benchmark (600 OWASP TOP 10 scenarios in Web, API, LLM, Mobile, and GOAD infrastructure). Compared systems: PentAGI (github.com/vxcontrol/pentagi), Strix (github.com/usestrix/strix), Shannon (github.com/KeygraphHQ/shannon).

Metrics applicable to the agent operating in isolation. In production, Yaga's findings undergo specialized human validation before any client delivery, eliminating the residual 2% of false positives.

[ Benchmark on current models ]

How Yaga performs on frontier models.

We evaluated Yaga on the leading language models available today, across the three pentest modes (white-box, gray-box, and black-box), over the 600 offensive scenarios of our internal benchmark.

Backbone model
White-box
Gray-box
Black-box
Time (600 cases)
Specialty
GPT-5.5
97.8%
97.7%
97.6%
7h 35m
Code analysis, speed
Claude Opus 4.7
97.5%
98.1%
98.4%
8h 09m
Black-box inference, low FP
GPT-5
97.3%
97.3%
97.2%
8h 06m
Balanced general reasoning
Claude Opus 4.6
97.1%
97.7%
98.1%
8h 21m
Stable black-box inference
Qwen 72B
84.2%
82.8%
81.3%
16h 51m
Structured tasks, on-prem
Llama 3.2
21.1%
19.8%
18.3%
20h+
Low-complexity triage

Models evaluated under strictly identical conditions: same Yaga framework version (v2.4.1), same tool inventory, same prompt templates, and same targets.

[ The platform ]

HAS. Where everything integrates.

HackerSec's proprietary platform where Yaga operates and human pentesters validate. Clients track in real time, manage fixes, request retests, and integrate with the tools the team already uses.

Request in minutes

Define scope and trigger pentest directly on the platform.

Real-time tracking

Vulnerabilities appear as they are identified.

Complete findings

CVSS, exploitation evidence, and remediation recommendations.

One-click retest

Client fixes, requests review, and validates closure.

Workflow integrations

Jira, Slack, Teams, GitHub, and ServiceNow native.

MCP connection

Client AI agents query security data directly.

[ In continuous development ]

The next generation of offensive cybersecurity.

Real research, development, and operation. In production every day inside HackerSec.