[ Mission ]

Offensive cybersecurity must operate at the speed of modern attacks.

Adversaries already operate with AI. We built HackerSec.ai so defense operates at the same speed, with the technical depth only human specialists ensure.

[ Who we are ]

Built on a decade of offensive research.

A research and product initiative by HackerSec, a reference in offensive cybersecurity in the Brazilian market. HackerSec.ai unites three fronts that evolved in parallel over the past years.

HackerSec

The organization

Founded in 2011
+500 companies served
Reference in offensive cybersecurity in Brazil
Pioneer in AI-First methodology
National and international operation

Yaga

The agent

+1,000 exploitation scenarios
98% accuracy across 600 tests
91.2% on exploitation chains
Compatible with 20+ backbone models
Documented in technical paper

HAS

The platform

HackerSec's proprietary platform
Native integrations (Jira, Slack, MCP)
In production 24/7
Real-time tracking
One-click retest

[ The proprietary methodology ]

Pentest AI-First.

Real artificial intelligence accelerating the process. Human specialists deepening the attack. Four stages to deliver pentests closer to the behavior of real adversaries.

Stage 01

Scope definition

Attack surface and objectives defined. AI and pentesters operate where it matters.

→

Stage 02

AI runs the pentest

Reconnaissance, real exploitations, contextual analysis, and vulnerability identification.

→

Stage 03

Specialized validation

Each finding passes through technical criteria. Only confirmed vulnerabilities advance.

→

Stage 04

Human deepening

Pentesters explore attack chains, business logic, and complex scenarios.

"The future of pentest is not full automation. It is artificial intelligence accelerating the process and specialists deepening the attack."
Andrew Martinez, CEO of HackerSec

[ The agent ]

Yaga

Autonomous pentest agent developed internally by HackerSec. Exploitation-first architecture across four layers: Intelligence Layer, Execution Layer, Chain Engine, and Memory System.

Powered by specialized agents covering more than 1,000 exploitation scenarios: injection, authentication, business logic, privilege escalation, lateral movement, and multi-vector chains. Toolchains with 140+ instruments complement execution.

Compatible with more than 20 backbone models, including frontier (GPT-5.5, GPT-5, Claude Opus 4.7/4.6, Qwen 72B, Llama 3.2) and custom models trained by HackerSec. Operates over web, APIs, networks, cloud, mobile, IoT, and AI/LLM systems, in white-box, gray-box, and black-box modes.

01

Reconnaissance and navigation. Maps the attack surface, navigates the application as a user, and enumerates endpoints, parameters, and authentication flows.

02

Architecture comprehension and analysis. Identifies technology stack, design patterns, and critical entry points. Extracts relevant inputs for testing (tokens, headers, payloads, and internal states).

03

Real exploitation. Executes exploits within the authorized scope, tests business logic, and chains vulnerabilities to validate operational impact.

04

Contextual refinement. Adapts execution to the environment, prioritizes what matters, and discards inconsistent signals before human validation.

[ Methodological positioning ]

Technical depth at machine speed.

Operational difference between the three approaches available in the market today for vulnerability identification.

Automated scan

Traditional pentest

HackerSec.ai

Surface coverage

Known vulns

Sample-based

Full

Execution time

Minutes

2 to 4 weeks

Hours

Real exploitation

Known signals

Yes

Business logic

n/a

Yes

Human validation

n/a

Yes

False positives delivered

High

Low

Zero

Scalability

High

Limited

High

Comparison based on methodologies operated in production environments, under defined scope. Human validation performed by HackerSec specialist pentesters.

[ In production ]

Yaga, by the numbers.

92.3%

Overall accuracy across 124 academic benchmark scenarios (black-box, gray-box, and white-box) with Claude Opus 4.8.

91.2%

Accuracy on multi-stage exploitation chains with Opus 4.8. RCE, chained SSRF, privilege escalation, and lateral movement.

3.2%

False positive rate on the agent in isolation in black-box scenarios. Subsequent human validation lifts the delivered finding to zero false positives.

[ Comparison with published autonomous systems ]

Yaga vs. publicly evaluated agents.

We evaluated Yaga across 600 scenarios from our proprietary benchmark, covering OWASP TOP 10 Web, API, LLM, Mobile, and GOAD infrastructure (Active Directory).

PentAGI

Strix

Shannon

Yaga

Overall detection accuracy

~67%

~70%

~72%

98.0%

Exploitation chain (weighted average)

52.0%

55.4%

58.1%

91.2%

Flag capture (CTF, 120 scenarios)

70.0%

72.5%

74.2%

96.3%

Authentication bypass (85 scenarios)

~60%

~67%

~65%

94.7%

False positive rate (agent in isolation)

~16%

~13%

~14%

2.0%

Evaluation across HackerSec's proprietary benchmark (600 OWASP TOP 10 scenarios in Web, API, LLM, Mobile, and GOAD infrastructure). Compared systems: PentAGI (github.com/vxcontrol/pentagi), Strix (github.com/usestrix/strix), Shannon (github.com/KeygraphHQ/shannon).

Metrics applicable to the agent operating in isolation. In production, Yaga's findings undergo specialized human validation before any client delivery, eliminating the residual 2% of false positives.

[ Benchmark on current models ]

How Yaga performs on frontier models.

We evaluated Yaga on the leading frontier language models, across the three pentest modes (white-box, gray-box, and black-box), over the 124 scenarios of the academic benchmark — of which 40 require multi-stage exploitation chains.

Backbone model

White-box

Gray-box

Black-box

Average time

Specialty

Claude Opus 4.8
96.1%
93.8%
87.4%
18.3 min
Top in complex chains. Best calibration under uncertainty.

GPT-5.5

94.2%

89.7%

83.1%

22.7 min

Strong on short chains. Loses traction on 5+ stages.

Claude Opus 4.7

91.8%

86.1%

80.2%

23.5 min

Best in information-rich white-box contexts.

Claude Opus 4.6

91.3%

87.5%

79.8%

24.1 min

Superior uncertainty calibration in gray-box.

Claude Sonnet 4.6

85.5%

78.9%

71.4%

31.2 min

Cost-efficient. Degrades on long chains.

Models evaluated under strictly identical conditions: same Yaga framework version, same tool inventory, same prompt templates, and same targets. Average time per benchmark scenario.

[ The platform ]

HAS. Where everything integrates.

HackerSec's proprietary platform where Yaga operates and human pentesters validate. Clients track in real time, manage fixes, request retests, and integrate with the tools the team already uses.

Request in minutes

Define scope and trigger pentest directly on the platform.

Real-time tracking

Vulnerabilities appear as they are identified.

Complete findings

CVSS, exploitation evidence, and remediation recommendations.

One-click retest

Client fixes, requests review, and validates closure.

Workflow integrations

Jira, Slack, Teams, GitHub, and ServiceNow native.

MCP connection

Client AI agents query security data directly.

[ Collaboration ]

Built to grow with the ecosystem.

The HackerSec.ai initiative is open to collaboration with cybersecurity companies partnered with HackerSec. Those operating in the same terrain get early access to new HAS platform modules, propose evaluation scenarios for Yaga, and evolve alongside the Pentest AI-First methodology.

Talk about partnership

Building the next generation of offensive cybersecurity.

Offensive cybersecurity must operate at the speed of modern attacks.

Built on a decade of offensive research.

HackerSec

The organization

Yaga

The agent

HAS

The platform

Pentest AI-First.

Scope definition

AI runs the pentest

Specialized validation

Human deepening

Yaga

Technical depth at machine speed.

Yaga, by the numbers.

Yaga vs. publicly evaluated agents.

How Yaga performs on frontier models.

HAS. Where everything integrates.

Request in minutes

Real-time tracking

Complete findings

One-click retest

Workflow integrations

MCP connection

Built to grow with the ecosystem.

The next generation of offensive cybersecurity.