Security Testing Is a Logistics Problem

Practical insights for security teams filling vendor gaps with custom tooling and automation

automation security offensive security

Mohammed Diaa

Head of Workflows

September 09, 2025

7 mins read

I’ve talked to a lot of folks whose job involves some form of security testing (penetration testers, adversary simulation teams, AppSec engineers, and red teamers), and an interesting pattern emerged: they’re almost always building their own tools.

Not because they want to. Not many people are excited to maintain a pile of internal scripts and duct-taped workflows with no support plan or dedicated budget.

Sometimes they are dealing with niche use cases or oddball software that no vendor’s research team has ever touched. Sometimes their vendor does support it, but the detection is just incredibly slow. One security engineer mentioned that their attack surface management tool takes over ten days to register asset changes. Ten days. This delay isn’t acceptable when there’s an active exploit in the wild, and they need to know what’s exposed right now. So they built their own discovery workflows to fill this gap.

Other times, the detection is just unreliable. Vendor scanners are still software, and software has bugs: scans fail silently, never finish, or return clean results when the security team has hard proof that something’s vulnerable.

And yes, some people genuinely enjoy building their own stuff. One security engineer put it perfectly:

“It’s nice to build a legacy. It just feels good to deliver something that actually sticks around and has a lasting impact, not like a vulnerability report that gets fixed and forgotten.”

But even they will admit it’s not ideal. The first problem is focus. A lot of time and energy goes into things that aren’t really security work (infrastructure maintenance and DevOps workflows). It’s not the kind of work they’re interested in or best at, but they power through it for a while. Then one or two key engineers move on, and things start to fall apart. The system wasn’t built for anyone else to step in. A small group of highly skilled and motivated people was holding everything together. They made it work, but when they leave, big chunks of coverage leave with them.

TL;DR

Before we dive into the details, here are five must‑have capabilities I heard over and over for any system that has a chance of fixing this problem:

The system should let you drop in your own updates the minute you understand the bug or attack tactic; no feature request, and no vendor dependency.
The system should come with solid, built-in coverage. Writing your own checks and tools should be the exception, not the default.
The system should handle everything end-to-end: two-way integrations with asset inventories and vulnerability management systems, scalable and observable scan executions, reusable building blocks, and reliable infrastructure that can be approved.
The system should support rapid iteration and be as consistent as possible with the local development environment. If it works locally, it works in production.
The system should make it easy to promote one-off logic and PoCs into scheduled and repeatable jobs without little to no rework.

In the sections that follow, we’ll unpack why each of these matters so you can adapt them to your checklist, whether you’re building something in‑house or evaluating a third‑party solution.

Most Vendor Scanners Weren’t Built for Custom Security Checks

Requirement 1: The system should let you drop in your own updates the minute you understand the bug or attack tactic; no feature request, and no vendor dependency.

A lot of scanners were designed to do one thing well: deliver broad coverage for common vulnerabilities, at scale, with stability. They were not built to support a team that's simulating real-world attackers and doing R&D-style security work: writing custom checks, adapting tactics to their infrastructure, targeting niche software, etc. And that’s not a knock, it just wasn’t the use case they were built for.

You start to realize this model isn’t built for your team when you're waiting on a vendor to ship coverage for a bug you already understand, or buying more tools just to fill in the gaps.

Switching vendors doesn’t fix it. You just swap one set of “80% coverage” for another. Sometimes you get a different 80%, but you still end up missing stuff.

A better setup looks very different. You take the vendor tool for what it is, broad, stable coverage, then you complement it. Your team writes a check in the morning, tests it locally against a slice of your attack surface, learns from the results, tweaks it, and rolls it out by day’s end.

You Don’t Need to Build Every Security Tool Yourself

Requirement 2: The system should come with solid, built-in coverage. Writing your own checks and tools should be the exception, not the default.

Once teams realize commercial scanners aren’t built for their kind of work, they sometimes swing too far in the other direction. They start building everything from scratch: custom orchestrators, schedulers, asset pipelines, dashboards, and writing all their own vulnerability checks too.

Then they burn out. Because the actual detection logic, the part that matters, gets buried under tons of infrastructure work.

You don’t need to build your own scanner from scratch to catch the stuff others miss. Most vulnerabilities are already well covered by open-source and commercial tools. Your time is better spent identifying the handful of things only your team knows to look for (those one-off misconfigs, niche bugs, weird edge cases in your environment, etc.)

A good rule of thumb: aim for 80% coverage from commercial tools, 15% from trustworthy open-source scanners, and embrace that the last 5% will be custom checks that only your team would think to write.

Custom Security Checks Need a Proper Development Environment

Requirement 3: The system should handle everything end-to-end: two-way integrations with asset inventories and vulnerability management systems, scalable and observable scan executions, reusable building blocks, and reliable infrastructure that can be approved.

A lot of security engineers can write or review a PoC. That’s the fun part. What slows things down is everything around it: figuring out where to pull fresh asset data from, how to run the check/tactic at scale, how to route the output so other teams can actually use it, and making sure the SOC doesn’t flag the whole thing as suspicious.

You need a system that takes care of all the overhead so you can focus on the tactic itself. That means clean integrations with your other systems, infrastructure that scales up or down as needed, vetted codified tactics and shared utilities, audit logs that show who ran what, when, and how, and predictable egress IP addresses that can be approved.

For example, something like this

execute_check --check ./path/to/local_script.py --input asm_web_servers

…should just work. The system should know how to translate the input tag asm_web_servers into a complete, up-to-date asset list via an integration with your asset inventory. It should run the script on secure, scalable, and approved infrastructure, push the results into a dashboard you can view and share, and log everything for traceability.

Security Checks Should Run the Same in Dev and Prod

Requirement 4: The system should support rapid iteration and be as consistent as possible with the local development environment. If it works in dev, it works in prod.

Everything starts locally. A Python script here, a Nuclei template there, and a couple of open-source tools stitched together in a shell script. That’s how experimentation happens. But things get messy when production doesn’t look anything like this dev environment.

Maybe prod runs in some custom orchestrator with five layers of config and minimal docs. Or the entire thing needs to be rewritten to match a bespoke format. Or you need to go through a DevOps team just to get your scan running at scale.

In software engineering, this is a well-known consideration: dev/prod parity. The Twelve-Factor App methodology calls this out explicitly: “Keep development, staging, and production as similar as possible.”

A better system doesn't punish you for starting small. Dev and prod should look very similar, just with more horsepower. Your local Python script or Nuclei template shouldn't need to be rewritten just to run at scale. You shouldn't need a DevOps ticket to ship what worked in your shell.

That means production supports all the formats teams already use: Python scripts, shell scripts, Nuclei templates, open-source scanners, and in-house tools packaged as Docker images. They should run consistently in prod, just like they did in dev.

Proof-of-Concepts Should Graduate to Repeatable Workflow

Requirement 5: The system should make it easy to promote one-off logic and PoCs into scheduled and repeatable jobs without little to no rework.

This is where the last point really pays off. If dev and prod work similarly, then “promotion” isn’t a handoff or a rewrite. It’s just the next step. A script that worked on one asset in dev should scale to thousands in prod, and become a reusable module or a scheduled job that runs continuously.

The right system is self-serve: you go from building to testing to scaling to operationalizing without friction. What starts as a PoC shouldn’t be more than a few clicks away from being a reusable asset that keeps delivering value.

What It Looks Like When It’s Working

The best setups I’ve seen are when the offensive security team (red team, adversary simulation, penetration testers, etc) owns this process directly. They already know what tactics work, so they just codify them in a way that helps their teammates, their future selves, and the vulnerability management team.

Here’s how it typically plays out :

Anyone on the team can run a command like poc_test --poc cve-2025-53770.py which takes a check (Python script, Nuclei template, etc) as input.
The poc_test utility runs the check across the asset inventory (or a slice of it) on approved infrastructure that scales horizontally, so results come back in minutes for review.
The check is tweaked and rerun a few times until it works (fast iteration is everything at this stage)
Once it proves useful and reliable, it’s committed to a centralized security tests repository. From there it automatically:
- becomes part of scheduled scans used by the vulnerability management team
- drops into the offensive security team’s ad-hoc toolkit for engagements

A PoC goes from idea to test to production with one command and one git commit. No waiting on vendors, no rewrites, no ops overhead.

Conclusion: Fix The Logistics

For many teams, the bottleneck isn’t finding a vulnerability. It’s taking a one-off check/tactic and making it repeatable at scale. That’s a logistics problem. The teams that figure this out don’t switch scanners every year; instead, they focus on building processes that allow their best ideas to flow into production consistently. Maybe that’s where your next win is hiding, too.

See Trickest
in Action

Gain visibility, elite security, and complete coverage with Trickest Platform and Solutions.

Get a demo

Security Posture

Vulnerability Management

See Trickest in Action

Platform & Features

Popular Modules