IAMUVIN

Cybersecurity & Ethical Hacking

Web Application Penetration Testing: My Methodology

Uvin Vindula·July 14, 2025·12 min read
Share

TL;DR

Web application penetration testing is not about running a scanner and dumping the output into a PDF. It is a structured, four-phase process — reconnaissance, automated scanning, manual testing, and reporting — that requires written authorization before you touch a single endpoint. I use Burp Suite, OWASP ZAP, and Nmap as my core toolkit, but the tools are secondary to the methodology. The scanner finds the low-hanging fruit. Manual testing finds the business logic flaws, chained vulnerabilities, and authentication bypasses that actually compromise an application. This article walks through exactly how I approach every web application security assessment, from scoping the engagement to delivering a report that gets vulnerabilities fixed.


Authorization First — Non-Negotiable

I am going to say this once, clearly, and I will repeat it throughout this article: you do not touch a system without written authorization.

This is not a suggestion. This is not something you handle with a verbal agreement over a call. Before I run a single Nmap scan, before I open Burp Suite, before I even perform passive reconnaissance beyond what is publicly indexed — I have a signed document that specifies exactly what I am allowed to test, when I am allowed to test it, and what is explicitly out of scope.

The authorization document covers:

  • Scope definition — which domains, subdomains, IP ranges, and API endpoints are in scope
  • Testing window — the dates and times during which active testing is permitted
  • Exclusions — production databases that must not be modified, third-party integrations that are off-limits, specific endpoints that could trigger financial transactions
  • Emergency contacts — who I call if I accidentally cause a service disruption at 2 AM
  • Data handling — how I store, transmit, and eventually destroy any sensitive data I encounter during testing
  • Rules of engagement — whether denial-of-service testing is permitted, whether social engineering is in scope, whether physical access testing is included

I keep a template for this. Every client signs it before the engagement begins. If a client pushes back on the paperwork and says "just start testing, we trust you" — that is exactly when I push harder on getting the document signed. Trust is not a legal defense. The authorization protects the client, protects me, and establishes clear boundaries that make the entire assessment more effective.

I have walked away from engagements where the client refused to define scope clearly. A pentest without clear scope is not a pentest — it is a liability.


Phase 1 — Reconnaissance

Reconnaissance is where I spend the most time relative to its perceived value. Clients often want me to jump straight into "hacking." But the quality of the reconnaissance directly determines the quality of the findings. You cannot test what you do not know exists.

I split recon into passive and active stages.

Passive Reconnaissance

Passive recon does not send a single packet to the target. Everything comes from public sources, cached data, and third-party services.

Subdomain enumeration is the starting point. I use tools like subfinder and amass to pull subdomains from certificate transparency logs, DNS records, and public datasets. A client will tell me their application lives at app.example.com. Recon reveals staging.example.com, api-v2.example.com, admin.example.com, and legacy.example.com — all of which may be running older, unpatched versions of the application.

Technology fingerprinting tells me what the application is built with before I interact with it. HTTP response headers, JavaScript library signatures, meta tags, and error page formats all leak information. Knowing the target runs Next.js 14 on Vercel with a PostgreSQL backend changes my entire testing approach compared to a Laravel application on Apache.

OSINT gathering includes searching for exposed credentials in public repositories (using tools like truffleHog against the organization's GitHub), reviewing job postings that reveal internal technology stacks, and checking archive.org for historical versions of the application that may reveal deprecated endpoints still accessible in production.

Google dorking with targeted queries like site:example.com filetype:pdf, site:example.com inurl:admin, or site:example.com ext:env surfaces documents, admin panels, and configuration files that should not be publicly indexed.

Active Reconnaissance

Active recon involves direct interaction with the target — this is where the authorization document becomes essential.

Port scanning with Nmap reveals what services are running beyond the web application itself. A typical scan starts broad:

bash
nmap -sV -sC -p- --open -oA recon/full-scan target.example.com

The -sV flag detects service versions. -sC runs default scripts for additional enumeration. -p- scans all 65,535 ports. I frequently find database ports exposed to the internet, debug interfaces left running on non-standard ports, and administrative services that were never meant to be public.

Directory and file enumeration with gobuster or ffuf maps the application's URL structure beyond what the sitemap and robots.txt reveal. I use wordlists tailored to the application's technology stack — a Next.js app gets a different wordlist than a WordPress site.

API endpoint discovery is critical for modern applications. I look for OpenAPI/Swagger documentation endpoints (/api-docs, /swagger.json, /openapi.yaml), GraphQL introspection queries, and undocumented API versions that may lack the security controls applied to the current version.

Everything I find goes into a structured recon document. By the end of Phase 1, I have a complete map of the attack surface — every subdomain, every open port, every technology version, every API endpoint. This map drives every decision in the phases that follow.


Phase 2 — Automated Scanning

Automated scanning is valuable but limited. Scanners are excellent at finding known vulnerability patterns — missing security headers, outdated library versions, common misconfigurations, and signature-based vulnerabilities like reflected XSS in query parameters. They are terrible at finding business logic flaws, complex authentication bypasses, and chained vulnerabilities.

I use automated scanning as a baseline, never as a conclusion.

Burp Suite Professional

Burp Suite is the center of my web application testing workflow. I configure the browser to proxy through Burp, manually crawl the application to build a complete sitemap, and then run the active scanner against the discovered endpoints.

The key configuration decisions:

  • Scope control — I restrict the scanner to the authorized target. Burp will happily follow links to third-party domains and scan those too if you let it. That is both a waste of time and potentially illegal.
  • Scan speed — I throttle requests to avoid overwhelming the application. A production environment getting hammered by an unrestricted scanner can degrade performance for real users. I coordinate the testing window with the client and adjust scan intensity accordingly.
  • Authentication handling — I configure Burp's session handling rules to maintain authenticated sessions during scanning. Most interesting vulnerabilities are behind authentication. An unauthenticated scan of a SaaS application misses 90% of the attack surface.

Burp's scanner will flag injection points, cross-site scripting vectors, insecure cookie configurations, information disclosure in error messages, and dozens of other vulnerability classes. Each finding gets triaged — confirmed, false positive, or requires manual verification.

OWASP ZAP

I run OWASP ZAP as a second opinion. Different scanners have different detection engines, and I have found genuine vulnerabilities with ZAP that Burp missed, and vice versa. ZAP's active scan mode is particularly good at detecting certain classes of injection vulnerabilities, and its AJAX Spider handles JavaScript-heavy single-page applications better in some scenarios.

ZAP also generates a useful baseline report that I can compare against Burp's findings. Where both tools flag the same issue, I have higher confidence. Where only one flags it, I investigate manually.

Vulnerability Scanning

Beyond the web application scanners, I run targeted checks based on what Phase 1 revealed:

  • SSL/TLS configuration — testssl.sh or sslyze to check for weak cipher suites, expired certificates, and protocol downgrade vulnerabilities
  • Known CVE checks — if recon identified specific software versions, I check for published vulnerabilities and available exploits
  • Header analysis — verifying the presence and correct configuration of security headers (Content-Security-Policy, Strict-Transport-Security, X-Frame-Options, X-Content-Type-Options, Referrer-Policy, Permissions-Policy)

The automated phase typically takes a few hours of active scanning time. I review every finding the scanners produce, eliminate false positives, and create a prioritized list of issues to investigate further in Phase 3.


Phase 3 — Manual Testing

This is where the real work happens. Automated scanners are pattern matchers. Manual testing is where I think like an attacker — but an attacker with authorization, clear scope, and a methodology.

I structure manual testing around the OWASP Testing Guide categories, adapted to the specific application.

Authentication Testing

Authentication is the front door. I test:

  • Credential handling — are passwords transmitted over HTTPS? Are they hashed with bcrypt or argon2, or something weaker? Does the application enforce password complexity requirements?
  • Brute force protection — does the login endpoint implement rate limiting or account lockout? I test this carefully, within the authorized scope, to verify that automated credential stuffing would be blocked.
  • Session management — are session tokens sufficiently random? Do they expire? Are they invalidated on logout? Can I fixate a session? Are cookies set with Secure, HttpOnly, and SameSite flags?
  • Multi-factor authentication bypass — if MFA is implemented, can I bypass it by directly accessing post-authentication endpoints? Can I manipulate the MFA verification flow to skip the second factor?
  • Password reset flow — can I enumerate valid email addresses through the reset form? Are reset tokens predictable? Do they expire? Can I intercept or redirect the reset link?

Authorization Testing

Authorization failures are the most common critical vulnerability I find. The application checks whether you are logged in but not whether you should have access to what you are requesting.

  • Insecure Direct Object References (IDOR) — changing /api/users/123/profile to /api/users/124/profile and accessing another user's data. I test every endpoint that includes an identifier in the URL, request body, or query parameters.
  • Horizontal privilege escalation — accessing resources that belong to another user at the same privilege level. Can user A read user B's invoices? Can user A modify user B's settings?
  • Vertical privilege escalation — accessing administrative functionality as a regular user. Can I reach /admin/users by navigating directly? Does the API check roles on every request, or only on the frontend?
  • Function-level access control — are API endpoints protected independently of the UI? Removing a button from the frontend does not remove the endpoint.

Injection Testing

I test for injection vulnerabilities that the automated scanners may have missed, particularly in complex input flows:

  • SQL injection — beyond simple parameter injection, I test stored procedures, JSON field queries, search functionality with complex filters, and multi-step forms where input from step 1 is used in a query at step 3
  • Cross-site scripting (XSS) — I test every input field, URL parameter, HTTP header, and file upload for reflected, stored, and DOM-based XSS. Modern frameworks like React and Next.js mitigate many XSS vectors through automatic escaping, but dangerouslySetInnerHTML, URL-based injection, and third-party component vulnerabilities remain common
  • Server-Side Request Forgery (SSRF) — any feature that fetches external URLs (link previews, webhook URLs, image imports, PDF generation) is an SSRF candidate. I test for access to internal services, cloud metadata endpoints (169.254.169.254), and internal network scanning
  • Command injection — less common in modern web applications but still present in applications that interact with the operating system for file processing, image manipulation, or PDF generation

Business Logic Testing

This is what separates a penetration test from a vulnerability scan. Business logic flaws cannot be detected by automated tools because they require understanding what the application is supposed to do.

  • Race conditions — can I submit the same coupon code simultaneously from two sessions and get double the discount? Can I withdraw funds while a transfer is still processing?
  • Workflow bypasses — can I skip the payment step in a checkout flow by directly accessing the order confirmation endpoint? Can I approve my own request in an approval workflow?
  • Price manipulation — can I modify the price of an item in the request? Does the server trust the client-sent price, or does it recalculate from the database?
  • Rate limit testing — can I abuse functionality by sending requests faster than intended? Can I enumerate valid coupon codes, user IDs, or API keys through unrestricted endpoints?

API-Specific Testing

Modern web applications are API-driven. I test the API layer independently of the frontend:

  • GraphQL introspection — if the GraphQL endpoint allows introspection, I map the entire schema and test queries the frontend never makes
  • Mass assignment — can I add fields to a request body that the API accepts but the frontend never sends? Setting role: "admin" or isVerified: true in a registration request is a classic example
  • Excessive data exposure — does the API return more data than the frontend displays? User objects that include password hashes, internal IDs, or other users' data in a list response

Phase 4 — Reporting

A penetration test is only as valuable as its report. I have seen security assessments with critical findings that were never fixed because the report was unreadable — 200 pages of scanner output dumped into a Word document with no context, no prioritization, and no remediation guidance.

My reports follow a consistent structure:

Executive Summary

One to two pages written for non-technical stakeholders. No jargon. This section answers three questions: What did we test? What did we find? How serious is it? The executive summary includes an overall risk rating, the count of findings by severity, and the single most important recommendation.

Methodology

A brief description of the four-phase approach, the tools used, the testing window, and any limitations encountered (features that were out of scope, environments that were unavailable, authentication issues that limited coverage).

Findings

Each finding follows a consistent format:

  • Title — a clear, descriptive name for the vulnerability
  • Severity — Critical, High, Medium, Low, or Informational, based on CVSS scoring
  • Location — the specific URL, endpoint, parameter, or component where the vulnerability exists
  • Description — what the vulnerability is and why it matters, written so a developer who has never heard of this vulnerability class can understand it
  • Proof of Concept — step-by-step reproduction instructions with screenshots. I never report a vulnerability I cannot reproduce. The PoC is detailed enough that the development team can verify the issue independently
  • Impact — what an attacker could achieve by exploiting this vulnerability. Data theft? Account takeover? Financial loss? This is what motivates the fix
  • Remediation — specific, actionable guidance on how to fix the issue. Not "implement proper input validation" but "use parameterized queries with your ORM's built-in escaping for the search endpoint at /api/search, and apply the DOMPurify.sanitize() function to the user-generated content rendered in the comment component"
  • References — links to OWASP, CWE, or other authoritative resources for further reading

Risk Matrix

A visual summary showing all findings plotted by likelihood and impact. This gives stakeholders an immediate understanding of where the most urgent remediation effort should be directed.


Tools I Use

My core toolkit is deliberately focused. I would rather know three tools deeply than ten tools superficially.

Burp Suite Professional is my primary testing tool. The intercepting proxy, repeater, intruder, and scanner cover 80% of what I need for web application testing. Extensions like ActiveScan++, Autorize, and JSON Web Token Attacker extend its capabilities for specific vulnerability classes.

OWASP ZAP serves as my secondary scanner and is my recommendation for development teams who want to integrate security scanning into their CI/CD pipeline. It is open-source, actively maintained, and its API makes automation straightforward.

Nmap handles network-level reconnaissance. Service detection, port scanning, and NSE scripts for targeted checks against specific services.

ffuf and gobuster for directory and file enumeration. Fast, configurable, and effective with the right wordlists.

SQLMap for confirming and exploiting SQL injection vulnerabilities that I have already identified manually. I never point SQLMap at an application blindly — that is how you accidentally modify production data.

testssl.sh for comprehensive TLS/SSL configuration analysis.

subfinder and amass for subdomain enumeration during reconnaissance.

Postman and curl for manual API testing, constructing specific requests, and testing edge cases that Burp's repeater cannot handle as conveniently.

Every tool here is used within the authorized scope. These are not hacking tools — they are professional instruments for security assessment.


Common Findings from Real Assessments

After conducting numerous web application security assessments, patterns emerge. These are the vulnerabilities I find most frequently, ranked roughly by how often they appear:

  1. Broken access control (IDOR) — nearly every assessment. Developers check authentication but forget authorization. The frontend hides the button, but the API endpoint is wide open.
  1. Missing or misconfigured security headers — Content-Security-Policy is either absent or set to unsafe-inline unsafe-eval, which defeats its purpose entirely. Strict-Transport-Security is missing, allowing protocol downgrade attacks.
  1. Verbose error messages in production — stack traces, database query details, and internal file paths exposed in API error responses. This is free reconnaissance for an attacker.
  1. Insecure session management — session tokens that do not expire, cookies without the Secure flag on HTTPS applications, sessions that persist after password change.
  1. Exposed API documentation — Swagger UI accessible without authentication in production, revealing every endpoint, parameter, and data model.
  1. Outdated dependencies with known CVEs — npm packages or server software with published vulnerabilities that have patches available but not applied.
  1. SSRF through URL input features — any feature that accepts a URL and fetches it server-side (webhooks, link previews, file imports) is almost always exploitable for internal network access.
  1. Rate limiting gaps — login endpoints with rate limiting but password reset endpoints without it, or rate limiting applied per IP but not per account.
  1. Cross-site scripting in edge cases — the main input fields are sanitized, but file upload names, CSV export content, or PDF generation inputs are not.
  1. Business logic flaws — price manipulation, workflow bypasses, race conditions in financial operations. These are never found by scanners. They require understanding the application's purpose.

Writing Reports That Get Fixed

The goal of a penetration test is not to find vulnerabilities. It is to get vulnerabilities fixed. A report with 50 findings and zero fixes is worse than useless — it creates a false sense of security because "we had a pentest done."

Here is how I write reports that lead to action:

Prioritize ruthlessly. Not every finding is equally urgent. I mark critical and high-severity findings with clear "fix this within 7 days" or "fix this within 30 days" timelines based on exploitability and impact. A theoretical vulnerability with no practical exploit path does not get the same urgency as an IDOR that exposes customer payment data.

Write for the developer, not the auditor. The person fixing the vulnerability is usually a developer who may not have security expertise. I include code-level remediation guidance in the same language and framework the application uses. If the app is built with Next.js and Prisma, the remediation shows the parameterized Prisma query, not a generic "use prepared statements" recommendation.

Include reproduction steps that actually work. Every PoC should be reproducible by the development team without my assistance. I include exact HTTP requests (sanitized of any sensitive data), screenshots of each step, and expected versus actual behavior.

Offer a retest. I include a retest in my engagement scope. After the client has fixed the reported issues, I verify the fixes. This closes the loop and ensures that the remediation actually addresses the vulnerability rather than introducing a new one.


Responsible Disclosure

Not every security issue I encounter is during a paid engagement. Sometimes I notice a vulnerability in a product I use, a service I interact with, or an open-source project I contribute to.

Responsible disclosure follows a strict process:

  1. Document the vulnerability with a clear proof of concept — enough to demonstrate the issue, not enough to enable exploitation at scale
  2. Contact the vendor through their published security contact (security@, a bug bounty program, or a security.txt file)
  3. Give reasonable time — 90 days is the industry standard for remediation before public disclosure
  4. Do not discuss publicly until the vendor has patched the issue and affected users have had time to update
  5. Never exploit beyond proof of concept — access the minimum data necessary to demonstrate the vulnerability, then stop

If a vendor has a bug bounty program, I follow their specific disclosure rules. If they do not respond within a reasonable timeframe, I escalate through CERT/CC or other coordination bodies rather than going public unilaterally.

The security community depends on responsible disclosure. Dropping a zero-day on Twitter does not make you a hacker — it makes you a risk to every user affected by that vulnerability.


How I Price Security Assessments

Transparency matters. Here is how I think about pricing for web application penetration testing, without revealing specific numbers that vary by engagement.

Factors that determine scope and pricing:

  • Application complexity — a five-page marketing site is fundamentally different from a SaaS platform with 200 API endpoints, role-based access control, payment processing, and third-party integrations
  • Authentication complexity — a single user role versus a multi-tenant application with admin, manager, user, and API key access levels
  • Testing depth — a baseline assessment (automated scanning plus targeted manual testing) versus a comprehensive engagement (full manual testing across all OWASP categories plus business logic review)
  • Environment — testing against a staging environment with synthetic data versus coordinated testing against production with real user data (which requires additional safeguards)
  • Deliverables — standard report versus report plus developer workshop plus retest
  • Timeline — standard delivery versus expedited results

What clients receive:

  • Detailed findings report in the format described above
  • Executive summary suitable for board-level presentation
  • Remediation guidance specific to their technology stack
  • Retest of fixed vulnerabilities within an agreed window
  • A debrief call to walk through findings with the development and security teams

I do not charge by the vulnerability. Finding more issues should never be a financial disincentive. The engagement is scoped by application complexity and testing depth, and the deliverable is a comprehensive assessment regardless of whether I find 5 issues or 50.

If you are interested in a security assessment for your web application, you can review my services and get in touch at /services.


Key Takeaways

  • Authorization is mandatory. Written scope, signed by both parties, before any testing begins. No exceptions.
  • Reconnaissance drives quality. The time invested in mapping the attack surface directly determines the value of the assessment. Never skip recon.
  • Scanners are a baseline, not a conclusion. Automated tools find the obvious issues. Manual testing finds the critical ones — business logic flaws, chained vulnerabilities, and access control failures.
  • The four phases are sequential and interdependent. Recon informs scanning targets. Scanning surfaces leads for manual testing. Manual testing produces the findings. Reporting translates findings into action.
  • Reports must drive remediation. A report that sits in a shared drive is worthless. Write for the developer who will fix the issue. Include code-level guidance. Offer a retest.
  • Responsible disclosure protects everyone. Never exploit beyond proof of concept. Never disclose before the vendor has remediation time. The goal is to make systems more secure, not to demonstrate what you can break.
  • Methodology over tools. The methodology is transferable. Tools change, frameworks evolve, and vulnerability classes shift. A structured approach to security testing remains effective regardless of the technology stack.

*I am Uvin Vindula (@IAMUVIN), a Web3 and AI engineer and ethical hacker based between Sri Lanka and the UK. I build production-grade applications and assess them for security vulnerabilities. I perform authorized penetration testing for web applications, APIs, and blockchain-based systems. Every finding leads to a fix, and every assessment follows the methodology described in this article. If your application needs a security assessment, explore my services or reach out at contact@uvin.lk.*

Working on a Web3 or AI project?

Share
Uvin Vindula

Uvin Vindula

Web3 and AI engineer based in Sri Lanka and the UK. Author of The Rise of Bitcoin. Director of Blockchain and Software Solutions at Terra Labz. Founder of uvin.lk — Sri Lanka's Bitcoin education platform with 10,000+ learners.