Skillfade Logo

Web Hacking Stages, Fuzzing for Files and Pages

⏳ 8 min read

Table of Contents

    Cybersecurity Illustration

    You enter the neon-lit alley of cyberspace, rain drizzling upon the top of your hood, neon signs reflecting off puddles beneath your boots. Servers hum like distant thunder, wires pulsing with electric life, shards of data darting through tunnels of fibre optics. In this world every packet is a potential threat, every line of source code a door you might silently open. You are both hunter and hunted, a cyberpunk pilgrim seeking the truth in the digital shadows.

    The air tastes of ozone and possibility, screens flicker with fragmented code, the glow from monitors casting long silhouettes. You feel the weight of knowledge and curiosity pressing against your skull, demanding you look deeper. This is the zone of web hacking, the acid-trip of packet injection, directory traversal, SQL bruising, where your only weapon is your mind and your only guide is process. Strap in, because we’re going for a ride through the stages of web hacking, with special focus on fuzzing for files and pages.


    Web Hacking Stages

    To traverse this terrain effectively, you must understand the map. Below are the key stages, each a layer in the onion of exploitation.

    1. Reconnaissance
      Gather information: site architecture, technologies used, public endpoints, versions of software, subdomains. Use tools like nmap, whatweb, dirb, whois. You want the lay of the land, like a scout mapping the horizon before the charge.

    2. Scanning and Enumeration
      Once you have basic info, probe deeper: scan for open ports, enumerate directories, test server responses. Find where the vault doors are weak. Identify endpoints that reveal interesting behaviour, misconfigured servers, unintentionally exposed files.

    3. Vulnerability Analysis
      Assess what you found. Are there vulnerable plugins, outdated software, poor input validation, insecure configurations? Make a threat model: remote code execution, SQL injection, cross-site scripting, file inclusion, directory traversal.

    4. Exploitation
      Time to strike: craft payloads, exploit weaknesses, elevate privileges. Use exploitation frameworks like Metasploit, or write your own scripts. Be cautious: exploitation without permission is illegal.

    5. Post-Exploitation and Reporting
      After successful access, you map internal paths, exfiltrate data ethically in CTF or authorised penetration tests, clean up traces, document everything. Provide recommendations: patching, proper access controls, sanitisation.

    Each stage feeds into the next, you loop back, re-evaluate, adapt tactics. Without discipline you’ll wander, get caught, or flop.


    Fuzzing for Files and Pages

    Fuzzing is the art of blind-probing, throwing data at inputs to see what breaks. For files and pages this means attempting to discover unlinked pages, hidden files, backup versions, misconfigurations by systematically requesting resources and examining responses.

    Why Files and Pages Fuzzing Matters

    • Hidden admin panels, old backups, config files accidentally left world-readable often leak secrets.
    • Finding pages with different status codes (200, 403, 500) gives clues about structure.
    • Many breach incidents begin with discovering a forgotten .git folder or a backup.zip.

    Tools and Wordlists

    Use tools such as ffuf, dirbuster, gobuster, wfuzz. Obtain wordlists from repositories like SecLists. Curate your own lists for the domain under test.

    Practical Snippets

    Below are code examples. Organised for safety and learning, avoid using them for illegal hacking. Always have explicit permission.

    Bash Example with ffuf

    bash
    #!/bin/bash
    # Fuzz directories and pages against target domain
    TARGET="https://example.com"
    WORDLIST="/path/to/seclists/Discovery/Web-Content/common.txt"
    
    ffuf -u ${TARGET}/FUZZ -w ${WORDLIST} \
      -o ffuf_results.json \
      -mc 200,301,302,403,500 \
      -ac
    
    • -mc filters by HTTP status codes of interest.
    • -ac auto-calibrate to filter out noise.

    Python Script for Backup File Discovery

    python
    #!/usr/bin/env python3
    import requests
    
    target = "https://example.com"
    candidates = ["backup.zip","site_backup.tar.gz","old","db_backup.sql","config.bak"]
    
    for filename in candidates:
        url = f"{target}/{filename}"
        resp = requests.get(url, timeout=5)
        if resp.status_code == 200:
            print(f"Found: {url} ({len(resp.content)} bytes)")
        elif resp.status_code in (403, 401):
            print(f"Possible protection: {url} ({resp.status_code})")
        else:
            print(f"Nope: {url} ({resp.status_code})")
    

    PowerShell for Windows-Hosted Pages

    powershell
    $target = "https://example.com"
    $list = Get-Content .\filelist.txt
    
    foreach ($file in $list) {
        $uri = "$target/$file"
        try {
            $resp = Invoke-WebRequest -Uri $uri -UseBasicParsing -TimeoutSec 5
            if ($resp.StatusCode -eq 200) {
                Write-Output "Accessible: $uri"
            } elseif ($resp.StatusCode -in 403,401) {
                Write-Output "Protected: $uri ($($resp.StatusCode))"
            }
        } catch {
            Write-Output "Error: $uri"
        }
    }
    

    Strategies and Best Practices

    • Start with broad scans, then narrow in. Bruteforce may yield too much noise.
    • Respect robots.txt for reconnaissance, but know it may list sensitive paths.
    • Watch for differences in response size, headers, redirects; a 200 with zero content can be as revealing as one with huge content.
    • Use rate-limiting, random delays to avoid being blocked.
    • Log everything: path, status code, content length, time of response.

    Security and Ethical Considerations

    This work walks a fine line. Doing this without authorisation could be unlawful. Use test environments, CTFs, bug bounties with clear scope. If you discover sensitive data accidentally, report it appropriately. Treat web hacking as a craft, not vandalism.


    Web Hacking Stages and Fuzzing for Files and Pages: A Practical Guide

    Aim

    This guide teaches you how to carry out the core stages of web hacking and how to employ fuzzing techniques to find hidden files and pages. You will learn through hands-on examples how to map, scan, and fuzz web resources effectively.


    Learning Outcomes

    By the end of this guide you will be able to:

    • Identify the stages of web hacking: reconnaissance, mapping, scanning, exploitation and reporting
    • Use tools to discover hidden files and pages via fuzzing
    • Construct your own fuzzing scripts in Bash and Python to automate file and directory discovery
    • Analyse HTTP responses to pinpoint valid vs invalid paths

    Prerequisites

    • Familiarity with HTTP requests, status codes (200, 404, 403 etc.), and basic web server behaviour
    • Basic command-line skills (Linux or macOS terminal, or PowerShell on Windows)
    • Python 3 installed (with standard libraries such as requests)
    • Tools such as curl, wget or ffuf (or equivalent)

    Step-by-Step Guide

    1. Reconnaissance

    • Gather information about the target domain: DNS records, IP address, subdomains.
    • Use dig, nslookup, whois to identify DNS entries.
    bash
      dig example.com any
      whois example.com
    
    • Enumerate subdomains:
    bash
      ffuf -u https://example.com -w subdomains.txt -H "Host: FUZZ.example.com"
    

    2. Mapping the Web Application

    • Crawl the site to discover link structure; use tools like wget for basic crawling.
    bash
      wget --mirror --convert-links --adjust-extension --page-requisites --no-parent https://example.com
    
    • Use a site map generator or create your own list of known or likely paths (e.g. /admin, /uploads, /backup etc.).

    3. Scanning for Vulnerabilities

    • Scan for common weaknesses using automated tools (e.g. Nikto, OWASP ZAP).
    • Identify input vectors: pages with query parameters, file uploads, login forms.

    4. Fuzzing for Files and Pages

    4.1 Prepare a Wordlist
    • Use or customise wordlists like common.txt, directory-list-2.3-medium.txt.
    • Include file extensions such as .php, .html, .bak, .old, .zip.
    4.2 Use Bash with curl or wget
    • Example Bash loop with curl to check paths:
    bash
      wordlist="wordlist.txt"
      target="https://example.com"
      for word in $(cat $wordlist); do
        url="$target/$word"
        status=$(curl -o /dev/null -s -w "%{http_code}" $url)
        if [ "$status" -ne 404 ]; then
          echo "Found $url (status $status)"
        fi
      done
    
    4.3 Use a Python Script
    • Example Python script using requests:
    python
      import requests
    
      target = "https://example.com"
      wordlist = "wordlist.txt"
    
      with open(wordlist, 'r') as f:
          for line in f:
              path = line.strip()
              url = f"{target}/{path}"
              resp = requests.get(url, allow_redirects=False)
              if resp.status_code not in (404, 403):
                  print(f"Found: {url} (Status: {resp.status_code})")
    
    4.4 Advanced Fuzzing with Tools
    • Use ffuf:
    bash
      ffuf -u https://example.com/FUZZ -w wordlist.txt -mc 200,301,302,403
    

    Options:
    - -u: URL with FUZZ placeholder
    - -w: path to wordlist
    - -mc: match HTTP status codes

    5. Analysing Results

    • Collect valid paths. Verify whether they allow sensitive access (backups, admin panels, configuration files).
    • Investigate each discovered resource manually: review page content, check for disclosure of secrets.

    6. Exploitation

    • If valid files or pages are found, test whether they are exploitable – for instance, a backup file may contain credentials, source code or database connection strings.

    7. Reporting and Remediation Suggestions

    • Document all discovered files/pages, status codes, access methods.
    • Provide remediation guidance:
    • Remove unnecessary files
    • Disable directory listing
    • Implement proper access controls

    Actionable Insights

    • Always begin with small wordlists and refine them based on what you find – testing thousands of entries at once may lead to many false positives or wasted time.
    • Log output and use tools to colourise or filter status codes: valid codes (200, 301, 302) are of most interest, but sometimes 403 or 401 may also indicate presence of content.
    • Respect legal boundaries: perform these actions only on systems for which you have explicit permission.

    Practising these stages and techniques will sharpen your ability to discover hidden web resources and to assess their risks effectively.

    You emerge, sweat on your brow, mind buzzing with paths you hadn’t noticed before, claws in unseen directories. The glow of code still dances behind your eyes. These stages, this fuzzing, they are not endpoints but tools in your arsenal. Keep your curiosity sharp, your ethics solid, your scripts precise, and doors that once seemed closed will begin to whisper open.