Penetration Testing Reconnaissance

Reconnaissance is the critical first phase of any penetration test, where security professionals gather information about target systems, networks, and organizations. This comprehensive guide explores both passive and active reconnaissance techniques used by penetration testers to build a complete picture of their targets before launching security assessments.

Understanding Reconnaissance in Penetration Testing

Reconnaissance, often called the “information gathering” phase, is where penetration testers collect as much data as possible about a target. The quality of information gathered during this phase directly impacts the success of subsequent testing phases. Reconnaissance can be categorized into two main types:

  • Passive Reconnaissance: Gathering information without directly interacting with the target system
  • Active Reconnaissance: Directly engaging with the target to gather information

Why Reconnaissance Matters

Professional penetration testers spend 40-60% of their testing time on reconnaissance. This phase helps:

  • Identify attack surface: Discover all systems, services, and applications exposed
  • Find low-hanging fruit: Identify obvious vulnerabilities and misconfigurations
  • Plan attack strategies: Develop targeted approaches based on discovered technologies
  • Avoid detection: Better intelligence means fewer test attempts and reduced noise
  • Save time: Proper reconnaissance prevents wasted effort on dead ends

Passive Reconnaissance Techniques

Passive reconnaissance involves collecting information without sending any packets or requests directly to the target. This approach is stealthy and leaves minimal traces in target logs.

1. OSINT (Open Source Intelligence)

OSINT leverages publicly available information to learn about targets.

Search Engine Intelligence

Search engines contain vast amounts of indexed information about organizations:

# Google Dorking Examples

## Find exposed configuration files
site:example.com filetype:conf | filetype:config | filetype:cfg

## Discover backup files
site:example.com filetype:bak | filetype:backup | filetype:old

## Find database dumps
site:example.com filetype:sql | filetype:db | filetype:dump

## Locate login pages
site:example.com inurl:login | inurl:admin | inurl:portal

## Find exposed directories
site:example.com intitle:"index of" | intitle:"directory listing"

## Discover employee information
site:linkedin.com "example.com" "security engineer"

## Find exposed documents
site:example.com filetype:pdf | filetype:doc | filetype:xls

Tools for Search Engine Intelligence:

  • Google Hacking Database (GHDB): Repository of advanced search queries
  • DorkSearch: Automated Google dorking tool
  • Pagodo: Automated GHDB query tool

DNS Information Gathering

DNS records reveal significant information about infrastructure:

## DNS Lookups with dig
dig example.com ANY
dig example.com MX
dig example.com TXT
dig example.com NS

## Retrieve SPF and DMARC records
dig example.com TXT | grep -E "spf|dmarc"

## Zone transfer attempts (usually blocked but worth trying)
dig @ns1.example.com example.com AXFR

## Reverse DNS lookups
dig -x 192.168.1.1

DNS Reconnaissance Tools:

## DNSRecon - comprehensive DNS enumeration
dnsrecon -d example.com -a

## Fierce - DNS scanner
fierce --domain example.com

## DNSEnum - multiple query types
dnsenum example.com

2. WHOIS and Domain Registration

WHOIS databases provide registration details:

## Basic WHOIS lookup
whois example.com

## Historical WHOIS data (using online services)
## ViewDNS.info, DomainTools, WHOIS History

## Reverse WHOIS (find other domains by same registrant)
## Can use services like DomainTools or ViewDNS

Information from WHOIS:

  • Registrant contact information
  • Domain registration and expiry dates
  • Name servers
  • Hosting provider
  • Administrative and technical contacts
  • Organization details

3. Social Media Intelligence

Social media platforms leak substantial organizational information:

## LinkedIn enumeration
## Manually or using tools like linkedin2username

## theHarvester - email and subdomain harvesting
theHarvester -d example.com -b linkedin,twitter,google

## Sherlock - username enumeration across platforms
sherlock target_username

## Social Mapper - correlate social media profiles
## python social_mapper.py -f companyemployees.txt

Social Media Reconnaissance Goals:

  • Employee names and positions
  • Organizational structure
  • Technology stack mentions
  • Project details and timelines
  • Email address formats
  • Upcoming events and conferences

4. Website Analysis

Analyzing target websites reveals technologies and potential vulnerabilities:

## Builtwith technology profiler (online)
## Visit builtwith.com/example.com

## Wappalyzer - identify web technologies
## Browser extension or CLI tool

## WhatWeb - web technology identifier
whatweb example.com

## wget for offline analysis
wget --mirror --convert-links --adjust-extension --page-requisites --no-parent https://example.com

## Examine JavaScript files for sensitive data
curl https://example.com/app.js | grep -E "api|key|token|secret|password"

Website Metadata:

## Extract email addresses from website
curl https://example.com | grep -Eo "[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}"

## Download and analyze robots.txt
curl https://example.com/robots.txt

## Check security headers
curl -I https://example.com

## SSL/TLS certificate information
echo | openssl s_client -connect example.com:443 2>/dev/null | openssl x509 -noout -text

5. Public Records and Leaked Data

Checking for exposed information in breaches and public databases:

## Have I Been Pwned - check for breached credentials
## API or web interface at haveibeenpwned.com

## Dehashed - search leaked databases
## dehashed.com

## IntelX - search paste sites and leaks
## intelx.io

## GitHub reconnaissance
## Search for organization repositories and commits
## Look for exposed API keys, credentials, internal URLs

GitHub Dorking:

## Search GitHub for sensitive information
## Example queries:
org:example password
org:example api_key
org:example.com extension:pem
org:example AWS_SECRET
org:example PRIVATE KEY

6. Archive and Historical Data

Web archives preserve historical versions of websites:

## Wayback Machine
## Visit web.archive.org/web/*/example.com

## Archive.today
## archive.is/example.com

## Using waybackurls tool
waybackurls example.com | grep -E "\.js|\.json|api|admin"

Historical data can reveal:

  • Deprecated but still-functional endpoints
  • Previously exposed sensitive information
  • Changes in infrastructure
  • Old admin panels
  • Forgotten subdomains

Active Reconnaissance Techniques

Active reconnaissance involves direct interaction with target systems. While more likely to be detected, it provides deeper technical insights.

1. Port Scanning

Identifying open ports and running services:

## Nmap - Network Mapper

## Quick scan of common ports
nmap -F example.com

## Comprehensive scan with service detection
nmap -sV -sC -p- example.com

## Aggressive scan (OS detection, version detection, script scanning)
nmap -A example.com

## Stealth SYN scan
nmap -sS example.com

## UDP scan (slower but important)
nmap -sU --top-ports 100 example.com

## Scan with timing template (faster but noisier)
nmap -T4 -p- example.com

## Save results in all formats
nmap -sV -sC -oA scan_results example.com

Nmap Scripting Engine (NSE):

## Vulnerability detection scripts
nmap --script vuln example.com

## HTTP enumeration scripts
nmap --script http-* -p 80,443 example.com

## SMB enumeration
nmap --script smb-* -p 445 example.com

## Custom script categories
nmap --script="safe and discovery" example.com

Alternative Port Scanners:

## Masscan - very fast port scanner
masscan -p1-65535 example.com --rate=10000

## RustScan - fast port scanner that feeds to nmap
rustscan -a example.com -- -sV -sC

## Unicornscan - distributed port scanner
unicornscan example.com:1-65535

2. Service Enumeration

Deep diving into discovered services:

## HTTP/HTTPS enumeration
## Nikto - web server scanner
nikto -h https://example.com

## FTP enumeration
nmap --script ftp-* -p 21 example.com
ftp example.com  # Anonymous login attempt

## SSH enumeration
ssh -v example.com
nmap --script ssh-* -p 22 example.com

## SMTP enumeration
nmap --script smtp-* -p 25 example.com
smtp-user-enum -M VRFY -U users.txt -t example.com

## SMB/NetBIOS enumeration
nbtscan 192.168.1.0/24
enum4linux -a example.com
smbclient -L \\\\example.com -N

## SNMP enumeration
snmpwalk -v 2c -c public example.com
onesixtyone -c community.txt example.com

## RDP enumeration
nmap --script rdp-* -p 3389 example.com

3. Web Application Enumeration

Comprehensive web application analysis:

## Directory and file enumeration
## Gobuster - fast directory brute-forcer
gobuster dir -u https://example.com -w /usr/share/wordlists/dirb/common.txt

## Feroxbuster - recursive directory scanner
feroxbuster -u https://example.com -w /usr/share/wordlists/dirb/common.txt

## ffuf - fast web fuzzer
ffuf -u https://example.com/FUZZ -w /usr/share/wordlists/dirb/common.txt

## Dirsearch - advanced directory scanner
dirsearch -u https://example.com -e php,html,js

## Subdomain enumeration
## Sublist3r
sublist3r -d example.com

## Amass - comprehensive subdomain enumeration
amass enum -d example.com

## Subfinder
subfinder -d example.com

## Assetfinder
assetfinder --subs-only example.com

## DNS bruteforcing
gobuster dns -d example.com -w /usr/share/wordlists/dns/subdomains-top1million-5000.txt

## Virtual host discovery
gobuster vhost -u https://example.com -w /usr/share/wordlists/virtual-hosts.txt

4. Network Mapping

Understanding network topology:

## Traceroute - map network path
traceroute example.com
## Windows: tracert example.com

## ICMP sweep
nmap -sn 192.168.1.0/24

## ARP scanning (local network)
arp-scan --localnet
netdiscover -r 192.168.1.0/24

## Identify WAF/firewall
wafw00f https://example.com

## Network range identification
whois 8.8.8.8 | grep -E "CIDR|NetRange"

5. Email and User Enumeration

Identifying valid user accounts and email addresses:

## Email harvesting with theHarvester
theHarvester -d example.com -b all

## User enumeration via SMTP VRFY
smtp-user-enum -M VRFY -U users.txt -t mail.example.com

## Username enumeration on web apps
## Using custom scripts or Burp Suite Intruder
## Look for different responses for valid vs invalid users

## Email format identification
## hunter.io API or similar services
## Pattern: [email protected]

## LinkedIn2Username - generate username lists
linkedin2username -c "Example Company"

6. Vulnerability Scanning

Automated vulnerability identification:

## Nessus (commercial, with free Essentials version)
## Web interface for comprehensive vulnerability scanning

## OpenVAS (open source Nessus alternative)
## Web interface for vulnerability management

## Nikto - web server vulnerabilities
nikto -h https://example.com -Tuning 123456789

## Nuclei - fast vulnerability scanner
nuclei -u https://example.com -t /root/nuclei-templates/

## WPScan - WordPress vulnerability scanner
wpscan --url https://example.com --enumerate u,p,t

## SQLMap - SQL injection detection
sqlmap -u "https://example.com/page?id=1" --batch

## XSStrike - XSS detection
xsstrike -u "https://example.com/search?q=test"

Reconnaissance Methodology

A structured approach ensures comprehensive coverage:

Phase 1: Initial Passive Reconnaissance (No Target Contact)

  1. WHOIS and domain registration lookups
  2. Search engine intelligence (Google dorking)
  3. Social media profiling of organization and employees
  4. Public records and breach data searches
  5. Historical data via web archives
  6. Technology identification via passive tools

Phase 2: Active External Reconnaissance (Internet-Facing Systems)

  1. Subdomain enumeration and discovery
  2. Port scanning of discovered hosts
  3. Service identification and version detection
  4. Web application enumeration (directories, files, parameters)
  5. Email and user enumeration
  6. Vulnerability scanning of external assets

Phase 3: Active Internal Reconnaissance (If Scope Permits)

  1. Network mapping of internal ranges
  2. Host discovery via various protocols
  3. Service enumeration on internal systems
  4. Share and file system enumeration
  5. Active Directory enumeration (if applicable)
  6. Wireless network reconnaissance (if in scope)

Best Practices and Operational Security

⚠️ WARNING: Always ensure you have written authorization before conducting
reconnaissance, especially active techniques. Unauthorized reconnaissance
can violate computer fraud and hacking laws (CFAA in US, Computer Misuse
Act in UK, etc.).

Essential Rules:

  • Obtain explicit written permission and scope definition
  • Stay within defined scope boundaries
  • Document all activities with timestamps
  • Respect privacy and confidentiality
  • Follow coordinated disclosure policies for findings

Avoiding Detection

## Use proxies and VPNs
## Tor for anonymity (where appropriate and legal)

## Rate limiting with nmap
nmap -T2 --scan-delay 1s example.com

## Randomize scan order
nmap --randomize-hosts -iL targets.txt

## Fragmentation to evade IDS
nmap -f example.com

## Decoy scanning
nmap -D RND:10 example.com

## Source port manipulation
nmap --source-port 53 example.com

Documentation

Thorough documentation is critical:

## Create organized directory structure
mkdir -p recon/{passive,active}/{domains,hosts,services,vulns}

## Use tools with output options
nmap -oA recon/active/hosts/nmap_scan example.com

## Screenshot everything
## Use tools like Eyewitness for automated screenshots

## Maintain a chronological log
echo "[$(date)] Started port scan of example.com" >> recon/activity_log.txt

## Create relationship diagrams
## Document relationships between discovered assets

Advanced Reconnaissance Techniques

Automated Reconnaissance Frameworks

## Recon-ng - reconnaissance framework
recon-ng
workspaces create example_company
db insert domains example.com
modules load recon/domains-hosts/google_site_web
run

## SpiderFoot - OSINT automation
spiderfoot -s example.com

## Maltego - visual link analysis
## Commercial tool with community edition
## Excellent for mapping relationships

## OSINT Framework
## Web-based collection of OSINT tools
## Visit osintframework.com

## Amass + Subfinder + Assetfinder combination
amass enum -passive -d example.com -o amass.txt
subfinder -d example.com -o subfinder.txt  
assetfinder --subs-only example.com > assetfinder.txt
cat amass.txt subfinder.txt assetfinder.txt | sort -u > all_subdomains.txt

Cloud and Container Reconnaissance

## S3 bucket enumeration
## Using bucket name patterns
aws s3 ls s3://example-backup --no-sign-request

## Cloud IP ranges
## Check cloud provider IP lists
## AWS: https://ip-ranges.amazonaws.com/ip-ranges.json
## Azure: Microsoft download center
## GCP: Google Cloud IP ranges

## Docker registry enumeration
curl https://registry.example.com/v2/_catalog

## Kubernetes service discovery
nmap -p 8001,8080,10250,10255 example.com

Certificate Transparency Logs

Certificates reveal subdomains and services:

## Certificate transparency search
curl -s "https://crt.sh/?q=%25.example.com&output=json" | jq -r '.[].name_value' | sort -u

## Using certspotter
certspotter -domain example.com

## Censys search for certificates
## Via web interface or API

## SSLyze for SSL/TLS configuration
sslyze --regular example.com

Reconnaissance Tools Arsenal

Essential Tools by Category

Passive Reconnaissance:

  • theHarvester: Email and subdomain harvesting
  • Recon-ng: Reconnaissance framework
  • Maltego: Visual intelligence
  • Shodan: Search engine for internet-connected devices
  • Censys: Internet-wide scanning data

Active Reconnaissance:

  • Nmap: Port scanning and service detection
  • Masscan: High-speed port scanning
  • Gobuster: Directory and DNS brute-forcing
  • Nikto: Web server scanning
  • Nuclei: Vulnerability scanning with templates

Subdomain Enumeration:

  • Amass: Comprehensive subdomain discovery
  • Subfinder: Fast passive subdomain enumeration
  • Sublist3r: Subdomain enumeration via search engines
  • Assetfinder: Find related domains and subdomains

Web Application:

  • Burp Suite: Web application testing platform
  • OWASP ZAP: Web application security scanner
  • Ffuf: Fast web fuzzer
  • Feroxbuster: Recursive directory scanner

Building Your Reconnaissance Toolkit

#!/bin/bash
## setup_recon_tools.sh - Install essential reconnaissance tools

## Update system
apt update && apt upgrade -y

## Install prerequisites
apt install -y python3 python3-pip git curl wget nmap

## Install Go (for Go-based tools)
wget https://go.dev/dl/go1.21.0.linux-amd64.tar.gz
tar -C /usr/local -xzf go1.21.0.linux-amd64.tar.gz
export PATH=$PATH:/usr/local/go/bin

## Clone and install subdomain enumeration tools
go install github.com/projectdiscovery/subfinder/v2/cmd/subfinder@latest
go install github.com/tomnomnom/assetfinder@latest
git clone https://github.com/OWASP/Amass
cd Amass && go install ./... && cd ..

## Install web enumeration tools
go install github.com/ffuf/ffuf@latest
go install github.com/epi052/feroxbuster@latest
apt install -y gobuster

## Install other essential tools
pip3 install theHarvester
apt install -y nikto
git clone https://github.com/projectdiscovery/nuclei.git
cd nuclei/v2/cmd/nuclei && go build && cd ../../../..

## Install Metasploit Framework
curl https://raw.githubusercontent.com/rapid7/metasploit-omnibus/master/config/templates/metasploit-framework-wrappers/msfupdate.erb > msfinstall
chmod 755 msfinstall
./msfinstall

echo "Reconnaissance toolkit installation complete!"

Common Reconnaissance Scenarios

Scenario 1: External Network Penetration Test

## Step 1: Passive information gathering
theHarvester -d target.com -b all -l 500 > harvest.txt
amass enum -passive -d target.com -o domains.txt

## Step 2: Subdomain verification
cat domains.txt | httprobe > live_domains.txt

## Step 3: Port scanning
nmap -iL live_domains.txt -oA nmap_results -sV -sC -p-

## Step 4: Web enumeration
while read domain; do
    gobuster dir -u "https://$domain" -w /usr/share/wordlists/dirb/common.txt -o "gobuster_$domain.txt"
done < live_domains.txt

## Step 5: Vulnerability scanning
nuclei -l live_domains.txt -t /root/nuclei-templates/ -o nuclei_results.txt

Scenario 2: Web Application Assessment

## Technology identification
whatweb https://webapp.example.com

## Subdomain discovery
amass enum -active -d example.com -o subdomains.txt

## Content discovery
feroxbuster -u https://webapp.example.com -w /usr/share/wordlists/dirb/big.txt -x php,html,js,txt -o ferox_results.txt

## Parameter discovery
arjun -u https://webapp.example.com/api/user

## JavaScript analysis
cat webapp.js | grep -E "api|endpoint|route" > endpoints.txt

## API endpoint discovery
ffuf -u https://webapp.example.com/api/FUZZ -w api-wordlist.txt -mc 200,301,302

Scenario 3: Internal Network Assessment

## Network discovery
nmap -sn 10.0.0.0/8 -oG hosts.txt

## Extract active hosts
grep "Up" hosts.txt | cut -d' ' -f2 > active_hosts.txt

## Service enumeration
nmap -sV -sC -iL active_hosts.txt -oA services

## SMB enumeration
crackmapexec smb 10.0.0.0/24

## LDAP/AD enumeration
ldapsearch -x -H ldap://dc.example.local -b "DC=example,DC=local"

## Responder for LLMNR/NBT-NS poisoning (passive mode)
responder -I eth0 -A

Integrating Reconnaissance into the Kill Chain

Reconnaissance feeds into subsequent penetration testing phases:

  1. Weaponization: Use discovered technologies to craft exploits
  2. Delivery: Target specific services and users identified
  3. Exploitation: Focus on vulnerabilities found during scanning
  4. Installation: Plan persistence based on system architecture
  5. Command & Control: Understand egress filtering and monitoring
  6. Actions on Objectives: Know what data and systems exist

Conclusion

Reconnaissance is the foundation of successful penetration testing. By combining passive and active techniques, penetration testers build comprehensive intelligence profiles that guide all subsequent testing activities. Key takeaways:

Start passive: Minimize detection risk with OSINT and passive techniques ✅ Progress to active: Conduct targeted active reconnaissance based on passive findings ✅ Document everything: Maintain detailed logs and organize findings systematically ✅ Stay legal: Always operate within authorized scope and legal boundaries ✅ Automate wisely: Use tools and frameworks but verify results manually ✅ Think like attackers: Consider what real adversaries would find valuable ✅ Continuous process: Reconnaissance doesn’t end after the initial phase

Master these reconnaissance techniques, and you’ll consistently uncover attack vectors that others miss, leading to more comprehensive and valuable penetration testing assessments.

Remember: The time invested in thorough reconnaissance directly correlates with testing effectiveness. Never rush this critical phase.

Thank you for reading! If you have any feedback or comments, please send them to [email protected].