Module 2: External Attack Surface Discovery

Read: 10 min Lab: 35 min Total: 45 min

Overview

Your external attack surface is everything an adversary can see about your organization without authenticating -- domains, subdomains, login portals, exposed services, and the technologies behind them. In Module 1, we established that publicly discoverable information enabled every attack we reviewed. This module teaches you how to discover that information systematically, the same way an adversary would.

What Adversaries See

External reconnaissance requires no credentials and no special access. An adversary with a web browser and free tools can discover:

  • Domains and subdomains -- every registered domain and SSL certificate your organization has issued reveals infrastructure
  • Login portals -- VPN, webmail, remote desktop, and application login pages are visible to anyone who finds them
  • Technology stack -- HTTP headers, certificate details, and login page appearance reveal specific products and versions
  • Development and staging environments -- subdomains like dev., staging., or test. often have weaker security controls

Discovery Techniques

This module uses four complementary techniques, all browser-based:

Technique What It Finds Tools
Certificate transparency Subdomains from SSL certificate issuance logs crt.sh
Subdomain enumeration Subdomains with IP addresses, hosting providers, infrastructure mapping Subdomain Finder
DNS enumeration DNS records, mail servers, name servers, network maps DNSDumpster, SecurityTrails
Internet device search Exposed services, banners, product identification Shodan, Censys
Search engine dorking Indexed login pages, documents, configuration files Google advanced search operators

Naming Conventions That Reveal Infrastructure

Organizations often use predictable subdomain naming patterns. These are valuable reconnaissance targets because they indicate what is behind each service:

Pattern What It Suggests
vpn., remote., access. VPN or remote access portals
citrix., rdp., vdi. Virtual desktop / remote desktop services
mail., owa., webmail. Email services (login portals)
dev., staging., test. Development/staging environments (often less secured)
plant1., substation-west., site- Facility-specific infrastructure (reveals OT network presence)
api., portal., dashboard. Application interfaces (potential data exposure)

Lab: Build Your External Attack Surface Inventory

In this lab, you will discover your target organization's external attack surface using browser-based tools and AI-assisted analysis. Follow each step and document your findings -- the output becomes Artifact 2.

Step 1: Target Selection

Select your target organization for this exercise:

  • Your own organization (with authorization) -- provides the most operationally relevant results
  • NRECA (practice target) -- follow along using the worked examples included in each step

Identify all root domains you will investigate. Most organizations have more than one -- check the main website, email domain, and any known portals. The primary website domain often reveals the least infrastructure; member portals, internal service domains, and email domains are typically far richer targets.

Step 2: Domain and Subdomain Enumeration

This step uses two complementary approaches: certificate transparency (CT) logs reveal subdomains from SSL certificate issuance records, and subdomain enumeration tools aggregate results from multiple sources with IP address data. Together they provide comprehensive domain discovery.

2a. Certificate transparency search

Certificate transparency logs record every SSL/TLS certificate issued by public certificate authorities. When an organization requests a certificate for a subdomain, that subdomain becomes permanently recorded in CT logs -- even if the subdomain is not linked from any public page.

crt.sh provides a JSON API that returns structured certificate data. Appending &output=json to any crt.sh query returns JSON instead of an HTML table -- this makes it easy to feed results directly to your AI client for analysis.

Query the JSON endpoint for each of your root domains:

https://crt.sh/?q=%25.[yourdomain.com]&output=json

Save the JSON response to a local file using your browser's Save Page As (Ctrl+S / Cmd+S). Name it something descriptive like yourdomain_crt.json. Then attach or paste the saved JSON file into your AI client with this prompt:

CT JSON Analysis Prompt
I am analyzing certificate transparency data from crt.sh for
[organization name]. The attached/pasted JSON contains certificate
records for %.[yourdomain.com].

Analyze this data and provide:
1. A deduplicated list of all unique subdomain and domain names
2. Naming patterns and conventions the organization uses
3. Subdomains suggesting admin interfaces, login portals,
   VPN/remote access, dev/staging environments, or API endpoints
4. Any wildcard certificates and what infrastructure they might
   cover
5. Certificate authorities used (reveals vendor relationships
   and hosting providers)
6. Anything unusual or potentially sensitive that a human reviewer
   would miss scrolling through hundreds of certificate rows

Output a clean categorized subdomain list organized by
function/risk category.

Timeout fallback: crt.sh JSON responses can be slow for domains with large result sets. If the endpoint times out or returns an error, use the standard HTML view at https://crt.sh/?q=%25.[yourdomain.com], select all results in the table (Ctrl+A), copy (Ctrl+C), and paste into your AI client instead.

2b. Subdomain enumeration

CT logs reveal certificate-level data but may miss subdomains covered by wildcard certificates. Subdomain enumeration tools aggregate results from multiple sources and provide IP address data that CT logs do not include.

subdomainfinder.c99.nl is a subdomain enumeration tool that exports JSON with subdomain, IP address, and Cloudflare status for each result:

{"subdomain":"portal.example.com","ip":"203.0.113.50","cloudflare":"false"}

Search each of your root domains and save the JSON results. Use additional DNS enumeration tools to supplement:

Tool What to Do Account Required
Subdomain Finder Search each root domain -- export/save JSON results (includes IP addresses and Cloudflare status) No
DNSDumpster Enter root domain -- review DNS records, subdomains, and network map No
SecurityTrails Search domain -- review historical DNS records and subdomain history Free tier

2c. AI-assisted analysis of combined results

Attach or paste the saved JSON files from all enumerated domains into your AI client. The combined analysis across all root domains is where the most valuable patterns emerge:

Subdomain Enumeration Analysis Prompt
I am analyzing subdomain enumeration data for [organization name],
a [sector] organization. The attached/pasted JSON files contain
subdomain records for [domain1.com] and [domain2.com]. Each record
includes subdomain name, IP address, and Cloudflare status.

Analyze the combined data and provide:
1. All unique subdomains grouped by root domain
2. IP address clustering -- which subdomains share the same IP
   (suggests shared hosting or load balancer)
3. Authentication and identity infrastructure (SSO portals,
   federation servers, access management)
4. Remote access or VPN-related subdomains
5. API endpoints and what backend systems they suggest
6. Staging, test, or development environments
7. Blue/green or canary deployment patterns that reveal
   infrastructure architecture
8. Subdomains revealing sensitive business functions (benefits,
   payroll, financial, compliance, HR)
9. Which subdomains use cloud provider IPs vs on-premise IP
   ranges
10. Which root domain reveals the most infrastructure detail
    and why that matters for reconnaissance

Output a categorized inventory organized by: authentication/
identity, remote access, benefits/HR, financial/compliance,
APIs/backend, staging/test, content/public, and other.
Worked Example: NRECA Domain and Subdomain Enumeration

Certificate Transparency: electric.coop

Query the crt.sh JSON API for NRECA's primary domain:

https://crt.sh/?q=%25.electric.coop&output=json

Save the JSON response and attach it to your AI client using the CT JSON Analysis Prompt above.

Example AI Response (CT JSON Analysis for electric.coop)
Example AI Response (CT JSON Analysis)

Subdomain Enumeration: cooperative.com vs electric.coop

NRECA operates across at least three root domains: electric.coop (public website), cooperative.com (member services), and nreca.coop (email). Searching each with Subdomain Finder reveals a critical lesson:

  • Search cooperative.com: returns 100+ subdomains with rich infrastructure detail
  • Search electric.coop: returns only ~2 subdomains (www, careers)

The public-facing website domain (electric.coop) reveals almost nothing, while the member-services domain (cooperative.com) exposes the entire internal infrastructure. An adversary who only searches the primary website domain misses most of the attack surface -- and so will a defender who does the same.

Example AI Response (Subdomain Analysis for NRECA)
Example AI Response (Subdomain Enumeration Analysis)

Validation Checkpoint

Compare the AI-generated inventory against what is identifiable from NRECA's public website and contact pages:

Domain Purpose Notes
electric.coop Public website (WordPress) News, advocacy, public-facing
cooperative.com Member portal (SharePoint) Login required -- attack surface
nreca.coop Email domain Confirmed format: firstname.lastname@nreca.coop
benefits.cooperative.com Employee benefits portal Login required -- sensitive
careers.electric.coop Job postings Reveals tech stack, org structure
resap.cooperative.com Safety program portal Member login
ecba.cooperative.com Bar Association portal Member login
remagazine.cooperative.com RE Magazine Content platform
nrecainternational.coop International programs Separate org site

Manual website review found 9 domains. Subdomain enumeration of cooperative.com alone found 100+, including authentication infrastructure (Okta, PingFederate), benefits/PII systems, financial portals, and staging environments that do not appear anywhere on the public website. Enumeration always finds infrastructure that manual review misses.

Step 3: Remote Access Identification

Use internet device search engines to identify what services are exposed on your target's known domains or IP ranges. Look for:

  • Web servers -- identified by HTTP response headers (Server, X-Powered-By)
  • Mail servers -- SMTP, IMAP, Exchange endpoints
  • VPN login portals -- identified by product-specific login pages (FortiGate, Cisco ASA, Palo Alto GlobalProtect)
  • Load balancers and CDN endpoints -- may indicate cloud-hosted infrastructure

Manual verification required. This step must be performed using browser-based tools. AI clients cannot access Shodan, Censys, or live internet infrastructure -- they will generate plausible but fabricated results if asked to identify exposed services. Use AI for analyzing data you have already collected (Steps 2 and 4), not for discovering live services.

Tool What to Do Account Required
Shodan Search by domain, hostname, or organization name -- review banners for product identification Free tier
Censys Search by domain or IP -- review certificates and service details Free tier

Focus on identifying:

  • VPN login portals -- FortiGate, Cisco ASA/AnyConnect, Palo Alto GlobalProtect, SonicWall
  • Remote desktop services -- RDP, Citrix, VMware Horizon
  • Vendor remote support tools -- TeamViewer, AnyDesk, ConnectWise ScreenConnect
  • Web application login pages -- OWA, SharePoint, custom applications

Organizations with Shodan or Censys API access (paid tiers) can automate and scale these queries programmatically -- that is a natural next step beyond this workshop.

3a. Documenting discovered assets

The value of this step is not just finding exposed services -- it is recording them in a way that supports ongoing tracking and remediation. For each discovered asset, capture information across three categories:

Asset identification:

  • Hostname / FQDN
  • IP address
  • Port and protocol
  • Product vendor and name
  • Version or firmware (where identifiable from banners, headers, or login pages)
  • Certificate details: issuer, expiration date, Subject Alternative Name (SAN) entries

Ownership and responsibility:

  • Asset owner or responsible team
  • Business unit or facility
  • IT vs OT designation
  • Contact for patching or change management

Operational context:

  • Business function or purpose
  • Criticality to operations (a perimeter firewall protecting a substation vs a marketing blog)
  • Network zone: internet-facing, DMZ, dual-homed
  • Managed by internal staff or a third-party vendor/integrator

You will not be able to fill in every field from external discovery alone -- ownership and some operational context require internal knowledge. Record what you can from Shodan/Censys results and flag the gaps. The complete baseline template is provided in the Templates section.

3b. Connecting to asset management

Many organizations maintain IT asset inventories in platforms like ServiceNow, but OT assets are frequently absent from these systems. External discovery often reveals OT-adjacent assets -- VPN concentrators serving SCADA networks, historian web interfaces, vendor remote access portals -- that should be tracked alongside IT assets.

If your organization has an asset management system, cross-reference your external findings against it. Gaps between what you discovered externally and what appears in your inventory are themselves a finding -- they represent assets that exist on the internet but are not being tracked, patched, or managed through your standard processes.

Worked Example: NRECA Remote Access Documentation

Here is how a single discovered service would be documented using the three-category framework above:

Example: FortiGate VPN Portal

Hostname: vpn.example-coop.com | IP: 203.0.113.50 | Port: 443/TCP | Product: Fortinet FortiGate, FortiOS 7.4.6 | Cert: DigiCert, expires 2026-09-15, SAN includes vpn.example-coop.com | Owner: IT Security (confirm) | Facility: Main office | Designation: IT perimeter | Function: SSL-VPN remote access for staff and vendors | Criticality: High -- primary remote access path to internal network | Zone: Internet-facing | Management: Internal IT staff

This FortiGate finding connects directly to Module 4, where you will correlate the product and version against vulnerability databases.

Step 4: Consolidation and Google Dorking

At this point you have three sets of findings: certificate transparency data, subdomain enumeration with IP mapping, and manually verified exposed services with product identification. This step uses Google dorking to find resources that the other techniques miss, then consolidates everything into a single baseline.

4a. Google dorking

Search engine dorking uses Google's advanced search operators to find pages and files that the organization may not intend to be publicly accessible. Use your AI client to generate targeted dork queries based on your discoveries so far:

Google Dork Query Generation Prompt
I am performing external attack surface discovery on [organization
name], a [sector] organization. I have identified these root
domains: [domain1.com, domain2.com, domain3.com].

Based on these domains, generate Google dork queries to find:
1. Login pages and authentication portals
2. Exposed documents (PDF, DOCX, XLSX) that may contain
   sensitive information
3. Directory listings that reveal file structures
4. Error pages or debug output that reveals technology stack
5. Configuration files, backup files, or admin panels
6. Pages referencing VPN, remote access, or SCADA/OT systems

Format each query so I can paste it directly into Google.

Run the generated queries in Google and review the first 2-3 pages of results. Add any new findings to your inventory. Google dorking frequently surfaces resources that subdomain enumeration and Shodan/Censys miss -- particularly indexed documents, forgotten login pages, and configuration files.

4b. Consolidated analysis

Feed your combined findings from all steps into your AI client to produce the final attack surface inventory:

Attack Surface Consolidation Prompt
I have completed external attack surface discovery for
[organization name]. Here is a summary of all findings:

CERTIFICATE TRANSPARENCY (crt.sh):
[paste or summarize CT findings]

SUBDOMAIN ENUMERATION:
[paste or summarize subdomain/IP findings]

EXPOSED SERVICES (Shodan/Censys):
[paste or summarize service discovery findings]

GOOGLE DORKING:
[paste or summarize any additional findings]

Consolidate all findings into a single attack surface baseline:
1. A categorized inventory of all discovered assets organized by:
   remote access, authentication/identity, email, web applications,
   benefits/HR/financial, APIs, staging/test, content/public
2. For each category, note the highest-risk assets and why
3. Gaps in my discovery -- what should I investigate further
4. A prioritized list of assets to carry forward to vulnerability
   correlation (Module 4) and monitoring (Module 5)
5. A summary table suitable for use as a baseline document

Review and correct the AI-generated baseline. This consolidated inventory becomes your Artifact 2.

Worked Example: NRECA Consolidation and Google Dorking

Using the Google Dork Query Generation Prompt above with NRECA's three root domains (electric.coop, cooperative.com, nreca.coop):

Example AI Response (Google Dork Queries for NRECA)
Example AI Response (Google Dork Queries)

Output

Artifact 2: External attack surface inventory. A categorized list of your target organization's domains, subdomains, remote access services, and identified technologies. This inventory feeds directly into Module 3 (identifying which personnel have access to these services), Module 4 (correlating identified products against vulnerability databases), and Module 5 (establishing a monitoring baseline to detect changes).

Record your findings in the Baseline Document Template (download Excel), Section 1: External Attack Surface.

Module 2