Module 2: External Attack Surface Discovery

Read: 10 min Lab: 35 min Total: 45 min

Overview

Your external attack surface is everything an adversary can see about your organization without authenticating -- domains, subdomains, login portals, exposed services, and the technologies behind them. In Module 1, we established that publicly discoverable information enabled every attack we reviewed. This module teaches you how to discover that information systematically, the same way an adversary would.

What Adversaries See

External reconnaissance requires no credentials and no special access. An adversary with a web browser and free tools can discover:

Domains and subdomains -- every registered domain and SSL certificate your organization has issued reveals infrastructure
Login portals -- VPN, webmail, remote desktop, and application login pages are visible to anyone who finds them
Technology stack -- HTTP headers, certificate details, and login page appearance reveal specific products and versions
Development and staging environments -- subdomains like dev., staging., or test. often have weaker security controls

Discovery Techniques

This module uses four complementary techniques, all browser-based:

Technique	What It Finds	Tools
Certificate transparency	Subdomains from SSL certificate issuance logs	crt.sh
Subdomain enumeration	Subdomains with IP addresses, hosting providers, infrastructure mapping	Subdomain Finder
DNS enumeration	DNS records, mail servers, name servers, network maps	DNSDumpster, SecurityTrails
Internet device search	Exposed services, banners, product identification	Shodan, Censys
Search engine dorking	Indexed login pages, documents, configuration files	Google advanced search operators

Naming Conventions That Reveal Infrastructure

Organizations often use predictable subdomain naming patterns. These are valuable reconnaissance targets because they indicate what is behind each service:

Pattern	What It Suggests
vpn., remote., access.	VPN or remote access portals
citrix., rdp., vdi.	Virtual desktop / remote desktop services
mail., owa., webmail.	Email services (login portals)
dev., staging., test.	Development/staging environments (often less secured)
plant1., substation-west., site-	Facility-specific infrastructure (reveals OT network presence)
api., portal., dashboard.	Application interfaces (potential data exposure)

Lab: Build Your External Attack Surface Inventory

In this lab, you will discover your target organization's external attack surface using browser-based tools and AI-assisted analysis. Follow each step and document your findings -- the output becomes Artifact 2.

Step 1: Target Selection

Select your target organization for this exercise:

Your own organization (with authorization) -- provides the most operationally relevant results
NRECA (practice target) -- follow along using the worked examples included in each step

Identify all root domains you will investigate. Most organizations have more than one -- check the main website, email domain, and any known portals. The primary website domain often reveals the least infrastructure; member portals, internal service domains, and email domains are typically far richer targets.

Step 2: Domain and Subdomain Enumeration

This step uses two complementary approaches: certificate transparency (CT) logs reveal subdomains from SSL certificate issuance records, and subdomain enumeration tools aggregate results from multiple sources with IP address data. Together they provide comprehensive domain discovery.

2a. Certificate transparency search

Certificate transparency logs record every SSL/TLS certificate issued by public certificate authorities. When an organization requests a certificate for a subdomain, that subdomain becomes permanently recorded in CT logs -- even if the subdomain is not linked from any public page.

crt.sh provides a JSON API that returns structured certificate data. Appending &output=json to any crt.sh query returns JSON instead of an HTML table -- this makes it easy to feed results directly to your AI client for analysis.

Query the JSON endpoint for each of your root domains:

https://crt.sh/?q=%25.[yourdomain.com]&output=json

Save the JSON response to a local file using your browser's Save Page As (Ctrl+S / Cmd+S). Name it something descriptive like yourdomain_crt.json. Then attach or paste the saved JSON file into your AI client with this prompt:

CT JSON Analysis Prompt

I am analyzing certificate transparency data from crt.sh for
[organization name]. The attached/pasted JSON contains certificate
records for %.[yourdomain.com].

Analyze this data and provide:
1. A deduplicated list of all unique subdomain and domain names
2. Naming patterns and conventions the organization uses
3. Subdomains suggesting admin interfaces, login portals,
   VPN/remote access, dev/staging environments, or API endpoints
4. Any wildcard certificates and what infrastructure they might
   cover
5. Certificate authorities used (reveals vendor relationships
   and hosting providers)
6. Anything unusual or potentially sensitive that a human reviewer
   would miss scrolling through hundreds of certificate rows

Output a clean categorized subdomain list organized by
function/risk category.

Timeout fallback: crt.sh JSON responses can be slow for domains with large result sets. If the endpoint times out or returns an error, use the standard HTML view at https://crt.sh/?q=%25.[yourdomain.com], select all results in the table (Ctrl+A), copy (Ctrl+C), and paste into your AI client instead.

2b. Subdomain enumeration

CT logs reveal certificate-level data but may miss subdomains covered by wildcard certificates. Subdomain enumeration tools aggregate results from multiple sources and provide IP address data that CT logs do not include.

subdomainfinder.c99.nl is a subdomain enumeration tool that exports JSON with subdomain, IP address, and Cloudflare status for each result:

{"subdomain":"portal.example.com","ip":"203.0.113.50","cloudflare":"false"}

Search each of your root domains and save the JSON results. Use additional DNS enumeration tools to supplement:

Tool	What to Do	Account Required
Subdomain Finder	Search each root domain -- export/save JSON results (includes IP addresses and Cloudflare status)	No
DNSDumpster	Enter root domain -- review DNS records, subdomains, and network map	No
SecurityTrails	Search domain -- review historical DNS records and subdomain history	Free tier

2c. AI-assisted analysis of combined results

Attach or paste the saved JSON files from all enumerated domains into your AI client. The combined analysis across all root domains is where the most valuable patterns emerge:

Subdomain Enumeration Analysis Prompt

I am analyzing subdomain enumeration data for [organization name],
a [sector] organization. The attached/pasted JSON files contain
subdomain records for [domain1.com] and [domain2.com]. Each record
includes subdomain name, IP address, and Cloudflare status.

Analyze the combined data and provide:
1. All unique subdomains grouped by root domain
2. IP address clustering -- which subdomains share the same IP
   (suggests shared hosting or load balancer)
3. Authentication and identity infrastructure (SSO portals,
   federation servers, access management)
4. Remote access or VPN-related subdomains
5. API endpoints and what backend systems they suggest
6. Staging, test, or development environments
7. Blue/green or canary deployment patterns that reveal
   infrastructure architecture
8. Subdomains revealing sensitive business functions (benefits,
   payroll, financial, compliance, HR)
9. Which subdomains use cloud provider IPs vs on-premise IP
   ranges
10. Which root domain reveals the most infrastructure detail
    and why that matters for reconnaissance

Output a categorized inventory organized by: authentication/
identity, remote access, benefits/HR, financial/compliance,
APIs/backend, staging/test, content/public, and other.

Worked Example: NRECA Domain and Subdomain Enumeration

Certificate Transparency: electric.coop

Query the crt.sh JSON API for NRECA's primary domain:

https://crt.sh/?q=%25.electric.coop&output=json

Save the JSON response and attach it to your AI client using the CT JSON Analysis Prompt above.

Example AI Response (CT JSON Analysis for electric.coop)

Example AI Response (CT JSON Analysis)

Certificate Transparency Analysis: electric.coop

Unique Domains/Subdomains (deduplicated):

Public-Facing / Content:

electric.coop -- primary public website
www.electric.coop -- www redirect
remagazine.cooperative.com -- RE Magazine content platform
careers.electric.coop -- job postings and recruitment
nrecainternational.coop -- international programs site

Member/Employee Portals (Login Required):

cooperative.com -- member portal (SharePoint-based)
benefits.cooperative.com -- employee benefits enrollment
resap.cooperative.com -- Rural Electric Safety Achievement Program
ecba.cooperative.com -- Electric Cooperative Bar Association

Email / Communication:

nreca.coop -- email domain (confirmed format: firstname.lastname@nreca.coop)

Naming Patterns Identified:

Program-specific subdomains under cooperative.com (benefits, resap, ecba) -- suggests each internal program gets its own subdomain
Public content uses electric.coop; internal/member services use cooperative.com -- clear split between public and authenticated surfaces
Email on a separate domain (nreca.coop) from web properties

Wildcard Certificates:

*.cooperative.com -- covers all subdomains under cooperative.com. This means additional subdomains may exist that are not individually logged in CT. Consider DNS enumeration tools to find subdomains covered by this wildcard.
*.electric.coop -- same pattern; additional subdomains may be hidden behind the wildcard

Certificate Authorities Used:

DigiCert -- used for primary domains, suggests enterprise certificate management
Let's Encrypt -- used for some subdomains, may indicate cloud-hosted or developer-managed services
Sectigo -- used for specific portals

Notable Observations:

The wildcard certificates for *.cooperative.com and *.electric.coop mean the CT logs undercount the actual subdomain footprint. DNS enumeration (DNSDumpster, SecurityTrails) should be used to find subdomains covered by these wildcards.
The split between DigiCert (enterprise) and Let's Encrypt (free/automated) may indicate some services are managed outside the primary IT team -- worth investigating for shadow IT.
Benefits portal (benefits.cooperative.com) is a high-value target -- benefits systems typically contain SSNs, banking details, and salary data.

Subdomain Enumeration: cooperative.com vs electric.coop

NRECA operates across at least three root domains: electric.coop (public website), cooperative.com (member services), and nreca.coop (email). Searching each with Subdomain Finder reveals a critical lesson:

Search cooperative.com: returns 100+ subdomains with rich infrastructure detail
Search electric.coop: returns only ~2 subdomains (www, careers)

The public-facing website domain (electric.coop) reveals almost nothing, while the member-services domain (cooperative.com) exposes the entire internal infrastructure. An adversary who only searches the primary website domain misses most of the attack surface -- and so will a defender who does the same.

Example AI Response (Subdomain Analysis for NRECA)

Example AI Response (Subdomain Enumeration Analysis)

Subdomain Enumeration Analysis: NRECA

Domain Contrast: electric.coop returned only 2 subdomains (www.electric.coop, careers.electric.coop). cooperative.com returned 100+ subdomains exposing authentication infrastructure, benefits systems, financial portals, APIs, and staging environments. The member-facing domain is far richer for reconnaissance than the public website.

Authentication / Identity:

okta.cooperative.com -- Okta SSO portal. Reveals the organization uses Okta as its identity provider. High-value target: compromising SSO provides access to all federated applications.
pingfed.cooperative.com -- PingFederate identity federation server. Indicates a dual-IdP architecture (Okta + Ping) or migration between platforms.
pingaccess.cooperative.com -- PingAccess API security gateway. Controls access to web applications and APIs behind the Ping identity stack.

Benefits / HR (High-Value PII Targets):

benefits.cooperative.com -- primary benefits portal
beneficiaries.cooperative.com -- beneficiary designation (life insurance, retirement accounts)
cobra.cooperative.com -- COBRA benefits continuation
divorce.cooperative.com, marriage.cooperative.com, newchild.cooperative.com -- life event processing portals (contain highly sensitive personal data)
w2salary.cooperative.com -- W-2 and salary data (tax documents, compensation)
retirement.cooperative.com -- retirement/pension management

This cluster represents a massive PII exposure surface. Benefits systems typically contain SSNs, banking details, salary data, and family information. Each subdomain is a potential credential stuffing target.

Financial / Compliance:

form990.cooperative.com -- IRS Form 990 (nonprofit tax filings, reveals financial data)
financialpower.cooperative.com -- financial services portal
invoicepreferences.cooperative.com -- invoice management (potential for business email compromise targeting)

APIs / Backend:

Multiple *api.* subdomains revealing backend service architecture
API endpoints suggest microservices architecture with dedicated backends for member services, content delivery, and internal tooling

Staging / Test (Often Less Secured):

stage.cooperative.com, stagesearch.cooperative.com -- staging environment
test.community.cooperative.com -- test instance of community platform
pae-ext-test.cooperative.com, pfe-ext-test.cooperative.com -- external test instances of PingAccess and PingFederate (identity infrastructure test environments are especially sensitive)

Staging and test environments frequently have weaker authentication, default credentials, or disabled security controls. Identity infrastructure test instances (pae-ext-test, pfe-ext-test) are high-priority findings.

Infrastructure Architecture:

Blue/green deployment: Extensive blue.* subdomain pattern (blue.benefits, blue.retirement, etc.) reveals blue/green deployment architecture. This tells an adversary how the organization manages releases and where redundant infrastructure exists.
IP clustering: The majority of cooperative.com subdomains resolve to 74.127.88.162, suggesting centralized hosting or a load balancer fronting most services. A small number of subdomains resolve to different IPs, indicating cloud-hosted or externally managed services.

Content / Public:

electric.coop -- public website, WordPress
careers.electric.coop -- job postings
remagazine.cooperative.com -- RE Magazine
nrecainternational.coop -- international programs

Key Takeaway: Searching only electric.coop would have found 2 subdomains. Searching cooperative.com revealed authentication infrastructure, benefits/PII systems, financial portals, staging environments, and deployment architecture. Always enumerate all known root domains.

Validation Checkpoint

Compare the AI-generated inventory against what is identifiable from NRECA's public website and contact pages:

Domain	Purpose	Notes
electric.coop	Public website (WordPress)	News, advocacy, public-facing
cooperative.com	Member portal (SharePoint)	Login required -- attack surface
nreca.coop	Email domain	Confirmed format: firstname.lastname@nreca.coop
benefits.cooperative.com	Employee benefits portal	Login required -- sensitive
careers.electric.coop	Job postings	Reveals tech stack, org structure
resap.cooperative.com	Safety program portal	Member login
ecba.cooperative.com	Bar Association portal	Member login
remagazine.cooperative.com	RE Magazine	Content platform
nrecainternational.coop	International programs	Separate org site

Manual website review found 9 domains. Subdomain enumeration of cooperative.com alone found 100+, including authentication infrastructure (Okta, PingFederate), benefits/PII systems, financial portals, and staging environments that do not appear anywhere on the public website. Enumeration always finds infrastructure that manual review misses.

Step 3: Remote Access Identification

Use internet device search engines to identify what services are exposed on your target's known domains or IP ranges. Look for:

Web servers -- identified by HTTP response headers (Server, X-Powered-By)
Mail servers -- SMTP, IMAP, Exchange endpoints
VPN login portals -- identified by product-specific login pages (FortiGate, Cisco ASA, Palo Alto GlobalProtect)
Load balancers and CDN endpoints -- may indicate cloud-hosted infrastructure

Manual verification required. This step must be performed using browser-based tools. AI clients cannot access Shodan, Censys, or live internet infrastructure -- they will generate plausible but fabricated results if asked to identify exposed services. Use AI for analyzing data you have already collected (Steps 2 and 4), not for discovering live services.

Tool	What to Do	Account Required
Shodan	Search by domain, hostname, or organization name -- review banners for product identification	Free tier
Censys	Search by domain or IP -- review certificates and service details	Free tier

Focus on identifying:

VPN login portals -- FortiGate, Cisco ASA/AnyConnect, Palo Alto GlobalProtect, SonicWall
Remote desktop services -- RDP, Citrix, VMware Horizon
Vendor remote support tools -- TeamViewer, AnyDesk, ConnectWise ScreenConnect
Web application login pages -- OWA, SharePoint, custom applications

Organizations with Shodan or Censys API access (paid tiers) can automate and scale these queries programmatically -- that is a natural next step beyond this workshop.

3a. Documenting discovered assets

The value of this step is not just finding exposed services -- it is recording them in a way that supports ongoing tracking and remediation. For each discovered asset, capture information across three categories:

Asset identification:

Hostname / FQDN
IP address
Port and protocol
Product vendor and name
Version or firmware (where identifiable from banners, headers, or login pages)
Certificate details: issuer, expiration date, Subject Alternative Name (SAN) entries

Ownership and responsibility:

Asset owner or responsible team
Business unit or facility
IT vs OT designation
Contact for patching or change management

Operational context:

Business function or purpose
Criticality to operations (a perimeter firewall protecting a substation vs a marketing blog)
Network zone: internet-facing, DMZ, dual-homed
Managed by internal staff or a third-party vendor/integrator

You will not be able to fill in every field from external discovery alone -- ownership and some operational context require internal knowledge. Record what you can from Shodan/Censys results and flag the gaps. The complete baseline template is provided in the Templates section.

3b. Connecting to asset management

Many organizations maintain IT asset inventories in platforms like ServiceNow, but OT assets are frequently absent from these systems. External discovery often reveals OT-adjacent assets -- VPN concentrators serving SCADA networks, historian web interfaces, vendor remote access portals -- that should be tracked alongside IT assets.

If your organization has an asset management system, cross-reference your external findings against it. Gaps between what you discovered externally and what appears in your inventory are themselves a finding -- they represent assets that exist on the internet but are not being tracked, patched, or managed through your standard processes.

Worked Example: NRECA Remote Access Documentation

Here is how a single discovered service would be documented using the three-category framework above:

Example: FortiGate VPN Portal

Hostname: vpn.example-coop.com | IP: 203.0.113.50 | Port: 443/TCP | Product: Fortinet FortiGate, FortiOS 7.4.6 | Cert: DigiCert, expires 2026-09-15, SAN includes vpn.example-coop.com | Owner: IT Security (confirm) | Facility: Main office | Designation: IT perimeter | Function: SSL-VPN remote access for staff and vendors | Criticality: High -- primary remote access path to internal network | Zone: Internet-facing | Management: Internal IT staff

This FortiGate finding connects directly to Module 4, where you will correlate the product and version against vulnerability databases.

Step 4: Consolidation and Google Dorking

At this point you have three sets of findings: certificate transparency data, subdomain enumeration with IP mapping, and manually verified exposed services with product identification. This step uses Google dorking to find resources that the other techniques miss, then consolidates everything into a single baseline.

4a. Google dorking

Search engine dorking uses Google's advanced search operators to find pages and files that the organization may not intend to be publicly accessible. Use your AI client to generate targeted dork queries based on your discoveries so far:

Google Dork Query Generation Prompt

I am performing external attack surface discovery on [organization
name], a [sector] organization. I have identified these root
domains: [domain1.com, domain2.com, domain3.com].

Based on these domains, generate Google dork queries to find:
1. Login pages and authentication portals
2. Exposed documents (PDF, DOCX, XLSX) that may contain
   sensitive information
3. Directory listings that reveal file structures
4. Error pages or debug output that reveals technology stack
5. Configuration files, backup files, or admin panels
6. Pages referencing VPN, remote access, or SCADA/OT systems

Format each query so I can paste it directly into Google.

Run the generated queries in Google and review the first 2-3 pages of results. Add any new findings to your inventory. Google dorking frequently surfaces resources that subdomain enumeration and Shodan/Censys miss -- particularly indexed documents, forgotten login pages, and configuration files.

4b. Consolidated analysis

Feed your combined findings from all steps into your AI client to produce the final attack surface inventory:

Attack Surface Consolidation Prompt

I have completed external attack surface discovery for
[organization name]. Here is a summary of all findings:

CERTIFICATE TRANSPARENCY (crt.sh):
[paste or summarize CT findings]

SUBDOMAIN ENUMERATION:
[paste or summarize subdomain/IP findings]

EXPOSED SERVICES (Shodan/Censys):
[paste or summarize service discovery findings]

GOOGLE DORKING:
[paste or summarize any additional findings]

Consolidate all findings into a single attack surface baseline:
1. A categorized inventory of all discovered assets organized by:
   remote access, authentication/identity, email, web applications,
   benefits/HR/financial, APIs, staging/test, content/public
2. For each category, note the highest-risk assets and why
3. Gaps in my discovery -- what should I investigate further
4. A prioritized list of assets to carry forward to vulnerability
   correlation (Module 4) and monitoring (Module 5)
5. A summary table suitable for use as a baseline document

Review and correct the AI-generated baseline. This consolidated inventory becomes your Artifact 2.

Worked Example: NRECA Consolidation and Google Dorking

Using the Google Dork Query Generation Prompt above with NRECA's three root domains (electric.coop, cooperative.com, nreca.coop):

Example AI Response (Google Dork Queries for NRECA)

Example AI Response (Google Dork Queries)

Output

Artifact 2: External attack surface inventory. A categorized list of your target organization's domains, subdomains, remote access services, and identified technologies. This inventory feeds directly into Module 3 (identifying which personnel have access to these services), Module 4 (correlating identified products against vulnerability databases), and Module 5 (establishing a monitoring baseline to detect changes).

Record your findings in the Baseline Document Template (download Excel), Section 1: External Attack Surface.

M1: Threat Context Back to Home M3: Personnel Exposure

Module 2

Overview

What Adversaries See

Discovery Techniques

Naming Conventions That Reveal Infrastructure

Lab: Build Your External Attack Surface Inventory

Step 1: Target Selection

Step 2: Domain and Subdomain Enumeration

Certificate Transparency: electric.coop

Certificate Transparency Analysis: electric.coop

Subdomain Enumeration: cooperative.com vs electric.coop

Subdomain Enumeration Analysis: NRECA

Validation Checkpoint

Step 3: Remote Access Identification

Step 4: Consolidation and Google Dorking

Google Dork Queries for NRECA Domains

Output