Module 5: Monitoring & Alerting Setup
Overview
Modules 2 through 4 built a point-in-time picture of your organization's external exposure -- attack surface, personnel risk, and vulnerability correlation. That picture starts going stale the moment you close your browser. This module establishes ongoing monitoring so you are alerted when things change rather than discovering problems during the next quarterly review.
The goal is not comprehensive IT monitoring -- it is targeted monitoring of the specific external signals that indicate risk to OT operations and remote access infrastructure. Specifically, this program watches for changes to: internet-facing remote access systems and VPN portals; edge devices with OT network connectivity such as firewalls, VPN concentrators, and remote access gateways; credentials belonging to Tier 1 and Tier 2 personnel with OT or IT/OT interface access (Module 3); and new vulnerabilities in the products identified in Modules 2 and 4 that sit on the boundary between IT and OT networks. Everything configured in this module -- every alert, every scheduled check, every baseline entry -- should connect back to one of these categories. If a monitoring activity does not inform your understanding of risk to operational systems, it belongs in a different program.
Push + Pull: Two Monitoring Models
Sustainable monitoring combines two approaches:
- Push-based (alerts come to you) -- Services that notify you when something new appears: a new breach containing your domain, a new CVE for a product you use, a mention of your organization in a security context. These require initial setup but then run continuously with no recurring effort.
- Pull-based (you go check periodically) -- Scheduled re-runs of the same queries from Modules 2-4: subdomain enumeration, Shodan/Censys checks, breach database lookups, vulnerability correlation. These catch changes that push-based alerts miss and provide a structured baseline comparison.
Neither approach is sufficient alone. Push alerts have gaps in coverage, and pull checks only find changes at the frequency you run them. Together they provide layered coverage.
Where Free Tools Hit Their Ceiling
The free-tool approach in this workshop replicates roughly 60-70% of what commercial Attack Surface Management (ASM) platforms provide. Products in this category -- Censys ASM, Shodan Enterprise Monitor, GreyNoise, Recorded Future, Digital Shadows/ReliaQuest, and others -- automate continuous asset discovery, automatic CVE correlation against discovered assets, breach feed aggregation, and personnel exposure monitoring. What you are doing manually across Modules 2-5, these platforms do continuously and at scale.
The tradeoff is cost versus analyst time. For small teams monitoring a handful of domains, the manual approach works well. When alert triage and pull-based checks are consuming more than 2-3 hours per week, a paid platform likely pays for itself in analyst time alone -- and provides coverage that manual checks cannot match. Treat this workshop as the foundation: you will know exactly what to evaluate and what questions to ask when that threshold arrives.
Pull-Based Checks
Pull-based monitoring means re-running the same discovery and correlation queries on a schedule. The value is in comparison to baseline -- not the raw results, but what has changed since your last check:
- Subdomain changes (Module 2) -- New subdomains appearing in certificate transparency logs or DNS enumeration. A new subdomain may indicate a new service, a development/staging environment, or shadow IT
- Exposure changes (Module 2) -- New ports or services appearing on Shodan/Censys for your IP ranges. Services that were internal-only may have been accidentally exposed
- New breach appearances (Module 3) -- Personnel email addresses appearing in newly disclosed breaches. Breaches are disclosed continuously; a weekly HIBP check catches new exposure
- New CVEs (Module 4) -- New vulnerabilities disclosed for products in your asset inventory. Check NVD, CISA KEV, and vendor PSIRTs for your specific products and versions
- Personnel changes (Module 3) -- Leadership changes, new hires in key roles, departures. Role changes affect your tier assignments and monitoring priorities
Push-Based Alert Sources
Push-based services run continuously once configured. These complement your pull-based schedule by alerting you to changes between scheduled checks:
| Source | What It Monitors |
|---|---|
| Google Alerts | Web mentions of your organization combined with security keywords (breach, hack, vulnerability, SCADA) |
| Shodan Alerts | Changes to internet-exposed services on your IP ranges (free tier: up to 16 IPs) |
| crt.sh / Cert Spotter | New SSL/TLS certificates issued for your domains, indicating new subdomains or shadow IT |
| CISA ICS Advisories + KEV | New ICS advisories, KEV additions, and vendor-specific security alerts (email and RSS) |
| Vendor PSIRTs | New vulnerabilities for specific products in your asset inventory (see Module 4 vendor PSIRT table) |
| HaveIBeenPwned | New data breaches containing email addresses from your domain (requires domain verification) |
Configure any of these services that are not already active for your organization. The lab section below focuses on pull-based checks because those require structured practice to establish the baseline and schedule that make ongoing monitoring repeatable.
Alert Aggregation and Signal-to-Noise
Without aggregation, monitoring alerts scatter across email, Slack, RSS feeds, and browser bookmarks -- making it easy to miss critical notifications. Establish a single collection point:
- Dedicated email folder -- Filter all monitoring alerts (Google Alerts, CISA, vendor PSIRTs, HIBP) into a single folder. Simple, works with any email system
- Slack or Teams channel -- A dedicated channel for monitoring alerts, with email-to-channel integration for services that only support email delivery
- RSS reader -- For CISA advisories and vendor feeds that publish RSS. Aggregates multiple feeds into a single reading interface
Start narrow, expand as you learn. Begin with a focused set of alerts covering your highest-priority assets and personnel. Overly broad alerting generates noise that trains you to ignore notifications -- the opposite of what you want. After a few weeks of triage, you will learn which alerts generate actionable findings and which are noise. Expand coverage gradually.
When an alert arrives, run it through three questions:
- Does this involve an asset or person in my baseline? If not, it may be informational but is not directly actionable against your monitored environment.
- Does this indicate a change from baseline? New asset, changed configuration, removed service, new breach appearance -- changes are what you are looking for. An alert confirming known state is not a finding.
- Is this actionable within my team's current capacity? A real finding that no one can act on is a backlog item, not an emergency. Be honest about capacity.
Based on those answers, assign each alert a disposition:
- Act now -- P0/P1 finding: escalate immediately per your escalation path
- Weekly review queue -- Notable change that is not urgent. Batch these for your weekly pull check
- Parking lot -- Interesting but low priority. Review monthly during your baseline update
- Tune or disable -- After 4+ weeks of consistently non-actionable results from a specific alert, adjust the query or remove it. An alert that never produces useful findings is consuming attention without providing value
One additional discipline: any item classified as "Needs Investigation" in your baseline (Step 2) should not sit indefinitely. If it remains unresolved for more than one full monitoring cycle -- weekly or monthly, depending on when it was added -- force a human decision: accept the risk, investigate further, or remediate. Unresolved items that linger become invisible, which is worse than a deliberate risk acceptance.
Sustainability and Operational Fatigue
Alert fatigue -- too many notifications -- is a recognized problem. Less discussed is operational fatigue: the cumulative burden of running a monitoring program week after week, month after month. Manual programs fail not because the first month was bad, but because by month six, the person responsible has changed roles, competing priorities have crowded out the weekly checks, execution becomes inconsistent, and nothing is documented well enough for a handoff.
Design for this from day one. Identify your minimum viable monitoring -- the 2-3 activities that, if nothing else runs, keep the program alive. This might be: process your push alerts daily, run the weekly KEV check, and update the baseline monthly. Everything beyond that is valuable but optional. Document procedures clearly enough that someone else can execute them without training. Share ownership so the program does not depend on a single person. Use realistic time estimates -- if your weekly checks consistently take 45 minutes instead of 30, update the schedule rather than skipping steps. And accept that some coverage gaps are acceptable. A monitoring program that runs consistently at 70% coverage is far more valuable than one designed for 100% that quietly stops after three months.
Lab: Configure Your Monitoring Infrastructure
In this lab, you will build an OT-focused pull-based monitoring schedule tailored to your findings from Modules 2-4, then consolidate those findings into a classified baseline document. The outputs become Artifacts 6 and 7.
Step 1: Design Your Pull-Based Monitoring Schedule
Your pull-based schedule should target the remote access infrastructure, edge devices, and OT-adjacent personnel identified in Modules 2-4. This is not a general IT monitoring checklist -- generic changes like marketing website updates or unrelated subdomain additions are out of scope. The focus is on the systems and credentials that, if compromised, create a path to operational technology: VPN portals, firewall management interfaces, Tier 1 and Tier 2 personnel credentials, and the vulnerability status of edge devices sitting on the IT/OT boundary.
Use your AI client to generate a customized monitoring checklist based on your specific findings:
Create a pull-based OSINT monitoring schedule for [organization
name], a [sector] organization focused on monitoring remote
access and OT-adjacent infrastructure. Include weekly checks
(under 30 minutes) and monthly checks (up to 2 hours).
From our prior modules:
- Remote access and OT-adjacent services discovered: [list
from Module 2 -- VPN portals, admin interfaces, remote
desktop, firewall management pages]
- Edge device vendors with internet-exposed assets: [list
from Module 4, e.g., Fortinet, Cisco]
- Tier 1 and Tier 2 personnel (OT/IT access): [count and
roles from Module 3]
- Email domain for breach monitoring: [domain from Module 3]
- Primary domains and subdomains to watch: [from Module 2]
For each check, specify:
1. The tool and exact query to run
2. What to compare against (the baseline from today)
3. What constitutes an OT-relevant finding vs. expected noise
4. Estimated time
Prioritize checks covering remote access infrastructure and
edge device vulnerability changes above general subdomain
or personnel checks.
Worked Example: NRECA Pull-Based Monitoring Schedule
Example AI Response (NRECA Monitoring Schedule)
NRECA Pull-Based OSINT Monitoring Schedule
Weekly Checks (estimated 25 minutes):
Checks are ordered by OT relevance: remote access exposure and edge device vulnerabilities first, credential risk second, infrastructure changes third.
- Shodan/Censys Quick Check (5 min)
- Tool: Shodan
- Query: Check known VPN portal and admin interface IPs from Module 2 baseline for port, service, or version changes
- Compare to: Module 2 baseline service inventory (remote access entries)
- OT-relevant finding: Any change to a VPN login page, new port on a firewall management IP, or version change on an edge device. A version downgrade or unexpected service on a known remote access IP warrants immediate investigation
- Expected noise: Minor HTTP header changes, certificate renewals on the same service
- CISA KEV Review (5 min)
- Tool: CISA KEV catalog
- Query: Review additions from the past 7 days. Check each new entry against your Module 4 asset inventory
- Compare to: Module 4 vulnerability correlation table
- OT-relevant finding: Any new KEV entry matching a product/vendor in your asset inventory -- this is a P0 finding if the asset is internet-exposed
- Expected noise: KEV additions for products not in your environment
- Certificate Transparency Check (5 min)
- Tool: crt.sh
- Query:
%.electric.coopand%.cooperative.com - Compare to: Module 2 baseline subdomain list
- OT-relevant finding: Any new subdomain not in your baseline -- especially subdomains suggesting new remote access services (vpn2., ras., remote.), new OT-adjacent applications, or shadow IT
- Expected noise: Certificate renewals for existing subdomains (same name, new cert serial)
- Breach Database Check (5 min)
- Tool: HaveIBeenPwned
- Query: Domain search for
nreca.coop(if verified) or spot-check Tier 1 personnel emails - Compare to: Module 3 breach findings baseline
- OT-relevant finding: Any Tier 1 or Tier 2 personnel appearing in a new breach, especially if passwords or hashes were exposed. These are the credentials that could unlock remote access to OT-adjacent systems
- Expected noise: Previously known breaches appearing in HIBP updates (no new data, just re-processed)
- Alert Backlog Processing (5 min)
- Tool: Your alert aggregation point (email folder, Slack channel)
- Action: Review and triage any push alerts that were flagged during daily checks but deferred for weekly review
- OT-relevant finding: Patterns across multiple alerts (e.g., several mentions of your organization in security forums, multiple new subdomains in one week)
Monthly Checks (estimated 90 minutes):
- Full Subdomain Re-Enumeration (20 min)
- Tools: crt.sh, DNSDumpster, SecurityTrails
- Query: Complete subdomain enumeration for all domains from Module 2
- Compare to: Module 2 baseline domain map
- Action: Update baseline. Investigate any new subdomains. Remove any that have been decommissioned
- Shodan/Censys Exposure Review (20 min)
- Tools: Shodan, Censys
- Query: Re-run Module 2 queries for organization IP ranges and domain names
- Compare to: Module 2 baseline service inventory
- Action: Identify new services, changed ports, or version changes. Flag any new administrative interfaces exposed to the internet. Pay particular attention to any changes in VPN login pages, admin interfaces, or management ports -- these directly affect OT network boundary security
- Vulnerability Correlation Update (20 min)
- Tools: NVD, CISA KEV, vendor PSIRTs, ICS Advisory Project
- Query: Re-run Module 4 CPE queries for all products in your asset inventory
- Compare to: Module 4 vulnerability correlation table
- Action: Update correlation table with new CVEs. Re-assess priority ratings. Check if previously patched items have follow-on CVEs (like the FortiGate CVE-2026-24858 follow-on)
- Personnel Inventory Review (15 min)
- Tools: Organization website, professional profiles
- Query: Review leadership and key personnel pages for changes
- Compare to: Module 3 personnel inventory
- Action: Update inventory for new hires, departures, and role changes. New Tier 1 personnel should be added to breach monitoring immediately
- Baseline Document Update (15 min)
- Action: Consolidate all weekly and monthly findings into the baseline document (Artifact 7). Reclassify items as needed. Archive resolved items
Customize the AI-generated schedule for your environment. Your actual weekly time may vary -- the key is having a consistent, repeatable process rather than a specific time target. Organizations with larger attack surfaces or more exposed services may need to allocate more time for monthly checks.
Step 2: Create Your Baseline Snapshot
Your pull-based monitoring is only useful if you have a baseline to compare against. Consolidate your outputs from Modules 2, 3, and 4 into a single baseline document that serves as the reference point for all future monitoring.
For each item in the baseline, assign a classification:
| Classification | Meaning | Action |
|---|---|---|
| Known-Good | Expected and properly configured. No security concern identified | Document and monitor for changes |
| Accepted Risk | Known exposure that cannot be remediated immediately (e.g., business-required internet-facing service) | Document the business justification. Monitor more frequently. Review acceptance quarterly |
| Needs Remediation | Security issue with a clear fix (patch available, misconfiguration, unnecessary exposure) | Add to remediation queue with priority from Module 4 framework. Track to resolution |
| Needs Investigation | Unknown or unexpected item requiring further analysis before classification | Investigate promptly. Items should not remain in this category for more than one monitoring cycle |
OT-Priority Sorting
When classifying baseline entries, work through them in order of operational impact. Remote access portals, VPN endpoints, and administrative interfaces should be classified first -- these are the assets most likely to provide a direct path into OT networks. Tier 1 and Tier 2 personnel (those with OT system access or credentials) come next, followed by vulnerability correlation entries for edge devices and OT-adjacent systems. All remaining findings -- corporate subdomains, general personnel, and lower-priority vulnerabilities -- are classified last. This ordering ensures that the entries with the most direct OT impact receive attention even when time runs short.
Baseline Structure
Your baseline document should consolidate the key outputs from each preceding module:
- From Module 2 (Attack Surface): All discovered domains, subdomains, and exposed services with their current status. Remote access portals, VPN login pages, and administrative interfaces should be listed first within this section. Include the asset documentation fields (hostname, IP, port, product, criticality, zone)
- From Module 3 (Personnel): Tier 1 and Tier 2 personnel with email addresses, breach status, and monitoring priority. Team email addresses and email format pattern
- From Module 4 (Vulnerabilities): Vulnerability correlation table with CVE details, KEV status, priority ratings, and remediation status
- From this module: List of configured push alerts and pull-based schedule, so the next person running the monitoring process knows what is already set up
The baseline is a living document. Every monitoring cycle should produce a comparison against the baseline and then update it. When a new subdomain appears, it starts as "Needs Investigation," gets analyzed, and then moves to one of the other three categories. When a vulnerability is patched, its status in the baseline changes to "Known-Good" with a note about the patch date. The baseline reflects your current understanding, not a historical snapshot. The remote access and edge device entries are the highest-priority section of this document -- if nothing else is reviewed at each monitoring cycle, those entries should be.
Step 3: Sustainability Assessment
You have now designed a monitoring schedule and created a baseline. Before leaving this module, spend five minutes making explicit decisions about how this program survives beyond today. Answer these four questions -- use your AI client to think through the tradeoffs, but the answers are yours to make:
- Single point of failure: Is this program documented well enough that a teammate who was not here today could run it? If not, what is missing -- tool access, login credentials, the schedule itself, or context about why certain alerts were configured?
- Minimum viable monitoring: Of all the push alerts and pull checks in your monitoring program, which 2-3 would you keep if your available time dropped by 75%? These are your non-negotiables -- the checks that keep the program alive even when everything else falls off. (Include both push alerts and pull checks in your consideration.)
- Escalation path: When monitoring surfaces a P0 or P1 finding at 10pm on a Friday, who gets the call? Is that documented anywhere other than your own memory?
- Program review cadence: When will you revisit the alert configurations themselves? Alert configs go stale too -- organizational changes, new domains, personnel turnover, and vendor product changes all require updates to what you are monitoring.
Use your AI client to help identify your minimum viable monitoring set:
Given this monitoring program covering [describe your setup --
domains monitored, alert services configured, pull-based schedule,
number of personnel tracked], I have limited time and need to
identify my minimum viable monitoring set.
If I could only run 3 checks per week, which 3 would give me the
highest detection value for an electric cooperative / [sector]
organization, and why?
For each, specify:
1. The tool and the specific query to run
2. What finding would require immediate action (escalate now)
vs. adding to a weekly review queue
3. How long the check takes and what it would miss if skipped
Document your answers to these four questions alongside your monitoring artifacts. The goal is not to plan for failure -- it is to make scope decisions deliberately rather than discovering them when the program quietly stops running.
Output
Artifact 5: Push-based alert configuration. The alert sources documented in the read section -- Google Alerts, CISA ICS Advisory subscriptions, vendor PSIRT feeds, and breach notification services -- configured to deliver ongoing notifications without recurring effort. These push-based alerts form the daily triage queue that your Module 6 runbook will operationalize.
Artifact 6: Pull-based monitoring schedule. A structured weekly and monthly checklist specifying which tools to use, which queries to run, and what to compare against your baseline. This schedule ensures monitoring happens consistently rather than only when someone remembers. Combined with the push-based alert sources documented in the read section, this provides layered monitoring coverage. Use the Monitoring Checklist Template (download Word).
Artifact 7: Consolidated baseline document. A single document consolidating your Module 2 attack surface inventory, Module 3 personnel exposure inventory, and Module 4 vulnerability correlation table, with each item classified as known-good, accepted risk, needs remediation, or needs investigation. This baseline is the reference point for all ongoing monitoring -- every future check asks "what has changed since the baseline?" Use the Baseline Document Template (download Excel).
Looking ahead: The manual program you built here is a foundation, not the end state. Once you know what you are looking for and what normal looks like, automation becomes feasible. The same AI clients you have been using throughout this workshop can generate scripts that automate your pull-based checks -- Python scripts for crt.sh API queries, CISA KEV JSON feed parsing, HIBP API lookups (for those with API access), and Shodan API queries -- turning your manual schedule into a repeatable, scheduled process.