Module 5: Monitoring & Alerting Setup

Read: 10 min Lab: 40 min Total: 50 min

Overview

Modules 2 through 4 built a point-in-time picture of your organization's external exposure -- attack surface, personnel risk, and vulnerability correlation. That picture starts going stale the moment you close your browser. This module establishes ongoing monitoring so you are alerted when things change rather than discovering problems during the next quarterly review.

The goal is not comprehensive IT monitoring -- it is targeted monitoring of the specific external signals that indicate risk to OT operations and remote access infrastructure. Specifically, this program watches for changes to: internet-facing remote access systems and VPN portals; edge devices with OT network connectivity such as firewalls, VPN concentrators, and remote access gateways; credentials belonging to Tier 1 and Tier 2 personnel with OT or IT/OT interface access (Module 3); and new vulnerabilities in the products identified in Modules 2 and 4 that sit on the boundary between IT and OT networks. Everything configured in this module -- every alert, every scheduled check, every baseline entry -- should connect back to one of these categories. If a monitoring activity does not inform your understanding of risk to operational systems, it belongs in a different program.

Push + Pull: Two Monitoring Models

Sustainable monitoring combines two approaches:

  • Push-based (alerts come to you) -- Services that notify you when something new appears: a new breach containing your domain, a new CVE for a product you use, a mention of your organization in a security context. These require initial setup but then run continuously with no recurring effort.
  • Pull-based (you go check periodically) -- Scheduled re-runs of the same queries from Modules 2-4: subdomain enumeration, Shodan/Censys checks, breach database lookups, vulnerability correlation. These catch changes that push-based alerts miss and provide a structured baseline comparison.

Neither approach is sufficient alone. Push alerts have gaps in coverage, and pull checks only find changes at the frequency you run them. Together they provide layered coverage.

Where Free Tools Hit Their Ceiling

The free-tool approach in this workshop replicates roughly 60-70% of what commercial Attack Surface Management (ASM) platforms provide. Products in this category -- Censys ASM, Shodan Enterprise Monitor, GreyNoise, Recorded Future, Digital Shadows/ReliaQuest, and others -- automate continuous asset discovery, automatic CVE correlation against discovered assets, breach feed aggregation, and personnel exposure monitoring. What you are doing manually across Modules 2-5, these platforms do continuously and at scale.

The tradeoff is cost versus analyst time. For small teams monitoring a handful of domains, the manual approach works well. When alert triage and pull-based checks are consuming more than 2-3 hours per week, a paid platform likely pays for itself in analyst time alone -- and provides coverage that manual checks cannot match. Treat this workshop as the foundation: you will know exactly what to evaluate and what questions to ask when that threshold arrives.

Pull-Based Checks

Pull-based monitoring means re-running the same discovery and correlation queries on a schedule. The value is in comparison to baseline -- not the raw results, but what has changed since your last check:

  • Subdomain changes (Module 2) -- New subdomains appearing in certificate transparency logs or DNS enumeration. A new subdomain may indicate a new service, a development/staging environment, or shadow IT
  • Exposure changes (Module 2) -- New ports or services appearing on Shodan/Censys for your IP ranges. Services that were internal-only may have been accidentally exposed
  • New breach appearances (Module 3) -- Personnel email addresses appearing in newly disclosed breaches. Breaches are disclosed continuously; a weekly HIBP check catches new exposure
  • New CVEs (Module 4) -- New vulnerabilities disclosed for products in your asset inventory. Check NVD, CISA KEV, and vendor PSIRTs for your specific products and versions
  • Personnel changes (Module 3) -- Leadership changes, new hires in key roles, departures. Role changes affect your tier assignments and monitoring priorities

Push-Based Alert Sources

Push-based services run continuously once configured. These complement your pull-based schedule by alerting you to changes between scheduled checks:

Source What It Monitors
Google Alerts Web mentions of your organization combined with security keywords (breach, hack, vulnerability, SCADA)
Shodan Alerts Changes to internet-exposed services on your IP ranges (free tier: up to 16 IPs)
crt.sh / Cert Spotter New SSL/TLS certificates issued for your domains, indicating new subdomains or shadow IT
CISA ICS Advisories + KEV New ICS advisories, KEV additions, and vendor-specific security alerts (email and RSS)
Vendor PSIRTs New vulnerabilities for specific products in your asset inventory (see Module 4 vendor PSIRT table)
HaveIBeenPwned New data breaches containing email addresses from your domain (requires domain verification)

Configure any of these services that are not already active for your organization. The lab section below focuses on pull-based checks because those require structured practice to establish the baseline and schedule that make ongoing monitoring repeatable.

Alert Aggregation and Signal-to-Noise

Without aggregation, monitoring alerts scatter across email, Slack, RSS feeds, and browser bookmarks -- making it easy to miss critical notifications. Establish a single collection point:

  • Dedicated email folder -- Filter all monitoring alerts (Google Alerts, CISA, vendor PSIRTs, HIBP) into a single folder. Simple, works with any email system
  • Slack or Teams channel -- A dedicated channel for monitoring alerts, with email-to-channel integration for services that only support email delivery
  • RSS reader -- For CISA advisories and vendor feeds that publish RSS. Aggregates multiple feeds into a single reading interface

Start narrow, expand as you learn. Begin with a focused set of alerts covering your highest-priority assets and personnel. Overly broad alerting generates noise that trains you to ignore notifications -- the opposite of what you want. After a few weeks of triage, you will learn which alerts generate actionable findings and which are noise. Expand coverage gradually.

When an alert arrives, run it through three questions:

  1. Does this involve an asset or person in my baseline? If not, it may be informational but is not directly actionable against your monitored environment.
  2. Does this indicate a change from baseline? New asset, changed configuration, removed service, new breach appearance -- changes are what you are looking for. An alert confirming known state is not a finding.
  3. Is this actionable within my team's current capacity? A real finding that no one can act on is a backlog item, not an emergency. Be honest about capacity.

Based on those answers, assign each alert a disposition:

  • Act now -- P0/P1 finding: escalate immediately per your escalation path
  • Weekly review queue -- Notable change that is not urgent. Batch these for your weekly pull check
  • Parking lot -- Interesting but low priority. Review monthly during your baseline update
  • Tune or disable -- After 4+ weeks of consistently non-actionable results from a specific alert, adjust the query or remove it. An alert that never produces useful findings is consuming attention without providing value

One additional discipline: any item classified as "Needs Investigation" in your baseline (Step 2) should not sit indefinitely. If it remains unresolved for more than one full monitoring cycle -- weekly or monthly, depending on when it was added -- force a human decision: accept the risk, investigate further, or remediate. Unresolved items that linger become invisible, which is worse than a deliberate risk acceptance.

Sustainability and Operational Fatigue

Alert fatigue -- too many notifications -- is a recognized problem. Less discussed is operational fatigue: the cumulative burden of running a monitoring program week after week, month after month. Manual programs fail not because the first month was bad, but because by month six, the person responsible has changed roles, competing priorities have crowded out the weekly checks, execution becomes inconsistent, and nothing is documented well enough for a handoff.

Design for this from day one. Identify your minimum viable monitoring -- the 2-3 activities that, if nothing else runs, keep the program alive. This might be: process your push alerts daily, run the weekly KEV check, and update the baseline monthly. Everything beyond that is valuable but optional. Document procedures clearly enough that someone else can execute them without training. Share ownership so the program does not depend on a single person. Use realistic time estimates -- if your weekly checks consistently take 45 minutes instead of 30, update the schedule rather than skipping steps. And accept that some coverage gaps are acceptable. A monitoring program that runs consistently at 70% coverage is far more valuable than one designed for 100% that quietly stops after three months.

Lab: Configure Your Monitoring Infrastructure

In this lab, you will build an OT-focused pull-based monitoring schedule tailored to your findings from Modules 2-4, then consolidate those findings into a classified baseline document. The outputs become Artifacts 6 and 7.

Step 1: Design Your Pull-Based Monitoring Schedule

Your pull-based schedule should target the remote access infrastructure, edge devices, and OT-adjacent personnel identified in Modules 2-4. This is not a general IT monitoring checklist -- generic changes like marketing website updates or unrelated subdomain additions are out of scope. The focus is on the systems and credentials that, if compromised, create a path to operational technology: VPN portals, firewall management interfaces, Tier 1 and Tier 2 personnel credentials, and the vulnerability status of edge devices sitting on the IT/OT boundary.

Use your AI client to generate a customized monitoring checklist based on your specific findings:

Pull-Based Monitoring Schedule Prompt
Create a pull-based OSINT monitoring schedule for [organization
name], a [sector] organization focused on monitoring remote
access and OT-adjacent infrastructure. Include weekly checks
(under 30 minutes) and monthly checks (up to 2 hours).

From our prior modules:
- Remote access and OT-adjacent services discovered: [list
  from Module 2 -- VPN portals, admin interfaces, remote
  desktop, firewall management pages]
- Edge device vendors with internet-exposed assets: [list
  from Module 4, e.g., Fortinet, Cisco]
- Tier 1 and Tier 2 personnel (OT/IT access): [count and
  roles from Module 3]
- Email domain for breach monitoring: [domain from Module 3]
- Primary domains and subdomains to watch: [from Module 2]

For each check, specify:
1. The tool and exact query to run
2. What to compare against (the baseline from today)
3. What constitutes an OT-relevant finding vs. expected noise
4. Estimated time

Prioritize checks covering remote access infrastructure and
edge device vulnerability changes above general subdomain
or personnel checks.
Worked Example: NRECA Pull-Based Monitoring Schedule
Example AI Response (NRECA Monitoring Schedule)
Example AI Response (NRECA Monitoring Schedule)

Customize the AI-generated schedule for your environment. Your actual weekly time may vary -- the key is having a consistent, repeatable process rather than a specific time target. Organizations with larger attack surfaces or more exposed services may need to allocate more time for monthly checks.

Step 2: Create Your Baseline Snapshot

Your pull-based monitoring is only useful if you have a baseline to compare against. Consolidate your outputs from Modules 2, 3, and 4 into a single baseline document that serves as the reference point for all future monitoring.

For each item in the baseline, assign a classification:

Classification Meaning Action
Known-Good Expected and properly configured. No security concern identified Document and monitor for changes
Accepted Risk Known exposure that cannot be remediated immediately (e.g., business-required internet-facing service) Document the business justification. Monitor more frequently. Review acceptance quarterly
Needs Remediation Security issue with a clear fix (patch available, misconfiguration, unnecessary exposure) Add to remediation queue with priority from Module 4 framework. Track to resolution
Needs Investigation Unknown or unexpected item requiring further analysis before classification Investigate promptly. Items should not remain in this category for more than one monitoring cycle

OT-Priority Sorting

When classifying baseline entries, work through them in order of operational impact. Remote access portals, VPN endpoints, and administrative interfaces should be classified first -- these are the assets most likely to provide a direct path into OT networks. Tier 1 and Tier 2 personnel (those with OT system access or credentials) come next, followed by vulnerability correlation entries for edge devices and OT-adjacent systems. All remaining findings -- corporate subdomains, general personnel, and lower-priority vulnerabilities -- are classified last. This ordering ensures that the entries with the most direct OT impact receive attention even when time runs short.

Baseline Structure

Your baseline document should consolidate the key outputs from each preceding module:

  • From Module 2 (Attack Surface): All discovered domains, subdomains, and exposed services with their current status. Remote access portals, VPN login pages, and administrative interfaces should be listed first within this section. Include the asset documentation fields (hostname, IP, port, product, criticality, zone)
  • From Module 3 (Personnel): Tier 1 and Tier 2 personnel with email addresses, breach status, and monitoring priority. Team email addresses and email format pattern
  • From Module 4 (Vulnerabilities): Vulnerability correlation table with CVE details, KEV status, priority ratings, and remediation status
  • From this module: List of configured push alerts and pull-based schedule, so the next person running the monitoring process knows what is already set up

The baseline is a living document. Every monitoring cycle should produce a comparison against the baseline and then update it. When a new subdomain appears, it starts as "Needs Investigation," gets analyzed, and then moves to one of the other three categories. When a vulnerability is patched, its status in the baseline changes to "Known-Good" with a note about the patch date. The baseline reflects your current understanding, not a historical snapshot. The remote access and edge device entries are the highest-priority section of this document -- if nothing else is reviewed at each monitoring cycle, those entries should be.

Step 3: Sustainability Assessment

You have now designed a monitoring schedule and created a baseline. Before leaving this module, spend five minutes making explicit decisions about how this program survives beyond today. Answer these four questions -- use your AI client to think through the tradeoffs, but the answers are yours to make:

  1. Single point of failure: Is this program documented well enough that a teammate who was not here today could run it? If not, what is missing -- tool access, login credentials, the schedule itself, or context about why certain alerts were configured?
  2. Minimum viable monitoring: Of all the push alerts and pull checks in your monitoring program, which 2-3 would you keep if your available time dropped by 75%? These are your non-negotiables -- the checks that keep the program alive even when everything else falls off. (Include both push alerts and pull checks in your consideration.)
  3. Escalation path: When monitoring surfaces a P0 or P1 finding at 10pm on a Friday, who gets the call? Is that documented anywhere other than your own memory?
  4. Program review cadence: When will you revisit the alert configurations themselves? Alert configs go stale too -- organizational changes, new domains, personnel turnover, and vendor product changes all require updates to what you are monitoring.

Use your AI client to help identify your minimum viable monitoring set:

Minimum Viable Monitoring Prompt
Given this monitoring program covering [describe your setup --
domains monitored, alert services configured, pull-based schedule,
number of personnel tracked], I have limited time and need to
identify my minimum viable monitoring set.

If I could only run 3 checks per week, which 3 would give me the
highest detection value for an electric cooperative / [sector]
organization, and why?

For each, specify:
1. The tool and the specific query to run
2. What finding would require immediate action (escalate now)
   vs. adding to a weekly review queue
3. How long the check takes and what it would miss if skipped

Document your answers to these four questions alongside your monitoring artifacts. The goal is not to plan for failure -- it is to make scope decisions deliberately rather than discovering them when the program quietly stops running.

Output

Artifact 5: Push-based alert configuration. The alert sources documented in the read section -- Google Alerts, CISA ICS Advisory subscriptions, vendor PSIRT feeds, and breach notification services -- configured to deliver ongoing notifications without recurring effort. These push-based alerts form the daily triage queue that your Module 6 runbook will operationalize.

Artifact 6: Pull-based monitoring schedule. A structured weekly and monthly checklist specifying which tools to use, which queries to run, and what to compare against your baseline. This schedule ensures monitoring happens consistently rather than only when someone remembers. Combined with the push-based alert sources documented in the read section, this provides layered monitoring coverage. Use the Monitoring Checklist Template (download Word).

Artifact 7: Consolidated baseline document. A single document consolidating your Module 2 attack surface inventory, Module 3 personnel exposure inventory, and Module 4 vulnerability correlation table, with each item classified as known-good, accepted risk, needs remediation, or needs investigation. This baseline is the reference point for all ongoing monitoring -- every future check asks "what has changed since the baseline?" Use the Baseline Document Template (download Excel).

Looking ahead: The manual program you built here is a foundation, not the end state. Once you know what you are looking for and what normal looks like, automation becomes feasible. The same AI clients you have been using throughout this workshop can generate scripts that automate your pull-based checks -- Python scripts for crt.sh API queries, CISA KEV JSON feed parsing, HIBP API lookups (for those with API access), and Shodan API queries -- turning your manual schedule into a repeatable, scheduled process.

Module 5