IODA Help
Screencasts
We have created screencasts that walk you through using IODA's Dashboard
and Explorer. The screencasts discuss the most important user interface
elements, IODA's data sources, and show how to use both the Dashboard
and the Explorer effectively.
Datasources
BGP
-
Data is obtained by processing all updates from all Route Views and
RIPE RIS collectors.
-
Every 5 minutes, we calculate the number of full-feed peers that
observe each prefix. A peer is full-feed if it has more than 400k IPv4 prefixes, and/or more than 10k IPv6 prefixes (i.e., suggesting that it shares its entire routing table).
-
A prefix is visible if more than 50% of the full-feed peers observe it.
We aggregate prefix visibility statistics by country, region and ASN.
Active Probing
-
We use a custom implementation of the Trinocular technique.
-
We probe ~4.2M /24 blocks at least once every 10 minutes (as opposed to 11 minutes used in the Trinocular paper).
-
Currently the alerts IODA shows use data from a team of
20 probers located at SDSC. (Alerts based on data from our
distributed probers that run on the Ark platform are coming soon.)
-
The trinocular measurement and inference technique labels a /24 block as up,
down, or unknown.
In addition, we then aggregate up /24s into country, region and ASN
statistics.
Network Telescope
-
We analyze traffic data from both the UCSD and Merit Network Telescopes.
(Currently IODA uses only data from the UCSD Telescope for generating alerts.)
-
We apply anti-spoofing heuristics and noise reduction filters to the
raw traffic.
-
For each packet that passes the filters, we perform geolocation (using the Netacuity IP geolocation DB) and ASN lookups on the source IP address,
and then compute the number of unique source IPs per minute, aggregated by country, region, and ASN.
Outage Detection
-
For each data source (BGP, Active Probing, and Darknet), we currently
monitor for three types of outages: country-level, region-level and
AS-level.
-
Detection is performed by comparing the current value for
each datasource/aggregation (e.g. the number of /24 networks visible
on BGP and geolocated to Italy) to an
historical value that is computed by finding the
median of a sliding window of recent values (the length of
the window varies between data sources and is listed below).
-
If the current value is lower than a given fraction of the
history value, an alert is generated. Each data source is
configured with two history-fraction thresholds; one that
triggers a warning alert, and one that triggers a
critical alert. The warning and critical thresholds for each
data source are listed below. These values are experimental and are based on empirical observations of the signal to noise ratio for each data source.
Detection Criteria
BGP
-
Metric: # /24 blocks (visible by > 50% of peers)
-
History Sliding Window Length: 24 hours
-
Thresholds:
-
Warning: 99%
-
Critical: 50%
Active Probing
-
Metric: # /24 blocks up
-
History Sliding Window Length: 7 days
-
Thresholds:
-
Warning: 80%
-
Critical: 50%
Darknet
-
Metric: # unique source IP addresses
-
History Sliding Window Length: 7 days
-
Thresholds:
-
Warning: 25%
-
Critical: 10%
Outage Severity Scores
Alert Area
To quantify the severity of an outage, we use a concept we call
Alert Area, which takes into account both the magnitude of the
outage and the duration of the outage. The alert area is computed by
multiplying the relative drop (i.e. ((history - current) / history) * 100)
by the length of the outage (in minutes). All alert tables in IODA show
Alert Area values as per-datasource outage severity scores.
Overall Score
While we perform outage detection on a per-datasource basis, we use multiple
datasources to gain confidence about an outage. To do this, we compute
an Overall Score by multiplying the
Alert Area scores for each data source that triggered an alert.
We multiply Alert Area scores together (rather than summing them) to
give weight to outages that have been detected through multiple datasources.
Caveat: While the "Overall Score" values given in the alert tables
reflect a multiplication of the total alert area for each data source,
the "Overall Score" value shown in the "Outage Severity Levels"
visualizations is instead the
sum of the overall score values for each minute in the time window.
That is, an overall score is computed for each minute by multiplying
together the alert areas for that minute, and then these per-minute
overall scores are summed to give the total shown when hovering over a
country or region.