Our tech support team is happy to help you with any questions you might have. Contact us on our online support forum at https://support.nagios.com/forum/
Nagios XI is the easy-to-use, enterprise version of Nagios that features:
Download a free 30-day trial to give Nagios XI a spin.
Inquire today and let our Quickstart team help you get started with Nagios XI
Up To: Contents
See Also: Host Checks, Passive Host State Translation
If you've ever worked in tech support, you've undoubtably had users tell you "the Internet is down". As a techie, you're pretty sure that no one pulled the power cord from the Internet. Something must be going wrong somewhere between the user's chair and the Internet.
Assuming its a technical problem, you begin to search for the problem. Perhaps the user's computer is turned off, maybe their network cable is unplugged, or perhaps your organization's core router just took a dive. Whatever the problem might be, one thing is most certain - the Internet isn't down. It just happens to be unreachable for that user.
Nagios Core is able to determine whether the hosts you're monitoring are in a DOWN or UNREACHABLE state. These are very different (although related) states and can help you quickly determine the root cause of network problems. Here's how the reachability logic works to distinguish between these two states.
Take a look at the simple network diagram below. For this example, lets assume you're monitoring all the hosts (server, routers, switches, etc) that are pictured. Nagios Core is installed and running on the Nagios Core host.
In order for Nagios Core to be able to distinguish between DOWN and UNREACHABLE states for the hosts that are being monitored, you'll need to tell Nagios Core how those hosts are connected to each other - from the standpoint of the Nagios Core daemon. To do this, trace the path that a data packet would take from the Nagios Core daemon to each individual host. Each switch, router, and server the packet encounters or passes through is considered a "hop" and will require that you define a parent/child host relationship in Nagios Core. Here's what the host parent/child relationships look like from the viewpoint of Nagios Core:
Now that you know what the parent/child relationships look like for hosts that are being monitored, how do you configure Nagios Core to reflect them? The parents directive in your host definitions allows you to do this. Here's what the (abbreviated) host definitions with parent/child relationships would look like for this example:
define host { host_name Nagios ; <-- The local host has no parent - it is the topmost host } define host { host_name Switch1 parents Nagios } define host { host_name Web parents Switch1 } define host { host_name FTP parents Switch1 } define host host_name Router1 parents Switch1 } define host { host_name Switch2 parents Router1 } define host { host_name Wkstn1 parents Switch2 } define host { host_name HPLJ2605 parents Switch2 } define host { host_name Router2 parents Router1 } define host { host_name somewebsite.com parents Router2 }
Now that you're configured Nagios Core with the proper parent/child relationships for your hosts, let's see what happen when problems arise. Assume that two hosts - Web and Router1 - go offline.
When hosts change state (i.e. from UP to DOWN), the host reachability logic in Nagios Core kicks in. The reachability logic will initiate parallel checks of the parents and children of whatever hosts change state. This allows Nagios Core to quickly determine the current status of your network infrastructure when changes occur.
In this example, Nagios Core will determine that Web and Router1 are both in DOWN states because the "path" to those hosts is not being blocked.
Nagios Core will determine that all the hosts "beneath" Router1 are all in an UNREACHABLE state because Nagios Core can't reach them. Router1 is DOWN and is blocking the path to those other hosts. Those hosts might be running fine, or they might be offline - Nagios Core doesn't know because it can't reach them. Hence Nagios Core considers them to be UNREACHABLE instead of DOWN.
By default, Nagios Core will notify contacts about both DOWN and UNREACHABLE host states. As an admin/tech, you might not want to get notifications about hosts that are UNREACHABLE. You know your network structure, and if Nagios Core notifies you that your router/firewall is down, you know that everything behind it is unreachable.
If you want to spare yourself from a flood of UNREACHABLE notifications during network outages, you can exclude the unreachable (u) option from the notification_options directive in your host definitions and/or the host_notification_options directive in your contact definitions.