What's New with the University of Oregon RouteViews Project

Advanced Network Technology Center
University of Oregon

BGPd on eqix.routeviews.org crashed for unknown reasons. Down from about
11:50 to about 17:30 UTC.

After considering our options, we've decided to decommission route-view3
and manage all the Oregon and multi-hop sessions on either route-views,
route-views2, or route-views6. All non-redundant sessions have been moved
to one of the other collectors.

route-views3 is back up (with temporary hardware) and appears to be stable.

route-views3 is down with a hardware malfunction, and will be off line 
for repairs for an unknown period of time. If you are currently peering 
with route-views3 and are receiving alarms, please feel free to admin 
the sessions down until the problem has been resolved.

route-views2 was down from approximately 16:26 22-Oct-06 until 20:13 23-Oct-06 
UTC due to an apparent hardware problem.

route-views.isc had an interface problem with caused collections to 
be lost from approximately 15:00 to 20:00 UTC.

route-views.eqix had problems of unknown origin for several hours. Some
or all peer sessions were disrupted for part of this time. 

The PAIX collector had interface problems, causing numerous peer resets
throughout the day. A new interface was swapped in and, after a brief outage,
connectivity appears to be restored. Problems were finally resolved prior
to 23:00 UTC.

a fiber cut on campus caused outages of all the Oregon collectors
(route-views, route-views2, and route-views3) between 03:30 and
05:00 UTC.

route-views2 down again briefly around 15:00 UTC. Between 16:30 and
16:45 the Zebra BGPD was replaced with a patched version.

route-views2 developed a problem with the Zebra process and was down from
around 17:20 to 04:44 UTC.

route-views was down from approximately 20:30 to 21:40, then had to be reloaded
again at 00:22 UTC.

routeviews.isc was down from approximately 15:30 to 00:30 UTC.

Ongoing issues with the BGP process on route-views.linx caused periodic loss 
of collections from approximately 27 Jan 06 to 10 Feb 06. We currently believe 
that the problem has now been fixed.

route-views.eqix was replaced with a new machine today. Resumption of
service took longer than expected, and a large part of the day passed 
without any data being collected. 

bgpd on route-views.linx died and had to be manually restarted, resulting
in a loss of data from approximately 13:12 to 15:20 UTC.

memory problems on route-views caused a loss of connectivity and and data
collection problems between 07:00 and 16:00 UTC.

a switch failure caused loss of connectivity to the majority of peers from
approximately 22:15 to 22:45 UTC today. This will show up as a loss of data
in the archive files for today.

route-views2 ran out of disk space. Approximately 2 hours of collection
was lost.

20050328 through 20050330
route-views.eqix continues to experience hardware problems, and is down for
about two days. We are still waiting for new hardware. 

route-views.oregon-ix stopped accepting tcp connections and had to be rebooted.
Cause of the problem is unknown. 

route-views.eqix was down for approximately a day and a half. This machine
is experiencing ongoing problems and a replacement is being ordered.

upstream router crash affected several BGP peers on the boxes at the University.

route-views.eqix, hung and had to be rebooted.  Cause unknown.

route-views.isc, NAP switch outage caused a few hours of peer losage.

route-views.eqix ethernet PHY wedged.  rebooted

route-views.isc (PAIX) NAP connection affected by UPS failure.

route-views3 routing daemon restarted.  It became very confused after adding
a specific hold-timer to the BGP groups.

route-views.isc (PAIX) NAP interface accumulating errors that are affecting
BGP sessions.  Began about 12:45 UTC.  Thought to be an interface problem,
box was rebooted.  Later found that S&D is having switch problems.

route-views2 rebooted at 19:00.  Suspected new kernel as cause of lingering
tcp connections.  Not the case, but did clear the result of these in zebra.

route-views.linx rebooted at 01:00 to load new kernel.

route-views2 dropping sessions during RIB dumps after upgrade.  Addition of
memory, fixing RAID microcode, and code changes seem to have fixed the problem.

route-views.wide lost power at ~10:25 UTC

route-views2 and 6 upgraded.

route-views.isc lost power.

route-views.linx rebooted to pick-up new software.

route-views.wide rebooted to pick-up new software.

route-views.wide now has a connection to the WIDE/NSPIXP2 IPv6 switch and
is ready for IPv6 peering.

route-views.wide.routeviews.org lost power early this morning (PDT).  Cause
not yet known.

We had additional problems with the archive.routeviews.org hardware.  Nearly
all is restored now.

We suffered a RAID failure on archive.routeviews.org, the machine that also
collects RIBs from the Cisco route-views.routeviews.org.  It successfully
rebuilt on the 12th and failed again the morning of the 13th.

It has been rehomed to a new machine with a more reliable RAID controller.

We regret that Cisco RIBs that would have been collected during these periods
are lost.

04 February 2005 help@routeviews.org.