Internet2 IP Backbone Capacity Augment Practice
(January 13, 2015)
Internet2 continuously monitors its backbone traffic levels. In addition to the traffic graphs made available on the Internet2 NOC website, Internet2 staff receive weekly backbone traffic reports that provide a snapshot of the prior week’s worth of activity.[1] These reports provide triggers for staff discussion of potential areas for capacity augment.
Internet2 provides three different services across the Internet2 infrastructure: Research and education IP traffic, commercial peer traffic to TR-CPS routers, and Layer 2 circuits. The goal of the headroom policy is to accommodate bursts up to reasonable end-host interface sizes[2] on both the research and education IP infrastructure and the Layer 2 circuits. In addition, across all three services it is desirable for ports and circuits to be able to routinely handle traffic bursts due to failed-over circuits and services.
- Backbone circuits are flagged for discussion when the 95th percentile number reaches 30% for a given week. When the circuit regularly sustains levels at 40% utilization, staff initiates a backbone augment.[3]
- These same figures should be used for peering ports, if it is possible to negotiate that level with peers.
- Queues should be monitored for activity weekly with multiple queuing events treated as 30% measurement.
- It is recommended that a 50% headroom be maintained on the connections between the regionals and Internet2 as well as between campuses and regionals, ensuring sufficient headroom end to end.
The guidelines above comprise a working understanding of the needs of the Internet2 community and may be adjusted with input from Internet2 staff in conjunction with advisory groups. Internet2 views maintaining adequate backbone headroom as a basic responsibility.
[1] The weekly snapshots specifically report on the prior period’s 95th-percentile measurement. This industry-standard calculation is the maximum observed bandwidth once the top 5 percent of observed 30-second averages are discarded during that period. This means that during a week, there are roughly 8.4 hours of data that are ignored. Similarly, during a 24-hour period, 72 minutes of data are ignored. For example, on the weekly report, a 95th percentile number of 4.8Gbps implies that there was roughly 8.4 hours of 30-second averages that were at or above 4.8Gbps.
[2] 40gig as 4/4/2014
[3] For the purposes of this document, there are no rigid definitions of “regularly sustains” other than staff interpretation of traffic graphs, combined with their understanding of the R&E landscape at the time. This takes into account any special events, conferences, or one-time demonstrations that may be artificially inflating traffic levels in a given period.