BGP update profiles and the implications for secure BGP update validation processing Geoff Huston PAM2007 5 April 2007

Why? Secure BGP proposals all rely on some form of validation of BGP update messages Validation typically involves cryptographic validation, and may refer to further validation via a number resource PKI This validation may take considerable resources to complete. This implies that the overheads securing BGP updates in terms of validity of payload may contribute to: Slower BGP processing Slower propagation of BGP updates Slower BGP convergence following withdrawal Greater route instability Potential implications in the stability of the forwarding plane One of the approaches to securing BGP is to attach some form of credential to a BGP update message. These credentials would normally be originally placed into the message by the originator of the BGP message, and allow remote parties who receive the update message to be able to validate that the subject of the update message, namely the originating AS actually generated the original update message, and that the originating AS has the authority to speak about the address prefix. Validation of cryptographic information may entail significant processing overheads, and the concern is that measures to secure BGP may have a serious negative impact on the performance of BGP.

What is the question here? Validation information has some time span Validation outcomes can be assumed to be valid for a period of hours Should BGP-related validation outcomes be locally cached? What size and cache lifetime would yield high hit rates for BGP update validation processing? One way to mitigate this processing overhead is to observe that validation outcomes may be considered “valid” for a period of some hours. The question is then one of: If validation outcomes were to be caches what cache parameters (cache time and retention time) would be effective in mitigating the BGP validation workload?

Method Use a BGP update log from a single eBGP peering session with AS 4637 over a 14 day period 10 September 2006 – 23 September 2006 Examine time and space distributions of BGP Updates that have similar properties in terms of validation tasks

Update Statistics for the session Day Prefix Updates Duplicates: Prefix + Origin AS Duplicates AS Path Prefix + Comp-Path 1 72,934 60,105 (82%) 54,924 (75%) 34,822 (48%) 35,312 (48%) 2 79,361 71,714 (90%) 67,942 (86%) 49,290 (62%) 50,974 (64%) 3 104,764 93,708 (89%) 87,835 (84%) 65,510 (63%) 66,789 (64%) 4 107,576 94,127 (87%) 87,275 (81%) 64,335 (60%) 66,487 (62%) 5 139,483 110,994 (80%) 99,171 (71%) 68,096 (49%) 69,886 (50%) 6 100,444 92,944 (92%) 88,765 (88%) 70,759 (70%) 72,108 (72%) 7 75,519 71,935 (95%) 69,383 (92%) 56,743 (75%) 58,212 (77%) 8 64,010 60,642 (95%) 57,767 (90%) 49,151 (77%) 49,807 (78%) 9 94,944 89,777 (95%) 86,517 (91%) 71,118 (75%) 72,087 (76%) 10 81,576 78,245 (96%) 75,529 (93%) 63,607 (78%) 64,696 (79%) 11 95,062 91,144 (96%) 87,486 (92%) 72,678 (76%) 74,226 (78%) 12 108,987 103,463 (95%) 99,662 (91%) 80,720 (74%) 82,290 (76%) 13 91,732 87,998 (96%) 85,030 (93%) 72,660 (79%) 74,116 (81%) 14 78,407 76,174 (97%) 74,035 (94%) 64,994 (83%) 65,509 (84%) These are the raw figures from the 14 day capture period The first column is the number of prefixes that were updated by BGP (including duplicates) The second column is the percentage of these prefixes that were updated at least twice (i.e. were duplicated during the day) The third is the number of prefix plus origin AS pairs that were duplicated in the day. The fourth column uses prefix plus the AS path The fifth column uses a compressed AS path (i.e. removing repeated AS’s from the AS Path)

CDF by Prefix and Originating AS The CDF on the left indicates that the busiest 1% of all prefixes were responsible for 20% of the total number of BGP prefix updates over the 14 day period, and the top 10% of prefixes were responsible for 55% of all prefix updates The CDF on the right indicates that the busiest 1% of all origin AS’s occurred in 41% of all prefix updates, and the top 10% occurred in 80% of all prefix updates. These figures show a highly skewed distribution with strong heavy tail characteristics. They also indicate a strong likelihood of cacheability.

Time Distribution This figure indicates the time distribution of duplicated update information for the various ‘types’ of duplication, looking at prefixes, prefix plus origin, prefix plus path and prefix plus compressed path. The figures indicate that the overall majority of duplicates occur within one hour of the original update, and the recurrence rate drops off in the 30 – 40 hour range.

Time Spread This figure shows the same plot as the previous figure, but uses a relative percentage of all duplicates. The shaded area indicates that within 36 hours more than 80% of all duplicate types will have occurred.

Space Distribution Use a variable size cache simulator Assume 36 hour cache lifetime Want to know the hit rate of validation queries against cache size The next figures look at determining the most effective cache size for this data. The cache simulator is one that simulates all cache sizes simultaneously, using a 36 hour cache retention interval and a LRU cache replacement scheme

Prefix Similarity This is a simulation of a prefix cache, looking at the cache hot rates for each of the 14 days, using a 36 hour cache retention time and a LRU cache replacement algorithm. For 13 of the 14 days in the measurement period a cache size of 10,000 prefixes would have a cache hit rate of between 78% to 86%

Prefix + Origin Similarity This is similar to the previous figure, but in this case the cache key is “origination” which is the combination of the prefix and the origin AS. Here a cache of 10,000 entries would have an average daily hit rate of some 72%.

Prefix + Path Similarity This is the same as the previous figures, using this time a cache key of prefix plus AS path. Here a cache of 10,000 entries would have an average hit rate of some 38%.

Observations A large majority of BGP updates explore diverse paths for the same origination True origination instability occurs relatively infrequently (1:4) ? Validation workloads can be reduced by considering origination (prefix plus origin) and the path vector as separable validation tasks Further processing reduction can be achieved by treating a AS path vector as a sequence of AS paired adjacencies AS the text points out a significant proportion of BGP updates appear to be BGP performing a path exploration. The extent of origin instability, where a prefix is announced from multiple origin ASs is less common. The view of using the prefix plus AS path as the ‘unit’ of BGP security validation appears to represent a relatively suboptimal choice in terms of validation workload even with caching out validation outcomes. The workload can be reduced significantly by considering origination (prefix plus origin AS) and AS paths as distinct validation tasks. The next question is to what extent can path validation outcomes be cached, where the path and not the prefix is validated and the validation outcome cached.

AS Path Similarity This figure indicates that a cache of 10,000 AS paths with a retention period of 36 hours would have an average daily hit rate of some 77%

AS Pair Similarity AS path validation can be further decomposed into a task of a sequence of validation of AS pairs. This figure shows the AS pair validation cache hit rate for various cache sizes. A cache of 100 AS pairs would have an average hit rate of some 83%

Observations Validation caching appears to be a useful approach to addressing some of the potential overheads of validation of BGP updates Separating origination from path processing, using a 36 hour validation cache can achieve 80% validation hit rate using a cache of 10,000 Prefix + AS originations and a cache of 1,000 AS pairs Cacheing represents a very effective way to reduce the validation processing overhead associated with secure BGP measures. The data takes in this measurement period indicates that a cache of 10,000 originations can achieve an 80% validation hit rate, reducing the validation processing load by this amount. A cacne of 1,000 AS pairs appears capable of performing at a hit rate of 90% Part of the high AS Pair cache hit rate is due to the observation that in an AS path vector the AS pairs on the left are ‘closer’ to the BGP speaker, and near AS pair adjacencies will occur in almost every update, while AS pair adjacencies to the right of the AS path vector will be less frequent. The other factor here is the highly skewed distribution of Origin AS’s in updates where, as previously noted, 41% of all updates have 1% (or some 230) origin AS’s. This points to a more general observation about network stability and BGP that it appears that in general most of the network is extremely stable, and a very small subset of the network exhibits disproportionately high levels of routing instability.

What do we want from secure BGP? Validation that the received BGP Update has been processed by the ASs in the AS Path, in the same order as the AS Path, and reflects a valid prefix, valid origination and valid propagation along the AS Path? or Validation that the received Update reflects a valid prefix and valid origination, and that the AS Path represents a plausible sequence of validated AS peerings? By decomposing the BGP update validation task of validation of a prefix and an associated AS Path into separate origination validation and AS path validation, and then further decomposing AS Path validation into a sequence of AS pair validation tasks, and caching validation outcomes, the validation workload can be reduced considerably. The validation outcomes, however, are subtly different. The validation of the complete update, where each AS adds crypto information that covers what it sends as an update to its peer creates a cumulative outcome where a BGP update receiver can be assurred that not only is the prefix a valid one and the AS path in the update is a valid AS path in terms of the actual network topology, but it can also validate that the update itself has traversed the same sequence of AS’s as represented in the AS path. The validation outcome of origination and AS pairs can still provide assurance that the prefix is a valid one and that the prefix holder has authorized the origination of the prefix by the orign AS. However the AS pair validation can only validate that the AS path as represented in the update is a plausible path, and the update itself may not have been propagated along this AS path.

Thank You