CS-495 Advanced Networking J. Scott Miller, Spring 2005 Against Internet Intrusions (paper)
CS-495 Advanced Networking 2 Plan of Attack Introduction Data Collection Data Analysis –Data as over-generalized –Response to data as flimsy –Projections as too simplistic –Final analysis shoddy
CS-495 Advanced Networking 3 Introduction As we’ve already heard, this paper has a lot of data! Unfortunately, the analysis provided does not match the depth of information collected –Few meaningful conclusions are drawn –Analysis is very simplistic and preliminary –Future work is suggested once
CS-495 Advanced Networking 4 Data Collection Firewall logs from across the world –1600 different locations –Collected over four months Sounds good, but… –No information given regarding the subnets these firewalls protect, such as size and composition (this become important later) –Logs lack IP header and connection information
CS-495 Advanced Networking 5 Data as Over-Generalized Data is placed into two very large groups –Worms –Non-worms But behavior of each intruder is specialized –Code Red I exhibits strong day-of-the-month characteristics whereas Code Red II does not –Global characteristics inferred are therefore very dependant on the worm in question –Same for non-worms
CS-495 Advanced Networking 6 Data as Over-Generalized (cont.) What does this mean? –Analysis of persistence is biased toward the worms considered Code Red I is memory resident while II is not –Periodicity is skewed by varied behavior While it’s neat to see traffic spike during the Code Red I spread phase, it is not necessary telling One more thing… –Not clear if the firewalls catch intra-subnet traffic, important for some worms
CS-495 Advanced Networking 7 Response to Data Analysis of top sources –Focus limited to non-worm sources –Author’s find a very Zipf-like distribution
CS-495 Advanced Networking 8 Response to Data (cont.) So author’s suggest… –“… blacklisting worst offenders would be an effective mechanism defending against non-port 80 intrusions.” Unfortunately, this is ineffective because of the long tail distribution –A few nodes are making a large number of attacks –Many are making a small number of attacks and not all IPs in that group can be banned –No information is given on how many intrusions would still remain
CS-495 Advanced Networking 9 Projections The limited data set is extrapolated to give an idea of the amount of intrusions Internet-wide –Calculated by taking the average intrusions per IP and multiplying that by the IP space –“We assume uniformity, but do not test for it. That is, we assume that since our set of provider networks are reasonably well distributed … our perspective reflects what is seen over the general internet.” Sound naïve?
CS-495 Advanced Networking 10 Projections (cont.) It is! –Simply stating that you did not test for uniformity does not make it ok to ignore it! A number of other factors are ignored in this assumption: –Intra-subnet traffic missed by the router –Traffic behind a NAT –Unassigned IP addresses Without regard to these factors, 25 billion scans a day is arbitrary
CS-495 Advanced Networking 11 Final Analysis Finally, the author’s take a look at how many subnets is adequate to determine “worst offenders” (top) and target ports (bottom) Data appears erratic still – is it possible that this data does not fit that model? Only mentions the data should be “relatively stable”
CS-495 Advanced Networking 12 Moving on to My Opponent…