The Monitoring and Early Detection of Internet Worms Cliff C. Zou, Weibo Gong, Don Towsley, and Lixin Gao IEEE/ACM Trans. Networking, Oct. 2005
Virus / Worm / Trojan Horse Virus: 寄生在已存在的檔案中。 一段電腦程式碼,它會「將自身附加到程式或檔案」,在電腦之間 傳佈,並在旅行途中感染電腦。 系統漏洞(不需使用者操作) Worm: 以新檔案的形式安裝到電腦上。 蠕蟲通常不需要使用者的動作即可散佈,而且它會將它本身的完整 複本 ( 可能已修改 ) 透過網路發佈。 系統漏洞(不需使用者操作) Trojan Horse: 看似有用,但實際上卻會造成損害的電腦程式。 後門程式 (Backdoor) 以偽裝欺騙使用者(需使用者操作)
Outline Worm propagation models Worm monitoring system Kalman filter estimation Code Red simulation Blaster-like worm simulation
Summary of worm models Scan mode Uniform-scan (random) (as default) Code Red Imperfect uniform-scan Slammer Sequential-scan Blaster Subnet-scan Code Red II Worm propagation models Simple epidemic model Discrete-time version Exponential model (for slow start phase) AR discrete-time model Transformed linear model
Notations
Worm propagation model
Propagation models Simple epidemic model (*) Discrete-time version (*) Exponential model (slow start phase: N - I t N ) AR discrete-time model Transformed linear model * D.J. Daley and J. Gani, Epidemic Modeling: An Introduction. Cambridge, U.K.: Cambridge Univ. Press, where
Generic worm monitoring system
Components Ingress scan monitor Listen to the global traffic in the Internet. Scan traffic Incoming traffic to unused local IP addresses Egress scan monitor Monitor the outgoing traffic from a network to infer the scan behavior of a potential worm. Scan rate Scan distribution Data mixer Reduce the traffic for sending observation data to the MWC
The data that MWC obtains The number of scans monitored in a monitoring interval from discrete time ( t -1) to t, denoted by Z t. The cumulative number of infected hosts observed by the discrete time t, denoted by C t. A worm ’ s scan distribution A worm ’ s average scan rate η
Correction of biased observation C t (1/2) For a uniform-scan worm, each worm scan has a small probability p of being observed by a monitoring system, thus an infected host will send out many scans before one of them is observed. C t is not proportional to I t In a monitoring interval Δ, a worm send out on average scans, thus the monitoring system has the probability to observe at least on scan from an infected host in a monitoring interval.
Correction of biased observation C t (2/2) remove the conditioning on C t-1 replace E[C t ] by C t unobserved infected hosts
Estimated I t (2 17 IP space)
Estimated I t (2 14 IP space) noisier
Kalman filter estimation (simple epidemic model) System state: The system is described as ( y 1, y 2, …, y t are the measurement data, e.g., Z t or I t ) ( υ t is the noise) (α and β are derived from I t )
How to detect a worm? For each TCP or UDP port, MWC has an alarm threshold for monitored illegitimate scan traffic Z t. If the monitored scan traffic is over the alarm threshold for several consecutive monitoring intervals, the Kalman filter will be activated. The MWC begins to record C t and calculates the average worm scan rate η from the report of egress scan monitors. The Kalman filter can either use C t or Z t to estimate all the parameters of a worm. The three discrete-time models are used to detect the worm. Once an estimated value of α stabilizes and oscillates slightly around a positive constant value, we have detected the presence of a worm.
Code Red simulation Uniform-scan Can be accurately modeled by the simple epidemic model The alarm threshold for Z t Set to be two times as large as the mean value of the background noise (*) * D. Goldsmith. Incidents Maillist: Possible Codered Connection Attempts. [Online]. Available:
Code Red propagation and its variability
Kalman filter estimation of Code Red infection rate α (1/3) epidemic model
Kalman filter estimation of Code Red infection rate α (2/3) AR exponential model
Kalman filter estimation of Code Red infection rate α (3/3) transformed linear model 0.3% infected
Long-term Kalman filter estimation In fast spread phrase
Estimate of the vulnerable population size N of Code Red In fast spread phrase
Blaster-like worm simulation Sequential-scan Still can be accurately modeled by the simple epidemic model 16-block monitor Monitor 16 “ /16 ” networks 1024-block monitor Monitor 1024 “ /22 ” networks IP Space monitored IP space A C B So for sequential-scan worms, the monitors should cover as distributed as possible. 16* = 1024*
Worm propagation comparison between Code Red and Blaster-like worm
Blaster-like worm ( I t )
Blaster-like worm ( Z t )
Blaster-like worm ( Z t after using a low pass filter)
Kalman filter estimation of α for the Blaster-like worm 1.3% infected Transformed linear model