1 Measurement and Classification of Humans and Bots in Internet Chat By Steven Gianvecchio, Mengjun Xie, Zhenyu Wu, and Haining Wang College of William.

Slides:



Advertisements
Similar presentations
Wenke Lee and Nick Feamster Georgia Tech Botnet and Spam Detection in High-Speed Networks.
Advertisements

Detecting Spam Zombies by Monitoring Outgoing Messages Zhenhai Duan Department of Computer Science Florida State University.
Unit 1: Module 1 Objective 10 identify tools used in the entry, retrieval, processing, storage, presentation, transmission and dissemination of information;
An Introduction of Botnet Detection – Part 2 Guofei Gu, Wenke Lee (Georiga Tech)
Battle of Botcraft: Fighting Bots in Online Games with Human Observational Proofs Steven Gianvecchio, Zhenyu Wu, Mengjun Xie, and Haining Wang.
RB-Seeker: Auto-detection of Redirection Botnet Presenter: Yi-Ren Yeh Authors: Xin Hu, Matthew Knysz, Kang G. Shin NDSS 2009 The slides is modified from.
 Firewalls and Application Level Gateways (ALGs)  Usually configured to protect from at least two types of attack ▪ Control sites which local users.
Secure Shell – SSH Tam Ngo Steve Licking cs265. Overview Introduction Brief History and Background of SSH Differences between SSH-1 and SSH- 2 Brief Overview.
BotMiner Guofei Gu, Roberto Perdisci, Junjie Zhang, and Wenke Lee College of Computing, Georgia Institute of Technology.
Neural Technology and Fuzzy Systems in Network Security Project Progress 2 Group 2: Omar Ehtisham Anwar Aneela Laeeq
Internet Quarantine: Requirements for Containing Self-Propagating Code David Moore et. al. University of California, San Diego.
Dr. Steven Gianvecchio.  Internet of Things botnet  Includes TV and refrigerator  Flashback hits Mac OS X  800K Macs infected  Explosion of Android.
Department Of Computer Engineering
SMS Mobile Botnet Detection Using A Multi-Agent System Abdullah Alzahrani, Natalia Stakhanova, and Ali A. Ghorbani Faculty of Computer Science, University.
Internet Relay Chat Security Issues By Kelvin Lau and Ming Li.
2009/9/151 Rishi : Identify Bot Contaminated Hosts By IRC Nickname Evaluation Reporter : Fong-Ruei, Li Machine Learning and Bioinformatics Lab In Proceedings.
Battle of Botcraft: Fighting Bots in Online Games withHuman Observational Proofs Steven Gianvecchio, Zhenyu Wu, Mengjun Xie, and Haining Wang The College.
Secure Public Instant Messaging (IM): A Survey Mohammad Mannan Paul C. Van Oorschot Digital Security Group School of Computer Science Carleton University,
BOTNETS & TARGETED MALWARE Fernando Uribe. INTRODUCTION  Fernando Uribe   IT trainer and Consultant for over 15 years specializing.
PROJECT IN COMPUTER SECURITY MONITORING BOTNETS FROM WITHIN FINAL PRESENTATION – SPRING 2012 Students: Shir Degani, Yuval Degani Supervisor: Amichai Shulman.
Guofei Gu, Roberto Perdisci, Junjie Zhang, and Wenke Lee College of Computing, Georgia Institute of Technology USENIX Security '08 Presented by Lei Wu.
1 Measurements and Mitigation of Peer-to-Peer-based Botnets: A Case Study on Storm Worm T. Holz, M. Steiner, F. Dahl, E. Biersack, and F. Freiling - Proceedings.
Combining Supervised and Unsupervised Learning for Zero-Day Malware Detection © 2013 Narus, Inc. Prakash Comar 1 Lei Liu 1 Sabyasachi (Saby) Saha 2 Pang-Ning.
Botnets An Introduction Into the World of Botnets Tyler Hudak
資安新聞簡報 報告者:劉旭哲、曾家雄. Spam down, but malware up 報告者:劉旭哲.
B OTNETS T HREATS A ND B OTNETS DETECTION Mona Aldakheel
WARNINGBIRD: A Near Real-time Detection System for Suspicious URLs in Twitter Stream.
 Collection of connected programs communicating with similar programs to perform tasks  Legal  IRC bots to moderate/administer channels  Origin of.
BotNet Detection Techniques By Shreyas Sali
BotMiner: Clustering Analysis of Network Traffic for Protocol- and Structure-Independent Botnet Detection Guofei Gu, Roberto Perdisci, Junjie Zhang, and.
Speaker:Chiang Hong-Ren Botnet Detection by Monitoring Group Activities in DNS Traffic.
Bots Used to Facilitate Spam Matt Ziemniak. Discuss Snort lab improvements Spam as a vehicle behind cyber threats Bots and botnets What can be done.
Jeong, Hyun-Cheol. 2 Contents DDoS Attacks in Korea 1 1 Countermeasures against DDoS Attacks in Korea Countermeasures against DDoS Attacks in.
Maintaining a Secure Messaging Environment Across , IM, Web and Other Protocols Jim Jessup Regional Manager, Information Risk Management Specialist.
8 1 ADVANCED COMMUNICATION TOOLS Using Chat, Virtual Worlds, and Newsgroups New Perspectives on THE INTERNET.
Man vs. Machine: Adversarial Detection of Malicious Crowdsourcing Workers Gang Wang, Tianyi Wang, Haitao Zheng, Ben Y. Zhao, UC Santa Barbara, Usenix Security.
Botnet behavior and detection October RONOG Silviu Sofronie – a Head of Forensics.
BOTNETS Presented By : Ramesh kumar Ramesh kumar 08EBKIT049 08EBKIT049 A BIGGEST THREAT TO INERNET.
Jhih-sin Jheng 2009/09/01 Machine Learning and Bioinformatics Laboratory.
1 Impact of IT Monoculture on Behavioral End Host Intrusion Detection Dhiman Barman, UC Riverside/Juniper Jaideep Chandrashekar, Intel Research Nina Taft,
Week 10-11c Attacks and Malware III. Remote Control Facility distinguishes a bot from a worm distinguishes a bot from a worm worm propagates itself and.
Model-Based Covert Timing Channels: Automated Modeling and Evasion Steven Gianvecchio 1, Haining Wang 1, Duminda Wijesekera 2, and Sushil Jajodia 2 1 College.
Studying Spamming Botnets Using Botlab 台灣科技大學資工所 楊馨豪 2009/10/201 Machine Learning And Bioinformatics Laboratory.
BotGraph: Large Scale Spamming Botnet Detection Yao Zhao, Yinglian Xie, Fang Yu, Qifa Ke, Yuan Yu, Yan Chen, and Eliot Gillum Speaker: 林佳宜.
REVISITING DEFENSES AGAINST LARGE SCALE ONLINE PASSWORD GUESSING ATTACKS Mansour Alsaleh,Mohammad Mannan and P.C van Oorschot.
By Gianluca Stringhini, Christopher Kruegel and Giovanni Vigna Presented By Awrad Mohammed Ali 1.
Chien-Chung Shen Bot and Botnet Chien-Chung Shen
Botnets Usman Jafarey Including slides from The Zombie Roundup by Cooke, Jahanian, McPherson of the University of Michigan.
Intrusion Detection Systems Paper written detailing importance of audit data in detecting misuse + user behavior 1984-SRI int’l develop method of.
BotCop: An Online Botnet Traffic Classifier 鍾錫山 Jan. 4, 2010.
05/04/07 Using Active Learning to Label Large Corpora Ted Markowitz Pace University CSIS DPS & IBM T. J. Watson Research Ctr.
Speaker:Chiang Hong-Ren An Investigation and Implementation of Botnet Detection Schemes.
Role Of Network IDS in Network Perimeter Defense.
Spamming Botnets: Signatures and Characteristics Yinglian Xie, Fang Yu, Kannan Achan, Rina Panigrahy, Microsoft Research, Silicon Valley Geoff Hulten,
2009/6/221 BotMiner: Clustering Analysis of Network Traffic for Protocol- and Structure- Independent Botnet Detection Reporter : Fong-Ruei, Li Machine.
1 Botnets Group 28: Sean Caulfield and Fredrick Young ECE 4112 Internetwork Security Prof. Henry Owen.
Antivirus Software Technology By Mitchell Zell. Intro  Computers are vulnerable to attack  Most common type of attack is Malware  Short for malicious.
Created by the E-PoliceSlide 122 February, 2012 Dangers of s By Michael Kuc.
Botnets A collection of compromised machines
BUILD SECURE PRODUCTS AND SERVICES
Chapter 7: Identifying Advanced Attacks
Some Common Terms The Internet is a network of computers spanning the globe. It is also called the World Wide Web. World Wide Web It is a collection of.
BotCatch: A Behavior and Signature Correlated Bot Detection Approach
Botnets A collection of compromised machines
Risk of the Internet At Home
Dieudo Mulamba November 2017
Timing Analysis of Keystrokes and Timing Attacks on SSH
REVISITING DEFENSES AGAINST LARGE SCALE ONLINE PASSWORD GUESSING ATTACKS Mansour Alsaleh,Mohammad Mannan and P.C van Oorschot.
Binghui Wang, Le Zhang, Neil Zhenqiang Gong
Identifying Slow HTTP DoS/DDoS Attacks against Web Servers DEPARTMENT ANDDepartment of Computer Science & Information SPECIALIZATIONTechnology, University.
Presentation transcript:

1 Measurement and Classification of Humans and Bots in Internet Chat By Steven Gianvecchio, Mengjun Xie, Zhenyu Wu, and Haining Wang College of William and Mary

USENIX Security 2008 Measurement and Classification of Humans and Bots in Internet Chat 2 Outline  Background  Measurement  Classification System  Experimental Evaluation  Conclusion

USENIX Security 2008 Measurement and Classification of Humans and Bots in Internet Chat 3 Outline  Background  Measurement  Classification System  Experimental Evaluation  Conclusion

USENIX Security 2008 Measurement and Classification of Humans and Bots in Internet Chat 4 Bots  Bots - programs that automate human tasks  web bots automate browsing the web  chat bots automate online chat  can be harmful and/or helpful

USENIX Security 2008 Measurement and Classification of Humans and Bots in Internet Chat 5 Chat Bots vs. BotNets  BotNets – networks of compromised machines  some use chat systems (IRC) for C&C, others use P2P, HTTP, etc.  abuse various systems  Chat Bots – automated chat programs  some are helpful, e.g., chat loggers  can abuse chat systems and their users

USENIX Security 2008 Measurement and Classification of Humans and Bots in Internet Chat 6 The Chat Bot Problem MSN  The Problem – chat bots abuse chat services (e.g., AOL, Yahoo!, MSN)  send spam  spread malicious software  mount phishing attacks  Our focus is on the Yahoo! chat system

USENIX Security 2008 Measurement and Classification of Humans and Bots in Internet Chat 7 A Typical Chat Alice12 entered the room. Alice12: Hi room. Alice12 entered the room. Alice12: Hi room. Bob34: hi alice Alice12 entered the room. Alice12: Hi room. Bob34: hi alice Susie88: any guys want to let a cute girl move in with them! hehe Alice12 entered the room. Alice12: Hi room. Bob34: hi alice Susie88: any guys want to let a cute girl move in with them! hehe Alice12: What’s up? Alice12 entered the room. Alice12: Hi room. Bob34: hi alice Susie88: any guys want to let a cute girl move in with them! hehe Alice12: What’s up? Bob34: not much Alice12 entered the room. Alice12: Hi room. Bob34: hi alice Susie88: any guys want to let a cute girl move in with them! hehe Alice12: What’s up? Bob34: not much Susie88: can you guys see me on my web-cam?? (its in my profile)

USENIX Security 2008 Measurement and Classification of Humans and Bots in Internet Chat 8 Yahoo! Chat  Yahoo! chat is a large commercial chat service  over 3,000 chat rooms AUTH, CHAT, IM, …

USENIX Security 2008 Measurement and Classification of Humans and Bots in Internet Chat 9 Yahoo! Chat  Yahoo! chat system  client connects to a server  servers relay messages to/from clients

USENIX Security 2008 Measurement and Classification of Humans and Bots in Internet Chat 10 Outline  Background  Measurement  Classification System  Experimental Evaluation  Conclusion

USENIX Security 2008 Measurement and Classification of Humans and Bots in Internet Chat 11 Measurement  August-November 2007 – we collect data  August 2007 – Yahoo! adds CAPTCHA  must pass to join a chat room  protocol update, prevents some 3 rd party clients from accessing chat  October 2007 – bots are back  some bots return before 3 rd party clients

USENIX Security 2008 Measurement and Classification of Humans and Bots in Internet Chat 12 Measurement  September and October 2007  very few chat bots  August and November 2007  many chat bots  1,440 hours of chat logs  147 chat logs  21 chat rooms

USENIX Security 2008 Measurement and Classification of Humans and Bots in Internet Chat 13 Measurement  To create our dataset, we read and label the chat users as  human, bot, or ambiguous  In total, we recognized 14 different types of chat bots  different triggering mechanisms  different text generation techniques

USENIX Security 2008 Measurement and Classification of Humans and Bots in Internet Chat 14 Triggering Mechanisms  Timer-Based  periodic timers, e.g., 40 seconds  random timers, e.g., seconds  Response-Based  responds to other users Sam77: Bob12, you’re just full of questions, aren’t you? Sam77: Bob12, lots of evidence for evolution can be found here

USENIX Security 2008 Measurement and Classification of Humans and Bots in Internet Chat 15 Text Generation  Character Padding Fiona88: anyone boredjn wanna chat?uklcss  Synonym Phrases Marjorie99: Hi Babes! Marjorie Here! Inspect My Site Marjorie99: Mmmm Folks! Im Marjorie! View My Webpage  Odd Line or Word Spacing  Message Replay

USENIX Security 2008 Measurement and Classification of Humans and Bots in Internet Chat 16 Types of Chat Bots  Periodic Bots – sends messages based on periodic timers  Random Bots – sends messages based on random timers  Responder Bots – responds to messages of other users  Replay Bots – replays messages of other users

USENIX Security 2008 Measurement and Classification of Humans and Bots in Internet Chat 17  Humans  inter-message delay – evidence of heavy tail  message size – well fit by Exponential (λ=0.034)

USENIX Security 2008 Measurement and Classification of Humans and Bots in Internet Chat 18  Periodic Bots  inter-message delay – several clusters with high probabilities  message size – messages built from templates approximate a normal distribution

USENIX Security 2008 Measurement and Classification of Humans and Bots in Internet Chat 19  Random Bots  inter-message delay – Equilikely distribution at 40, 64, and 88; Uniform distribution  message size – messages selected from a small database

USENIX Security 2008 Measurement and Classification of Humans and Bots in Internet Chat 20  Responder Bots  inter-message delay – human-like timing  message size – multiple templates of different lengths

USENIX Security 2008 Measurement and Classification of Humans and Bots in Internet Chat 21  Replay Bots  inter-message delay – cluster with high probabilities (replay bots are periodic)  message size – human-like size, well fit by Exponential (λ=0.028)

USENIX Security 2008 Measurement and Classification of Humans and Bots in Internet Chat 22 Outline  Background  Measurement  Classification System  Experimental Evaluation  Conclusion

USENIX Security 2008 Measurement and Classification of Humans and Bots in Internet Chat 23 Classification System  Entropy Classifier  detects abnormal behavior  based on message sizes and inter-message delays  accurate but slow  Machine Learning Classifier  detects “learned” patterns  based on message content  fast but must be trained

USENIX Security 2008 Measurement and Classification of Humans and Bots in Internet Chat 24  Observation – chat bots are less complex than humans, and thus, lower in entropy  exploits the low entropy of chat bots  Corrected Conditional Entropy Test (CCE)  estimates higher-order entropy  Entropy Test (EN)  estimates first-order entropy Entropy Classifier

USENIX Security 2008 Measurement and Classification of Humans and Bots in Internet Chat 25 Machine Learning Classifier  Observation - chat spam like spam is a text classification problem  exploits message content of chat bots  CRM114  a powerful text classification system  several built-in classifiers: HMM, KNN/Hyperspace, OSB, SVM, Winnow, etc.  we use OSB

USENIX Security 2008 Measurement and Classification of Humans and Bots in Internet Chat 26  Hybrid Classification System  entropy classifier builds and maintains the bot corpus  machine learning classifier uses the bot and human corpora BOT CORPUS CLASSIFY AS CHAT BOT HUMAN CORPUS CLASSIFY AS HUMAN INPUT ENTROPY CLASSIFIER MACHINE LEARNING CLASSIFIER

USENIX Security 2008 Measurement and Classification of Humans and Bots in Internet Chat 27 Outline  Background  Measurement  Classification System  Experimental Evaluation  Conclusion

USENIX Security 2008 Measurement and Classification of Humans and Bots in Internet Chat 28 Experimental Evaluation  Types of Chat Bots  Periodic Bots  Random Bots  Responder Bots  Replay Bots  Classifiers  entropy classifier – 100 messages  machine learning classifier – 25 messages

USENIX Security 2008 Measurement and Classification of Humans and Bots in Internet Chat 29 Experimental Evaluation  Classification Tests  Ent – entropy classifier  SupML – fully-supervised ML classifier, trained on AUG BOTS  SupMLre – fully-supervised ML classifier, retrained on NOV BOTS  EntML – entropy-trained ML

USENIX Security 2008 Measurement and Classification of Humans and Bots in Internet Chat 30 AUG BOTSNOV BOTS periodicrandomrespondperiodicrandomreplayhuman test TP FP EN(imd) 121/12168/681/3051/51109/10940/407/1713 CCE(imd) 121/12149/684/3051/51109/10940/4011/1713 EN(ms) 92/1217/688/3046/5134/1090/407/1713 CCE(ms) 77/1218/6830/3051/516/1090/4011/1713 OVERALL 121/12168/6830/3051/51109/10940/4017/1713  Entropy Classifier  EN – entropy  CCE – corrected conditional entropy  (imd) – inter-message delay  (ms) – message size

USENIX Security 2008 Measurement and Classification of Humans and Bots in Internet Chat 31 AUG BOTSNOV BOTS periodicrandomrespondperiodicrandomreplayhuman test TP FP EN(imd) 121/12168/681/3051/51109/10940/407/1713 CCE(imd) 121/12149/684/3051/51109/10940/4011/1713 EN(ms) 92/1217/688/3046/5134/1090/407/1713 CCE(ms) 77/1218/6830/3051/516/1090/4011/1713 OVERALL 121/12168/6830/3051/51109/10940/4017/1713  EN(imd) and CCE(imd)  problems against responder bots  detect most other chat bots

USENIX Security 2008 Measurement and Classification of Humans and Bots in Internet Chat 32 AUG BOTSNOV BOTS periodicrandomrespondperiodicrandomreplayhuman test TP FP EN(imd) 121/12168/681/3051/51109/10940/407/1713 CCE(imd) 121/12149/684/3051/51109/10940/4011/1713 EN(ms) 92/1217/688/3046/5134/1090/407/1713 CCE(ms) 77/1218/6830/3051/516/1090/4011/1713 OVERALL 121/12168/6830/3051/51109/10940/4017/1713  EN(ms) and CCE(ms)  problems against random and replay bots  detect most other chat bots

USENIX Security 2008 Measurement and Classification of Humans and Bots in Internet Chat 33 AUG BOTSNOV BOTS periodicrandomrespondperiodicrandomreplayhuman test TP FP EN(imd) 121/12168/681/3051/51109/10940/407/1713 CCE(imd) 121/12149/684/3051/51109/10940/4011/1713 EN(ms) 92/1217/688/3046/5134/1090/407/1713 CCE(ms) 77/1218/6830/3051/516/1090/4011/1713 OVERALL 121/12168/6830/3051/51109/10940/4017/1713  OVERALL  detects all chat bots  false positive rate is ~0.01  100 messages

USENIX Security 2008 Measurement and Classification of Humans and Bots in Internet Chat 34 AUG BOTSNOV BOTS periodicrandomrespondperiodicrandomreplayhuman test TP FP Ent 121/12168/6830/3051/51109/10940/4017/1713 SupML 121/12168/6830/3014/51104/1091/400/1713 SupMLre 121/12168/6830/3051/51109/10940/400/1713 EntML 121/12168/6830/3051/51109/10940/401/1713  Entropy and Machine Learning Classifiers  Ent – entropy classifier (from last slide)  SupML – fully-supervised machine learning  SupMLre – SupML retrained  EntML – entropy-trained machine learning

USENIX Security 2008 Measurement and Classification of Humans and Bots in Internet Chat 35 AUG BOTSNOV BOTS periodicrandomrespondperiodicrandomreplayhuman Test TP FP Ent 121/12168/6830/3051/51109/10940/4017/1713 SupML 121/12168/6830/3014/51104/1091/400/1713 SupMLre 121/12168/6830/3051/51109/10940/400/1713 EntML 121/12168/6830/3051/51109/10940/401/1713  Ent  OVERALL results from previous slide

USENIX Security 2008 Measurement and Classification of Humans and Bots in Internet Chat 36 AUG BOTSNOV BOTS periodicrandomrespondperiodicrandomreplayhuman test TP FP Ent 121/12168/6830/3051/51109/10940/4017/1713 SupML 121/12168/6830/3014/51104/1091/400/1713 SupMLre 121/12168/6830/3051/51109/10940/400/1713 EntML 121/12168/6830/3051/51109/10940/401/1713  SupML  has problems against November bots  needs to be retrained for new bots  SupMLre  detects all bots

USENIX Security 2008 Measurement and Classification of Humans and Bots in Internet Chat 37 AUG BOTSNOV BOTS periodicrandomrespondperiodicrandomreplayhuman test TP FP Ent 121/12168/6830/3051/51109/10940/4017/1713 SupML 121/12168/6830/3014/51104/1091/400/1713 SupMLre 121/12168/6830/3051/51109/10940/400/1713 EntML 121/12168/6830/3051/51109/10940/401/1713  EntML  false positive rate is ~ (Ent is ~0.01)  25 messages

USENIX Security 2008 Measurement and Classification of Humans and Bots in Internet Chat 38 Outline  Background  Measurement  Classification System  Experimental Evaluation  Conclusion

USENIX Security 2008 Measurement and Classification of Humans and Bots in Internet Chat 39 Conclusion  Measurements  overall, chat bots are less complex than humans  some chat bots more human-like  Classification System  exploits benefits of both classifiers  quickly classifies known chat bots  accurately classifies unknown chat bots

USENIX Security 2008 Measurement and Classification of Humans and Bots in Internet Chat 40 Conclusion (cont.)  Future Work  investigate more advanced chat bots  explore applications of entropy on other forms of bots (e.g., web bots)  explore other applications of entropy (e.g., detecting covert timing channels)

USENIX Security 2008 Measurement and Classification of Humans and Bots in Internet Chat 41 Questions? Thank You!