Digital Forensics Dr. Bhavani Thuraisingham The University of Texas at Dallas Evidence Correlation November 2011.

Slides:



Advertisements
Similar presentations
CLEARSPACE Digital Document Archiving system INTRODUCTION Digital Document Archiving is the process of capturing paper documents through scanning and.
Advertisements

Module 13: Performance Tuning. Overview Performance tuning methodologies Instance level Database level Application level Overview of tools and techniques.
Provenance-Aware Storage Systems Margo Seltzer April 29, 2005.
© 2008 Cisco Systems, Inc. All rights reserved.Cisco ConfidentialPresentation_ID 1 Chapter 8: Monitoring the Network Connecting Networks.
A Survey of Botnet Size Measurement PRESENTED: KAI-HSIANG YANG ( 楊凱翔 ) DATE: 2013/11/04 1/24.
A correlation method for establishing provenance of timestamps in digital evidence By: Bradley Schatz*, George Mohay, Andrew Clark From: Information Security.
Toolbox Mirror -Overview Effective Distributed Learning.
Technical Architectures
Computer & Network Forensics
MS DB Proposal Scott Canaan B. Thomas Golisano College of Computing & Information Sciences.
70-290: MCSE Guide to Managing a Microsoft Windows Server 2003 Environment Chapter 11: Monitoring Server Performance.
OCT1 Principles From Chapter One of “Distributed Systems Concepts and Design”
Chapter 14 The Second Component: The Database.
COS/PSA 413 Day 17. Agenda Lab 8 write-up grades –3 B’s, 1 C and 1 F –Answer the Questions!!! Capstone progress report 2 overdue Today we will be discussing.
8/28/97Information Organization and Retrieval Files and Databases University of California, Berkeley School of Information Management and Systems SIMS.
MCITP Guide to Microsoft Windows Server 2008 Server Administration (Exam #70-646) Chapter 14 Server and Network Monitoring.
PMI Inventory Tracker™
Maintaining and Updating Windows Server 2008
Overview of Search Engines
Barracuda Networks Confidential1 Barracuda Backup Service Integrated Local & Offsite Data Backup.
Advance evidence collection and analysis of web browser activity by Junhoon Oh David Rivera 11/7/2013 Digital Forensics.
Hands-On Microsoft Windows Server 2008 Chapter 11 Server and Network Monitoring.
Windows Server 2008 Chapter 11 Last Update
Security Guidelines and Management
COEN 252 Computer Forensics
Data Mining By Andrie Suherman. Agenda Introduction Major Elements Steps/ Processes Tools used for data mining Advantages and Disadvantages.
Digital Forensics Dr. Bhavani Thuraisingham The University of Texas at Dallas Lecture #12 Intelligent Digital Forensics September 30, 2009.
IT – DBMS Concepts Relational Database Theory.
Maintaining a Microsoft SQL Server 2008 Database SQLServer-Training.com.
Database System Development Lifecycle © Pearson Education Limited 1995, 2005.
Overview of the Database Development Process
Overview of SQL Server Alka Arora.
Computers Are Your Future Tenth Edition Chapter 12: Databases & Information Systems Copyright © 2009 Pearson Education, Inc. Publishing as Prentice Hall1.
© Paradigm Publishing Inc. 9-1 Chapter 9 Database and Information Management.
Components of Database Management System
70-290: MCSE Guide to Managing a Microsoft Windows Server 2003 Environment, Enhanced Chapter 11: Monitoring Server Performance.
material assembled from the web pages at
© Paradigm Publishing Inc. 9-1 Chapter 9 Database and Information Management.
Module 10: Monitoring ISA Server Overview Monitoring Overview Configuring Alerts Configuring Session Monitoring Configuring Logging Configuring.
©2010 John Wiley and Sons Chapter 12 Research Methods in Human-Computer Interaction Chapter 12- Automated Data Collection.
Digital Forensics Dr. Bhavani Thuraisingham The University of Texas at Dallas Lecture #8 Computer Forensics Data Recovery and Evidence Collection September.
Guide to Computer Forensics and Investigations Fourth Edition
Secure Sensor Data/Information Management and Mining Bhavani Thuraisingham The University of Texas at Dallas October 2005.
70-290: MCSE Guide to Managing a Microsoft Windows Server 2003 Environment, Enhanced Chapter 11: Monitoring Server Performance.
Digital Forensics Dr. Bhavani Thuraisingham The University of Texas at Dallas Lecture #4 Data Acquisition September 8, 2008.
World Wide Web “WWW”, "Web" or "W3". World Wide Web “WWW”, "Web" or "W3"
Digital Forensics Dr. Bhavani Thuraisingham The University of Texas at Dallas Lecture #27 Evidence Correlation October 31, 2007.
Chapter 5 Processing Crime and Incident Scenes Guide to Computer Forensics and Investigations Fourth Edition.
Digital Forensics Dr. Bhavani Thuraisingham The University of Texas at Dallas Evidence Correlation November 4, 2008.
Data and Applications Security Developments and Directions Dr. Bhavani Thuraisingham The University of Texas at Dallas Lecture #15 Secure Multimedia Data.
Selective and Intelligent Imaging Using Digital Evidence Bags.
 Forensics  Application of scientific knowledge to a problem  Computer Forensics  Application of the scientific method in reconstructing a sequence.
Search Engine using Web Mining COMS E Web Enhanced Information Mgmt Prof. Gail Kaiser Presented By: Rupal Shah (UNI: rrs2146)
Digital Forensics Dr. Bhavani Thuraisingham The University of Texas at Dallas Network Forensics - III November 3, 2008.
Digital Forensics Dr. Bhavani Thuraisingham The University of Texas at Dallas Lecture #8 File Systems September 22, 2008.
Erik Jonsson School of Engineering and Computer Science The University of Texas at Dallas Cyber Security Research on Engineering Solutions Dr. Bhavani.
Introduction Web analysis includes the study of users’ behavior on the web Traffic analysis – Usage analysis Behavior at particular website or across.
COEN 252 Computer Forensics Forensic Duplication of Hard Drives.
Chapter 11 Analysis Methodology Spring Incident Response & Computer Forensics.
Maintaining and Updating Windows Server 2008 Lesson 8.
Wednesday NI Vision Sessions
Data and Applications Security Developments and Directions
Unit 27: Network Operating Systems
Chapter 12: Automated data collection methods
Chapter 9 Database and Information Management.
Chapter 8: Monitoring the Network
Dr. Bhavani Thuraisingham The University of Texas at Dallas
Dr. Bhavani Thuraisingham The University of Texas at Dallas
Dr. Bhavani Thuraisingham The University of Texas at Dallas
Presentation transcript:

Digital Forensics Dr. Bhavani Thuraisingham The University of Texas at Dallas Evidence Correlation November 2011

Papers to discuss l Forensic feature extraction and cross-drive analysis l A correlation method for establishing provenance of timestamps in digital evidence -

Abstract of Paper 1 l This paper introduces Forensic Feature Extraction (FFE) and Cross- Drive Analysis (CDA), two new approaches for analyzing large data sets of disk images and other forensic data. FFE uses a variety of lexigraphic techniques for extracting information from bulk data; CDA uses statistical techniques for correlating this information within a single disk image and across multiple disk images. An architecture for these techniques is presented that consists of five discrete steps: imaging, feature extraction, first-order cross-drive analysis, cross-drive correlation, and report generation. CDA was used to analyze 750 images of drives acquired on the secondary market; it automatically identified drives containing a high concentration of confidential financial records as well as clusters of drives that came from the same organization. FFE and CDA are promising techniques for prioritizing work and automatically identifying members of social networks under investigation. Authors believe it is likely to have other uses as well.

Outline l Introduction l Forensics Feature Extraction l Single Drive Analysis l Cross drive analysis l Implementation l Directions

Introduction: Why? l Improper prioritization. In these days of cheap storage and fast computers, the critical resource to be optimized is the attention of the examiner or analyst. Today work is not prioritized based on the information that the drive contains. l Lost opportunities for data correlation. Because each drive is examined independently, there is no opportunity to automatically ‘‘connect the dots’’ on a large case involving multiple storage devices. l Improper emphasis on document recovery. Because today’s forensic tools are based on document recovery, they have taught examiners, analysts, and customers to be primarily concerned with obtaining documents.

Feature Extraction l An address extractor, which can recognize RFC822- style addresses. l An Message-ID extractor. l An Subject: extractor. l A Date extractor, which can extract date and time stamps in a variety of formats. l A cookie extractor, which can identify cookies from the Set- Cookie: header in web page cache files. l A US social security number extractor, which identifies the patterns ###-##-#### and ######### when preceded with the letters SSN and an optional colon. l A Credit card number extractor.

Single Drive analysis l Extracted features can be used to speed initial analysis and answer specific questions about a drive image. l Authors have successfully used extracted features for drive image attribution and to build a tool that scans disks to report the likely existence of information that should have been destroyed under Fair and Accurate Credit Transactions Act l Drive attribution: an analyst might encounter a hard drive and wish to determine to whom that drive previously belonged. For example, the drive might have been purchased on eBay and the analyst might be attempting to return it to its previous owner. l powerful technique for making this determination is to create a histogram of the addresses on the drive (as returned by the address feature extractor).

Cross drive analysis (CDA) l Cross-drive analysis is the term that coined to describe forensic analysis of a data set that spans multiple drives. l The fundamental theory of cross-drive analysis is data gleaned from multiple drives can improve the forensic analysis of a drive in question both in the case when the multiple drives are related to the drive in question and in the case when they are not. l two forms of CDA: first order, in which the results of a feature extractor are compared across multiple drives, an O(n) operation; and second order, where the results are correlated, an O(n2) operation.

Implementation l 1. Disks collected are imaged onto into a single AFF file. (AFF is the Advanced Forensic Format, a file format for disk images that contains all of the data accession information, such as the drive’s manufacturer and serial number, as well as the disk contents) l 2. The afxml program is used to extract drive metadata from the AFF file and build an entry in the SQL database. l 3. Strings are extracted with an AFF-aware program in three passes, one for 8-bit characters, one for 16-bit characters in lsb format, and one for 16-bit characters in msb format. l 4. Feature extractors run over the string files and write their results to feature files. l 5. Extracted features from newly-ingested drives are run against a watch list; hits are reported to the human operator. l 6. The feature files are read by indexers, which build indexes in the SQL server of the identified features.

Implementation l 7. A multi-drive correlation is run to see if the newly accessioned drive contained features in common with any drives that are on a drive watch list. l 8. A user interface allows multiple analysts to simultaneously interact with the database, to schedule new correlations to be run in a batch mode, or to view individual sectors or recovered files from the drive images that are stored on the file server.

Directions l Improve feature extraction l Improve the algorithms l Develop end to end systems

Abstract of Paper 2 l Establishing the time at which a particular event happened is a fundamental concern when relating cause and effect in any forensic investigation. Reliance on computer generated timestamps for correlating events is complicated by uncertainty as to clock skew and drift, environmental factors such as location and local time zone offsets, as well as human factors such as clock tampering. Establishing that a particular computer’s temporal behavior was consistent during its operation remains a challenge. The contributions of this paper are both a description of assumptions commonly made regarding the behavior of clocks in computers, and empirical results demonstrating that real world behavior diverges from the idealized or assumed behavior. Authors present an approach for inferring the temporal behavior of a particular computer over a range of time by correlating commonly available local machine timestamps with another source of timestamps. We show that a general characterization of the passage of time may be inferred from an analysis of commonly available browser records.

Outline l Introduction l Factors to consider l Drifting clocks l Identifying computer timescales by correlation with corroborating sources l Directions

Introduction l Timestamps are increasingly used to relate events which happen in the digital realm to each other and to events which happen in the physical realm, helping to establish cause and effect. l Difficulty with timestamps is how to interpret and relate the timestamps generated by separate computer clocks when they are not synchronized l Current approaches to inferring the real world interpretation of timestamps assume idealized models of computer clock l Uncertainty with the behavior of suspect’s clock computer before seizure. l Authors explore two themes related to this uncertainty. - investigate whether it is reasonable to assume uniform behavior of computer clocks over time, and test these assumptions by attempting to characterize how computer clocks behave in the wild. - investigate the feasibility of automatically identifying the local time on a computer by correlating timestamps embedded in digital evidence with corroborative time sources.

Factors l Computer timekeeping l Real-time synchronization l Factors affecting timekeeping accuracy - Clock configuration - Tampering - Synchronization protocol - Misinterpretation l Usage of timestamps in forensics

Drifting clocks behavior l Enumerate the main factors influencing the temporal behavior of the clock of a computer, and then attempt to experimentally validate whether one can make informed assumptions about such behavior. l Authors do this by empirically studying the temporal behavior of a network of computers found in the wild. l The subject of case study is a network of machines in active use by a small business. The network consists of a Windows 2000 domain, consisting of one Windows 2000 server, and mixed number of Windows XP and 2000 workstations. l The goal here is to observe the temporal behavior. In order to observe this behavior, authors have constructed a simple service that logs both the system time of a host computer and the civil time for the location. l The program samples both sources of time and logs the results to a file. The logging program was deployed on all workstations and the server

Correlation l Automated approach which correlates time stamped events found on a suspect computer with time stamped events from a more reliable, corroborating source. l Web browser records are increasingly employed as evidence in investigations, and are a rich source of time stamped data. l Techniques implemented are” Click stream correlation algorithm and Non-cached correlation algorithm l Authors compare the results of both algorithms

Directions l Need to determine whether the conditions and the assumptions of the experiments are realistic l What are the most appropriate correlation algorithms? l Need to integrate with clock synchronization algorithms