An Automated Timeline Reconstruction Approach for Digital Forensic Investigations Written by Christopher Hargreaves and Jonathan Patterson Presented by.

Slides:



Advertisements
Similar presentations
DIGIDOC A web based tool to Manage Documents. System Overview DigiDoc is a web-based customizable, integrated solution for Business Process Management.
Advertisements

AS ICT Finding your way round MS-Access The Home Ribbon This ribbon is automatically displayed when MS-Access is started and when existing tables.
HTML5 ETDs Edward A. Fox, Sung Hee Park, Nicholas Lynberg, Jesse Racer, Phil McElmurray Digital Library Research Laboratory Virginia Tech ETD 2010, June.
Copyright © 2014 Pearson Education, Inc. Publishing as Prentice Hall
ICS103 Programming in C Lecture 1: Overview of Computers & Programming
Lecture 1: Overview of Computers & Programming
Technical BI Project Lifecycle
Effective Discovery Techniques In Computer Crime Cases.
MCTS GUIDE TO MICROSOFT WINDOWS 7 Chapter 10 Performance Tuning.
File System Analysis.
File Management Chapter 12. File Management A file is a named entity used to save results from a program or provide data to a program. Access control.
File Management Systems
Computer & Network Forensics
Creating Architectural Descriptions. Outline Standardizing architectural descriptions: The IEEE has published, “Recommended Practice for Architectural.
Lesson 4-Installing Network Operating Systems. Overview Installing and configuring Novell NetWare 6.0. Installing and configuring Windows 2000 Server.
Chapter 14 The Second Component: The Database.
EPOCH 1000 File Management Data Logging and Reporting
 Contents 1.Introduction about operating system. 2. What is 32 bit and 64 bit operating system. 3. File systems. 4. Minimum requirement for Windows 7.
Overview of Search Engines
Collections Management Museums Reporting in KE EMu.
Reporting in EMu Crystal != Reporting or Why is reporting so difficult and can we do anything about it? Bernard Marshall KE Software.
COEN 252 Computer Forensics Forensic Duplication of Hard Drives.
COEN 252 Computer Forensics
70-293: MCSE Guide to Planning a Microsoft Windows Server 2003 Network, Enhanced Chapter 14: Problem Recovery.
MS Access Advanced Instructor: Vicki Weidler Assistant:
Impact Analysis of Database Schema Changes Andy Maule, Wolfgang Emmerich and David S. Rosenblum London Software Systems Dept. of Computer Science, University.
Lesson 4 Computer Software
Chapter Seven Advanced Shell Programming. 2 Lesson A Developing a Fully Featured Program.
Objectives Learn what a file system does
BACS 371 Computer Forensics
A summary of the report written by W. Alink, R.A.F. Bhoedjang, P.A. Boncz, and A.P. de Vries.
Database Systems: Design, Implementation, and Management Eighth Edition Chapter 10 Database Performance Tuning and Query Optimization.
MCTS Guide to Microsoft Windows 7
Teaching Digital Forensics w/Virtuals By Amelia Phillips.
1Copyright © 2011 Pearson Education, Inc. Publishing as Prentice Hall. Exploring Microsoft Office Access 2010 by Robert Grauer, Keith Mast, and Mary Anne.
© Paradigm Publishing Inc. 9-1 Chapter 9 Database and Information Management.
Defining Digital Forensic Examination & Analysis Tools Brian Carrier.
CHAPTER FOUR COMPUTER SOFTWARE.
Introduction to Interactive Media Interactive Media Tools: Software.
Winrunner Usage - Best Practices S.A.Christopher.
© Paradigm Publishing Inc. 9-1 Chapter 9 Database and Information Management.
1 NASIS 6.1 and WSS 2.3 Updates Jim Fortner National Soil Survey Center April 20, 2011.
Week #3 Objectives Partition Disks in Windows® 7 Manage Disk Volumes Maintain Disks in Windows 7 Install and Configure Device Drivers.
Computer Systems & Architecture Lesson 4 8. Reconstructing Software Architectures.
Lecture 18 Windows – NT File System (NTFS)
Using automation to enhance the process of Digital Forensic analysis Daniel Walton School of Computer and Information Science
Chapter 1 : Overview of Computer and Programming By Suraya Alias
Metadata By N.Gopinath AP/CSE Metadata and it’s role in the lifecycle. The collection, maintenance, and deployment of metadata Metadata and tool integration.
7 Strategies for Extracting, Transforming, and Loading.
Week1: Introduction to Computer Networks. Copyright © 2012 Cengage Learning. All rights reserved.2 Objectives 2 Describe basic computer components and.
Digital Forensics Dr. Bhavani Thuraisingham The University of Texas at Dallas Lecture #8 File Systems September 22, 2008.
Presentation on Database management Submitted To: Prof: Rutvi Sarang Submitted By: Dharmishtha A. Baria Roll:No:1(sem-3)
COEN 252 Computer Forensics Forensic Duplication of Hard Drives.
Stellar Phoenix Photo Recovery Recover Photos, Audio & Videos.
CIS-NG CASREP Information System Next Generation Shawn Baugh Amy Ramirez Amy Lee Alex Sanin Sam Avanessians.
 INDEX  Overview.  Introduction.  System Requirement.  Features Of SQL.  Development Process.  System Design (SDLC).  Implementation.  Future.
Creighton Barrett Dalhousie University Archives
MCTS Guide to Microsoft Windows 7
ICS103 Programming in C Lecture 1: Overview of Computers & Programming
CAE-SCRUB for Incorporating Static Analysis into Peer Reviews
Extract and Correlate Evidences in Computer Forensics
Booting Up 15-Nov-18 boot.ppt.
CHFI & Digital Forensics [Part.1] - Basics & FTK Imager
Chapter 9 Database and Information Management.
Press ESC for Startup Options © Microsoft Corporation.
Dr. Bhavani Thuraisingham The University of Texas at Dallas
COMP1321 Digital Infrastructures
MAPO: Mining and Recommending API Usage Patterns
Presentation transcript:

An Automated Timeline Reconstruction Approach for Digital Forensic Investigations Written by Christopher Hargreaves and Jonathan Patterson Presented by Jason McKenzie November 8 th, 2013

Introduction  Reconstruction: a process in which an event or series of events is carefully examined in order to find out or show exactly what happened (Merriam Webster)  Provenance: the origin or source of something  Low-level PC event: File modification, registry key update  High-level PC event: Connection of a USB device, like a USB stick  Goal: Construct a software prototype using Python to automatically reconstruct a timeline of events using low- level events to infer high-level events and their provenance

Background  Reconstruction is an essential aspect of digital forensics  Key challenge in digital forensics is the large volume of information that needs to be analyzed  Population owns an increasing number of digital devices  There are tools present that automate the extraction process of a digital investigation, and are useful for examining events that have occurred  There is a demand for explaining the sequence of digital events, and a tool to automatically reconstruct the events and produce a timeline is needed

Related Work  Related work is comprised of solutions that incorporate some form of timeline generation (non automatic)  Timelines based on file system times  Uses metadata from file systems to create a timeline  Modified, Accessed, and Created (MAC) times  The Sleuth Kit generates timeline from file activity  Encase creates graphical “Timeline” view  Times that the contents of files are examined are not captured in metadata and presents a limitation

Related Work (continued)  Timelines including time from inside files  Cyber Forensic Time Lab (CFTL)  Extracts system times from FAT and NTFS hard drives and some file types  Has incomplete source information of extracted events  Log2timeline  Has several enhancements and options that when combined could produce a timeline  Carbone and Bean addressed the need for a rich, event filled timeline in their paper “Generating computer forensic super-timelines under Linux” in 2011  Key to creating an event filled timeline is to capture more event times

Related Work (continued)  Visualizations  Encase  Visual Timeline  Zeitline  Imports file system times from other programs through the user of Import Filters  Complex events: events directly imported from system  Atomic events: comprised of atomic and other complex events.  Allows for filtering, searching, and combination of atomic into complex events  Aftertime  Performs enhanced timeline generation  Visualizes results as a histogram

Related Work (continued)  Summary  Importance of recovering times from inside files and using file system metadata  Two key challenges:  Too many events to effectively analyze  Difficult to visualize what is going on in the timeline due to the number of events  Highlighting patterns of activity to indicate areas of interest and maintaining records of source of extracted data is important

Methodology  As expressed previously, large volume of events creates a problem for analysis and an inability to visualize the timeline  To counteract this, an approach to automate the process of combining “low-level” events, into “high-level” events is being researched  By automating the conversion of low-level to high-level events a summary of activity would be produced that would help direct the investigation  To facilitate this, a software prototype was constructed

Methodology (continued)  Should frameworks be expanded to accommodate a timeline reconstruction system?  Would take extensive work to build upon an existing framework, like log2timeline  Best to implement a new framework without having to adjust data structures or adjust for legacy languages  Python 3 is chosen for this project due to readability of code

Design  Overall design  Python Digital Forensic Timeline (PyDFT)  Supports low-level event extraction and high-level event reconstruction  Also supports case management, conversion of different formats for date and time, and basic GUI’s

Design (continued)  Generation of low-level events  Overview  Low-level events are file system times and times extracted from within files  Analysis is performed on a mounted file system NOT a disk based image  Recommended approach is to mount disk image in read- only mode using Linux or Mac OS X  Extraction of file system times  Master File Table ($MFT)  Accessed directly on Linux or Mac OS X using NTFS driver from Tuxera  Created, modified, accessed, and entry modified times from Standard Information Attribute are used to build four events for reach file

Design (continued)  Generation of low-level events (continued)  Times from inside files  Extraction Manager calls GetTimesFromInsideFiles() for any files mounted in the file system and checked for time extractors  If found, extracts information from file pointer, file name, file path  Any time information extracted is added to low- level timeline  Time extractors used are browsing history found in Chrome, Firefox, Internet Explorer; Skype, Windows Live Mail, etc.

Design (continued)  Generation of low-level events (continued)  Parsers and bridges  Parsers: process raw data structures and recover data in a useable form  Bridges: takes information from parsers and maps it to a low-level event object  Design approach makes it easier to accommodate new parsers, and code in the parsers easier to reuse

Design (continued)  Generation of low-level events (continued)  Traceability  If extractor returns a low-level event, it also points to the raw data that produced the event.  Different types of provenance based upon event  Low-level event format  Different events have different provenance and have different fields  Id, date_time_min, date_time_max, evidence, provenance, etc.

Design (continued)  Generation of low-level events (continued)  Backing store for the low-level timeline  A back-end storage is required due to the use of Python classes  SQLite chosen as the backing store and allows for multiple advanced queries  Summary  Extraction manager extracts low-level events that are converted to a standard format and added to timeline  Timeline stored in SQLite  Fields like date/time, provenance, and information about the raw data

Design (continued)  Reconstruction of high-level events  Overview  Use of predetermined rules using plug-in scripts to automatically convert low-level events to high-level events  Basic event matching using test events  SQLite requires knowledge of SQL  By creating a test event with all the conditions of the low-level event it’s possible to add events to the high- level timeline without extensive knowledge SQL queries  Comparison match (not exact match) with test events and low-level events  Matching field values can produce SQL searches for those fields and then create high-level events

Design (continued)  Reconstruction of high-level events (continued)  Matching multiple artefacts  “Test events” serve as triggers and any matches are used to construct a hypothesis of a high-level event  Low-level timeline created in memory for a specific period determined by the analyzer  Analyzer searches for all low-level events occurring in this period  If matches are found are considered supporting artefacts  If matches are not found are considered contradictory artefacts  One ore more high-level events created based upon these artefacts

Design (continued)  Reconstruction of high-level events (continued)  High level event format  Similar to low-level event format  Includes files, trigger_evidence_artefact, supporting_evidence_artefact, contradictory_evidence_artefact  High-level timeline output  Not stored in SQLite  Exports to XML and individual high-level event HTML reporting

Design (continued)  Reconstruction of high-level events (continued)  Summary  Searching timeline through the use of “test events” that have similarities to desired low-level events  One or more match leads to one or more high-level event  Since low-level event information is preserved, it can still point to the raw data that generated the low-level event  Produces two timelines  Low-level event timeline (not very readable)  High-level event timeline (human readable)

Results  Examples of high-level events constructed  Google searches  11:28:30 Google search for ‘how to hack wifi’  USB device connection  “Setup API entry for USB found (VIBL07AB PID:FCF6 Serial:07A80207B128BE08)”

Results (continued)  Visualization  Since there are usually not a large amount of high-level events it’s possible to use a third-party program like Timeflow to display them graphically  In the high-level timeline below there are 2894 low-level events that have occurred (obviously not displayed)

Results (continued)  Performance  Calculations based on Intel Core 2 Duo GHz and 4- 8GB of ram  1 Million events, ~2min per analyzer, 22 analyzers = 44 minutes to process 1 million events  Equivalent to other indexing or searching forensics tools (“start search and walk away”)  No plans to optimize performance

Evaluation  Results section reinforces that the use of “test events” matching low-level events, which is considered “temporal proximity pattern matching”, is effective at creating high-level events automatically  Need to develop more analyzers and time extractors to further reinforce feasibility of “temporal proximity pattern matching”  Need to implement low-level extractors that are currently not available for some aspects of the disk like Recycle Bin  Need to determine if keeping high-level provenance of information is required since the associated low-level provenance is preserved

Evaluation (continued)  Although performance is within limits compared to other forensics tools a bottleneck exists due to each analyzer searching through the timeline linearly for patterns  More analyzers means a greater bottleneck  Needs optimization for multi-core processors  Optimization of SQLite secondary indexing could improve performance  Need to implement a way of verifying target PC’s clock is correct  Need more robust testing of the prototype

Future work  Creation of more low-level event extractors  Creation of more analyzers  Formalizing low-level event information  Inputting data from other tools  Testing of framework against real world data  Adding complexity to analysis scripts, such as Bayesian networks  Development of more robust visual data tools for timelining

Conclusions  Illustrates possibility of pattern matching to automatically reconstruct high-level human-understandable events which then creates a readable visualization of the timeline  Preserves provenance of low-level events  Not to be used to replace a full forensic analysis by an experienced, trained analyst