Download presentation
Presentation is loading. Please wait.
Published bySharon Wiggins Modified over 9 years ago
1
An Automated Timeline Reconstruction Approach for Digital Forensic Investigations Written by Christopher Hargreaves and Jonathan Patterson Presented by Jason McKenzie November 8 th, 2013
2
Introduction Reconstruction: a process in which an event or series of events is carefully examined in order to find out or show exactly what happened (Merriam Webster) Provenance: the origin or source of something Low-level PC event: File modification, registry key update High-level PC event: Connection of a USB device, like a USB stick Goal: Construct a software prototype using Python to automatically reconstruct a timeline of events using low- level events to infer high-level events and their provenance
3
Background Reconstruction is an essential aspect of digital forensics Key challenge in digital forensics is the large volume of information that needs to be analyzed Population owns an increasing number of digital devices There are tools present that automate the extraction process of a digital investigation, and are useful for examining events that have occurred There is a demand for explaining the sequence of digital events, and a tool to automatically reconstruct the events and produce a timeline is needed
4
Related Work Related work is comprised of solutions that incorporate some form of timeline generation (non automatic) Timelines based on file system times Uses metadata from file systems to create a timeline Modified, Accessed, and Created (MAC) times The Sleuth Kit generates timeline from file activity Encase creates graphical “Timeline” view Times that the contents of files are examined are not captured in metadata and presents a limitation
5
Related Work (continued) Timelines including time from inside files Cyber Forensic Time Lab (CFTL) Extracts system times from FAT and NTFS hard drives and some file types Has incomplete source information of extracted events Log2timeline Has several enhancements and options that when combined could produce a timeline Carbone and Bean addressed the need for a rich, event filled timeline in their paper “Generating computer forensic super-timelines under Linux” in 2011 Key to creating an event filled timeline is to capture more event times
6
Related Work (continued) Visualizations Encase Visual Timeline Zeitline Imports file system times from other programs through the user of Import Filters Complex events: events directly imported from system Atomic events: comprised of atomic and other complex events. Allows for filtering, searching, and combination of atomic into complex events Aftertime Performs enhanced timeline generation Visualizes results as a histogram
7
Related Work (continued) Summary Importance of recovering times from inside files and using file system metadata Two key challenges: Too many events to effectively analyze Difficult to visualize what is going on in the timeline due to the number of events Highlighting patterns of activity to indicate areas of interest and maintaining records of source of extracted data is important
8
Methodology As expressed previously, large volume of events creates a problem for analysis and an inability to visualize the timeline To counteract this, an approach to automate the process of combining “low-level” events, into “high-level” events is being researched By automating the conversion of low-level to high-level events a summary of activity would be produced that would help direct the investigation To facilitate this, a software prototype was constructed
9
Methodology (continued) Should frameworks be expanded to accommodate a timeline reconstruction system? Would take extensive work to build upon an existing framework, like log2timeline Best to implement a new framework without having to adjust data structures or adjust for legacy languages Python 3 is chosen for this project due to readability of code
10
Design Overall design Python Digital Forensic Timeline (PyDFT) Supports low-level event extraction and high-level event reconstruction Also supports case management, conversion of different formats for date and time, and basic GUI’s
11
Design (continued) Generation of low-level events Overview Low-level events are file system times and times extracted from within files Analysis is performed on a mounted file system NOT a disk based image Recommended approach is to mount disk image in read- only mode using Linux or Mac OS X Extraction of file system times Master File Table ($MFT) Accessed directly on Linux or Mac OS X using NTFS driver from Tuxera Created, modified, accessed, and entry modified times from Standard Information Attribute are used to build four events for reach file
12
Design (continued) Generation of low-level events (continued) Times from inside files Extraction Manager calls GetTimesFromInsideFiles() for any files mounted in the file system and checked for time extractors If found, extracts information from file pointer, file name, file path Any time information extracted is added to low- level timeline Time extractors used are browsing history found in Chrome, Firefox, Internet Explorer; Skype, Windows Live Mail, etc.
13
Design (continued) Generation of low-level events (continued) Parsers and bridges Parsers: process raw data structures and recover data in a useable form Bridges: takes information from parsers and maps it to a low-level event object Design approach makes it easier to accommodate new parsers, and code in the parsers easier to reuse
14
Design (continued) Generation of low-level events (continued) Traceability If extractor returns a low-level event, it also points to the raw data that produced the event. Different types of provenance based upon event Low-level event format Different events have different provenance and have different fields Id, date_time_min, date_time_max, evidence, provenance, etc.
15
Design (continued) Generation of low-level events (continued) Backing store for the low-level timeline A back-end storage is required due to the use of Python classes SQLite chosen as the backing store and allows for multiple advanced queries Summary Extraction manager extracts low-level events that are converted to a standard format and added to timeline Timeline stored in SQLite Fields like date/time, provenance, and information about the raw data
16
Design (continued) Reconstruction of high-level events Overview Use of predetermined rules using plug-in scripts to automatically convert low-level events to high-level events Basic event matching using test events SQLite requires knowledge of SQL By creating a test event with all the conditions of the low-level event it’s possible to add events to the high- level timeline without extensive knowledge SQL queries Comparison match (not exact match) with test events and low-level events Matching field values can produce SQL searches for those fields and then create high-level events
17
Design (continued) Reconstruction of high-level events (continued) Matching multiple artefacts “Test events” serve as triggers and any matches are used to construct a hypothesis of a high-level event Low-level timeline created in memory for a specific period determined by the analyzer Analyzer searches for all low-level events occurring in this period If matches are found are considered supporting artefacts If matches are not found are considered contradictory artefacts One ore more high-level events created based upon these artefacts
18
Design (continued) Reconstruction of high-level events (continued) High level event format Similar to low-level event format Includes files, trigger_evidence_artefact, supporting_evidence_artefact, contradictory_evidence_artefact High-level timeline output Not stored in SQLite Exports to XML and individual high-level event HTML reporting
19
Design (continued) Reconstruction of high-level events (continued) Summary Searching timeline through the use of “test events” that have similarities to desired low-level events One or more match leads to one or more high-level event Since low-level event information is preserved, it can still point to the raw data that generated the low-level event Produces two timelines Low-level event timeline (not very readable) High-level event timeline (human readable)
20
Results Examples of high-level events constructed Google searches 11:28:30 Google search for ‘how to hack wifi’ USB device connection “Setup API entry for USB found (VIBL07AB PID:FCF6 Serial:07A80207B128BE08)”
21
Results (continued) Visualization Since there are usually not a large amount of high-level events it’s possible to use a third-party program like Timeflow to display them graphically In the high-level timeline below there are 2894 low-level events that have occurred (obviously not displayed)
22
Results (continued) Performance Calculations based on Intel Core 2 Duo 2.28-28 GHz and 4- 8GB of ram 1 Million events, ~2min per analyzer, 22 analyzers = 44 minutes to process 1 million events Equivalent to other indexing or searching forensics tools (“start search and walk away”) No plans to optimize performance
23
Evaluation Results section reinforces that the use of “test events” matching low-level events, which is considered “temporal proximity pattern matching”, is effective at creating high-level events automatically Need to develop more analyzers and time extractors to further reinforce feasibility of “temporal proximity pattern matching” Need to implement low-level extractors that are currently not available for some aspects of the disk like Recycle Bin Need to determine if keeping high-level provenance of information is required since the associated low-level provenance is preserved
24
Evaluation (continued) Although performance is within limits compared to other forensics tools a bottleneck exists due to each analyzer searching through the timeline linearly for patterns More analyzers means a greater bottleneck Needs optimization for multi-core processors Optimization of SQLite secondary indexing could improve performance Need to implement a way of verifying target PC’s clock is correct Need more robust testing of the prototype
25
Future work Creation of more low-level event extractors Creation of more analyzers Formalizing low-level event information Inputting data from other tools Testing of framework against real world data Adding complexity to analysis scripts, such as Bayesian networks Development of more robust visual data tools for timelining
26
Conclusions Illustrates possibility of pattern matching to automatically reconstruct high-level human-understandable events which then creates a readable visualization of the timeline Preserves provenance of low-level events Not to be used to replace a full forensic analysis by an experienced, trained analyst
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.