Assembly 2005, Helsinki, July 20051 Crinkler - compressing Windows 4k intros to EXE files Aske Simon Christensen Rune L. H. Stubbe.

Slides:



Advertisements
Similar presentations
Normal Distribution 2 To be able to transform a normal distribution into Z and use tables To be able to use normal tables to find and To use the normal.
Advertisements

Information Retrieval in Practice
Preliminaries Advantages –Hash tables can insert(), remove(), and find() with complexity close to O(1). –Relatively easy to program Disadvantages –There.
Sample chapter from Reverse Engineering Course.
Image Compression. Data and information Data is not the same thing as information. Data is the means with which information is expressed. The amount of.
File Processing - Indirect Address Translation MVNC1 Hashing Indirect Address Translation Chapter 11.
Lab 3: Malloc Lab. “What do we need to do?”  Due 11/26  One more assignment after this one  Partnering  Non-Honors students may work with one other.
Compression Techniques. Digital Compression Concepts ● Compression techniques are used to replace a file with another that is smaller ● Decompression.
Predecessor to the Database: Traditional File Processing Records are stored in files. Programs are customized to process the data.
An introduction to systems programming
File Organizations and Indexing Lecture 4 R&G Chapter 8 "If you don't find it in the index, look very carefully through the entire catalogue." -- Sears,
1.1 CAS CS 460/660 Introduction to Database Systems File Organization Slides from UC Berkeley.
Protected Mode. Protected Mode (1 of 2) 4 GB addressable RAM –( to FFFFFFFFh) Each program assigned a memory partition which is protected from.
CS 255: Database System Principles slides: Variable length data and record By:- Arunesh Joshi( 107) Id: Cs257_107_ch13_13.7.
Hard Drive Formatting 1. Formatting Once a hard drive has been partitioned, there’s one more step you must perform before your OS can use that drive:
IT 342 : Fundamentals of Multimedia
VPC3: A Fast and Effective Trace-Compression Algorithm Martin Burtscher.
OBJECT MODULE FORMATS. The object module format we have employed as an educational device is called OMF (relocatable object format). It’s one of the earliest.
Power Point EDU 271 Microsoft PowerPoint is a powerful tool to create professional looking presentations and slide shows. PowerPoint allows you to construct.
File Implementation. File System Abstraction How to Organize Files on Disk Goals: –Maximize sequential performance –Easy random access to file –Easy.
Symbol Tables Symbol tables are used by compilers to keep track of information about variables functions class names type names temporary variables etc.
Identifying Reversible Functions From an ROBDD Adam MacDonald.
Problems discussed in the review session for the final COSC 4330/6310 Summer 2012.
CS212: DATA STRUCTURES Lecture 10:Hashing 1. Outline 2  Map Abstract Data type  Map Abstract Data type methods  What is hash  Hash tables  Bucket.
21 st International Unicode Conference Dublin, Ireland, May Folded Trie: Efficient Data Structure for All of Unicode Vladimir Weinstein
1 The 2001 Census PUMFS Odyssey Sponsored by HAL and PALS Presented by Chuck Humphrey.
©2006 Dipl.-Ing. Jerzy CzopikNovember 10&11, ProZ.com Regional Conference Edinburgh Edinburgh 2006 Word – the widely unknown animal Formatting Word.
Lecture 18: Dynamic Reconfiguration II November 12, 2004 ECE 697F Reconfigurable Computing Lecture 18 Dynamic Reconfiguration II.
Survey on Improving Dynamic Web Performance Guide:- Dr. G. ShanmungaSundaram (M.Tech, Ph.D), Assistant Professor, Dept of IT, SMVEC. Aswini. S M.Tech CSE.
CIS250 OPERATING SYSTEMS Memory Management Since we share memory, we need to manage it Memory manager only sees the address A program counter value indicates.
LZRW3 Decompressor dual semester project Characterization Presentation Students: Peleg Rosen Tal Czeizler Advisors: Moshe Porian Netanel Yamin
David Luebke 1 10/25/2015 CS 332: Algorithms Skip Lists Hash Tables.
The Design and Implementation of Log-Structure File System M. Rosenblum and J. Ousterhout.
CSE451 Linking and Loading Autumn 2002 Gary Kimura Lecture #21 December 9, 2002.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 Virus Pattern Recognition Using Self-Organization Map.
Silberschatz, Galvin and Gagne ©2009 Operating System Concepts – 8 th Edition, Chapter 11: File System Implementation.
Advances in digital image compression techniques Guojun Lu, Computer Communications, Vol. 16, No. 4, Apr, 1993, pp
March 23 & 28, Csci 2111: Data and File Structures Week 10, Lectures 1 & 2 Hashing.
March 23 & 28, Hashing. 2 What is Hashing? A Hash function is a function h(K) which transforms a key K into an address. Hashing is like indexing.
Recent Results in Combined Coding for Word-Based PPM Radu Rădescu George Liculescu Polytechnic University of Bucharest Faculty of Electronics, Telecommunications.
1 Chapter 7 Skip Lists and Hashing Part 2: Hashing.
Physical Database Design Purpose- translate the logical description of data into the technical specifications for storing and retrieving data Goal - create.
I MPLEMENTING FILES. Contiguous Allocation:  The simplest allocation scheme is to store each file as a contiguous run of disk blocks (a 50-KB file would.
Lecture 10 Page 1 CS 111 Summer 2013 File Systems Control Structures A file is a named collection of information Primary roles of file system: – To store.
Chapter 8 Physical Database Design. Outline Overview of Physical Database Design Inputs of Physical Database Design File Structures Query Optimization.
Hanyang University Hyunok Oh Energy Optimal Bit Encoding for Flash Memory.
CHAPTER 9 HASH TABLES, MAPS, AND SKIP LISTS ACKNOWLEDGEMENT: THESE SLIDES ARE ADAPTED FROM SLIDES PROVIDED WITH DATA STRUCTURES AND ALGORITHMS IN C++,
7: The Logic of Sampling. Introduction Nobody can observe everything Critical to decide what to observe Sampling –Process of selecting observations Probability.
NETW3005 Memory Management. Reading For this lecture, you should have read Chapter 8 (Sections 1-6). NETW3005 (Operating Systems) Lecture 07 – Memory.
NTFS Filing System CHAPTER 9. New Technology File System (NTFS) Started with Window NT in 1993, Windows XP, 2000, Server 2003, 2008, and Window 7 also.
Accelerating Multi-Pattern Matching on Compressed HTTP Traffic Dr. Anat Bremler-Barr (IDC) Joint work with Yaron Koral (IDC), Infocom[2009]
COSC 3330/6308 Second Review Session Fall Instruction Timings For each of the following MIPS instructions, check the cycles that each instruction.
CC410: System Programming Dr. Manal Helal – Fall 2014 – Lecture 10 – Loaders.
Applied Algorithmics - week7
File System Structure How do I organize a disk into a file system?
Chapter 11: File System Implementation
Subject Name: File Structures
CHAPTER 5: PHYSICAL DATABASE DESIGN AND PERFORMANCE
Chapter 11: File System Implementation
Optimizing Malloc and Free
Lecture 22: Compressed Linear Algebra for Large Scale ML
Chapter 11: File System Implementation
Introduction to Database Systems
Loaders and Linkers.
are shown in the COMMENTS example.
Chapter 11: File System Implementation
An introduction to systems programming
DATA STRUCTURES-COLLISION TECHNIQUES
CSE 326: Data Structures Lecture #14
Collision Resolution: Open Addressing Extendible Hashing
Presentation transcript:

Assembly 2005, Helsinki, July Crinkler - compressing Windows 4k intros to EXE files Aske Simon Christensen Rune L. H. Stubbe

Assembly 2005, Helsinki, July Overview Background Compression method Function import Header layout Demo Future plans

Assembly 2005, Helsinki, July Why another one? EXE optimizerCAB compressorBAT inserter EXE file BAT file Most common method: CAB dropping Dropping is a mess We want EXE files!

Assembly 2005, Helsinki, July How is Crinkler different? The normal build process: CompilerAssemberLinkerCruncher C/C++ files ASM files object / library files EXE file

Assembly 2005, Helsinki, July How is Crinkler different? The Crinkler way: CompilerAssemberCrinkler C/C++ files ASM files object / library files EXE file

Assembly 2005, Helsinki, July Why another one? Control over code and data placement –Choose base address –Optimize order for best compression –Separate code and data –Put in extra code Import code Code transformations

Assembly 2005, Helsinki, July Compression method Context modelling + Much better compression ratio than LZX + Well suited for small amounts of data + Small decompression code (< 250 bytes) + Pays off even with the extra header - Extremely slow - Very memory-hungry

Assembly 2005, Helsinki, July Data compression basics Take advantage of self-similarity Find patterns and eliminate them Dictionary compression Statistical compression

Assembly 2005, Helsinki, July Dictionary compression LZ77: Refer repetitions back to original Reasonable compression ratio Fast compression Very fast decompression MISSISSIPPIMISS ISSI PPI

Assembly 2005, Helsinki, July Estimate probability distribution of each symbol based on earlier data PPM: Problem: local MISSISSIPPI Statistical compression

Assembly 2005, Helsinki, July MISSISSIPPI Context modelling Generalization of PPM Look at combinations of recent symbols A bit mask describes a model Problem: Many masks to choose from

Assembly 2005, Helsinki, July Implementation Estimation for each single bit Context is current byte + selection of last 8 Estimate the best collection of masks Estimate the best weights of the masks Keep track of contexts in a hash table Ignore hash collisions Find hash table size with few collisions

Assembly 2005, Helsinki, July Function import Import by name: Name of each function –The import table is a big part of an EXE file Import by ordinal: Number instead of name –Much smaller but quite incompatible Import by hash: Hash code of each function –Small and compatible –Not supported directly Import by hashed ordinal range

Assembly 2005, Helsinki, July Header optimization DOS header Section header PE offset DOS stub PE header Data directories 544 bytes!

Assembly 2005, Helsinki, July Header optimization DOS header Section header PE offset DOS stub PE header Data directories

Assembly 2005, Helsinki, July Header optimization DOS header Section header PE offset DOS stub PE header Data directories

Assembly 2005, Helsinki, July Header optimization DOS header Section header PE offset DOS stub PE header Data directories Ignored

Assembly 2005, Helsinki, July Header optimization DOS header Section header PE offset DOS stub PE header Data directories Ignored 196 bytes!

Assembly 2005, Helsinki, July Header optimization DOS header Section header PE offset DOS stub PE header Data directories Hash code 124 bytes + 18 hash codes!

Assembly 2005, Helsinki, July Demo

Assembly 2005, Helsinki, July Future plans Windows 2000 compatibility Even better compression Section reordering Transformations More feedback 64k specialized version

Assembly 2005, Helsinki, July Thank you Questions? Comments? Suggestions?