PRESENTED BY: ILYA NELKENBAUM KEREN ARMON SUPERVISOR: MR. YOSSI KANIZO 09/03/2011 Cuckoo the Kicking Bird 1.

Slides:



Advertisements
Similar presentations
1 An Efficient, Hardware-based Multi-Hash Scheme for High Speed IP Lookup Hot Interconnects 2008 Socrates Demetriades, Michel Hanna, Sangyeun Cho and Rami.
Advertisements

Bio Michel Hanna M.S. in E.E., Cairo University, Egypt B.S. in E.E., Cairo University at Fayoum, Egypt Currently is a Ph.D. Student in Computer Engineering.
Data Structures Using C++ 2E
Optimal Fast Hashing Yossi Kanizo (Technion, Israel) Joint work with Isaac Keslassy (Technion, Israel) and David Hay (Hebrew Univ., Israel)
Cuckoo Hashing : Hardware Implementations Adam Kirsch Michael Mitzenmacher.
Submitters: Erez Rokah Erez Goldshide Supervisor: Yossi Kanizo.
Web Categorization Crawler – Part I Mohammed Agabaria Adam Shobash Supervisor: Victor Kulikov Winter 2009/10 Final Presentation Sep Web Categorization.
Submitters: Erez Rokah Erez Goldshide Supervisor: Yossi Kanizo Networked Software Systems Laboratory Department of Electrical Engineering Technion - Israel.
Randal E. Bryant Carnegie Mellon University CS:APP2e CS:APP Chapter 4 Computer Architecture Overview CS:APP Chapter 4 Computer Architecture Overview
NETWORK ON CHIP ROUTER Students : Itzik Ben - shushan Jonathan Silber Instructor : Isaschar Walter Final presentation part A Winter 2006.
Mid semester Presentation Data Packages Generator & Flow Management Data Packages Generator & Flow Management Data Packages Generator & Flow Management.
Hash Tables With Finite Buckets Are Less Resistant to Deletions Yossi Kanizo (Technion, Israel) Joint work with David Hay (Columbia U. and Hebrew U.) and.
Combined Input Output Queuing Switch Simulator The Laboratory of Computer Communication and Networking.
1 A Tree Based Router Search Engine Architecture With Single Port Memories Author: Baboescu, F.Baboescu, F. Tullsen, D.M. Rosu, G. Singh, S. Tullsen, D.M.Rosu,
© 2006 Pearson Addison-Wesley. All rights reserved13 A-1 Chapter 13 Hash Tables.
Crossbar Switches Crossbar switches are an important general architecture for fast switches. 2 x 2 Crossbar Switches A general N x N crossbar switch.
Overview Of Microsoft New Technology ENTER. Processing....
Cuckoo Hashing and CAMs Michael Mitzenmacher. Background For the past several years, I have had funding from Cisco to research hash tables and related.
FF-1 9/30/2003 UTD Practical Priority Contention Resolution for Slotted Optical Burst Switching Networks Farid Farahmand The University of Texas at Dallas.
Optimal Fast Hashing Yossi Kanizo (Technion, Israel) Joint work with Isaac Keslassy (Technion, Israel) and David Hay (Politecnico di Torino, Italy)
By: Arnon Benor Supervisor: Yossi Kanizo Lab Engineer: Dr. Ilana David Spring Semester 2009.
1 Last Class: Introduction Operating system = interface between user & architecture Importance of OS OS history: Change is only constant User-level Applications.
Performed by:Gidi Getter Svetlana Klinovsky Supervised by:Viktor Kulikov 08/03/2009.
Data Structures Using C++ 2E Chapter 9 Searching and Hashing Algorithms.
© 2006 Pearson Addison-Wesley. All rights reserved13 B-1 Chapter 13 (excerpts) Advanced Implementation of Tables CS102 Sections 51 and 52 Marc Smith and.
OS Fall ’ 02 Performance Evaluation Operating Systems Fall 2002.
Functional Simulation Overview1 OpenTV PC Simulator.
Data Structures and Programming.  John Edgar2.
Pipelined Two Step Iterative Matching Algorithms for CIOQ Crossbar Switches Deng Pan and Yuanyuan Yang State University of New York, Stony Brook.
Analysis of Simulation Results Andy Wang CIS Computer Systems Performance Analysis.
Offline Programming to Online using IPS
A Navigation Mesh for Dynamic Environments Wouter G. van Toll, Atlas F. Cook IV, Roland Geraerts CASA 2012.
Simulation of Memory Management Using Paging Mechanism in Operating Systems Tarek M. Sobh and Yanchun Liu Presented by: Bei Wang University of Bridgeport.
Instrumentation System Design – part 2 Chapter6:.
CHALLENGING SCHEDULING PROBLEM IN THE FIELD OF SYSTEM DESIGN Alessio Guerri Michele Lombardi * Michela Milano DEIS, University of Bologna.
Matrix Multiplication on FPGA Final presentation One semester – winter 2014/15 By : Dana Abergel and Alex Fonariov Supervisor : Mony Orbach High Speed.
Most modern operating systems incorporate these five components.
Firmware based Array Sorter and Matlab testing suite Final Presentation August 2011 Elad Barzilay & Uri Natanzon Supervisor: Moshe Porian.
Web Categorization Crawler Mohammed Agabaria Adam Shobash Supervisor: Victor Kulikov Winter 2009/10 Design & Architecture Dec
CS453 Lecture 3.  A sequential algorithm is evaluated by its runtime (in general, asymptotic runtime as a function of input size).  The asymptotic runtime.
High Speed Digital Systems Lab Asic Test Platform Supervisor: Michael Yampolsky Assaf Mantzur Gal Rotbard Project Midterm Presentation One-Semester Project.
Performance evaluation of component-based software systems Seminar of Component Engineering course Rofideh hadighi 7 Jan 2010.
Switches 1RD-CSY  In this lecture, we will learn about  Collision Domain and Microsegmentation  Switches – a layer two device ◦ MAC address.
The course. Description Computer systems programming using the C language – And possibly a little C++ Translation of C into assembly language Introduction.
Main Memory. Chapter 8: Memory Management Background Swapping Contiguous Memory Allocation Paging Structure of the Page Table Segmentation Example: The.
CSS446 Spring 2014 Nan Wang.  To understand the implementation of linked lists and array lists  To analyze the efficiency of fundamental operations.
High-Speed Policy-Based Packet Forwarding Using Efficient Multi-dimensional Range Matching Lakshman and Stiliadis ACM SIGCOMM 98.
Hashtables. An Abstract data type that supports the following operations: –Insert –Find –Remove Search trees can be used for the same operations but require.
Written by Changhyun, SON Chapter 5. Introduction to Design Optimization - 1 PART II Design Optimization.
Infrastructure for Data Warehouses. Basics Of Data Access Data Store Machine Memory Buffer Memory Cache Data Store Buffer Bus Structure.
Fundamentals of Programming Languages-II
1 Process Description and Control Chapter 3. 2 Process A program in execution An instance of a program running on a computer The entity that can be assigned.
DDRIII BASED GENERAL PURPOSE FIFO ON VIRTEX-6 FPGA ML605 BOARD PART B PRESENTATION STUDENTS: OLEG KORENEV EUGENE REZNIK SUPERVISOR: ROLF HILGENDORF 1 Semester:
C - IT Acumens. COMIT Acumens. COM. To demonstrate the use of Neural Networks in the field of Character and Pattern Recognition by simulating a neural.
Onlinedeeneislam.blogspot.com1 Design and Analysis of Algorithms Slide # 1 Download From
VIRTUAL NETWORK PIPELINE PROCESSOR Design and Implementation Department of Communication System Engineering Presented by: Mark Yufit Rami Siadous.
Silberschatz, Galvin and Gagne ©2009 Operating System Concepts – 8 th Edition, Chapter 8: Main Memory.
Introduction to Performance Tuning Chia-heng Tu PAS Lab Summer Workshop 2009 June 30,
Computer Orgnization Rabie A. Ramadan Lecture 9. Cache Mapping Schemes.
Buffering Techniques Greg Stitt ECE Department University of Florida.
Overview Modern chip designs have multiple IP components with different process, voltage, temperature sensitivities Optimizing mix to different customer.
Memory Management.
Data Structures Using C++ 2E
Applying Control Theory to Stream Processing Systems
Hash Tables (Chapter 13) Part 2.
Data Structures Using C++ 2E
Real-time Software Design
Alan Jovic1, Kresimir Jozic2, Davor Kukolja1,
Working with the Compute Block
OPERATING SYSTEMS MEMORY MANAGEMENT BY DR.V.R.ELANGOVAN.
Presentation transcript:

PRESENTED BY: ILYA NELKENBAUM KEREN ARMON SUPERVISOR: MR. YOSSI KANIZO 09/03/2011 Cuckoo the Kicking Bird 1

Motivation Modern networking systems:  Increasing traffic rates.  Packet processing in switching level is essential and in some cases is crucial. Memory access time becomes more critical.  Fast memory is very expensive and size limited. All this requires faster and more efficient data structures. 2

Motivation (2) Applications can be found in wire speed communication, high speed packet processing, large data centers, etc. Hash-based data structures are an extremely useful technique to deal with this type of problems.  Particularly hash table. Traditional data structures are not efficient enough. 3

Cuckoo Hashing 4 A new approach for handling collisions.

Cuckoo Hashing YZ X Insert X H_1(X)=1 H_2(X)=4 5

Cuckoo Hashing XZ Y Insert YH_1(Y)=1 H_2(Y)=7 6

Cuckoo Hashing XZY 7

XZY Find(y) H_1(Y)=1H_2(Y)=7 Found! 8

Cuckoo Hashing – Description Basic scheme: each element gets d possible locations. To insert x, check all locations for x. If one is empty, insert. If all are full, x kicks out an old element y. Then y moves to one of its other locations. If all locations are full, y kicks out z, and so on, until an empty slot is found 9

Hash Basics Hash memory include: Basic hash parameters:  m – number of buckets.  h - buckets height.  D – number of memory segments.  n - number of elements.  d – number of hash functions.  b – maximum number of kicks. h 10

Objectives Cuckoo’s  Reduce number of memory accesses  Number of accesses is translated to number of kicks.  Better memory utilization.  According to mathematical analysis, for a table twice the size of the number of elements, we will have zero elements in CAM Project  Test the performance of parallel cuckoo implementation compared with a sequential one, in a manner of memory accesses in several system configurations. 11

Implementation Platform OOP language: C# (using Microsoft Visual Studio)  OOP  Generic data structures (Queue).  Garbage collector  GUI  Unfamiliar language. Version control system:  Using the lab facilities (SVN). 12

Class Diagram 13

Memory Structures Hash Table:  Memory segment  API to memory Segments:  For each segment: operation queue CAM:  Content Addressable Memory 14

Cuckoo Logic Implements an abstract Cuckoo scheme Is father of:  Naive Cuckoo  NaiveParallel Cuckoo  Parallel Cuckoo Contains properties:  CAM  Operation queue (filled by simulations)  Hash set – an assembly of randomized hash functions  Statistics Methods:  doQueue (virtual).  Get Statistics methods 15

Simulation Flow 16 Input for hash table parameters (Including Cuckoo constants) Generating and inserting operations to hash table operations queue. Executing the operations by selected Cuckoo Logic Extracting flow data and processing it according to simulation type.

Naive Cuckoo Logic 17 Implements naive execution of operations:  Get first operation.  Execute sequentially for each hash function of element.  When finished, get next operation. Methods:  Enqueue  doQueue  addElement Was implemented first.

Naive Parallel Cuckoo Logic 18 Implements parallel examinations of different hash functions for each element:  Get first operation.  Inquire execution of all hash functions simultaneously.  Save first success and drop all others.  When finished, get next operation. Methods:  Enqueue  doQueue  addElement Was implemented second.

Parallel Cuckoo Logic 19 Implements parallel execution of different operations:  Consider all segments as pipelined system  Each cycle (one memory access), all segments execute an operation.  In case of failure, the operation is being transferred to the next segment.  In case of success a new operation is being pulled from operations queue. Methods:  AddNewOper  CheckResult  DoQueue Was implemented last.

API 20 Console API GUI API

Simulation Classes Main Class Simulations  Two main simulation classes:  Fast_simulation – samples the stats data after each operation  Regular_Simulation – samples the stats after all operations were executed.  GUI uses only fast simulation Constants  Define all constants – m, n, d, D, b, h.  Can be modified. 21

CAM Load by number of elements 22 In this type of simulation the insertion scenario of elements is running, while the final number of elements inserted is equal to number of buckets (m = 1000) D = d = 1D = d = 2D = d = 3

CAM Load by number of kicks 23 In this type of simulation we sweep the limit of the number of kicks allowed and each time insert 1000 elements into the hash table.

Memory Access by number of elements 24 In this type of simulation an insertion scenario is executed according to the parameters given. The number of memory accesses is shown as function of inserted elements number. D = d = 1D = d = 2D = d = 10

Number of kicks by number of elements 25 This type of simulation executes the insertion scenario according to the given parameters and the result is number of kicks that were made as function of inserted elements number.

Achievements 26

Future Development 27 Within the framework of the project one of the main goals was to provide a modular code for implementing and running different Cuckoo Logics over the same hash table. Due to the modularity, it will be possible in future to add the following features:  Additional Cuckoo Logic approaches  Additional operations for hash table (find and delete).  Implementing mixed operations scenarios

Gantt Ramp up on problem, algorithm, terminology and theory. Designing the project, Get to know C#. Getting green light from supervisor. Implementation of ‘naive’ Cuckoo. Running simulations and analyzing results. Iterative improvements. Implementation of naive parallel Cuckoo. Running simulations and analyzing results. Iterative improvements. Implementation of parallel Cuckoo. Creating a GUI for configurations of simulations. Creating a GUI to review results. Project summary. 28 EXP X 14 REAL …

Thank You Yossi ! 29

30 API