INFOMR Project Dafne van Kuppevelt ● Vikram Doshi ● Seçkin Savaşçı Development Review.

Slides:



Advertisements
Similar presentations
Virtual Memory Basics.
Advertisements

System Area Network Abhiram Shandilya 12/06/01. Overview Introduction to System Area Networks SAN Design and Examples SAN Applications.
A NOVEL APPROACH TO SOLVING LARGE-SCALE LINEAR SYSTEMS Ken Habgood, Itamar Arel Department of Electrical Engineering & Computer Science GABRIEL CRAMER.
University of Maryland Locality Optimizations in cc-NUMA Architectures Using Hardware Counters and Dyninst Mustafa M. Tikir Jeffrey K. Hollingsworth.
File Systems.
Enhanced Availability With RAID CC5493/7493. RAID Redundant Array of Independent Disks RAID is implemented to improve: –IO throughput (speed) and –Availability.
Chapter 5: Server Hardware and Availability. Hardware Reliability and LAN The more reliable a component, the more expensive it is. Server hardware is.
OS Memory Addressing.
ATLAS, Technische Universität München The Future of ATLAS Track Reconstruction Robert Langenberg (TU München, CERN) Robert Langenberg – Gentner Day 2013.
OS Fall ’ 02 Introduction Operating Systems Fall 2002.
1 Java Grande Introduction  Grande Application: a GA is any application, scientific or industrial, that requires a large number of computing resources(CPUs,
PhD/Master course, Uppsala  Understanding the interaction between your program and computer  Structuring the code  Optimizing the code  Debugging.
MapReduce : Simplified Data Processing on Large Clusters Hongwei Wang & Sihuizi Jin & Yajing Zhang
1.3 Executing Programs. How is Computer Code Transformed into an Executable? Interpreters Compilers Hybrid systems.
Data Storage Willis Kim 14 May Types of storages Direct Attached Storage – storage hardware that connects to a single server Direct Attached Storage.
Memory Management in Windows and Linux &. Windows Memory Management Virtual memory manager (VMM) –Executive component responsible for managing memory.
Google MapReduce Simplified Data Processing on Large Clusters Jeff Dean, Sanjay Ghemawat Google, Inc. Presented by Conroy Whitney 4 th year CS – Web Development.
ITEC 325 Lecture 29 Memory(6). Review P2 assigned Exam 2 next Friday Demand paging –Page faults –TLB intro.
Computers in the real world Objectives Understand what is meant by memory Difference between RAM and ROM Look at how memory affects the performance of.
CS 162 Discussion Section Week 1 (9/9 – 9/13) 1. Who am I? Kevin Klues Office Hours:
TRACK-ALIGNED EXTENTS: MATCHING ACCESS PATTERNS TO DISK DRIVE CHARACTERISTICS J. Schindler J.-L.Griffin C. R. Lumb G. R. Ganger Carnegie Mellon University.
Input and Output Computer Organization and Assembly Language: Module 9.
Computing Labs CL5 / CL6 Multi-/Many-Core Programming with Intel Xeon Phi Coprocessors Rogério Iope São Paulo State University (UNESP)
MapReduce: Simplified Data Processing on Large Clusters Jeffrey Dean and Sanjay Ghemawat.
Seaborg Cerise Wuthrich CMPS Seaborg  Manufactured by IBM  Distributed Memory Parallel Supercomputer  Based on IBM’s SP RS/6000 Architecture.
Roopa.T PESIT, Bangalore. Source and Credits Dalvik VM, Dan Bornstein Google IO 2008 The Dalvik virtual machine Architecture by David Ehringer.
File System Implementation Chapter 12. File system Organization Application programs Application programs Logical file system Logical file system manages.
IT253: Computer Organization
ICML/MLOSS Holger Arndt: The Universal Java Matrix Package1 The Universal Java Matrix Package (UJMP) Holger Arndt Technical University of Munich.
1 Challenge the future Matlab Handling Large Data Sets Efficiently in MATLAB ®
MIDORI The Post Windows Operating System Microsoft Research’s.
Large Scale Parallel File System and Cluster Management ICT, CAS.
02/09/2010 Industrial Project Course (234313) Virtualization-aware database engine Final Presentation Industrial Project Course (234313) Virtualization-aware.
Silberschatz, Galvin and Gagne  Operating System Concepts Chapter 12: File System Implementation File System Structure File System Implementation.
Computer Systems Week 14: Memory Management Amanda Oddie.
Nanco: a large HPC cluster for RBNI (Russell Berrie Nanotechnology Institute) Anne Weill – Zrahia Technion,Computer Center October 2008.
U N I V E R S I T Y O F S O U T H F L O R I D A Hadoop Alternative The Hadoop Alternative Larry Moore 1, Zach Fadika 2, Dr. Madhusudhan Govindaraju 2 1.
Computer Architecture Lecture 27 Fasih ur Rehman.
Operating Systems ECE344 Ashvin Goel ECE University of Toronto Memory Management Overview.
Applications of secondary storage
Chapter 1 Introduction. Components of a Computer CPU (central processing unit) Executing instructions –Carrying out arithmetic and logical operations.
Indexing OLAP Data Sunita Sarawagi Monowar Hossain York University.
Virtualizing a Multiprocessor Machine on a Network of Computers Easy & efficient utilization of distributed resources Goal Kenji KanedaYoshihiro OyamaAkinori.
File Systems.  Issues for OS  Organize files  Directories structure  File types based on different accesses  Sequential, indexed sequential, indexed.
OS Memory Addressing. Architecture CPU – Processing units – Caches – Interrupt controllers – MMU Memory Interconnect North bridge South bridge PCI, etc.
1 Reconstruction tasks R.Shahoyan, 25/06/ Including TRD into track fit (JIRA PWGPP-1))  JIRA PWGPP-2: Code is in the release, need to switch setting.
XIP – eXecute In Place Jiyong Park. 2 Contents Flash Memory How to Use Flash Memory Flash Translation Layers (Traditional) JFFS JFFS2 eXecute.
Enhanced Availability With RAID CC5493/7493. RAID Redundant Array of Independent Disks RAID is implemented to improve: –IO throughput (speed) and –Availability.
CIT 140: Introduction to ITSlide #1 CSC 140: Introduction to IT Operating Systems.
Chapter 3 Getting Started. Copyright © 2005 Pearson Addison-Wesley. All rights reserved. Objectives To give an overview of the structure of a contemporary.
Defining the Competencies for Leadership- Class Computing Education and Training Steven I. Gordon and Judith D. Gardiner August 3, 2010.
SketchVisor: Robust Network Measurement for Software Packet Processing
Chapter Overview General Concepts IA-32 Processor Architecture
Kernel Code Coverage Nilofer Motiwala Computer Sciences Department
Analysis of Sparse Convolutional Neural Networks
Chapter 1: A Tour of Computer Systems
Distributed Network Traffic Feature Extraction for a Real-time IDS
Architecture Background
Performance and Code Tuning
OS Virtualization.
MapReduce Simplied Data Processing on Large Clusters
O.S Lecture 13 Virtual Memory.
Lecture 14 Virtual Memory and the Alpha Memory Hierarchy
Chapter 9: Virtual-Memory Management
Operating Systems Chapter 5: Input/Output Management
Introduction to Operating Systems
Java Programming Introduction
The Gamma Operator for Big Data Summarization
Performance and Code Tuning Overview
Presentation transcript:

INFOMR Project Dafne van Kuppevelt ● Vikram Doshi ● Seçkin Savaşçı Development Review

Heat Kernel Signature

Before Development C#  develop faster  use.NET framework’s abilities  our experience on C# and other similar languages(java) x86 architecture Visual Studio IDE Google Code for project hosting  Easy to start  Complete package for project management  Support git, svn & mercurial 

Initial Parsing Getting Laplacian Matrices form Off Files L = D – A  Storage Problem * 32 = 7.54 GB 45000*44999/2*1= 120 MB  ~64X Compression  Computational Overhead

Initial Parsing  Storage Problem( continued) Human Readable Matrix FileSerialization to Files Able to use in different language implementations Language Specific Storage Inefficient (string conversion, whitespaces, line feeds) Storage Efficient Own parsing methodDeserialize and use!  Time Problem Human Readable - IntegerSerialization – Bitwise ~2 hours~2 mins Larger files cannot be parsedAll can be parsed

Initial Parsing Getting Eigen Values & Vectors from Laplacian Matrices We tried to implement our own eigen decomposer  FAILED We started to search for a suitable library  Our needs were: - Structure for storing sparse symmetric matrices - Eigen decomposer method that is specialized for sparse symmetric matrices  partially FAILED  We switched to trial & error for finding a good library, our goal was now speed and good memory usage.

Initial Parsing librarySpecStatusTimeMemory Math.NETC# x86 Seq LAPACKxGPL20+ minsAverage DotNumericsC# x86 Seq LAPACK portxGPL2 minsAverage CenterSpaceC# x86 Seq LAPACK$ minsGood EigenC++ x86 Seq UNIQUExGPL2 hoursGood Experiment results with 2000 vertex model : DotNumerics  Structure for storing Symmetric matrices  Eigen Decomposer for Symmetric matrices  ~40 largest models are out of project scope due to time and memory problems

Initial Parsing Time and Memory Problems Largest model will take more than 250 hours to parse For the largest model we must have ~16 GB memory space because DotNumerics use Double precision for storing values Overview of running on current architecture (x86):  2GB process dedicated memory  /largeaddressaware 3 GB  We cannot make injections due to LAPACK calls It is impossible to reach beyond 3 GB address space on x86, theoretically 4 GB Curious Cat?

Initial Parsing Prototype FromTo Archx86x64 Virtual address space3 GB8 TB Virtual Memory4 GB20 GB LanguageC#C++ LibraryDotNumericsArmadillo(xGPL, tweaked for x64) Base libraryLAPACKIntel MKL Parallel ( $129) Other Library-Boost Parallel, Used CPUsNo, 1Yes, 2 Optimization for Intel CPUNoYes Other Memory Improvements-Memory Mapped File Result : Eigen Decomposition time for 2000 vertex model? Guess?

Initial Parsing Model Vertex CountTimeMemory secondsSuperb Largest one (45000)2 days ( Due to my 3GB main memory and memory mapped files, Disk I/O times become significant ; average 10 times slower for the best access) Very bad because of memory mapping Prototype Results  We didn’t change to develop on this prototype, because : Parsing the largest one is still infeasible Not all of us have x64 architecture