The Storage B. Ramamurthy C B. Ramamurthy1. Topics for discussion On chip memory On board memory System memory Off system/online storage/ secondary memory.

Slides:



Advertisements
Similar presentations
Inner Architecture of a Social Networking System Petr Kunc, Jaroslav Škrabálek, Tomáš Pitner.
Advertisements

C SINGH, JUNE 7-8, 2010IWW 2010, ISATANBUL, TURKEY Advanced Computers Architecture, UNIT 2 Advanced Computers Architecture UNIT 2 CACHE MEOMORY Lecture7.
MapReduce Online Created by: Rajesh Gadipuuri Modified by: Ying Lu.
IiWAS2002, Bandung, Indonesia Teaching and Learning Databases Dr. Stéphane Bressan National University of Singapore.
Toolbox Mirror -Overview Effective Distributed Learning.
Memory Hierarchy. Smaller and faster, (per byte) storage devices Larger, slower, and cheaper (per byte) storage devices.
Meanwhile RAM cost continues to drop Moore’s Law on total CPU processing power holds but in parallel processing… CPU clock rate stalled… Because.
The Memory B. Ramamurthy C B. Ramamurthy1. Topics for discussion On chip memory On board memory System memory Off system/online storage/ secondary memory.
MS DB Proposal Scott Canaan B. Thomas Golisano College of Computing & Information Sciences.
High Performance Computing Course Notes High Performance Storage.
Google Bigtable A Distributed Storage System for Structured Data Hadi Salimi, Distributed Systems Laboratory, School of Computer Engineering, Iran University.
IT Systems Memory EN230-1 Justin Champion C208 –
Hadoop tutorials. Todays agenda Hadoop Introduction and Architecture Hadoop Distributed File System MapReduce Spark 2.
Case Study - GFS.
1: IntroductionData Management & Engineering1 Course Overview: CS 395T Semantic Web, Ontologies and Cloud Databases Daniel P. Miranker Objectives: Get.
Secondary Storage 7.
Distinguish between primary and secondary storage.
Introduction. Readings r Van Steen and Tanenbaum: 5.1 r Coulouris: 10.3.
Memory Main memory consists of a number of storage locations, each of which is identified by a unique address The ability of the CPU to identify each location.
Computer System Architectures Computer System Software
CSC 456 Operating Systems Seminar Presentation (11/13/2012) Leon Weingard, Liang Xin The Google File System.
9/14/2015B.Ramamurthy1 Operating Systems : Overview Bina Ramamurthy CSE421/521.
CS 153 Design of Operating Systems Spring 2015 Final Review.
Chapter 3: Operating-System Structures System Components Operating System Services System Calls System Programs System Structure Virtual Machines System.
COT 4600 Operating Systems Fall 2009 Dan C. Marinescu Office: HEC 439 B Office hours: Tu-Th 3:00-4:00 PM.
Presented by CH.Anusha.  Apache Hadoop framework  HDFS and MapReduce  Hadoop distributed file system  JobTracker and TaskTracker  Apache Hadoop NextGen.
Logging in Flash-based Database Systems Lu Zeping
© Pearson Education Limited, Chapter 16 Physical Database Design – Step 7 (Monitor and Tune the Operational System) Transparencies.
March 19981© Dennis Adams Associates Tuning Oracle: Key Considerations Dennis Adams 25 March 1998.
B. RAMAMURTHY MapReduce and Hadoop Distributed File System 10/6/ Contact: Dr. Bina Ramamurthy CSE Department University at Buffalo (SUNY)
Introduction to Hadoop and HDFS
Amazon Web Services BY, RAJESH KANDEPU. Introduction  Amazon Web Services is a collection of remote computing services that together make up a cloud.
CSE 451: Operating Systems Section 10 Project 3 wrap-up, final exam review.
Hive Facebook 2009.
Whirlwind Tour of Hadoop Edward Capriolo Rev 2. Whirlwind tour of Hadoop Inspired by Google's GFS Clusters from systems Batch Processing High.
1 Moshe Shadmon ScaleDB Scaling MySQL in the Cloud.
1 CSCI 2510 Computer Organization Memory System I Organization.
The exponential growth of data –Challenges for Google,Yahoo,Amazon & Microsoft in web search and indexing The volume of data being made publicly available.
Overview of Physical Storage Media
Data in the Cloud – I Parallel Databases The Google File System Parallel File Systems.
3 Computing System Fundamentals
Intro – Part 2 Introduction to Database Management: Ch 1 & 2.
Mainframe (Host) - Communications - User Interface - Business Logic - DBMS - Operating System - Storage (DB Files) Terminal (Display/Keyboard) Terminal.
The Memory B. Ramamurthy C B. Ramamurthy1. Topics for discussion On chip memory On board memory System memory Off system/online storage/ secondary memory.
Computers Operating System Essentials. Operating Systems PROGRAM HARDWARE OPERATING SYSTEM.
MapReduce and GFS. Introduction r To understand Google’s file system let us look at the sort of processing that needs to be done r We will look at MapReduce.
+ CS 325: CS Hardware and Software Organization and Architecture Memory Organization.
Programming for GCSE Topic 5.1: Memory and Storage T eaching L ondon C omputing William Marsh School of Electronic Engineering and Computer Science Queen.
Computer Architecture Lecture 24 Fasih ur Rehman.
Introduction: Memory Management 2 Ideally programmers want memory that is large fast non volatile Memory hierarchy small amount of fast, expensive memory.
Database Systems Lecture 1. In this Lecture Course Information Databases and Database Systems Some History The Relational Model.
11 Intel Modular Server Understanding the Storage MFSYS25 MFSYS35.
Multilevel Caches Microprocessors are getting faster and including a small high speed cache on the same chip.
66 CHAPTER THE SYSTEM UNIT. © 2005 The McGraw-Hill Companies, Inc. All Rights Reserved. 6-2 Competencies Details about memory Memory Classification Bases.
Lecture 5: Memory Performance. Types of Memory Registers L1 cache L2 cache L3 cache Main Memory Local Secondary Storage (local disks) Remote Secondary.
Em Spatiotemporal Database Laboratory Pusan National University File Processing : Database Management System Architecture 2004, Spring Pusan National University.
BIG DATA/ Hadoop Interview Questions.
CIT 140: Introduction to ITSlide #1 CSC 140: Introduction to IT Operating Systems.
TYPES OF MEMORY.
Memory COMPUTER ARCHITECTURE
Flash Storage 101 Revolutionizing Databases
Memory Main memory consists of a number of storage locations, each of which is identified by a unique address The ability of the CPU to identify each location.
Memory Main memory consists of a number of storage locations, each of which is identified by a unique address The ability of the CPU to identify each location.
Primary and Secondary Storage Explained
Introduction to Operating Systems
The Memory B. Ramamurthy C B. Ramamurthy.
MICROPROCESSOR MEMORY ORGANIZATION
Computer System Design Lecture 11
Amazon Web Services.
Primary Storage 1. Registers Part of the CPU
Presentation transcript:

The Storage B. Ramamurthy C B. Ramamurthy1

Topics for discussion On chip memory On board memory System memory Off system/online storage/ secondary memory File system abstraction Offline/ tertiary memory RAID: Redundant Array of Inexpensive Disks NAS: Network Accessible Storage SAN: Storage area networks DB and DBMS: Data base and DB management systems Distributed file system Google file system Hadoop file system C B. Ramamurthy2

Data and Computation Continuum Compute intensive Ex: computation of digits of PI Data intensive Ex: analyzing web logs C B. Ramamurthy3

More dimensions C B. Ramamurthy4 Data scale Compute scale Payroll KMGT MFLOPS GFLOPS TFLOPS PFLOPS P Digital Signal Processing Weblog Mining Business Analytics Realtime Systems Massively Multiplayer Online game (MMOG) Other variables: Communication Bandwidth, ?

Solution Processing Granularity Pipelined Instruction level Concurrent Thread level Service Object level Indexed File level Mega Block level Virtual System Level Data size: small Data size: large

On chip memory Registers Cache Buffers (instruction pipeline) Characteristics: volatile C B. Ramamurthy6

On board memory Cache – Instructions cache – Data cache – Translation look aside buffers (TLB) Characteristics: content addressable, set- associative organization C B. Ramamurthy7

System memory RAM : Random access memory: main memory Read and write possible volatile ROM: Read only memory: boot programs for operating systems Flash memory: Erasable/writable non-volatile memory SDRAM: synch dynamic RAM others EAROM C B. Ramamurthy8

Off-system storage (Earlier Lectures covered these) Off system/online storage/ secondary memory File system abstraction Offline/ tertiary memory RAID: Redundant Array of Inexpensive Disks NAS: Network Accessible Storage SAN: Storage area networks C B. Ramamurthy9

Database and Database Management System Data source Transactional Data base server Relational db or similar foundation Tables, rows, result set, SQL ODBC: open data base connectivity Very successful business model: Oracle, DB2, MySQL, and others Persistence models: EJB, DAO, ADO (I am not going to expand the abbreviation.. ) C B. Ramamurthy10

Distributed file system(DFS) A dedicated server manages the files for an compute environment For example, nickelback,cse.buffalo.edu is your file server and that is why we did not want you to run your user applications on this machine. DFS addresses various transparencies: location transparency, sharing, performance etc. Examples: NFS, NFS+, AFS (Andrew FS)… (you will study these in Distributed Systems course) C B. Ramamurthy11

On to Google File Internet introduced a new challenge in the form web logs, web crawler’s data: large scale “peta scale” But observe that this type of data has an uniquely different characteristic than your transactional or the “order” data on amazon.com: “write once” ; so is HIPPA protected healthcare and patient information; Google exploited this characteristics in its Google file system: S. GhemavatGoogle file system: S. Ghemavat C B. Ramamurthy12

Hadoop File System (HFS) Hadoop file system is a reverse engineered version of the GFS : this is my first opinion on HFS HFS is a distributed file system for large scale data Data throughput is more important than latency Batch computing than interactive time shared computing C B. Ramamurthy13

Cat Bat Dog Other Words (size: TByte) map split combine reduce part0 part1 part2 MapReduce