Areas For Review L3 Review of SM Software, 28 Oct 2008 1 The Charge From Jim’s email with instructions for the review: “The time limit for this review.

Slides:



Advertisements
Similar presentations
Presented by Fault Tolerance and Dynamic Process Control Working Group Richard L Graham.
Advertisements

Segmentation and Paging Considerations
The Zebra Striped Network File System Presentation by Joseph Thompson.
REDUNDANT ARRAY OF INEXPENSIVE DISCS RAID. What is RAID ? RAID is an acronym for Redundant Array of Independent Drives (or Disks), also known as Redundant.
CS533 Concepts of Operating Systems Class 3 Monitors.
1: Operating Systems Overview
OPERATING SYSTEM OVERVIEW
1 Process Description and Control Chapter 3. 2 Process Management—Fundamental task of an OS The OS is responsible for: Allocation of resources to processes.
Memory Management 2010.
V0.01 © 2009 Research In Motion Limited Introduction to Java Application Development for the BlackBerry Smartphone Trainer name Date.
S A B D C T = 0 S gets message from above and sends messages to A, C and D S.
Using Two Queues. Using Multiple Queues Suspended Processes Processor is faster than I/O so all processes could be waiting for I/O Processor is faster.
What are Exception and Interrupts? MIPS terminology Exception: any unexpected change in the internal control flow – Invoking an operating system service.
Backup & Recovery 1.
Guidelines for Examination Candidates Raymond Hickey English Linguistics University of Duisburg and Essen (August 2015)
General Overview How to Order from aslegal.com -General Overview -Ordering Products - Browsing our online products - Using our Product Search - Order by.
- The Event Intelligence Platform Smarter Events for Exhibitors, Organizers & Attendees Making the most out of Zerista Company Confidential – Do Not Reproduce.
Algorithms Describing what you know. Contents What are they and were do we find them? Why show the algorithm? What formalisms are used for presenting.
1 Fault Tolerance in the Nonstop Cyclone System By Scott Chan Robert Jardine Presented by Phuc Nguyen.
Dr. Dinesh Kumar Assistant Professor Department of ENT, GMC Amritsar.
Operating System Review September 10, 2012Introduction to Computer Security ©2004 Matt Bishop Slide #1-1.
Nachos Phase 1 Code -Hints and Comments
Discussions for oneM2M Semantics Standardization Group Name: WG5 Source: InterDigital Communications Meeting Date: Agenda Item: WI-0005 ASN/MN-CSE.
Making a great Project 2 OCR 1994/2360. Analysis This is the key to getting it right. Too many candidates skip through this section. It’s worth 20% of.
2006 Chapter-1 L3: "Embedded Systems - Architecture, Programming and Design", Raj Kamal, Publs.: McGraw-Hill, Inc. 1 Hardware Elements in the Embedded.
Statistics Monitor of SPMSII Warrior Team Pu Su Heng Tan Kening Zhang.
Storage Manager Overview L3 Review of SM Software, 28 Oct Storage Manager Functions Event data Filter Farm StorageManager DQM data Event data DQM.
*** CONFIDENTIAL *** © Toshiba Corporation 2008 Confidential Creating Report Templates.
Event Management & ITIL V3
Software Development Process.  You should already know that any computer system is made up of hardware and software.  The term hardware is fairly easy.
26-Oct-15CSE 542: Operating Systems1 File system trace papers The Design and Implementation of a Log- Structured File System. M. Rosenblum, and J.K. Ousterhout.
Recent Software Issues L3 Review of SM Software, 28 Oct Recent Software Issues Occasional runs had large numbers of single-event files. INIT message.
Intermediate 2 Software Development Process. Software You should already know that any computer system is made up of hardware and software. The term hardware.
Database Management Systems, R. Ramakrishnan and J. Gehrke 1 External Sorting Chapter 13.
Computer Studies (AL) Memory Management Virtual Memory I.
Log-Structured File Systems
RELIABILITY ENGINEERING 28 March 2013 William W. McMillan.
EPICS Release 3.15 Bob Dalesio May 19, Features for 3.15 Support for large arrays - done for rsrv in 3.14 Channel access priorities - planned to.
Silberschatz, Galvin and Gagne  Operating System Concepts UNIT II Operating System Services.
SIP PUBLISH draft-ietf-simple-publish-01 Aki Niemi
The Theoretical Design
FIT5174 Parallel & Distributed Systems Dr. Ronald Pose Lecture FIT5174 Distributed & Parallel Systems Lecture 5 Message Passing and MPI.
EPICS Release 3.15 Bob Dalesio May 19, Features for 3.15 Support for large arrays Channel access priorities Portable server replacement of rsrv.
Biology 22 Molecular Laboratory Report 1. Bacterial Transformation 2. Plasmid Isolation 3. RFLP Analysis.
Embedded Computer - Definition When a microcomputer is part of a larger product, it is said to be an embedded computer. The embedded computer retrieves.
Cache Perf. CSE 471 Autumn 021 Cache Performance CPI contributed by cache = CPI c = miss rate * number of cycles to handle the miss Another important metric.
 PROCESS MANAGEMENT  A process is a program in execution: (A program is passive, a process active.)  A process has resources (CPU time, files) and.
ITEC 370 Lecture 20 Testing. Review Questions? Project update on F Test plan –Sections –How / when to use it.
St. Mary’s Catholic School, Mayville Mrs. Kaiser, Technology Teacher.
Embedded Real-Time Systems Processing interrupts Lecturer Department University.
DATA LINK CONTROL. DATA LINK LAYER RESPONSIBILTIES  FRAMING  ERROR CONTROL  FLOW CONTROL.
Findings – January  Respondents  Access to the practice  Repeat prescription service  Test results  Practice staff  Overall satisfaction 
Difference between External and Internal Server Monitoring.
Thoughts on the LMAP protocol(s) LMAP Interim meeting, Dublin, 15 th September 2014 Philip Eardley Al Morton Jason Weil 1.
Emulating Volunteer Computing Scheduling Policies Dr. David P. Anderson University of California, Berkeley May 20, 2011.
Virtual Memory.
CMSC 611: Advanced Computer Architecture
nZDC: A compiler technique for near-Zero silent Data Corruption
EEC 688/788 Secure and Dependable Computing
Microsoft® Office Word 2007 Training
Evolution in Memory Management Techniques
Documenting Your PLC System
EEC 688/788 Secure and Dependable Computing
Log-Structured File Systems
1: click on the check box in front of the document(s) you would like to share (eg : Case Analysis/Transmittal Letter/Notification letter) . A green bar.
Log-Structured File Systems
EEC 688/788 Secure and Dependable Computing
Log-Structured File Systems
CSE 153 Design of Operating Systems Winter 2019
Presentation transcript:

Areas For Review L3 Review of SM Software, 28 Oct The Charge From Jim’s with instructions for the review: “The time limit for this review greatly restricts the scope. Harry and I discussed looking at a few aspects: bigger structural problems (has the problem been decomposed well? is the design reasonable?) looking for bad memory use areas of code that might cause performance problems areas of code that might have maintenance or stability problems With regards to the last two items on the above list, Harry and Kurt will point out specific areas that could be problematic. Harry also mentioned that one useful type of comment would be to call out areas that need more detailed review.” The following slides list some areas that are candidates for your consideration.

Areas For Review L3 Review of SM Software, 28 Oct Possible Areas for Review Handling of exceptions in threads and subsequent communication of problems to other parts of the application and to operators. Minimizing the performance impact of web-based monitoring while still providing up-to-date statistics. Ways to checkpoint or monitor the various threads and resources. Ways to enable debugging operations and/or output on-the-fly with minimal performance impact. Modifications to the management of the threads to avoid duplicate methods to fetch data or exposure of object internals. Places in the code where A) objects are copied, B) references are used, or C) shared pointers are used, and a different choice would lead to better performance or better memory usage or more reliable operation.

Areas For Review L3 Review of SM Software, 28 Oct Fault tolerance – existing issue list Please see the tables in section 4 of

Areas For Review L3 Review of SM Software, 28 Oct Fault tolerance in fragment collection Some of the changes suggested in the fault tolerance list have recenctly been added to the fragment collection (FragmentCollector.cc). This includes checking for duplicate fragments, fragment numbers out of range, and missing fragments. We also now handle fragments that arrive out of order. With these changes, the handling of fragments and subsequent release of I2O buffers should be much more reliable. However, a similar issue exists in StorageManager.cc with regard to sending the “discard” message to the resource broker. In normal running, this message tells the RB that the SM has received all of the fragments for an event, and the RB can free up the buffer for that event. Since the receipt of fragments in StorageManager.cc is de-coupled from the processing of fragments in FragmentCollector.cc, it seems like we should avoid coupling between the release of the I2O buffers and the sending of the discard message. As such, it seems like we need to duplicate the fragment tracking code in StorageManager.cc Of course, a more reasonable idea is to create a templated class that contains the functionality to track fragments and “clean up” when full events have been received (or it gives up waiting for an incomplete event). This class could be used both in FragmentCollector.cc and StorageManager.cc. Thoughts?