CS525 Advanced Distributed Systems Spring 2017

Slides:



Advertisements
Similar presentations
CPSC 689: Discrete Algorithms for Mobile and Wireless Systems Spring 2009 Prof. Jennifer Welch.
Advertisements

CS 425 / ECE 428 Distributed Systems Fall 2014 Indranil Gupta (Indy) August 26 – December 9, 2014 Lecture 1-29 Web: courses.engr.illinois.edu/cs425/ All.
EEC-681/781 Distributed Computing Systems Lecture 3 Wenbing Zhao Department of Electrical and Computer Engineering Cleveland State University
CSE 486/586, Spring 2012 CSE 486/586 Distributed Systems Introduction Steve Ko Computer Sciences and Engineering University at Buffalo.
11 CS525 Advanced Distributed Systems Spring 2015 Indranil Gupta (Indy) Wrap-Up January 20 – May 5, 2015 All Slides © IG.
Distributed Systems Sukumar Ghosh Department of Computer Science University of Iowa.
CS492: Special Topics on Distributed Algorithms and Systems Fall 2008 Lab 3: Final Term Project.
11 CS525 Advanced Distributed Systems Spring 2014 Indranil Gupta (Indy) Wrap-Up January 21 – May 6, 2014 All Slides © IG.
Advanced Topics in Distributed Systems Fall 2011 Instructor: Costin Raiciu.
Distributed systems [Fall 2014] G Lec 1: Course Introduction.
Distributed Systems: Concepts and Design Chapter 1 Pages
Introduction. Readings r Coulouris, Dollimore and Kindberg Distributed Systems: Concepts and Design Edn. 3 m Note: All figures from this book.
11 CS525 Advanced Distributed Systems Spring 2013 Indranil Gupta (Indy) Wrap-Up January 15 – April 30, 2013 All Slides © IG.
CS 425 / ECE 428 Distributed Systems Fall 2015 Indranil Gupta (Indy) August 25 – December 8, 2015 Lecture 1-29 Web: courses.engr.illinois.edu/cs425/ All.
CS614: Advanced Course in Computer Systems (Spring’04) Instructor: Ken Birman TA: non assigned (yet)
11 CS525 Advanced Distributed Systems Spring 2011 Indranil Gupta (Indy) Wrap-Up January 18 – May 3, 2011 All Slides © IG.
Lecture 29-1 Computer Science 425 Distributed Systems CS 425 / CSE 424 / ECE 428 Fall 2012 Indranil Gupta (Indy) August 27-December 11, 2012 Lecture 1-29.
CS426: Building Decentralized Systems Mahesh Balakrishnan.
Indranil Gupta Spring 09 January 20, 2009 – May 5, 2009 CS525 Advanced Distributed Systems Spring 2009.
Web Application Development Instructor: Matthew Schurr Please sign in on the sheet at the front of the room when you arrive.
Introduction to Mobile-Cloud Computing. What is Mobile Cloud Computing? an infrastructure where both the data storage and processing happen outside of.
11 CS525 Advanced Distributed Systems Spring 2016 Indranil Gupta (Indy) Wrap-Up January 19 – May 3, 2016 All Slides © IG.
Lecture 1 Book: Hadoop in Action by Chuck Lam Online course – “Cloud Computing Concepts” lecture notes by Indranil Gupta.
CSCI5570 Large Scale Data Processing Systems
AuraPortal Cloud Helps Empower Organizations to Organize and Control Their Business Processes via Applications on the Microsoft Azure Cloud Platform MICROSOFT.
COMPSCI 110 Operating Systems
CSE 486/586 Distributed Systems Wrap-up
Connected Living Connected Living What to look for Architecture
About Hadoop Hadoop was one of the first popular open source big data technologies. It is a scalable fault-tolerant system for processing large datasets.
Cassandra - A Decentralized Structured Storage System
Hadoop.
Introduction to Distributed Platforms
Distributed Programming in “Big Data” Systems Pramod Bhatotia wp
Computer Networks CNT5106C
Curator: Self-Managing Storage for Enterprise Clusters
An Open Source Project Commonly Used for Processing Big Data Sets
SmartHOTEL Planner Add-In for Outlook: Office 365 Integration Enhances Room Planning, Booking, and Guest Management for Small Hotels and B&Bs OFFICE 365.
Boomerang Adds Smart Calendar Assistant and Reminders to Office 365 That Increase Productivity and Simplify Meeting Scheduling OFFICE 365 APP BUILDER.
CS525 Advanced Distributed Systems Spring 2013
Connected Living Connected Living What to look for Architecture
Letsignit, an Automated Signature Solution for Microsoft Office 365 and Microsoft Exchange, Provides Efficiency in Branding and Customization OFFICE.
in All Office 365 Apps for Enterprise Companies
CSE 486/586 Distributed Systems Wrap-up
Make Your Management and Board Meetings More Effective and Paperless with Microsoft Office 365, SharePoint, and the Pervasent Board Papers App Partner.
CS525 Advanced Distributed Systems Spring 2009
In-Memory Performance
Operating Systems and Systems Programming
Advanced Operating Systems
Computer Networks CNT5106C
湖南大学-信息科学与工程学院-计算机与科学系
湖南大学-信息科学与工程学院-计算机与科学系
CS525 Advanced Distributed Systems Spring 2018
Web: courses.engr.illinois.edu/cs425/
CS110: Discussion about Spark
Ch 4. The Evolution of Analytic Scalability
Hadoop Technopoints.
CSE 70481: Distributed Storage
Architectures of distributed systems Fundamental Models
Syllabus and Introduction Keke Chen
Architectures of distributed systems Fundamental Models
Letsignit, an Automated Signature Solution for Microsoft Office 365 and Microsoft Exchange, Provides Efficiency in Branding and Customization OFFICE.
Web: courses.engr.illinois.edu/cs425/
Architectures of distributed systems
LO2 – Understand Computer Software
Computer Networks CNT5106C
Architectures of distributed systems Fundamental Models
Big-Data Analytics with Azure HDInsight
CS 239 – Big Data Systems Fall 2018
Distributed Systems and Algorithms
Presentation transcript:

CS525 Advanced Distributed Systems Spring 2017 Indranil Gupta (Indy) Wrap-Up January 17 – May 2, 2017 1 All Slides © IG

Agenda Wrap-Up of Discussion started at Course Beginning Articles

Can you name some examples of Operating Systems?

Can you name some examples of Operating Systems? … Linux WinXP Vista Unix FreeBSD Mac OSX 2K Aegis Scout Hydra Mach SPIN OS/2 Express Flux Hope Spring AntaresOS EOS LOS SQOS LittleOS TINOS PalmOS WinCE TinyOS

What is an Operating System?

What is an Operating System? User interface to hardware (device driver) Provides abstractions (processes, file system) Resource manager (scheduler) Means of communication (networking) …

Can you name some examples of Distributed Systems?

Distributed Systems Examples (From Beginning of Semester) Client-server (e.g., NFS) The Internet The Web A sensor network DNS BitTorrent (peer to peer overlay) Datacenters Hadoop

Do these look familiar? Basic: HBase, HDFS, Cassandra, Paxos, Mapreduce P2P: Gnutella, BitTorrent, Napster, Chord, Pastry, Grid ML: TensorFlow, SystemML Stream: Storm, StreamScore, S-Store, ReStream, BrTDB Companies: SocialHash, Kraken, Tao, ComDB2, Nitro Transactions: Warp, Isotope, SNOW, NOVA, YourSQL, SRoute Graphs: Leopard, Weaver Other: TRSpark, Slicer, Diamond, DQBarge, FairRide, Hug, SNC-Meister, Step, Firmament These are some of the systems you’ve seen during CS525 this semester 100% of the papers were new in SP17 compared to SP16!

What is a Distributed System?

The definition we started with A distributed system is a collection of entities, each of which is autonomous, programmable, asynchronous and failure-prone, and which communicate through an unreliable communication medium. Our interest in distributed systems involves algorithmics, design and implementation, maintenance, study Entity=a process on a device (PC, PDA, mote) Communication Medium=Wired or wireless network Designers and progammers interested in : algorithmics, maintenance. (informal definition, for us programmers of distributed systems) Ok: peer to peer systems Contradiction: a computer without ROM or disk drives that needs to boot over the network. Is a collection of these computers a distributed system?

A range of interesting problems for Distributed System designers P2P systems [Gnutella, Kazaa, BitTorrent] Cloud Infrastructures [AWS, Azure, GCE] Cloud Storage [Key-value stores, NoSQL, BigTable] Cloud Programming [MapReduce, Pig, Hive, Storm, Pregel] Coordination [Paxos] Routing [Sensor Networks, Internet]

A range of challenges Failures: no longer the exception, but rather a norm Scalability: 1000s of machines, Terabytes of data Asynchrony: clock skew and clock drift Security: of data, users, computations, etc.

Some of the Topics We’ve Covered Clouds and their predecessors (e.g., Grids and timesharing) Overlays and DHTs Sensor motes and TinyOS Basics – Lamport timestamps, Consensus, Snapshots, Failure detectors Epidemics Paxos & Consensus MapReduce A Touch of Sensor Nets Key-value stores and Storage Stream Processing Spark Measurement Quality or Quantity Fairness Transactions and Consistency Past Present

Some of the Topics We’ve Covered (2) Erasure Coding Latency Graph Processing Cluster Scheduling Automation and SLAs Distributed Machine Learning Memory, NVM Lots of Industrial Systems H. G. Wells, G. Hardin, Levin-Redell Present Future

CS 525 and Distributed Systems Peer to peer systems Cloud Computing D.S. Theory Sensor Networks

Interesting: Area Overlaps Course Projects! 20 Total Projects = Research + Entrepreneurial + Mixed Epidemics NNTP Gossip-based ad-hoc routing

Course Projects (Research + Entrepreneurial) Network Aware partitioning strategies for Distributed Graph Processing Graph Processing System with Bounded Staleness and Efficient Message Passing Dynamic PowerGraph Big Graph Neo4j - Bgn4j Dynamic Scaling in Flink G-Heron: Towards GPU-enabled stream processing Optimizing Dynamic Resource Allocation for Stream Processing Systems Sensitivity: Peer-Aware Mobile Offloading Wireless Surround Sound System MLsched: Spark + efficient machine learning executions TensorFlow+DSM Mesos, Swarm, and Kubernetes: Comparison Driver Fault Tolerance for Apache Spark Batch Processing PB&J: Distributed Machine Learning Over Encrypted Data KeyChain: A Discretionary, Attribute-based Identity System GNuggies: Decentralized Services using Untrusted Nodes Improving Scalability of Bitcoin A Decentralized Marketplace for Transactions Open Source Social Hashing Framework Performance Testing as a Service (PTaaS)

Course Projects (Research + Entrepreneurial) Network Aware partitioning strategies for Distributed Graph Processing Graph Processing System with Bounded Staleness and Efficient Message Passing Dynamic PowerGraph Big Graph Neo4j - Bgn4j Dynamic Scaling in Flink G-Heron: Towards GPU-enabled stream processing Optimizing Dynamic Resource Allocation for Stream Processing Systems Sensitivity: Peer-Aware Mobile Offloading Wireless Surround Sound System MLsched: Spark + efficient machine learning executions TensorFlow+DSM Mesos, Swarm, and Kubernetes: Comparison Driver Fault Tolerance for Apache Spark Batch Processing PB&J: Distributed Machine Learning Over Encrypted Data KeyChain: A Discretionary, Attribute-based Identity System GNuggies: Decentralized Services using Untrusted Nodes Improving Scalability of Bitcoin A Decentralized Marketplace for Transactions Open Source Social Hashing Framework Performance Testing as a Service (PTaaS) Graphs Stream Processing Mobile/Sensors Machine Learning Open-source Trusting the Untrusted Other

CS 525 Ongoing Projects Peer to peer systems Cloud Computing D.S. Theory Sensor Networks

CS 525 Ongoing Projects Peer to peer systems Cloud Computing D.S. Theory Projects Explore Overlaps across multiple areas (Groups shown) Sensor Networks

Leftover Work Final Project Report Submissions – 11.59 pm, Sunday May 7th, 2017 (email softcopy to indy@illinois.edu and lmlesli2@illinois.edu, turn hardcopy in to 3112 SC). At most 12 pages (+ any extra pages for refs), at least 12 pt font + 1 page for Business Plan 30% of course grade Final extension, Hard deadline (should contain hard and comprehensive data) We will work on all projects after the semester, in order to submit them to conferences/workshops! Past CS525 projects (since Fall 2003) have produced a total of more than 10 journal papers, 20 conference papers, and 10 workshop papers, and multiple best paper award winners (ICAC15, IC2E16, BigMine12)!

Winners Final Project – There will be Three Best Projects announced on website soon after the May 10th After the semester, all projects are eligible for continued work.

Office Hours for this week Today: only from now – 4 pm (in my office) Thursday 3 pm – 4 pm (in my office)

Presentations I hope you liked the selection of papers. Special mention presentations Lots of good ones! (difficult to pick “best ones”) General comments to all for future presentations: Use figures and animations – a picture is worth a 1000 thousand words Keep an eye on the clock Defer questions to end or offline if necessary Plan for > 1 minute per slide

Reviews Tough work, but only way to ensure you remember main ideas in paper and your initial thoughts on reading it Please preserve your reviews! I hope you enjoyed writing them. If your complaint is about the large number of papers you had to review….

Reviews Tough work, but only way to ensure you remember main ideas in paper and your initial thoughts on reading it Please preserve your reviews! I hope you enjoyed writing them. If your complaint is about the large number of papers you had to review….you’re right

Articles

Articles for this Class H. G. Wells, “World Brain” G. Hardin, “The tragedy of the commons” Levin and Redell, “How (and how not to) write a good SOSP paper” Nobelist 1981 chemistry

H. G. Wells H. G. Wells, “World Brain” (1938) Encyclopedias in those days written “for gentlemen by gentlemen” H. G. Wells envisioned a University that is world-wide, and a base of knowledge that is global He sought a “Permanent World Encyclopedia” That can be read by anyone anywhere That can be updated by anyone and from anywhere That will be an archive of humanity and its actions That will be an extension of humanity’s memory And he wrote this before the Internet was invented! Has this been realized? (article taken from book “World Brain,” book published circa 1938)

G. Hardin G. Hardin, “The tragedy of the commons” (1968) Adam Smith in 1776 in “The Wealth of Nations” popularized the “invisible hand,” the idea that an individual who “intends only his own gain,” is, as it were, “led by an invisible hand to promote…the public interest” Basis for stock markets and much of today’s economics! However, if there is a commons (think: open pastures, stock market, Internet, p2p, clouds, national parks, etc.), then the tragedy is that everything will be depleted so much that nothing will stay common anymore Example of free pastures for farmers with herds of sheep: “Each man is locked into a system that compels him to increase his herd without limit -- in a world that is limited.” Hardin concludes: “It is our considered professional judgment that this dilemma has no technical solution.” This essay motivated the development of game theory, and selfish models The tragedy of the commons is very visible in p2p systems (freeloading). Does it also reflect in Wikipedia? The argument says there are no technical solutions, which means you need to incentivize (or de-incentivize) humans to solve the problem Oil spills, other environmental disasters (oceans and wild lands are “commons”) Do clouds like AWS suffer from this? They’re so cheap they’re practically free. Levin and Redell 1983 sosp pc chairs Christensen – Business School, Harvard

Levin-Redell Levin and Redell, “How (and how not to) write a good SOSP paper” (PC co-chairs of 1983 ACM SOSP symposium) original idea to a real problem comprehensive and mature evaluation chronological and logical presentation Levin and Redell 1983 sosp pc chairs Christensen – Business School, Harvard

Questions?

All the Best for Your Project! Have a good summer. CS525 Course Evaluations Main purpose: to evaluate how useful this course was to you (and to get your feedback that will help improve future versions of the course) I won’t see these evaluations until after you see your grades Use pencil only (your own, or box of pencils being passed around – please return after use) I need a volunteer to: Collect forms and mail them (via campus mail only!) to ICES Return pencils to Donna Coleman (2106 SC) Return uncollected reviews to me All the Best for Your Project! Have a good summer.