CS426: Building Decentralized Systems Mahesh Balakrishnan
what this class is about building decentralized systems distributed
distributed systems in the wild HW OS VMM application Distributed Storage Big Data Computation DC Network / SDN / Wireless Specialized HW (FPGA/GPU/…) HW OS device server PL Spec. HW (arduino/…) data centers edge
course format do a course project: (90% of your grade) Phase 1: Build a simple, in-memory single-node service. Phase 2: Build a durable single-node service. Phase 3: Build a failure-atomic single-node service. Phase 4: Build a scalable service. Phase 5: Build a highly available service. Phase 6: Build a transactional service. class participation (10% of your grade) no exams!
building distributed systems expect to: write a lot of code in C (with some C++). malloc and free memory. encounter and debug core dumps. use concurrency control mechanisms like locks. issue I/O calls to disk / network. minimal hand-holding. we will not debug your code for you!
building distributed systems systems: abstractions, guarantees, principles, mechanisms, protocols… what’s an abstraction? process, file, pipe, block address space, virtual memory, socket… what’s a guarantee? ACID (atomicity, consistency, isolation, durability); strong consistency… abstractions hide complexity (of managing HW...)
building distributed systems why are systems distributed? durability: the system should not lose data when faults occur availability: the system should not be unavailable to clients when faults occur locality: lower latency scalability: higher throughput security / privacy efficiency distributed systems are complex: nodes and links can fail, messages can be delayed or lost. we need abstractions to hide this complexity.
what is the abstraction? a graph with nodes and edges API: add nodes, add edges between nodes, find shortest path, etc. two questions: is the abstraction useful? can it be implemented efficiently? what guarantees do we provide? strong consistency, durability, failure atomicity, scalability, availability, transactional isolation. what mechanisms will we use? RPC, partitioning, replication protocols such as Paxos, distribution transactions… what this class is about…
why is this important? distributed systems is not a fad technology trend. -client-server systems in the 80s -early Internet services in the 90s (Altavista, Inktomi, …) -peer-to-peer systems in the 00s (Napster, Kazaa, Gnutella…) -mature Internet services in the 00s (FaceBook, Google, Amazon, NetFlix, LinkedIn, Yelp…) -mobile services in the 10s -cloud computing in the 10s (AWS, Azure…) -IoT in the 20s?
more administrative stuff first lab will be released this Friday (1/22) it will be due in two weeks from Friday (2/12) website: office hours will be posted on website TA: Josh Lockerman undergrad TA: Soham Sankaran