Introduction Zachary G. Ives University of Pennsylvania CIS 455 / 555 – Internet and Web Systems January 17, 2008.

Slides:



Advertisements
Similar presentations
Welcome to Middleware Joseph Amrithraj
Advertisements

Distributed Processing, Client/Server and Clusters
© 2013 A. Haeberlen, Z. Ives Welcome to CIS 455 / 555 – Internet and Web Systems Zachary G. Ives University of Pennsylvania CIS 455 / 555 – Internet and.
Distributed components
Technical Architectures
CSCD 433/533 Advanced Computer Networks Lecture 1 Course Overview Fall 2011.
OCT 1 Master of Information System Management Organizational Communications and Distributed Object Technologies Review For Midterm.
CS603 Advanced Topics in Distributed Systems MWF 13:30-14:30 RHPH 162 Professor Chris Clifton.
OCT1 Principles From Chapter One of “Distributed Systems Concepts and Design”
Computer Science 162 Section 1 CS162 Teaching Staff.
EEC-681/781 Distributed Computing Systems Lecture 3 Wenbing Zhao Department of Electrical and Computer Engineering Cleveland State University
The Architecture of Transaction Processing Systems
McGraw-Hill/Irwin Copyright © 2007 by The McGraw-Hill Companies, Inc. All rights reserved. Chapter 17 Client-Server Processing, Parallel Database Processing,
ECS152BXin Liu 1 ECS 152B Computer Networks Fall 2003 Prof. Xin Liu
CSCD 330 Network Programming Winter 2012 Lecture 1 - Course Details.
Client-Server Processing and Distributed Databases
Quick Tour of the Web Technologies: The BIG picture LECTURE A bird’s eye view of the different web technologies that we shall explore and study.
Introduction. Readings r Van Steen and Tanenbaum: 5.1 r Coulouris: 10.3.
Computer Network Fundamentals CNT4007C
Computer Networks CEN 5501C Spring, 2008 Ye Xia (Pronounced as “Yeh Siah”)
5.1 Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved DISTRIBUTED.
Networks – Network Architecture Network architecture is specification of design principles (including data formats and procedures) for creating a network.
Lector: Aliyev H.U. Lecture №14: Telecommun ication network software design for data bases and servers. TASHKENT UNIVERSITY OF INFORMATION TECHNOLOGIES.
20-753: Fundamentals of Web Programming 1 Lecture 1: Introduction Fundamentals of Web Programming Lecture 1: Introduction.
1 ECE 156 Computer Network Architecture Professor Krish Chakrabarty Department of Electrical and Computer Engineering Fall 2006.
Distributed Systems: Concepts and Design Chapter 1 Pages
Syllabus. Instructor Dr. Hanan Lutfiyya Middlesex College 418 Ext Office Hours: Tuesday from 12:05-1:05 and Thursday from 11:05-1:05.
CS4273: Distributed System Technologies and Programming Lecture 13: Review.
CSCD 330 Network Programming Fall/Winter/Spring 2014 Lecture 1 - Course Details.
Architectures of distributed systems Fundamental Models
1 Introduction to Middleware. 2 Outline What is middleware? Purpose and origin Why use it? What Middleware does? Technical details Middleware services.
Welcome to CIS 455 / 555 – Internet and Web Systems Zachary G. Ives University of Pennsylvania CIS 455 / 555 – Internet and Web Systems January 13, 2010.
DISTRIBUTED COMPUTING Introduction Dr. Yingwu Zhu.
Welcome! CSI 4118: Computer Networks and Protocols (3,0,0) Professor: Dr. Robert L. Probert Office: SITE 5098 Phone: x6709
Hwajung Lee.  Interprocess Communication (IPC) is at the heart of distributed computing.  Processes and Threads  Process is the execution of a program.
IST 210: Organization of Data
Syllabus. Instructor Dr. Hanan Lutfiyya Middlesex College 418 Ext Office Hours: Wednesday 5-6; Thursdays 4-6 or by appointment.
From Coulouris, Dollimore, Kindberg and Blair Distributed Systems: Concepts and Design Edition 5, © Addison-Wesley 2012 Design of Parallel and Distributed.
 Course Overview Distributed Systems IT332. Course Description  The course introduces the main principles underlying distributed systems: processes,
Unit 9: Distributing Computing & Networking Kaplan University 1.
Stanford GSB High Tech Club Tech 101 – Session 1 Introduction to Software, Distributed Architectures, and ASPs Presented by Shawn Carolan Former Manager.
Distributed systems [Fall 2015] G Lec 1: Course Introduction.
© Chinese University, CSE Dept. Distributed Systems / Distributed Systems Topic 1: Characterization of Distributed & Mobile Systems Dr. Michael R.
Introduction.  Administration  Simple DBMS  CMPT 454 Topics John Edgar2.
CSCD 330 Network Programming Winter 2015 Lecture 1 - Course Details.
1 CDA 4527 Computer Communication Networking (not “analysis”) Prof. Cliff Zou School of Electrical Engineering and Computer Science University of Central.
INTERNET AND PROTOCOLS For more notes and topics visit: eITnotes.com.
ECE 374: Computer Networks & Internet Introduction Spring 2015 Prof. Michael Zink.
CMPT 401 Distributed Systems Concepts And Design.
Computer Network Architecture Lecture 6: OSI Model Layers Examples 1 20/12/2012.
ECE 374: Computer Networks & Internet Introduction Spring 2012 Prof. Michael Zink.
Computer Networks CNT5106C
1 Welcome to COE 431: Computer Networks Instructor: Wissam F. Fawaz Office 103, Bassil Bldg. Required.
CSCD 433/533 Advanced Computer Networks Lecture 1 Course Overview Spring 2016.
Distributed Systems 0. Overview Simon Razniewski Faculty of Computer Science Free University of Bozen-Bolzano A.Y. 2014/2015.
CHAPTER 3 Architectures for Distributed Systems
CSCD 433/533 Advanced Computer Networks
EECE 310 Software Engineering
CSCD 330 Network Programming Spring
Lecture 1: Multi-tier Architecture Overview
Distributed Systems Bina Ramamurthy 11/30/2018 B.Ramamurthy.
Distributed Systems Bina Ramamurthy 12/2/2018 B.Ramamurthy.
CSCD 330 Network Programming Spring
Architectures of distributed systems Fundamental Models
CSCD 433/533 Advanced Computer Networks
CSCD 330 Network Programming Spring
Distributed Systems Bina Ramamurthy 4/22/2019 B.Ramamurthy.
DISTRIBUTED SYSTEMS Principles and Paradigms Second Edition ANDREW S
Architectures of distributed systems Fundamental Models
Presentation transcript:

Introduction Zachary G. Ives University of Pennsylvania CIS 455 / 555 – Internet and Web Systems January 17, 2008

2 What this Course Is About  The focus is NOT on building Web applications in PHP, or servlets, or ASP…  It’s about how to build services like Google, Akamai, iTunes, …  What are the principles behind them?  Distributed systems concepts, with emphasis on scalability and interoperability  Data representation fundamentals, with emphasis on XML  Information retrieval concepts, including ranking and indexing  It’s a course that involves building software, evaluating it, and programming in teams

3 How Does this Relate to Other CIS Courses? CIS 505: focuses on distributed systems with an emphasis on concurrency CIS 330/550  Data representation and management  Building a DBMS-backed, servlet-based web site (e.g., mashup)  455/555 focuses on data with respect to interoperability CIS 350: focuses on software development and engineering CIS 573: software engineering & mashups Assumptions:  You know something about threads and synchronization primitives; and you’re at least vaguely familiar with database ideas (or can quickly learn them)

4 Some Things We’ll Look at  What are the principles behind building systems that work on the Internet?  How do these relate to many of today’s hot technologies?  Web servers, DHTML, Servlets, JSP, …  XML  Web services  Peer-to-peer  Application servers  Content distribution networks  Web search  Mash-ups  …

5 Staff  Instructor: Zack Ives,  Office: 576 Levine North  Office hours T 3:30-4:30 (and by arrangement)  TA: Mengmeng Liu,  Office: 575 Levine North  Office hours TBA  Discussion group:  

6 Textbooks  Distributed Systems: Principles and Paradigms, 2 nd ed, Tanenbaum and van Steen  Frequent supplementary handouts  Excerpts from several books  Many recent research papers

7 Prerequisites, Workload, etc. Necessary skills:  The ability to code in Java – there is a substantial implementation project  The ability to work as a team with a classmate  A willingness to “push the envelope”  Knowledge of threads & sync: CSE 380 – Operating Systems – or equivalent  Suggested CIS 330 / 550 – Databases – or equivalent Workload:  Several programming/debugging-based homework assignments  A substantial term project with experimental evaluation and a report  Midterm and final exam  WARNING: this course should be considered 1.5 CU (and we’re in the process of making that happen)

8 A Disclaimer…  This is a “bleeding edge” course!  Goal 1: give you a look under the covers of today’s hottest topics – in lectures and in projects  Goal 2: give you a level of comfort in managing large, complex software development with others’ code  Part of this means doing a substantial implementation project  As in the real world: learning APIs, dealing with inadequate tools  Most of you will find this a struggle!  We will be using some immature technology  Not everything has been tested and validated ahead of time  We’ll do the best we can to smooth over the bugs  We hope it will be a fun course, though… … And an interesting one!

9 A Bit of Context for the Course

10 What Exactly Is the Web?  The Web consists of HTTP servers that publish HTML, XML, and a few other content types  These are hyperlinked via URLs (a subset of URIs)  Plus there are a huge number of web clients  The web is built on a number of Internet protocols:  DNS, TCP, IP  The Internet has many other protocols  SMTP, IMAP, POP, AIM, FTP, …  Streaming media, music swapping protocols, …  Web services, custom applications

11 The Internet is Built in Layers  Link layer (802.11x, 802.3, …)  IP layer – point-to-point and multicast  Transport, session layers:  TCP, UDP -- session-based vs. sessionless; reliable vs. unreliable  UDP is used as the core of many multimedia protocols (e.g., Real or WMP streaming protocols)  TCP is used as the basis of most of our session-oriented protocols:  Telnet, SSH, FTP  HTTP  Other protocols are built over HTTP (e.g., XML-RPC)  IM, P2P protocols, …  Middleware and application layers  Sometimes we interpose extra layers that are invisible  e.g., Akamai

12 What Is an Internet System?  Not just a web server or web application…  An application built over the Internet, whose functionality is distributed across more than one machine  Typically, at least in a client-server or server-to-server fashion, but may have many more participants  Typically, data and/or code must be exchanged in distributed fashion for the functioning of the application  Often, the data must be partitioned, replicated, translated, etc.  Often, the code is written in multiple different environments, languages, etc.  Often, there are concerns about handling failures, firewalls, …

13 Why Are Internet System Topics Interesting?  Understanding what’s underneath today’s web  How does it work?  What are its shortcomings?  What are its strengths?  Understanding distributed algorithms  Using the right approach when designing new protocols and web systems  Being able to anticipate what’s actually possible in the future

14 Example: Web Search Index Servers Crawlers Search Interface Servers queries HTML forms; results query results Web Pages pages keywords + locations client Uses a model of document/word similarity to rank matches

15 Example: Information Integration XML sources Mediator System queries results in “mediated schema” client Relational sources HTML sources XQuery + XPath over XML SQL ODBC results HTTP POST HTML Maps all data into a single format and virtual schema

16 Example: Problem Partitioning client Breaks computation into many parts and distributes them to the clients Data Aggregation New sub- problems Computed subresults

17 Example: P2P File Sharing client request data Processes name-based requests for data; each node can make requests, forward requests, return data

18 What are the Hard Problems?  Disclaimer: most of the hard problems AREN’T solved (or solvable) – and there often isn’t any single BEST solution Much of systems design is about finding the right compromise for each specific problem  We can divide them into:  Scalability  Availability / reliability  Consistency  Interoperability  Location and resource discovery

19 Scalability  How do we support a large number of clients or requests?  Distribute work!  Challenges:  Coordination – takes significant overhead  Load balancing – avoid having bottlenecks  Parts of the solution:  Client-server, multi-tier, P2P architectures  Data partitioning, replication, remote procedure calls, …

20 Availability/Reliability  How do we ensure the system is “up” when we want it to be?  Replication and redundancy  Security measures against attacks  Ability to undo/redo  Challenges:  Keeping things consistent  Performance vs. security  Acknowledgments  Parts of the solution:  Data partitioning, replication, …  Logging, transactions, …  Redundant hardware, multiple sites, …

21 Consistency  Replication and distribution make it difficult to keep a unified, consistent view of the world – how do we combat this?  Locking, concurrency control, and invalidation schemes  Clock synchronization  Challenges:  Locking has huge performance overhead  Network partitions, disconnected operation  Parts of the solution:  Optimistic concurrency control, 2-phase locking  Conflict resolvers

22 Interoperability  How do we coordinate the efforts of components that have different data formats and/or source languages, and are on different machines?  Standardization!  Challenges:  Everything has a different semantics!  Parts of the solution:  Standard data formats: XML, XML schemas  “Schema mediation” and data translation  Remote procedure calls: CORBA, XML-RPC, …

23 Location & Resource Discovery  How do you find what you’re looking for?  Naming  Declarative queries over standard schemas  Advertisements  Challenges:  Naming has implicit semantics  What do you do when you don’t know what to call something?  Parts of the solution:  Directory systems – DNS, LDAP, etc.  Resource discovery and advertising protocols  Standardized schemas

24 Our First Focus: Single Machines, aka Servers  How do you handle large numbers of concurrent users?  Processes  Threads  Events  Hybrids (e.g., thread pools)  Staged architectures

25 Next Time…  We’ll look under the covers of an HTTP server  Key ideas in building scalable systems  Principles of HTTP and web servers  Management of concurrent sessions  To read:  Lampson and Saltzer paper  Tanenbaum Ch. 3.1  For next week: “HTTP Made Really Easy” and Rexford Ch. 4  If necessary: Review Tanenbaum “Modern OS,” Ch. 2.3 or a similar OS book on interprocess communication