Performance Issues of Web Services CSCI 8710 November 29-30, 2006 Kraemer.

Performance Issues of Web Services CSCI 8710 November 29-30, 2006 Kraemer

Web Services  Services available via the Internet that complete tasks or conduct transactions.  Self-contained, modular applications that can be described, published, and invoked over the Internet.  Can be automatically invoked by application programs.

Web Services  May be invoked at one site or may combine results of several services executed at different sites.

Performance concerns differ from stanard C/S  May involve both web service processing and network delays  May be accessed by wide variety of devices -- desktop computers, PDAs, mobile phones, other servers  Access via wireless communication networks: dynamic connectivity, low bandwidth, high latency

Performance concerns differ from standard C/S  Undpredictable nature of requests  Highly bursty  Varies with geographical location of clients, day of week, time of day  Highly variable size of requested objects  “Robot” access  Autonomous software agents that can consume significant amounts of system resources

Types of servers providing Web Services  Web servers  Transaction servers  Proxy servers  Cache servers  Wireless gateway servers  Mirror servers

Common problems  Insufficient bandwidth at peak times  Overloaded servers  Uneven server loads  Delivery of dynamic content  Shortage of connections between application servers and database servers  Failure of third-party servers  Delivery of multi-media content

Example: Bill Paying Service  Portal offers bill paying service  Customers can pay variety of bills through the service  Uses services provided by others:  Debit authorization (100 tps capability)  Electronic funds transfer  Customer authentication

Example: Bill Paying Service

 Portal B is bill paying service  Treat overall web service as ‘system’  Treat component services as ‘devices’  What is the capacity of B, given that the debit authorization service can support 100 tps and that each payment transaction requires 2 visits to the  X i = V i * X 0  100 = 2 * X 0  X 0 = 50 tps

Web server elements

HTML and XML  Most documents on the Web written using HTML “markup language”  Most consist of text and inline images  Can also include other multimedia objects  Generates multiple requests: for document and for each inline image -- single click by user may generate series of requests  XML uses tags and attributes to define/delimit data  Application must interpret meaning of the tags

Hardware and Operating System  Hardware view: performance a function of:  Number and speed of processors  Amount of main memory  Bandwidth and storage capacity of disk subsystem  Bandwidth of the NIC  OS considerations:  Performance, scalability, reliability, robustness

Content  Performance affected by:  Content size  Content structure  Hyperlinks  Popularity of content

Perception of Performance  User view:  Fast response time; no connections refused  Management view:  High throughput; high availability  Need to have quantitative measurements that describe behavior of Web service

Metrics  Two most important;  Response time -- seconds  Throughput -- http_ops/sec, also bits/sec

Other metrics  Hit  any connection to a web site, including in-line requests and errors  difficult to compare across sites  Visit  Series of page requests by a user at a single site  Inter-request times < timeout_value  Session  Series of consecutive and related requests made during a single visit  Inter-request times < timeout_value

Other metrics  User-perceived response time  Set of geographically distributed agents poll the WS  Error rate  Increase indicates degrading performance  Examples:  Overflow of pending connection queue  For streaming services:  Jitter  Startup latency

Most common measurements of Web service performance  End-to-end response time  Site response time  Throughput (req/sec)  Throughput (Mbps)  Errors/sec  Visitors/day  Unique visitors/day

Example - Travel Agency  Monitor for 30 minutes:  9000 HTTP requests  Three types of objects delivered:  Html pages (30%, avg. size 11,200 bytes)  Images (65%, avg. size 17,200 bytes)  Video clips (5%, avg. size 439,000 bytes)  What is the throughput:  9000 requests/1800 sec = 5 req/sec  What is the throughput in Kbps?

Throughput in Kbps?  X r = (total_req * class% * avg. size)/time  X html = (9000 * 0.30 * 11,200*8)/1800 = 131.25  X image = (9000 * 0.65 * 17,200*8)/1800 = 436.72  X video = (9000 * 0.05 * 439,000*8)/1800 = 857.42  X 0 = 131.25 + 436.72 + 857.42  X 0 = 1425.39 Kbps To support the Web traffic, the network connection should be at least a T1 line ( 1.544 Mbit/s ).Mbit/s

QoS indicators for Web Services  Response time  Availability  Percentage of time a service is ‘live’ (serving customer requests)  Reliability  Probability that WS will perform in satisfactory manner for a given period of time under specified operating and load conditions  Predictability  Cost

Input data needed to monitor QoS  Traffic  Performance  Usage patterns  Knowledge of average and peak load

Where are the delays?

 Four categories:  DNS lookup phase  TCP connection set-up phase  Server execution time  Network time

DNS lookup phase  Browser converts server name in URL into an IP address to establish the TCP connection  If server name can’t be resolved by local cache, send query to higher-level DNS server  For leading e-commerce sites, avg. lookup times are 0.01 and 0.11 sec. Fastest sites achieve 0.001 sec.

Anatomy of a Web Transaction

Anatomy of a Web transaction  Browser  Network  Server

Anatomy of a Web Transaction: the Browser  User clicks on hyperlink; requests document  Client (browser) checks local cache for document;  in case of hit:  returns document; user response time R’ Browser,hit*  In case of miss  Browser asks DNS to map server hostname to IP address  Cloent opens a TCP connectionto the server defined by the URL of the link  Client sends an HTTP request to the server  Browser formats and displays document and renders images  Returned document is stored in browser cache  User response time: R’ Browser,miss*

Anatomy of a Web Transaction: the Network  Imposes delays in delivering info from client to server (R’ N1 ) and from server to client (R’ N2 ).  Delays a function of components on path between them:  Modems, routers, comm links, bridges, relays  R’ Network  = total time HTTP request spends in the netork  = R’ N1 + R’ N2

Anatomy of a Web transaction: the Server  request arrives from client  server parses the request according to the http  server executes requested method (GET, HEAD, etc.)  if GET  server looks up file in its document tree by using the file system; file may be in cache or on disk  server read contents of file from disk or cache and writes it to network port  when file send complete, close the connection (if non- persistent HTTP)  R’ server = time spent in execution of HTTP request  includes service time and waiting time at the server

Anatomy of a Web transaction  If document not found in client’s cache:  response time is sum of residence time at all resources  R miss = R’ Browser, miss + R’ Network + R’ Server  If a hit  Rhit = R’ Browser, hit  Typically:  R hit << R miss  Average response time, R, over N T requests:  R = p C * R hit + (1-p c ) * R miss

Example  User wants to analyze impact of local cache size of browser on Web response time perceived by user  20% of requests serviced by local cache with R=400 msec  R for remotely serviced requests = 3 sec  Previous expts. indicate that 3x cache size results in hit rate of 45%  R_orig=0.20 * 0.4 + 0.80 * 3.0 = 2.48 sec  R_new = 0.45 * 0.4 + 0.55 * 3.0 = 1.83 sec

Bottlenecks  bottleneck = the component that limits system performance  Need to identify the bottleneck to improve performance

Example  home user  takes too long to download medium-size page (avg. size 20KB)  considering upgrading to processor w/2X faster CPU  How will this affect response time?

Example, continued  Assume:  R’ network = 7.5 sec  R’ server = 3.6 sec  R’ Browser, miss = 0.3 sec  R = R’ network + R’ server +R’ Browser, miss  R = 7.5 sec + 3.6 sec + 0.3 sec = 11.4 sec  Rnew = 7.5 + 3.6 + 0.15 = 11.25 sec  not much difference … CPU not the bottleneck

Example  Pharma co. plans intranet for training and display of images of molecules  training sessions have 100 people  assume 80% active at any one time  Each user performs avg. of 100 ops/hour  Each op requests avg. of 5 images  Avg. size of requested image is 25600 bytes  What is minimum bandwidth of network connection to image server?

Example, continued  100 * 0.80 * 100 ops/hour * 5 images/op * 25600 bytes/image * 8 bits /byte * 1 hr/3600 sec  (100 * 0.80 * 5 *25600 * 8 )/3600 = 2.28 Mbps

Web Infrastructure

Web infrastructure  Three major delay sources:  “last mile”  Link between end user and phone company switch, or DSL or cable connection to service provider  ISPs  Recently, more bandwidth added  Improvements via caching, load balancing, more servers  ‘backbone’ of network  Collection of interconnected network providers  Connect to each other to exchange traffic (peering)  Public peering: at major interconnection points (NAPs, network access points)(MAEs, Metropolitan Access Points)  Delays may occur at peering points

Basic Components  Servers  Browsers  Firewalls  protect data, programs, and computers on private network from the uncontrolled activities of untrusted users and software on other computers  Screens network traffic going through it, using  Software, network hardware, computers  Potential performance bottleneck

Proxy, Cache, Mirror  Techniques for improving web performance and security  Try to reduce  access time to web documents  Network bandwidth required for doc xfers  Demand on servers w/ very popular docs

Proxy server  Special type of web server that acts as an agent: server to the client, client to the server  Accepts requests from clients, forwards them to web servers  Receives responses from remote servers, forwards them back to the client  Originally designed to provide web access for users on private networks who had to go through a firewall

Proxy server  Can be configured to cache relayed responses  Benefits:  Improves access speed by bringing data closer to consumer  Cuts down on network traffic  Reduces server load  Increases availability in the web  Problems:  Ensuring that cached docs are up-to-date  What’s worth caching? For how long?

Proxy server

Caching  Used in the Web:  Client-side, at the browser  In the network, a caching proxy  Evaluating caching effectiveness:  Hit ratio = requests_satisfied/total_requests  Byte hit ratio = hit ratio weighted by doc size  Data transferred = bytes xferred/time

Example  Manager wants to install caching proxy server on corporate intranet w/ > 2000 users  Use for 6 months -> then evaluate  Consider two cases:  Cache holds small documents, avg. size 4800 bytes, hit ratio 60%  Cache holds medium documents, avg. size 32500 bytes, hit ratio 20%  Monitor for one hour, observe 28800 requests

Cache efficiency  Saved_BW =  (num_req * hit_ratio * avg_size)/time  Saved_BW_small =  (28800 * 0.60 * 4800 * 8)/3600 sec = 184Kbps  Saved_BW_med =  (28800 * 0.20 * 32500*8)/3600 = 416 Kbps  Holding larger documents can save more BW

Mirroring  Replicating site content at other servers  Requires:  Regular updates  DNS to direct browsers to secondary sites when primary is busy  Goals:  Increase availability  Balance server load  Thus increasing quality of service

Example  Manufacturing co., employee portal, too slow for European users  Idea: install mirror site in Paris  What are the bandwidth savings ?

Example: Mirror site in Paris  Current avg. BW is 35 Mbps  40% of load from Europe  42% of traffic could be served from caching  Cacheable amount: 35 * 0.42 = 14.7Mbps  Estimate cache hit ratio at 38%  Saved_BW = 14.7 Mbps * 0.38 = 5.6 Mbps  40% of traffic from Europe, so:  5.6 * 0.40 = 2.24 Mbps could be served from cache in Paris  6.4% savings on current BW usage at server  improvement in perceived response time for European users

Content Delivery Networks(CDN)  cache or replicate content as needed to meet demands from clients over the Web  coordinated caching systems implemented through proprietary networks and data centers  employ a DNS-redirecting mechanism  tries to assign best location from which to serve the requested content

Content Delivery Networks(CDN  DNS-redirecting mechanism:  client requests URL; browser generates a DNS request for the IP address corresponding to the domain name in the URL  CDN controls the DNS service for this domain name  CDN modifies DNS requests with the IP addess of a selected server rather than IP address of original server  uses a routing function to select “best” server:  client location, id of requested content, load of CDN network and servers, proximity of CDN servers to client are all considered  CDN should provide:  scalability, high availability, manageability, performance

The WAP Infrastructure  WAP = Wireless Application Protocol  architecture + set of protocols for wireless devices to access Web services at regular Web sites  wireless device communicates with WAP gateway, over wireless nework  WAP gateway communicates with servers

The WAP Infrastructure

 Docs for wireless devices written in form of XML known as WML (wireless markup language)  can also use WMLscript  WML docs  structured as set of “cards”, units of user interaction  deck = set of cards  users navigate between cards

The WAP Infrastructure  WML decks + WMLScripts  stored in regular web servers on internet  retrieved by WAP gateway via HTTP  Web server response is binary encoded by WAP gateway and sent to wireless device via lightweight protocols  designed to minimize BW requirements

WAP protocol stack

Server Architectures  Web Server  Application Server  Transaction and Database Server  Streaming Server  Multi-tier Architecture

Web Server  listens for HTTP requests  establishes requested connection  sends requested file  returns to listening mode  can handle more than one request at a time  fork a copy of the HTTP process for each request  multi-threaded HTTP program  pool of running processes

Dynamic content  can use client-side or server-side programs  can improve performance by pushing to client-side

Application Server  software that handles all application operations between broswer-based customers and back-end databases  receive client request  execute business logic, interacting with transaction and/or DB servers  can be implemented in many ways:  CGI scripts, FastCGIs, server-applications, server-side scripts

Transaction and Database Server  Tranasction Processing (TP) monitor provides:  an application programming interface  a set of program development tools  a system to monitor and control execution of transaction programs  DB server:  executes and monitor transaction processing applications

Streaming Server  Initially, audio and video were “download and play” technologies  Streaming media begins to play “almost” immediately  client request arrives  server retrieves video and audio data and begins to deliver them over the network  video and audio are compressed (MPEG, MP3)  typically have control part and data part

Example  Company plans to offer MM online training  Employee retrieves lecture of video, audio, slides; 30 minute duration  What is the number of streaming servers needed to serve the lecture presentation during busiest period of the day: 4-5 pm

Example  400 employees at peak  One MM server can stream presentations to 150 viewers simultaneously  What is the average number of simultaneous viewers during peak period?  Use Little’s Law: N= R  = Req/time = 400 viewers/60 min  R = 30 min  N = 30 * 400/60 = 200  Need two MM servers

Multi-tier Architecture  web-based apps usually in 3-tier architecture:  presentation layer  user interface (browser & HTML, XML, etc.)  application layer  business logic  collection of rules to implement application logic  may also contain Java applets, ActiveX controls, etc.  data service layer  persistent data

Multi-tier Architecture

Example  application layer designed to support 400 simultaneous processes  app process:  receives client request  executes app logic, interacting with DB server  Monitoring shows:  app process executes for 150 msec between DB requests  DB server handles 440 req/sec  400 app processes running during peak period

What if??  the application servers are replaced by new servers with 2X speed  Each application server characterized by Z, “think time” – time between receiving a reply from the DB server and submitting a new DB request  DB layer, characterized by throughput, X, in req/sec  R = N/X - Z

What if...?  DB response time:  R = 400/550 – 0.15 = 577 msec = 0.577 sec  after cpu upgrade, app processing time should be 75 msec  DB response time now:  R new = 400/550 – 0.075 = 652 msec = 0.652 sec  Improvement in app layer may not lead to improvement overall

Dynamic Load Balancing  heavy traffic load adversely impacting performance  add more servers  buy bigger (faster) servers  need to do cost-performance analysis

Dynamic Load Balancing  web cluster:  multiple web servers  single location addressed by one URL and a single virtual IP address  incoming requests routed amount servers in user-transparent way  switch acts as dispatcher, mapping virtual IP address to actual address

Web cluster

Networks  Bandwidth  measures the rate at which data can be sent through the network  usually expressed in bps  Latency  time needed for a bit (or small packet) to travel across the network

Bandwidth for different types of networks

Planning  Streaming service offers training videos  training session -> 15 min video at 300 Kbps  What impact if videos go to 25 min?  Service supports 35 simultaneous sessions  Average BW needed (now)  35 * 300 Kbps = 10.5 Mbps  Average number simult. sessions (now)  N = 35  N = * R  35 = * 15  = 35/15 = 35/15.. assume this remains the same  N new = * 25 = 35/15 * 25 = 58.33  Average BW needed (new)  58.33 * 300 Kbps = 17.5 Mbps

Example  training videos, avg. size 950 MB  100 students, 80% active at one time  Each user requests 2 clips/hour  BW needed to support:  ( 0.80 * 100) * 2 * (8 * 950)/3600 sec  337.7 Mbps  Need a 622 ATM network to support

Performance Issues of Web Services CSCI 8710 November 29-30, 2006 Kraemer.

Similar presentations

Presentation on theme: "Performance Issues of Web Services CSCI 8710 November 29-30, 2006 Kraemer."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Performance Issues of Web Services CSCI 8710 November 29-30, 2006 Kraemer.

Similar presentations

Presentation on theme: "Performance Issues of Web Services CSCI 8710 November 29-30, 2006 Kraemer."— Presentation transcript:

Similar presentations

About project

Feedback