Presentation is loading. Please wait.

Presentation is loading. Please wait.

Scalability and planning for growth 1WUCM1. Content management issues Structural – Naming (e.g. file, URL) policy – File and directory naming needs: invent/design/borrow.

Similar presentations


Presentation on theme: "Scalability and planning for growth 1WUCM1. Content management issues Structural – Naming (e.g. file, URL) policy – File and directory naming needs: invent/design/borrow."— Presentation transcript:

1 Scalability and planning for growth 1WUCM1

2 Content management issues Structural – Naming (e.g. file, URL) policy – File and directory naming needs: invent/design/borrow a scheme easy means to implement the scheme a way to check whether the scheme is being adhered to a way to fix breaches of the scheme – Names are difficult to 'fix' at a later stage – Poor design will cause maintenance grief Content – Update policy when? by whom? WUCM12

3 Content update policy Without control, a large web system will quickly spawn: – inconsistencies (between A and B) – errors (A is wrong) – inaccessible data (A cannot be reached) – etc. Update strategies: – update on demand – regular update schedule – hybrid (on-demand with regular clean-up) Consider a content management tool WUCM13

4 Possible server organisation WUCM14

5 Apache configuration issues 1 Apache directives with performance implications: – KeepAlive number Keeps the connection open for maximum this number of accesses – avoids hogging – KeepAliveTimeout seconds Max time to wait for next request – MaxKeepAliveRequests number Max number to keep open at one time – HostNameLookups [on|off|double] ‘on’ put hostname in log instead of IP address – MaxClients number Limits number of requests handled at once by server – MaxRequestsPerChild number each child process of Apache handles this many requests and dies (to tidy up memory leaks) – ThreadsPerChild number only relevant Win32. Default 50, may need increase for many simultaneous hits. (Microsoft issue..) WUCM15

6 Apache configuration issues 2 Other Apache directives: – UseCanonicalName on/off/dns Relates to DNS names – FollowSymLinks an Option, can cause Apache to waste time checking through file structure - security risk – Logging of all kinds slows Apache down –.htaccess files add overhead (read on each request) – Large configuration files also slow Apache, so thinning here is a good idea WUCM16

7 General server configuration issues CGI programs influence the performance of the website: – Consider FastCGI or mod_perl to speed matters – Writing efficient code is always important Other tricks – Force popular files to be memory resident Operating system may do that for you – Force secure transfers to have more bandwidth WUCM17

8 Proxy server performance issues An Apache proxy can: – Cache for speed – Filter for security or decency Apache's proxy functionality is encapsulated in mod_proxy In order to use mod_proxy, use the directive – ProxyRequests on|off WUCM18

9 Proxy customisation To block particular sites from your clients: ProxyBlock www.badsite.com baddomain.co.uk badword This will block the specific site, domain or any URL with names that contain ‘badword’ WUCM19

10 Hiding servers with a proxy Suppose there are two extra servers, parallel to the www.tech.port.ac.uk server Add the ProxyPass directive to the main www.tech.port.ac.uk server configuration file ProxyPass /users/ http://users.tech.port.ac.uk/ ProxyPass /secure/ http://secure.tech.port.ac.uk/ This makes users.tech.port.ac.uk and secure.tech.port.ac.uk appears as directories on the main server, e.g. www.tech.port.ac.uk/users/ WUCM110

11 Still not enough performance? Two further possibilities to boost performance: – Replace the server hardware with a more powerful machine – Add more servers and distribute the load of client requests amongst them WUCM111

12 Benefits of multiple servers Server machines can be cheaper and easily replaceable Individual servers can fall over without the website becoming unavailable Increase capacity by adding another server and synchronising the data No need to alter or reconfigure any of the existing servers WUCM112

13 Clustering 1 Cannot just add an extra servers – Each would need different IP addresses Set of servers needs to be established as a cluster so that: – For external clients it should appear as one big fast server with one domain name – Clients should not be aware that the load is being shared by a cluster of servers – Content on the multiple servers must be synchronised WUCM113

14 Clustering 2 Two basic ways of approaching clustering: 1.DNS load sharing 2.Web server clustering WUCM114

15 DNS load sharing Most common approach is Round-Robin DNS distribution It works by specifying multiple IP addresses for the same host name (using a BIND syntax) WUCM115 www.tech.port.ac.uk 60 IN A 148.197.203.1 www.tech.port.ac.uk 60 IN A 148.197.203.2 www.tech.port.ac.uk 60 IN A 148.197.203.3

16 DNS load sharing WUCM116 [Source: O’Reilly Books]

17 Round-Robin DNS sharing 1 Each DNS request for www.tech.port.ac.uk returns the next IP in sequence Set a short time-to-live (TTL) – the 60 seconds A lower TTL would – Improve web server load sharing – But increase the load on DNS server Attraction of round-robin DNS is its simplicity WUCM117

18 Round-Robin DNS sharing 2 Not true load balancing, only load sharing The round-robin takes no account of: – which servers are loaded – which are free – which are actually up and running Round-robin DNS makes keeping state for a user more difficult – A user may get a different server from last time WUCM118

19 Hardware load balancing Needs a specialist piece of software to redirect requests For example: – LocalDirector and DistributedDirector were products from Cisco (http://www.cisco.com).http://www.cisco.com – These will rewrite IP headers to redirect a connection to a local server WUCM119

20 Clustering with Apache 1 Apache provides way to cluster servers using the features of mod_rewrite and mod_proxy together This avoids the DNS caching problems and the cost of hardware solutions Need a machine as a proxy server, handling requests to several back-end servers on which the website is actually loaded WUCM120

21 Clustering with Apache 2 E.g. the proxy takes the master name www.tech.port.ac.uk and the backend servers might be www1 to www6 Wainwright (1999) sets out a method of setting up Apache using two parts: – Use mod_rewrite to randomly select a back-end server for the client request – Use mod_proxy’s ProxyPassReverse directive to disguise the URL of the back-end server WUCM121

22 Summary Configuration issues for scalability and performance Proxy Servers – filter and cache DNS (round robin) clustering Hardware clustering Proxy based clustering WUCM122


Download ppt "Scalability and planning for growth 1WUCM1. Content management issues Structural – Naming (e.g. file, URL) policy – File and directory naming needs: invent/design/borrow."

Similar presentations


Ads by Google