Presentation is loading. Please wait.

Presentation is loading. Please wait.

CSE 190: Internet E-Commerce Lecture 14: Operations.

Similar presentations


Presentation on theme: "CSE 190: Internet E-Commerce Lecture 14: Operations."— Presentation transcript:

1 CSE 190: Internet E-Commerce Lecture 14: Operations

2 Operations Everything it takes to keep a web site up and running, 24x7 –Deployment Process –Monitoring (SNMP) –Build system –Link rot –Maintenance window –Load testing –Browser compliance –Log rotation –Database backups –Disk failure –Router failure –Robots –Staffing –Data centers Expense of running a high availability site is comparable to running a physical store front

3 Deployment Process Proceeds in three phases –Development Within corporation, not accessible outside –Stage Within internet environment UAT run here Only operations staff may access –Live Accessible to outside world

4 Monitoring SNMP (Simple Network Management Protocol) –Used to monitor both hardware, software –Provides: Counters, Values, Triggers, Statistics –Remote control of services –Information stored in MIB (Management Information Base) –RMON sometimes used as alternative to SNMPv2 Software –HP OpenView

5 Maintenance Window Installation –Standard: J2EE standard web service descriptor (XML file with tarball of files) –InstallShield –Custom installation scripts Upgrades –Defined time on Friday or weekend to upgrade site, posted on web site –Process: Front page linked to ‘Site down’ Load balancer redirected if appropriate Application stops accepting new clients (Pause) Application terminates all active sessions Application upgraded Sanity checks performed Servers rebooted Load balancer restored

6 Link Rot Link rot: the continual process by which links become invalid over time Tracked with custom tools Best practice: Pages have permanent URLs Referral field: –Tracking this in logs shows who’s linking to what URL on your site

7 Load Testing Network load (60% bandwidth max) –Average page size (~20-30k) CPU load: Occurs at least three levels –HTTP level –Application level –DB query level –Metrics: maximum number of simultaneous users, latency vs. users Memory usage (256 M – 1 G per machine) Disk I/O load –1 Gb per machine typical Tools –Mercury Interactive: WinRunner –Segue: SilkTest –Rational: SiteLoad –Microsoft: WCAT

8 Browser Compatibility Cost of testing proportional to the number of platforms you’re compatible with The same product isn’t the same on different operating systems –E.g. IE4.5 isn’t the same on Mac vs. Windows Incompatible DOMs between MS, Netscape, Mozilla Browser archive –http://browsers.evolt.org/

9 Robots Robots: Automatically traverse web pages to retrieve documents, link structure, data Used for: –Indexing –HTML validation –Link validation –Mirroring Problems: –Too much rapid access from single IP –May be indexing dynamic, obsolete data Robot exclusion file: # /robots.txt file for mysite.com User-agent: webcrawler Disallow: User-agent: lycra Disallow: / User-agent: * Disallow: /jsp Disallow: /logs

10 Failure Models Mean Time To Failure (MTTF) = average amount of time the system is up Mean Time between Failures (MTBF) = average amount of time between failures Mean Time To Repair (MTTR) = average amount of time the system is down after it fails - active repair time (diagnostics and repair) Mean Down Time (MDT) - average amount of time system is down after it fails - active repair time + preventive maintenance + logistics time (time spent waiting for personnel, etc) Intrinsic availability: Mean Time To Failure (MTTF) Mean Time To Failure (MTTF) + MTTR Operational availability: Mean Time Between Failure (MTBF) Mean Time Between Failure (MTBF) + MDT Burn in Useful Life Wear out Integration Useful Life Obsolete & test Hardware Failure RateSoftware Failure Rate

11 When things go wrong Network operations –Software recovers from common failures –Network staff paged by email if server not available (via SNMP) –Usually rotating assignment Application developers may be called in if restarting servers, etc. fails completely. Only if it doesn’t look like a network problem.

12 Data Centers Data centers: Host your machines in their own premises –Also called “colocation” Features –Security: controlled entrance, exit –Weather: maintained temperature, humidity –Power: Backup power, available circuits –Bandwidth: OC-192 connections –Monitoring: 24/7 staff, may reboot misbehaving machines Machines typically arranged in “cages”; 1u, 2u machines Server blades Examples –NTT / Verio –Exodus / Global Crossing


Download ppt "CSE 190: Internet E-Commerce Lecture 14: Operations."

Similar presentations


Ads by Google