Download presentation
Presentation is loading. Please wait.
Published byClaribel Tate Modified over 9 years ago
1
Visualization of the Webpage Popularity for Ping Wales Visualization of the Popularity of the Web Access for Ping Wales Xiaochuan Huang (George) Supervised by Dr Markus Roggenbach Department of Computer Science University of Wales Swansea Nov. 2005 @ Gregynog
2
Visualization of the Webpage Popularity for Ping Wales Overview 1.A Regular Website Report 2.Specification 3.Technology Involved 4.A First Approach
3
Visualization of the Webpage Popularity for Ping Wales 1. A Regular Website Report What the project is about Our customer, Ping Media Ltd; the website, Ping Wales; What they need; and the technical infrastructure
4
Visualization of the Webpage Popularity for Ping Wales 1. A Regular Website Report What the project is about Introducing similar tools Log file analyzers; The AWStats and Analogs 6.0; Graphic statistics generated by AWStats and Analog
5
Visualization of the Webpage Popularity for Ping Wales 1. A Regular Website Report
6
Visualization of the Webpage Popularity for Ping Wales 1. A Regular Website Report What the project is about Our customer, Ping Media Ltd; the website, Ping Wales; What they need; and the technical infrastructure Introducing similar tools Log file analyzers; The AWStats and Analogs 6.0; Graphic statistics generated by AWStats and Analog Why this application is necessary Customer’s needs; The shortage of existing applications; Extendable project
7
Visualization of the Webpage Popularity for Ping Wales 2. Specification Components The filter/parser; The analyzer; Two databases; Visualization Going through the processes Take daily log file -> parse with DB1 -> output filtered result -> write result into DB2 Given a specified duration -> access DB2 -> generate the records -> output an visualized report
8
Visualization of the Webpage Popularity for Ping Wales 3. Technologies Involved The Apache log files Introduction;
9
Visualization of the Webpage Popularity for Ping Wales 3.Technologies Involved The Apache log files Introduction; Format; "%h %l %u %t \"%r\" %>s %b \"%{Referer}i\" \"%{User-agent}i\"" combined 220.244.224.104 - - [12/Jan/2005:00:12:38 +0000] "GET /hardware/toshiba-small-80gb-hdd.html HTTP/1.0" 200 11020 "http://www.pingwales.co.uk/business/apple- keynote.html" "Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.7.3) Gecko/20041204 Epiphany/1.4.4"
10
Visualization of the Webpage Popularity for Ping Wales The Apache log files Introduction; Format "%h %l %u %t \"%r\" %>s %b \"%{Referer}i\" \"%{User-agent}i\"" combined 220.244.224.104 - - [12/Jan/2005:00:12:38 +0000] "GET /hardware/toshiba-small- 80gb-hdd.html HTTP/1.0" 200 11020 "http://www.pingwales.co.uk/business/apple- keynote.html" "Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.7.3) Gecko/20041204 Epiphany/1.4.4" Log string analysis: (%h)220.244.224.104: the IP address of the client (%l)The RFC 1413, identity of the client (%u)The userid of the requesting person (%t)[12/Jan/2005:00:12:38 +0000]: the request time (\"%r\") "GET /hardware/toshiba-small-80gb-hdd.html HTTP/1.0" method, request page, client protocol (%>s) 200: the status code (%b)11020: the size of the object returned to the client (\"%{Referer}i\") the site that the client reports having been referred from. (\"%{User-agent}i\") identifying information of client browser
11
Visualization of the Webpage Popularity for Ping Wales 3. Technologies Involved The Apache log files Programming language – Ruby interpreted scripting language for quick and easy object-oriented programming % ruby puts "Hello, world! “ ^D Hello, world! % cd sample % ruby eval.rb ruby> a = "Hello, world!" "Hello, world! “ ruby> puts a Hello, world! Nil ruby> ^D %
12
Visualization of the Webpage Popularity for Ping Wales 3. Technologies Involved The Apache log files Programming language – Ruby Database access MySQL, The two databases Access DB with Ruby
13
Visualization of the Webpage Popularity for Ping Wales 4. A First Approach load the daily log file Parsing/Filtering while not end of file read hit, line by line for each hit, getIP(%h), getTime(%t), getReq(\"%r\"), getSt(%>s) Check if even(first( getSt() )), then go through the articles database looking for getIP() if there is, write such hit to database 2, read next go to next hit Analyzing Specify StartingTime, EndTime, build an array/stack: myArray Read through records from database 2, for those within the specified time for each hit, if getIP() is in myArray, then counter+=1 otherwise, write this hit to myArray, initial counter Sort myArray according to counter of each element Write out the result of top Ns to file, for visualizing
14
Water flow model Take daily log file -> parse with DB1 -> output filtered result -> write result into DB2 Given a specified duration -> access DB2 -> generate the records -> output an visualized report Daily Log File Filter Database 1 Database 2 Visualization Tool Graphic Report AnalyzerPeriod entryRecords
15
Visualization of the Webpage Popularity for Ping Wales Summary What I have done so far & What I am planning to do next
16
End… hey weak up, there he ends !! LOL George 21/11/2005 @Gregynog
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.