Stuart Lewis, UWA Thanks Google! A LOVE / HATE relationship Thanks Google! A LOVE / HATE relationship Stuart Lewis University of Wales Aberystwyth
Stuart Lewis, UWA Thanks Google! A LOVE / HATE relationship Contents LOVE –Why we love Google HATE –Problems –Updates to recent versions of DSpace LOVE –Google sitemaps –Google Analytics Questions and answers
Stuart Lewis, UWA Thanks Google! A LOVE / HATE relationship LOVE Google drives a lot of traffic to our repositories: 63.93% to in last three monthshttp://cadair.aber.ac.uk
Stuart Lewis, UWA Thanks Google! A LOVE / HATE relationship HATE Load –Google et al like to index our sites – homepagehttp://cadair.aber.ac.uk/ 124 times in 3 months – 18,000+ hits ~3.7% of the total hits
Stuart Lewis, UWA Thanks Google! A LOVE / HATE relationship HATE Not always nice – [11/Sep/2006:00:48: ] "GET /dspace/browse-title?top=2160%2F253” – [11/Sep/2006:00:48: ] "GET /dspace/browse-title?bottom=2160%2F267” – [11/Sep/2006:00:48: ] "GET /dspace/browse-title?top=2160%2F268“ – [11/Sep/2006:00:48: ] "GET /dspace/browse-title?top=2160%2F268“ – [11/Sep/2006:00:48: ] "GET /dspace/browse-title?bottom=2160%2F267" Brose pages create a lot of DB calls!
Stuart Lewis, UWA Thanks Google! A LOVE / HATE relationship HATE Recent changes to DSpace –Addition of Robots.txt file (1.4.1) –Restricts access to ‘expensive’ browse pages Except browse-title to ensure crawlers can find every item User-agent: * Disallow: /browse-author Disallow: /items-by-author Disallow: /browse-date Disallow: /browse-subject
Stuart Lewis, UWA Thanks Google! A LOVE / HATE relationship HATE Recent changes to DSpace –If-modified-since support (1.4.1) Crawlers can ask if content has been modified since a given date, and then the server only has to present the item if it has E.g. –If-Modified-Since: Mon, 20 Nov 13:43:31 GMT ‘Cheap’ return code if it has not changed –304 (Not Modified) Also of use to browsers / proxy servers
Stuart Lewis, UWA Thanks Google! A LOVE / HATE relationship HATE / LOVE Recent changes to DSpace –Google Sitemaps support (?) A gzipped XML file (50,000 URLS per file) In the following format: T02:00:20Z Google only then has to index what has changed
Stuart Lewis, UWA Thanks Google! A LOVE / HATE relationship LOVE DSpace statistics –Quite basic, but reliable and work well –Hard to see trends over time –No visualisations –No referrer statistics
Stuart Lewis, UWA Thanks Google! A LOVE / HATE relationship LOVE Google Analytics tool – –Small bit of javascript in header <script src=" type="text/javascript"> _uacct = "UA "; your site id urchinTracker(); –Doesn't get called by crawlers
Stuart Lewis, UWA Thanks Google! A LOVE / HATE relationship LOVE
Stuart Lewis, UWA Thanks Google! A LOVE / HATE relationship LOVE Google Analytics tool –Can create filters –Lots of superfluous functionality Designed to link in with AdWords™ and Ecommerce systems –A lot of stats, hard to find what you want –Some examples…
Stuart Lewis, UWA Thanks Google! A LOVE / HATE relationship LOVE
Stuart Lewis, UWA Thanks Google! A LOVE / HATE relationship LOVE
Stuart Lewis, UWA Thanks Google! A LOVE / HATE relationship LOVE
Stuart Lewis, UWA Thanks Google! A LOVE / HATE relationship LOVE
Stuart Lewis, UWA Thanks Google! A LOVE / HATE relationship LOVE
Stuart Lewis, UWA Thanks Google! A LOVE / HATE relationship LOVE
Stuart Lewis, UWA Thanks Google! A LOVE / HATE relationship LOVE Google Analytics tool –Very nice –Quite big – hard to find your way around! –Not integrated
Stuart Lewis, UWA Thanks Google! A LOVE / HATE relationship The end! Stuart Lewis Any questions?