Download presentation
Presentation is loading. Please wait.
Published byEgbert Stewart Modified over 9 years ago
1
A Web Services Search Engine CS 8803 [AIA] - Spring 2008 Roland Krystian Alberciak Piotr Kozikowski Sudnya Padalikar Tushar Sugandhi
2
Outline Project Overview Searching Web-services o Tools / APIs o How to figure out what information to show Results :Working prototype o Locate, classify, rank, and present web-services System Integration o Diversity! Languages (no joke): Python, Ruby on Rails, PHP, C#, Java, Perl. Databases: MySQL, MSSQL
3
Project Overview Step 1 - There are web-services available on the web Step 2 - (Challanges) Obstacles to find WS vs. web pages because: Effort to Register Directories disconnected No Clustering available No Ranking available Step 3 - Profit Should be Beneficial for Web Developers Should be Beneficial for us
4
What is out there? Swoogle -“10,000 ontologies” (they are more concerned with “semantic web” and “metadata”, and not so much on web services) Programmableweb -726 (only APIs) "Yellow pages" - 5000 web-services XMethods - 500 web-services UDDI - Discontinued but was useful to many web services to advertise themselves.
5
Survey of the Market-
6
We found solutions for Step 2! Step 1. Have web-services available on the web Step 2. (Solutions) Crawler, database, web application and a bunch of clustering algorithms and lots of "glue " Step 3. Our proposed solution - Web Slogger! - for us: content based advertising - for users: easy way to search for web-services
7
System Architecture
8
Crawling Yahoo! Why not Google? Restricted extraction: Could not extract many results What about Alexa? Couldn't afford it! :-) What did we crawl for?.wsdl and.asmx files How is Webslogger different from the Yellow Pages project (last year's class project)? Multiple Language support
9
Categorization and Clustering Glossaries Hierarchical Categirization (27 Categories) List of keywords for each category (2800 keywords) Web Service Partitioning By Importance Some sections in web service are more important than othe r e.g. Service Name / Operation Name is more important than message type name. Affinity Vector Weight assigned to each term in Webservice based on its mapping with Glossary Determines which web service belongs to which category
10
Ranking Insight Fundamental Difference: Web page ranking is based on inlinks and outlinks. Web service ranking should be based on objects and web methods. Recall: Our results are extracts from search engines. Therefore: We don't know how many pages link to a particular wsdl file. Search engine algorithms [ie. PageRank] have this data and can assert 'popularity', 'credibility' of hubs which locate sources. Resolution: We must find alternate ways to rank content
11
Ranking Options 1. Community Level: Collaborative Ranking: users can leave comments, Likert scale ranking rank good users / bad users in the community: experts 2. User Level: Usage statistic ranking: how long you view a wsdl do you go back to look at it again [since it is like an API...] inquire about what wsdl files they used to achieve a goal
12
Ranking Options..contd 3. Use Page Ranking provided by Google / Yahoo 4. File Level: Quality of file: "Do You Care if Your WSDL is W3C Compliant?" o Good format, thoroughness. Heuristics on model files. 5. Generate referral chain from WSDL o Understand citation network in order to determine valuable web services o Web services often use methods / objects from other web services. Use this linking to rank web services.
13
... element="xsd1:SubscriptionHeader"/>...
14
www.wbslogger.com
15
Future work Develop our own crawler Further improve clustering (there is always room for that!) Figure out an innovative (&& effective) way for ranking Location based clustering
16
Questions ?
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.