SERVICE ANNOTATION WITH LEXICON-BASED ALIGNMENT Service Ontology Construction Ontology of a given web service, service ontology, is constructed from service name, input and output parameter names of web service. While constructing service ontology, following steps are performed: Tokenization: Parameter names are split into tokens from ’ ’ character, capital cases or digits which are applicable to the case. Stop Word Removal: Stop words such as ”get”, “set”, ”is”, ”by”, ”list”, etc... are removed from token list. We generated the list of stop words from common practices in literature. Synset Generation: Synsets(synonym set) of the each token is generated. Then synsets are merged into a single huge set. Service Ontology: Synset tokens are embedded into an RDF schema such a way that each token is positioned as a root level element. SEMANTIC ANNOTATION OF WEB SERVICES WITH LEXICON-BASED ALIGNMENT Deniz Cantürk, Pınar Şenkul METU Computer Engineering Dept., 06531, Ankara, TURKEY Figure 1: Top level architecture of the Domain-Specific Service Discovery Layers Obtaining the Senses of the Keywords To generate a synset, synonyms of the keywords of services (extracted from input/output parameter names and service name) are added to the word’s set. Synonyms of each words are obtained by the help of WordNet [3]. Words are searched over the WordNet, WordNet provides separate definitions for each sense of the word. A New Alignment Technique: Lexicon based Alignment The new technique introduces a new synset concept, a new evaluation technique and a new matching technique. In this new approach we propose a new synset concept by using level-sense synsets in matrix format. New evaluation and matching techniques are built on top this representation. Level-Sense Synsets Level-Sense synset is a 2D matrix that holds the synonyms of the word for each sense in one dimension, derivation hierarchies of the senses in other dimension. An example Level-Sense synset for the keyword car is given in Table 1 EVALUATION Evaluation is performed over the service description GetCarEngines(string knownCategoryValues, string category). The lexicons of the web service, which are extracted from the service desciption, is identified as follows: {”car”, ”engines”, ”known”, ”category”, ”values”}. Semantic annotation is performed over these lexicons. Car ontology is used as the domain ontology. Hierarchy of car ontology is as follows: {car, (part, (wheel, tire, alternator, air-bag, seat), operation, (rental, sales, race, driving), automobile, passenger, way, (road, tunnel, bridge))}. Normalization algorithm implementing this idea is given in Algorithm 1. Normalization is applied only on the immediate parent. By using the Algorithm 1 and defined hierarchy of domain ontology, normalization is performed. For the above keyword set Normalized values are given in Table 3. CONCLUSION In this work, a new ontology alignment method is presented. This method is used for semantic annotation of web services that are related with a given domain under the domain ontology. The proposed technique is used within a domain-specific web service discovery system, namely DSWSD-S [1], [2]. REFERENCES [1] D. Canturk and P. Senkul, “Service acquisition and validation in a distributed service discovery system consisting of domain-specific sub-systems,” in Proc. of ICEIS, Madeira, Portugal, June 2010, pp. 93–99. [2] D. Canturk and P.Senkul, “Using semantic information for distributed web service discovery,” International Journal of Web Science, in press. [3] WordNet, “An electronic lexical database,” December Figure 2: Sense Comparison Tree (a) Case 1, (b) Case 2 Matching Degree Normalization After matching degree calculation, a normalization step is applied in order to further refine the similarity values. This normalization step is based on the hierarchy in the domain ontology. The basic idea in refinement is as follows: The matching between a service and an ontology term should be parallel to the matching between the service and the parent concept. According to this idea, if parent concept’s matching degree is higher than that of child, the similarity between the service and the ontology gets stronger. (Algorithm 1) ABSTRACT As the number of available web services published in registries and on web sites increases, web service discovery becomes a challenging task. One solution to the problem is to use distributed web service search system composed of domain specific sub service discoverers. Using ontology is the most common practice to specify domain knowledge. However, an important problem at this point is the lack of semantic annotation for currently available web services. For this reason, there is a strong need for a mechanism for semantic annotation of unannotated services. In this work, we propose a web service semantic annotation method that uses lexicon-based alignment. Lexicon-based alignment considers the different senses of the words, and hence it can find the association between the service and the ontology more accurately than previous alignment techniques. Keywords: semantic annotation, ontology alignment, web service discovery, distributed architecture, domain- specific web services. DOMAIN-SPECIFIC WEB SERVICE DISCOVERY SYSTEM The proposed service annotation technique is used within a domain-specific web service discovery system, namely Domain-Specific Web Service Discoverer with Semantics (DSWSD-S), which is presented in [1], [2]. DSWSD-S uses domain-specific web service discovery sub-systems, which are built around given ontologies. Main idea behind the system is grouping the web services around a set of ontologies each of which is handled by a separate domain-specific web service discovery node. The architecture for DSWSD-S consists of two layers, which are domain-specific crawler layer and domain-specific service discovery layer. UDDI Domain Specific Discovery Layer Ontology Domain Specific Crawler Layer Specialized Web Service Discovery System UDDI Unregistere d Services Unregistere d Services Unregistered Web Services ebXML Search Clients Sense 1Sense 2Sense 3Sense 4Sense 5 Leaf level car, auto, automobile, machine,motorcar car, railcar, railway car, railroad car cable car, carcar, gondolacar, elevator car Level 1 motor vehicle, automotive vehicle wheeled vehiclecompartment Level 2 selfvehicleroom Level 3 wheeled vehicle conveyance, transport area Level 4 vehicle instrumentality, instrumentation structure, construction Level 5 conveyance, transport artifact, artefact Level 6 instrumentality, instrumentation whole, unit Level 7 artifact, artefact object, physical object Level 8 whole, unitphysical entity Level 9 object, physical object entity Level 10 physical entity Level 11 entity Car Device Sense 1Sense 2Sense 3Sense 4Sense 5 Sense 1 ("instrumentality, instrumentation", 6,1) ("instrumentality, instrumentation", 4,1) ("artifact, artefact",5,2) Sense 2 ("entity",11,5)("entity",9,5) Sense 3 ("entity",11,11)("entity",9,11) Sense 4 ("artifact, artefact",7,3) ("artifact, artefact",5,3) Sense 5 ("artifact, artefact",7,4) ("artifact, artefact",5,4) Matching over Level-Sense Synsets Matching over the generated level-sense synsets of each word is performed by comparing each of the senses of the first word to each sense of the second word one by one. If the first word has m senses and second word has n senses totally m x n comparisons are carried on. (Table 2) Table 1: Level-Sense Synset of The Word Car Table 2: Sense Comparison Results Matching Degree Calculation Matching degree calculation is performed based on positions of the elements in the sense comparison tree (Figure 2). Obviously, if the terms that are to be compared are closer in the tree, they have higher similarity value; on the contrary if they are far away from each other, they have lower similarity value. (Figure 3) Figure 3: Matching Degree Formulas Table 3: Normalized Similarities of Parameters Proposed approach is compared with two widely used similarity methods, substring distance based and editing distance based similarity method. (Table 4) Table 4:Similarity Results for Car Ontology