A Geographic Knowledge Base for Semantic Web Applications Marcirio Silveira Chaves Mário J. Silva Bruno Martins 20º Brazilian Symposium on Databases - SBBD 2005 Uberlândia - MG Linguateca
º Brazilian Symposium on Databases2 Motivation/Context GKB - Geographic Knowledge Base –Geographic –Network Information exported as ontologies Geographic-aware Semantic Web applications GREASE – Geographic Reasoning for Search Engines
º Brazilian Symposium on Databases3 Presentation Structure Conceptual Design of GKB Knowledge Integration Using Geographic Knowledge in GKB GKB as an Ontology Statistics of the Ontologies Created Applications using GKB Final Remarks
º Brazilian Symposium on Databases4 Information Sources used by GKB Geo-Administrative and Geo-Physical Domain –Administrative –Postal –Gazetteers –Wikipedia Network Domain –FCCN Web domains Web sites
º Brazilian Symposium on Databases5 Architecture of GKB
º Brazilian Symposium on Databases6 Feature concept in GKB A meaningful object in the selected domain of discourse [ISO19109]. Ex.: countries, cities and localities
º Brazilian Symposium on Databases7 Conceptual Design of GKB GKB meta-model
º Brazilian Symposium on Databases8 Presentation Structure Conceptual Design of GKB Knowledge Integration Using Geographic Knowledge in GKB GKB as an Ontology Statistics of the Ontologies Created Applications using GKB Final Remarks
º Brazilian Symposium on Databases9 Knowledge Integration in GKB GKB hierarchy from different information sources Algorithm: –It searches the lowest common features types in both hierarchiesthe lowest common features types –If it holds, it identifies the common instances between the hierarchiescommon instances between the hierarchies –Once the common instances are identified, it goes up the hierarchy and searches for the lowest common ancestorlowest common ancestor –It verifies the distance (in number of relationships partOf) between the common instances of the features types and its ancestors. The ancestor, which has the small distance up to the common instances is merged through a relationship partOf with the ancestor in the another hierarchy.merged The existing relationships in both hierarchies are maintained.
º Brazilian Symposium on Databases10 Knowledge Integration in GKB GKB hierarchy from different information sources H1 Norte Grande Porto Tâmega Matosinhos Vila Nova de Gaia Penafiel NUT2 NUT3 MUNICIPALITY H2 Porto Matosinhos Vila Nova de Gaia Penafiel DISTRITO
º Brazilian Symposium on Databases11 Knowledge Integration in GKB GKB hierarchy from different information sources H1 Norte Grande Porto Tâmega Matosinhos Vila Nova de Gaia Penafiel NUT2 NUT3 MUNICIPALITY H2 Porto Matosinhos Vila Nova de Gaia Penafiel DISTRITO
º Brazilian Symposium on Databases12 Knowledge Integration in GKB GKB hierarchy from different information sources H1 Norte Grande Porto Tâmega Matosinhos Vila Nova de Gaia Penafiel NUT2 NUT3 MUNICIPALITY H2 Porto Matosinhos Vila Nova de Gaia Penafiel DISTRITO
º Brazilian Symposium on Databases13 Knowledge Integration in GKB Merged Hierarchy Norte Grande Porto Tâmega Penafiel Matosinhos Vila Nova de Gaia
º Brazilian Symposium on Databases14 Presentation Structure Conceptual Design of GKB Knowledge Integration Using Geographic Knowledge in GKB GKB as an Ontology Statistics of the Ontologies Created Applications using GKB Final Remarks
º Brazilian Symposium on Databases15 Using Geographic Knowledge in GKB Geographic scopes – –Lisboa (municipality) Rules New relationships and knowledge Description Logics (DLs) Geo domain –Names composed of multiple words are represented in different ways Network domain –Names of URLs are decomposed by the correspondent domain division
º Brazilian Symposium on Databases16 ABox in DLs for the: –municipality of Santiago do Cacém geoFeatureName(270,“santiagodocacem”) geoFeatureName(270,“santiagocacem”). geoFeatureName(270,“santiago-do-cacem”). geoFeatureName(270,“santiago-cacem”). geoFeatureType(270,“CON”). –web site: netSiteSubDomain(33684,“www”). netSitePrefix(33684,“cm”). netSiteDomainToken(33684,“santiago-do-cacem”). netSiteTLD(33684,“pt”). Using Geographic Knowledge in GKB
º Brazilian Symposium on Databases17 Terminology Description (TBox in DLs) –Municipalities hasScope(idN,idG) netSiteDomainToken(idN,X) (( netSitePrefix(idN,“cm”) netSitePrefix(idN,“mun”)) geoFeatureType(idG,“CON”) geoFeatureName(idG,X). Using Geographic Knowledge in GKB
º Brazilian Symposium on Databases18 Ex.: hasScope(idN,idG) netSiteDomainToken(idN,X) ( netSitePrefix(idN,“cm”) netSitePrefix(idN,“mun”)) geoFeatureType(idG,“CON”) geoFeatureName(idG,X). netSiteDomainToken(33684, “santiago-do-cacem”). netSitePrefix(33684, “cm”). geoFeatureType(270, “CON”). geoFeatureName(270, “santiago-do-cacem”). New knowledge: hasScope(33684, 270). Using Geographic Knowledge in GKB
º Brazilian Symposium on Databases19 Rule-based assigned scopes by GKB to sites of Portugal Site Type# of sites# of matches distritos3317 (52%) municipalities (90%) freguesias (41%) basic schools (6%) training centers15255 (36%) high schools (26%) Using Geographic Knowledge in GKB Scopes extended to the web pages under each one of the sites of matching subdomains
º Brazilian Symposium on Databases20 Presentation Structure Conceptual Design of GKB Knowledge Integration Using Geographic Knowledge in GKB GKB as an Ontology Statistics of the Ontologies Created Applications using GKB Final Remarks
º Brazilian Symposium on Databases21 GKB as an Ontology 238 Porto Geo-Net-PT01
º Brazilian Symposium on Databases22 Statistics of the Ontologies Created StatisticPortugalWorld # of features418,06512,293 # of relationships419,86712,258 # of part-of relationships418,340 (99.83%)12,245 (99,89%) # of equivalence relationships395 (0.09%)2,501(20,40%) # of adjacency relationships1,132 (0.27%)13 (0.10%) Avg. broader features per feature Avg. narrower features per feature Avg. equivalent features per feature with equivalent Avg. adjacent features per feature with adjacent # of features without ancestors3 (0.00%)1(0.00%) # of features without descendants374,349 (89.54%)12,045 (97,98%) # of features without equivalent417,867 (99.95%)11,819 (96,14%) # of features without adjacent417,739 (99.92%)12,291 (99,99%)
º Brazilian Symposium on Databases23 Presentation Structure Conceptual Design of GKB Knowledge Integration Using Geographic Knowledge in GKB GKB as an Ontology Statistics of the Ontologies Created Applications using GKB Final Remarks
º Brazilian Symposium on Databases24 Applications using GKB NERC tool for recognizing geographical references in text Classification tool for assigning documents to a corresponding geographical scope Information retrieval interface for geographical queries
º Brazilian Symposium on Databases25 Applications using GKB
º Brazilian Symposium on Databases26 Final Remarks A domain-independent model for storing geographic and network knowledge Sharing of the collected knowledge as formal ontologies Geo-Net-PT01: The first public geographic ontology of Portugal - Future work –Augmenting the knowledge in GKB with geographic entities extracted from the texts of the Portuguese Web