A Generalized Flow-Based Method for Analysis of Implicit Relationships on Wikipedia.

Slides:



Advertisements
Similar presentations
Abstract Shortest distance query is a fundamental operation in large-scale networks. Many existing methods in the literature take a landmark embedding.
Advertisements

Optimizing Cloud Resources for Delivering IPTV Services Through Virtualization.
Energy-Optimum Throughput and Carrier Sensing Rate in CSMA-Based Wireless Networks.
Annotating Search Results from Web Databases. Abstract An increasing number of databases have become web accessible through HTML form-based search interfaces.
Abstract Load balancing in the cloud computing environment has an important impact on the performance. Good load balancing makes cloud computing more.
A Secure Protocol for Spontaneous Wireless Ad Hoc Networks Creation.
Back-Pressure-Based Packet-by-Packet Adaptive Routing in Communication Networks.
Personalized QoS-Aware Web Service Recommendation and Visualization.
Discovering Emerging Topics in Social Streams via Link Anomaly Detection.
IP-Geolocation Mapping for Moderately Connected Internet Regions.
Crowdsourcing Predictors of Behavioral Outcomes. Abstract Generating models from large data sets—and deter¬mining which subsets of data to mine—is becoming.
Secure Encounter-based Mobile Social Networks: Requirements, Designs, and Tradeoffs.
Cross-Domain Privacy-Preserving Cooperative Firewall Optimization.
A Survey of Mobile Cloud Computing Application Models
Fast Nearest Neighbor Search with Keywords. Abstract Conventional spatial queries, such as range search and nearest neighbor retrieval, involve only conditions.
Understanding the External Links of Video Sharing Sites: Measurement and Analysis.
Security Evaluation of Pattern Classifiers under Attack.
A Framework for Mining Signatures from Event Sequences and Its Applications in Healthcare Data.
BestPeer++: A Peer-to-Peer Based Large-Scale Data Processing Platform.
Improving Network I/O Virtualization for Cloud Computing.
Mobile Relay Configuration in Data-Intensive Wireless Sensor Networks.
m-Privacy for Collaborative Data Publishing
Tweet Analysis for Real-Time Event Detection and Earthquake Reporting System Development.
A Fast Clustering-Based Feature Subset Selection Algorithm for High- Dimensional Data.
Optimal Client-Server Assignment for Internet Distributed Systems.
Protecting Sensitive Labels in Social Network Data Anonymization.
Identity-Based Secure Distributed Data Storage Schemes.
Enabling Dynamic Data and Indirect Mutual Trust for Cloud Computing Storage Systems.
LARS*: An Efficient and Scalable Location-Aware Recommender System.
Cooperative Caching for Efficient Data Access in Disruption Tolerant Networks.
Anonymization of Centralized and Distributed Social Networks by Sequential Clustering.
Accuracy-Constrained Privacy-Preserving Access Control Mechanism for Relational Data.
Identity-Based Distributed Provable Data Possession in Multi-Cloud Storage.
Content Sharing over Smartphone-Based Delay- Tolerant Networks.
A System for Denial-of- Service Attack Detection Based on Multivariate Correlation Analysis.
Modeling the Pairwise Key Predistribution Scheme in the Presence of Unreliable Links.
Privacy Preserving Delegated Access Control in Public Clouds.
Scalable Distributed Service Integrity Attestation for Software-as-a-Service Clouds.
Anomaly Detection via Online Over-Sampling Principal Component Analysis.
A Method for Mining Infrequent Causal Associations and Its Application in Finding Adverse Drug Reaction Signal Pairs.
Keyword Query Routing.
Document Clustering for Forensic Analysis: An Approach for Improving Computer Inspection.
A Highly Scalable Key Pre- Distribution Scheme for Wireless Sensor Networks.
Bandwidth Distributed Denial of Service: Attacks and Defenses.
Facilitating Document Annotation using Content and Querying Value.
Privacy Preserving Back- Propagation Neural Network Learning Made Practical with Cloud Computing.
Clustering Sentence-Level Text Using a Novel Fuzzy Relational Clustering Algorithm.
Two tales of privacy in online social networks. Abstract Privacy is one of the friction points that emerges when communications get mediated in Online.
Participatory Privacy: Enabling Privacy in Participatory Sensing
Preventing Private Information Inference Attacks on Social Networks.
Scalable Keyword Search on Large RDF Data. Abstract Keyword search is a useful tool for exploring large RDF datasets. Existing techniques either rely.
Abstract We propose two novel energy-aware routing algorithms for wireless ad hoc networks, called reliable minimum energy cost routing (RMECR) and reliable.
DCIM: Distributed Cache Invalidation Method for Maintaining Cache Consistency in Wireless Mobile Networks.
Supporting Privacy Protection in Personalized Web Search.
Twitsper: Tweeting Privately. Abstract Although online social networks provide some form of privacy controls to protect a user's shared content from other.
m-Privacy for Collaborative Data Publishing
A Scalable Two-Phase Top-Down Specialization Approach for Data Anonymization Using MapReduce on Cloud.
Multiparty Access Control for Online Social Networks : Model and Mechanisms.
A New Algorithm for Inferring User Search Goals with Feedback Sessions.
Data Mining with Big Data. Abstract Big Data concerns large-volume, complex, growing data sets with multiple, autonomous sources. With the fast development.
Securing Broker-Less Publish/Subscribe Systems Using Identity-Based Encryption.
Dealing With Concept Drifts in Process Mining. Abstract Although most business processes change over time, contemporary process mining techniques tend.
Privacy-Enhanced Web Service Composition. Abstract Data as a Service (DaaS) builds on service-oriented technologies to enable fast access to data resources.
Privacy-Preserving and Content-Protecting Location Based Queries.
Mona: Secure Multi-Owner Data Sharing for Dynamic Groups in the Cloud.
Whole Test Suite Generation. Abstract Not all bugs lead to program crashes, and not always is there a formal specification to check the correctness of.
Distributed Processing of Probabilistic Top-k Queries in Wireless Sensor Networks.
Facilitating Document Annotation Using Content and Querying Value.
Dynamic Query Forms for Database Queries. Abstract Modern scientific databases and web databases maintain large and heterogeneous data. These real-world.
Spatial Approximate String Search. Abstract This work deals with the approximate string search in large spatial databases. Specifically, we investigate.
Presentation transcript:

A Generalized Flow-Based Method for Analysis of Implicit Relationships on Wikipedia

Abstract We focus on measuring relationships between pairs of objects in Wikipedia whose pages can be regarded as individual objects. Two kinds of relationships between two objects exist: in Wikipedia, an explicit relationship is represented by a single link between the two pages for the objects, and an implicit relationship is represented by a link structure containing the two pages. Some of the previously proposed methods for measuring relationships are cohesion-based methods, which underestimate objects having high degrees, although such objects could be important in constituting relationships in Wikipedia. The other methods are inadequate for measuring implicit relationships because they use only one or two of the following three important factors: distance, connectivity, and cocitation. We propose a new method using a generalized maximum flow which reflects all the three factors and does not underestimate objects having high degree.

Abstract con… We confirm through experiments that our method can measure the strength of a relationship more appropriately than these previously proposed methods do. Another remarkable aspect of our method is mining elucidatory objects, that is, objects constituting a relationship. We explain that mining elucidatory objects would open a novel way to deeply understand a relationship.

Existing system Searching webpages containing a keyword has grown in this decade, while knowledge search has recently been researched to obtain knowledge of a single object and relationships between multiple objects, such as humans, places or events. Searching knowledge of objects using Wikipedia is one of the hottest topics in the field of knowledge search. In Wikipedia, the knowledge of an object is gathered in a single page updated constantly by a number of volunteers. Wikipedia also covers objects in a number of categories, such as people, science, geography, politic, and history. Therefore, searching Wikipedia is usually a better choice for a user to obtain knowledge of a single object than typical search engines.

Architecture Diagram

System Specification HARDWARE REQUIREMENTS Processor : Intel Pentium IV Ram : 512 MB Hard Disk : 80 GB HDD SOFTWARE REQUIREMENTS Operating System : Windows XP / Windows 7 FrontEnd : Java BackEnd : MySQL 5

C ONCLUSION We have proposed a new method of measuring the strength of a relationship between two objects on Wikipedia. By using a generalized maximum flow, the three representa¬tive concepts, distance, connectivity, and cocitation, can be reflected in our method. Furthermore, our method does not underestimate objects having high degrees.