Towards Maximum Independent Sets on Massive Graphs

Slides:



Advertisements
Similar presentations
Lower Bounds for Local Search by Quantum Arguments Scott Aaronson.
Advertisements

Copyright 2011, Data Mining Research Laboratory Fast Sparse Matrix-Vector Multiplication on GPUs: Implications for Graph Mining Xintian Yang, Srinivasan.
Bounded Conjunctive Queries Yang Cao 1,2, Wenfei Fan 1,2, Tianyu Wo 2, Wenyuan Yu 3 1 University of Edinburgh, 2 Beihang University, 3 Facebook Inc.
ICDE 2014 LinkSCAN*: Overlapping Community Detection Using the Link-Space Transformation Sungsu Lim †, Seungwoo Ryu ‡, Sejeong Kwon§, Kyomin Jung ¶, and.
Twig 2 Stack: Bottom-up Processing of Generalized-Tree-Pattern Queries over XML Documents Songting Chen, Hua-Gang Li *, Junichi Tatemura Wang-Pin Hsiung,
Minimizing Seed Set for Viral Marketing Cheng Long & Raymond Chi-Wing Wong Presented by: Cheng Long 20-August-2011.
ABSTRACT We consider the problem of computing information theoretic functions such as entropy on a data stream, using sublinear space. Our first result.
Efficient Autoscaling in the Cloud using Predictive Models for Workload Forecasting Roy, N., A. Dubey, and A. Gokhale 4th IEEE International Conference.
Department of Computer Science, University of Maryland, College Park, USA TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.:
Privacy-MaxEnt: Integrating Background Knowledge in Privacy Quantification Wenliang (Kevin) Du, Zhouxuan Teng, and Zutao Zhu. Department of Electrical.
OCFS: Optimal Orthogonal Centroid Feature Selection for Text Categorization Jun Yan, Ning Liu, Benyu Zhang, Shuicheng Yan, Zheng Chen, and Weiguo Fan et.
Presented By HaeJoon Lee Yanyan Shen, Beng Chin Ooi, Bogdan Marius Tudor National University of Singapore Wei Lu Renmin University Cang Chen Zhejiang University.
RESOURCES, TRADE-OFFS, AND LIMITATIONS Group 5 8/27/2014.
Towards Robust Indexing for Ranked Queries Dong Xin, Chen Chen, Jiawei Han Department of Computer Science University of Illinois at Urbana-Champaign VLDB.
1 Wenguang WangRichard B. Bunt Department of Computer Science University of Saskatchewan November 14, 2000 Simulating DB2 Buffer Pool Management.
Mining High Utility Itemset in Big Data
Analysis of algorithms Analysis of algorithms is the branch of computer science that studies the performance of algorithms, especially their run time.
A User-Lever Concurrency Manager Hongsheng Lu & Kai Xiao.
Efficient Common Items Extraction from Multiple Sorted Lists Wei Lu, Cuitian Rong, Jinchuan Chen, Xiaoyong Du, Gabriel Fung, Xiaofang Zhou Renmin University.
1 A New Calculational Law for Combinatorial Optimization Problems Akimasa Morihata PhD student of IPL (Takeichi/Hu Lab.), the University of Tokyo IFIP.
Multivariate Dyadic Regression Trees for Sparse Learning Problems Xi Chen Machine Learning Department Carnegie Mellon University (joint work with Han Liu)
Maximum Independent Set on Massive Graphs Supervisor Prof. Lu Special thanks to Hua.
Exploit of Online Social Networks with Community-Based Graph Semi-Supervised Learning Mingzhen Mo and Irwin King Department of Computer Science and Engineering.
A User Experience-based Cloud Service Redeployment Mechanism KANG Yu Yu Kang, Yangfan Zhou, Zibin Zheng, and Michael R. Lyu {ykang,yfzhou,
Shanjiang Tang, Bu-Sung Lee, Bingsheng He, Haikun Liu School of Computer Engineering Nanyang Technological University Long-Term Resource Fairness Towards.
Mining Top-K Large Structural Patterns in a Massive Network Feida Zhu 1, Qiang Qu 2, David Lo 1, Xifeng Yan 3, Jiawei Han 4, and Philip S. Yu 5 1 Singapore.
Two Dimension Measures: A New Algorithimic Method for Solving NP-Hard Problems Yang Liu.
Equivalence Between Priority Queues and Sorting in External Memory
Optimal Relay Placement for Indoor Sensor Networks Cuiyao Xue †, Yanmin Zhu †, Lei Ni †, Minglu Li †, Bo Li ‡ † Shanghai Jiao Tong University ‡ HK University.
Space for things we might want to put at the bottom of each slide. Part 6: Open Problems 1 Marianne Winslett 1,3, Xiaokui Xiao 2, Yin Yang 3, Zhenjie Zhang.
Object-to-Object Relationship Based Access Control: Model and Multi-Cloud Demonstration Tahmina Ahmed, Farhan Patwa and Ravi Sandhu Department of Computer.
BAHIR DAR UNIVERSITY Institute of technology Faculty of Computing Department of information technology Msc program Distributed Database Article Review.
Product Catalog Template -
A Collaborative Quality Ranking Framework for Cloud Components
Advanced Algorithms Analysis and Design
Yu Wang1, Gao Cong2, Guojie Song1, Kunqing Xie1
Nanyang Technological University
Towards Maximum Independent Sets on Massive Graphs
Heuristic & Approximation
Data Driven Resource Allocation for Distributed Learning
Near-Optimal Spectrum Allocation for Cognitive Radios: A Frequency-Time Auction Perspective Xinyu Wang Department of Electronic Engineering Shanghai.
Greedy & Heuristic algorithms in Influence Maximization
Optimizing Parallel Algorithms for All Pairs Similarity Search
Alexander Shekhovtsov and Václav Hlaváč
Abolfazl Asudeh Azade Nazi Nan Zhang Gautam DaS
Minimum Dominating Set Approximation in Graphs of Bounded Arboricity
FORA: Simple and Effective Approximate Single­-Source Personalized PageRank Sibo Wang, Renchi Yang, Xiaokui Xiao, Zhewei Wei, Yin Yang School of Information.
European Symposium on Algorithms – ESA
Kijung Shin1 Mohammad Hammoud1
ProbeSim: Scalable Single-Source and Top-k SimRank Computations on Dynamic Graphs Yu Liu , Bolong Zheng, Xiaodong He, Zhewei Wei, Xiaokui Xiao, Kai.
Distributed Representations of Subgraphs
Differential Privacy in Practice
Joining Massive High-Dimensional Datasets
Data Integration with Dependent Sources
The Importance of Communities for Learning to Influence
Bin Fu Department of Computer Science
Objective of This Course
Probably Approximately
CASE − Cognitive Agents for Social Environments
Memory Management Algorithms Huan Liu, Damon Mosk-Aoyama
Cost-effective Outbreak Detection in Networks
On the Designing of Popular Packages
NP-Completeness Reference: Computers and Intractability: A Guide to the Theory of NP-Completeness by Garey and Johnson, W.H. Freeman and Company, 1979.
A Dynamic System Analysis of Simultaneous Recurrent Neural Network
Minwise Hashing and Efficient Search
Mingzhen Mo and Irwin King
Clustering.
Submodular Maximization with Cardinality Constraints
PRSim: Sublinear Time SimRank Computation on Large Power-Law Graphs.
PRSim: Sublinear Time SimRank Computation on Large Power-Law Graphs
Presentation transcript:

Towards Maximum Independent Sets on Massive Graphs Yu Liu1, Jiaheng Lu2, Hua Yang1, Xiaokui Xiao3, Zhewei Wei1 1 DEKE, MOE and School of Information, Renmin University of China 2 Department of Computer Science, University of Helsinki, Finland 3 School of Computer Engineering, Nanyang Technological University, Singapore Contact: zhewei@ruc.edu.cn Implementation Details Motivation and Background Hardness of computing maximum independent set: NP-hard and APX-hard. Existing internal and external memory algorithms either don’t scale well or have no theoretical guarantee. It’s hopeful to use massive graph properties to make computing a near-optimal independent set much easier in practice. Experiments and Conclusion On Synthetic Graphs by ACL Model[Aiello et al STOC00]: Computation Models and Problem Statement Real-world Datasets: Power Law Random Graph Model -ACL Model[Aiello et al STOC00] (Semi-)External Memory Model Our Goals: -Memory budget: c|V|<=M << |G| -Low CPU time and I/O complexity -Find near-optimal independent set -Have non-trivial theoretical bounds Disk Memory Sampling-based algorithm Observations Our greedy algorithm is simple and effective. One-K-Swap and Two-K-Swap improve independent set size to near-optimal, with limited memory cost and acceptable time cost. Though Greedy[Halldórsson et al STOC94] also gives near-optimal results for most power law graphs, it cannot scale well on large graphs. Our algorithms outperform previous external algorithms, both in theory and in practice. Semi-external Memory Greedy Algorithm Dataset Greedy[94] Zeh[02] + One-K-Swap + Two-K-Swap Semi-Greedy S-Greedy OPT Astroph 17110 15275 16625 16814 15019 16054 16572 =19106 DBLP 260992 242521 260715 260961 260886 261003 261007 =261008 Youtube 880873 823821 879078 880455 878459 880642 880835 =880882 Patent 2073214 1711789 2018537 2047497 2032599 2070806 2078989 <2320446 Blog 2116668 1855824 2094057 2109767 2096910 2117377 2121133 <2216727 Citeseerx 5750799 5307498 5719705 5737953 5731026 5747431 5749396 <5765563 Uniprot 6948528 6938348 6947851 6948149 6942879 6947397 6948048 <6949108 Facebook N/A 18893989 57269875 57986375 58226290 58232256 58232269 <58232354 Twitter 36072163 46978395 48059663 48121173 48742356 48742573 <52335929 ClueWeb12 499444213 703485927 725810643 606465512 723673169 729594728 <816673210 One-K-Swap Algorithm Independent Set Size by Various Algorithms Time and Memory Cost by Various Algorithms Conclusions We develop three semi-external algorithms to find near-optimal independent set on massive graphs, all satisfying -Low memory cost -Low time and I/O complexity -Near-optimal in theory and in practice -Easy to implement We give non-trivial theoretical guarantees for our proposed algorithms, which proves to be near-optimal. Experiments show that our algorithms have better performance and bounds than existing external algorithms. Two-K-Swap Algorithm