Advanced Software Engineering PROJECT. 1. MapReduce Join (2 students)  Focused on performance analysis on different implementation of join processors.

Slides:



Advertisements
Similar presentations
Account Planning The purpose of these slides is to describe the Account Planning Process, the methodology, and the workload involved in running an account.
Advertisements

Pete Bohman Adam Kunk.  Introduction  Related Work  System Overview  Indexing Scheme  Ranking  Evaluation  Conclusion.
CSE 5243 (AU 14) Graph Basics and a Gentle Introduction to PageRank 1.
Analysis and Modeling of Social Networks Foudalis Ilias.
Context-aware Query Suggestion by Mining Click-through and Session Data Authors: H. Cao et.al KDD 08 Presented by Shize Su 1.
SKELETON BASED PERFORMANCE PREDICTION ON SHARED NETWORKS Sukhdeep Sodhi Microsoft Corp Jaspal Subhlok University of Houston.
Advanced Software Engineering PROJECT. 1. MapReduce Join (2 人 )  Focused on performance analysis on different implementation of join processors in MapReduce.
Investigation of Web Query Refinement via Topic Analysis and Learning with Personalization Department of Systems Engineering & Engineering Management The.
Seminar Computer Animation Arjan Egges
The Classroom Presenter Project Richard Anderson University of Washington December 5, 2006.
PageRank Identifying key users in social networks Student : Ivan Todorović, 3231/2014 Mentor : Prof. Dr Veljko Milutinović.
On Distinguishing between Internet Power Law B Bu and Towsley Infocom 2002 Presented by.
Classroom Presenter: Supporting Active Learning with the Tablet PC Richard Anderson University of Washington March 19, 2007 Asia-Pacific Regional Workshop.
Web Information Retrieval Projects Ida Mele. Rules Students can work in teams (max 3 people) The project must be delivered by the deadline that will be.
Projects ( ) Ida Mele. Rules Students have to work in teams (max 2 people). The project has to be delivered by the deadline that will be published.
CHUCK YOUNG MANAGING DIRECTOR OFFICE OF PUBLIC AFFAIRS GOVERNMENT ACCOUNTABILITY OFFICE to AGA BOSTON CHAPTER PROFESSIONAL DEVELOPMENT CONFERENCE MARCH.
Project Overview. What? What are we trying to accomplish How? How are we going to accomplish it When? When do we need to accomplish it by.
Assignment 3: A Team-based and Integrated Term Paper and Project Semester 1, 2012.
Social Media Facebook, Twitter, Google+, etc.. What is Social Technology?  Communication tools  Interactive tools  Examples?
1 Speaker : 童耀民 MA1G Authors: Ze Li Dept. of Electr. & Comput. Eng., Clemson Univ., Clemson, SC, USA Haiying Shen ; Hailang Wang ; Guoxin.
How to make a presentation (Oral and Poster) Dr. Bernard Chen Ph.D. University of Central Arkansas July 5 th Applied Research in Healthy Information.
Social Media Exploring Social Media to Enhance Interactive Communication and e-Learning for Students in Higher Education Billy Rector Texas Southern University.
Panagiotis Antonopoulos Microsoft Corp Ioannis Konstantinou National Technical University of Athens Dimitrios Tsoumakos.
When Experts Agree: Using Non-Affiliated Experts To Rank Popular Topics Meital Aizen.
LORETTA DUNCAN-BRANTLEY Associate Communications Manager Microsoft Corporation Discovering Your Path Through the Maze of Life.
BFTCloud: A Byzantine Fault Tolerance Framework for Voluntary-Resource Cloud Computing Yilei Zhang, Zibin Zheng, and Michael R. Lyu
Map-Reduce-Merge: Simplified Relational Data Processing on Large Clusters H.Yang, A. Dasdan (Yahoo!), R. Hsiao, D.S.Parker (UCLA) Shimin Chen Big Data.
Map-Reduce-Merge: Simplified Relational Data Processing on Large Clusters Hung-chih Yang(Yahoo!), Ali Dasdan(Yahoo!), Ruey-Lung Hsiao(UCLA), D. Stott Parker(UCLA)
윤언근 DataMining lab.  The Web has grown exponentially in size but this growth has not been isolated to good-quality pages.  spamming and.
A Graph-based Friend Recommendation System Using Genetic Algorithm
Jing (Selena) He and Hisham M. Haddad Department of Computer Science, Kennesaw State University Shouling Ji, Xiaojing Liao, and Raheem Beyah School of.
Freelib: A Self-sustainable Digital Library for Education Community Ashraf Amrou, Kurt Maly, Mohammad Zubair Computer Science Dept., Old Dominion University.
Nicholas D. Lane, Hong Lu, Shane B. Eisenman, and Andrew T. Campbell Presenter: Pete Clements Cooperative Techniques Supporting Sensor- based People-centric.
Job scheduling algorithm based on Berger model in cloud environment Advances in Engineering Software (2011) Baomin Xu,Chunyan Zhao,Enzhao Hua,Bin Hu 2013/1/251.
Finding Top-k Shortest Path Distance Changes in an Evolutionary Network SSTD th August 2011 Manish Gupta UIUC Charu Aggarwal IBM Jiawei Han UIUC.
Zhuo Peng, Chaokun Wang, Lu Han, Jingchao Hao and Yiyuan Ba Proceedings of the Third International Conference on Emerging Databases, Incheon, Korea (August.
Exploit of Online Social Networks with Community-Based Graph Semi-Supervised Learning Mingzhen Mo and Irwin King Department of Computer Science and Engineering.
Random Graph Generator University of CS 8910 – Final Research Project Presentation Professor: Dr. Zhu Presented: December 8, 2010 By: Hanh Tran.
A project from the Social Media Research Foundation: Finding direction in a sea of connection:
Most of contents are provided by the website Introduction TJTSD66: Advanced Topics in Social Media Dr.
Jiafeng Guo(ICT) Xueqi Cheng(ICT) Hua-Wei Shen(ICT) Gu Xu (MSRA) Speaker: Rui-Rui Li Supervisor: Prof. Ben Kao.
Intelligent Database Systems Lab N.Y.U.S.T. I. M. Externally growing self-organizing maps and its application to database visualization and exploration.
[Your Collaboration/Network Title] Insert your Name(s) and Credentials GOALS Contact Information GOALS: (Examples) Fully connect the network (increasing.
Advanced Software Engineering PROJECT November 2015.
 Frequent Word Combinations Mining and Indexing on HBase Hemanth Gokavarapu Santhosh Kumar Saminathan.
Date: 2012/08/21 Source: Zhong Zeng, Zhifeng Bao, Tok Wang Ling, Mong Li Lee (KEYS’12) Speaker: Er-Gang Liu Advisor: Dr. Jia-ling Koh 1.
1 Friends and Neighbors on the Web Presentation for Web Information Retrieval Bruno Lepri.
PROJECT. Topics  Theoretical: Error Performance Analysis for Partitioned Sketch Data Structures  Survey: Security and Privacy for Big Data: A Survey.
Development of the West Virginia University Electronic Theses & Dissertations System Presented By Haritha Garapati at ETD the 7 th International.
Speaker : Yu-Hui Chen Authors : Dinuka A. Soysa, Denis Guangyin Chen, Oscar C. Au, and Amine Bermak From : 2013 IEEE Symposium on Computational Intelligence.
Power Guru: Implementing Smart Power Management on the Android Platform Written by Raef Mchaymech.
An Energy-Efficient Approach for Real-Time Tracking of Moving Objects in Multi-Level Sensor Networks Vincent S. Tseng, Eric H. C. Lu, & Kawuu W. Lin Institute.
1 Random Walks on the Click Graph Nick Craswell and Martin Szummer Microsoft Research Cambridge SIGIR 2007.
Promoboxx: Your Digital Marketing Platform Free access to customizable, brand-approved digital marketing content and easy- to-use tools to better connect.
Social Media & Social Networking 101 Canadian Society of Safety Engineering (CSSE)
Ariel Fuxman, Panayiotis Tsaparas, Kannan Achan, Rakesh Agrawal (2008) - Akanksha Saxena 1.
Russell & Jamieson chapter Evaluation Steps 15. Evaluation Steps Step 1: Preparing an Evaluation Proposal Step 2: Designing the Study Step 3: Selecting.
INF 103 MART Successful Learning/inf103mart.com
International Conference on Data Engineering (ICDE 2016)
Distributed voting application for handheld devices
Hadoop Clusters Tess Fulkerson.
CS341: Project in Mining Massive Datasets Infosession
Dieudo Mulamba November 2017
CS & CS Capstone Project & Software Development Project
Why Social Graphs Are Different Communities Finding Triangles
Mining Social Networks. Contents  What are Social Networks  Why Analyse Them?  Analysis Techniques.
Block Matching for Ontologies
Introduction to Comparative Effectiveness Course (HAP 823)
Analyzing Two Participation Strategies in an Undergraduate Course Community Francisco Gutierrez Gustavo Zurita
Computational Advertising and
Presentation transcript:

Advanced Software Engineering PROJECT

1. MapReduce Join (2 students)  Focused on performance analysis on different implementation of join processors in MapReduce. Homogenization: add additional information about the source of the data in the map phase, then do the join in the reduce phase. Map-Reduce-Merge: a new primitive called merge is added to process the join separately. Other implementation: the map-reduce execution plan for joins generated by Hive.  Generate 10+ figures/tables for comparisons.

2. Social Network Structure Analysis (3-4 students)  Learn existing classification and clustering algorithms  Use both Google+ and Twitter social circle data    Build a distributed computing platform on M/R or Spark  Make use of Mahout/Mllib tools for data analysis, to discover the unique characteristics of each social network  Generate 10+ figures/tables for comparisons.  Bonus : compare M/R and Spark  Never use off-the-self softwares!!!

3. Distributed Learning-to-Ranking Systems (3-4 students)  Learn existing Pointwise, Pairewise, and Listwise learning-to-rank algorithms  Use Microsoft Learning to Rank Datasets   Build a distributed computing platform on either M/R, Storm, or Spark  Implement at least 3 algorithms  Generate 10+ figures/tables for comparisons.  Bonus : compare M/R and Spark

Mechanism  Working in group: 2, OR, 3-4 students, clear roles  me by this Friday (Dec 19) Team leader, Team members Topic  Deadline: 16 Jan 2015!  Deliverable: project report in Chinese Introduction (motivation, WHY?) Your proposal (HOW?) Performance Evaluation Conclusion  Presentation

Suggested Arrangement  Week-1: Define your roles and start literature research  Week-2 and 3: Propose solutions  Week-4 and 5: Implementation and obtain results  Finally, spend a few days writing your report

Attention!!  Not only an ENGIEERING project  Train your research thinking  What others have done? What are the research gap?  How to improve?  Performance? Accuracy, throughput, latency, etc. Compare to existing approaches  Make use of open-source frameworks  What is YOUR CONTRIBUTION?

 IEEE Xplore:

Social Network Analysis Advanced Software Engineering

Key Players  How to identify key/central nodes in network

Cohesion  How to characterize a network’s structure

Example  Facebook: 5.8million users (2009), avr 5.73 degrees, max 12 degrees  Twitter: 5.2 billion relationships, avr 4.67 degrees 50% users only 4 step away Almost everyone <5 steps For any 1,500 random users, steps  Erdos Number: Collaborative distance through paper co- authoring

Experiment: Forwarding Letters in US

Example: Social Evolution data set by MIT Media Lab  80 undergraduates with smart devices, moving around the campus.  collects the phone usages and student locations from October 2008 to June  phone usage: 3.15 million records of Bluetooth scans 3.63 million scans of WLAN access-points 61,100 call records 47,700 logged SMS events.  students provide offline, self-report answers related to their health habits, diet and exercise, weight changes, and political opinions during the presidential election campaign.

Contact graph, only links of greater than 2,000 contacts between two students are shown. Bigger nodes indicate higher betweenness centrality value for the corresponding participants. Thicker edges indicate higher contact frequency between the connected nodes.