Download presentation
Presentation is loading. Please wait.
Published byLynn Lane Modified over 9 years ago
1
Advanced Software Engineering PROJECT November 2015
2
1. Spark on Yarn and Mesos Yarn/Mesos+Spark+SparkStreaming Mesos: fine/coarse-grained Application: to retrieve Wikepedia posts as a stream Demonstrate “how” Yarn and Mesos help resource scheduling Visualization through Ganglia Performance analysis and comparisons Max. 2 students
3
2. Sparkathon Build super-smart weather apps that harness the power of IBM Bluemix™ and Apache Spark™. http://sparkathon.devpost.com/ http://sparkathon.devpost.com/ Worldwide competition, submitted by January 20, 2016 (5:00pm Eastern Time) Max. 4 students 只要入围, Project 给满分!
4
3. US Road Network Analysis https://snap.stanford.edu/data/#road https://snap.stanford.edu/data/#road GraphX + MLlib + Spark Streaming + Tableau Think about what you can do ^^ Community detection: J. Leskovec, K. Lang, A. Dasgupta, M. Mahoney. Community Structure in Large Networks: Natural Cluster Sizes and the Absence of Large Well-Defined Clusters. Internet Mathematics 6(1) 29--123, 2009.Community Structure in Large Networks: Natural Cluster Sizes and the Absence of Large Well-Defined Clusters Max. 3 students
5
4. Images on Internet http://www.sogou.com/lab s/dl/p2.html http://www.sogou.com/lab s/dl/p2.html Can you predict the seasonal trends? news trends? Political trends? Or human behaviors? Max. 3 students Data visualization (Tableau)
6
5. Your Proposal! Your turn to choose one Technically sound Application oriented Talk with me next lecture!
7
Arrangement Nov. 23: let me know your choice in class Week 1-2: Define your roles in a group, start literature research, and decide what to do Week 2-4: Propose solutions (architectures, frameworks, opensource, steps) Week 5-8: Implementation, performance analysis, and obtain results Week 9: Wrap up and spend a few days writing your report Jan. 16: project report to ase_bit@yahoo.com Jan. 18: project presentation
8
About Result Figures/Tables Each Image: Clear x-y labling At least 2 lines (One yours, one comparison) Each line at least 5 connected points Each Table Clear x-y labling At least 2 algorithms (One yours, one comparison) Each row/column at least 5 fields
9
Report Outline Introduction: background, why this, briefly explain what others have done Related Work: 10+ references (after 2012), describe what others have done, their advantages/disadvantages System Architecture Detailed Design: component design, algorithm design Simulation Results: figures/tables, discussions Conclusions and Future Work Written in English +5!
10
Attention! Not just an engineering project Aiming to train your research capability Look at what others have done? Are there existing problems? How to improve? Performance? Performance Analysis (10+ figures) Yours: accuracy, throughput, delay, system bound Comparisions: other algos, other systems OPEN SOURCE is the key! Grading is based on YOUR contributions!
11
IEEE Xplore: http://ieeexplore.ieee.org/
12
http://dl.acm.org
13
Social Network Analysis Advanced Software Engineering
14
Key Players How to identify key/central nodes in network
22
Cohesion How to characterize a network’s structure
28
Example Facebook: 5.8million users (2009), avr 5.73 degrees, max 12 degrees Twitter: 5.2 billion relationships, avr 4.67 degrees 50% users only 4 step away Almost everyone <5 steps For any 1,500 random users, 3.435 steps Erdos Number: Collaborative distance through paper co- authoring
29
Experiment: Forwarding Letters in US
30
Example: Social Evolution data set by MIT Media Lab 80 undergraduates with smart devices, moving around the campus. collects the phone usages and student locations from October 2008 to June 2009. phone usage: 3.15 million records of Bluetooth scans 3.63 million scans of WLAN access-points 61,100 call records 47,700 logged SMS events. students provide offline, self-report answers related to their health habits, diet and exercise, weight changes, and political opinions during the presidential election campaign.
31
Contact graph, only links of greater than 2,000 contacts between two students are shown. Bigger nodes indicate higher betweenness centrality value for the corresponding participants. Thicker edges indicate higher contact frequency between the connected nodes.
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.