Download presentation
Presentation is loading. Please wait.
Published byJoel McCormick Modified over 9 years ago
1
Social Curation of Large Multimedia Collections on a Microsoft Azure Cloud Dazhi Chong, Samuel Coppage, Xiangyi Gu, Harris Wu Kurt Maly, Mohammad Zubair maly@cs.odu.edu Old Dominion University Department of Computer Science DH 2012 Hamburg 19 July 20121
2
Outline Faceted Classification System and scalability issues Implementation and deployment on a cloud Evaluation and user studies Conclusions DH 2012 Hamburg 19 July 20122
3
Faceted Classification System and scalability issues Web based application Allows users collaboratively organize multimedia collections into faceted classification Social application - must handle Many users Various network traffic levels Traditional on-premises deployment can’t handle Increasing number of users Numerous evolving classification schemas Large document collections DH 2012 Hamburg 19 July 20123
4
Faceted Classification System and scalability issues DH 2012 Hamburg 19 July 20124
5
Faceted Classification System and scalability issues New features require even more resources Personal classification schema History feature – evolution of classification over time Decision – move to a cloud platform DH 2012 Hamburg 19 July 20125
6
The click-and-drag classification screen DH 2012 Hamburg 19 July 20126
7
Global and personal (or local) schemas DH 2012 Hamburg 19 July 20127
8
Faceted Classification System and scalability issues DH 2012 Hamburg 19 July 20128
9
Microsoft Windows Azure vs. Amazon Elastic Compute Microsoft Windows Azure Cloud Platform as a Service (PaaS) cloud Hides management and operational side from users Focus on development and solving business problems Amazon Elastic Compute Cloud Infrastructure as a Service (IaaS) cloud Allows to deploy new technologies and adopt new capabilities DH 2012 Hamburg 19 July 20129
10
Microsoft Windows Azure vs. Amazon Elastic Compute Both offer reliability and scalability Windows Azure more suitable for applications with variable load, short or unpredicted lifetime Azure platform was chosen because of the most managed environment Choice of either platform – best fit for a company, developers and users DH 2012 Hamburg 19 July 201210
11
Implementation and deployment on Azure cloud platform First step – conversion of Joomla 1.6.3 to work with Azure SQL Second step – converting Faceted Classification System packages to Azure SQL (from MSQL) Third step – full configuration of the system Last step – configuration of the whole project and deployment DH 2012 Hamburg 19 July 201211
12
Implementation and deployment on Azure cloud platform DH 2012 Hamburg 19 July 201212
13
Implementation and deployment on Azure cloud platform DH 2012 Hamburg 19 July 201213
14
Design of the cloud-based web application Final design of current deployment Web role can run by default 20 instances (more if needed) Azure manages load-balancing (round-robin algorithm, performance and failover in beta) and seamlessly redirects users All data stored now on Azure SQL DH 2012 Hamburg 19 July 201214
15
Design of the cloud-based web application DH 2012 Hamburg 19 July 201215
16
Advantages and disadvantages of deployment on the cloud platform Advantages High availability, reliability and scalability Disadvantages Azure SQL is a new product Lacks features of the full MSSQL DB No profiler Import, export are rudimentary DH 2012 Hamburg 19 July 201216
17
Advantages and disadvantages of deployment on the cloud platform Biggest drawback – performance of Microsoft SQL Driver for PHP Measured query statements – no unusual delays Fetching results with sqlsrv_fetch_array() sqlsrv_fetch_object() delays in rendering web pages up to 20 seconds Deployment of web application should consider all benefits and drawbacks DH 2012 Hamburg 19 July 201217
18
Evaluation User studies with classes on information technologies (Spring and Fall 2011) Students had to develop personal facet schemas Personal schemas were merged into global schema DH 2012 Hamburg 19 July 201218
19
Initial Page with only few facets
20
Page without & with user facets
21
Item detail screen without & with faces and tags
22
22 Merging of Personal Facets GlobalPersonal - Good facet/category definition - Useful for most users - Optimized - Wide coverage - Personal use - May contain non-facet schemas - Personal wording for facet/category/tag - Narrow coverage Approach: Evaluating all the personal schemas, find most widely used facets/category/items, use similarity of concepts, enrich/reconstruct the global schema.
23
Sample algorithm component 23 PopularityDescription New facet(1) It does not existed in the global schema; (2) is used in more than half of the personal schema New category(1)It or a ‘similar’ category does not exist in the global old facet; (2)the personal facet containing the global new category is similar to the global old facet (3)more than half of the users who have the (‘similar’) global facet have the new category under it. “Similar”:when two entities are either Wordnet similar or structure similar
24
Example-1: 24 Event - Group action - Competition - Wreck Location - Alabama - Virginia Source - Newspaper - Internet Space Quality Time - VA - Good - 1998 - New York - Bad - 2006 - Alabama Event - 2010 Position - Activity Tom - NY - Crash - OK - Virginia - Happening - Not Ok Jason Year - Favorite - befor e 2000 - Dislike - after 2000 Global schema (old): Personal schema:
25
25 Example-2: Event Source - Group action - Newspaper - Competition - Internet - Wreck Year Location - Before 2000 - Alabama - After 2000 - Virginia - New York Similarity: S(year, time) =0.5528, S(crash, wreck) =1, S(New York, NY)=1, S(Virginia, VA)=1 New global schema
26
Conclusions A cloud can solve the scalability issue of: compute intensive features such as schema merging and history (schema evolution) many simultaneous users Porting a complex application to the cloud is a daunting task – not for the uninitiated DH 2012 Hamburg 19 July 201226
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.