Analyzing OpenCourseWare usage by means of social tagging Julià Minguillón 1,2, Jordi Conesa 2 1 UOC UNESCO Chair in e-Learning 2 Universitat Oberta de.

Slides:



Advertisements
Similar presentations
Ontology Assessment – Proposed Framework and Methodology.
Advertisements

Language Technologies Reality and Promise in AKT Yorick Wilks and Fabio Ciravegna Department of Computer Science, University of Sheffield.
GMD German National Research Center for Information Technology Darmstadt University of Technology Perspectives and Priorities for Digital Libraries Research.
Introducing Intute: Social Sciences Your Guide to the Best of the Web.
Google Chrome & Search C Chapter 18. Objectives 1.Use Google Chrome to navigate the Word Wide Web. 2.Manage bookmarks for web pages. 3.Perform basic keyword.
The Web of data with meaning... By Michael Griffiths.
EDEN 2007 Naples, Italy LIFELONG LEARNING TEACHERS’ NEEDS IN VIRTUAL LEARNING ENVIRONMENTS Josep Maria Boneu 1, Maria Galofré 2, Julià Minguillón 2 1 Centre.
Yahoo! Research Bradley Horowitz VP Product Strategy Yahoo, Inc. August 2006.
Crosslingual Retrieval in an eLearning Environment Cristina Vertan, Kiril Simov, Petya Osenova, Lothar Lemnitzer, Alex Killing, Diane Evans, Paola Monachesi.
Physical Activity and Health OCW University of Murcia (Spain)
Adaptive Book: A Platform for teaching, learning and student modeling Ananda Gunawardena School of Computer Science Carnegie Mellon University.
Towards Semantic Web Mining Bettina Berndt Andreas Hotho Gerd Stumme.
ReQuest (Validating Semantic Searches) Norman Piedade de Noronha 16 th July, 2004.
Searching and Researching the World Wide: Emphasis on Christian Websites Developed from the book: Searching and Researching on the Internet and World Wide.
CS580: Building Web Based Information Systems Roger Alexander & Adele Howe The purpose of the course is to teach theory and practice underlying the construction.
Manchester, 2007 Adaptive learning paths for improving Lifelong Learning experiences Ana Elena Guerrero Roldán, Julià Minguillón Universitat Oberta de.
Domain Modelling the upper levels of the eframework Yvonne Howard Hilary Dexter David Millard Learning Societies LabDistributed Learning, University of.
Semantic Web Technologies Lecture # 2 Faculty of Computer Science, IBA.
Some facets of knowledge management in mathematics Wolfram Sperber (Zentralblatt Math) Patrick Ion (Math Reviews) Facets of Knowledge Organization A tribute.
Intute and Organic.Edunet Jackie Wickham ALLCU, Oxford, July 2008.
Web 2.0: Concepts and Applications 4 Organizing Information.
Learning Object Repositories: a learner centered perspective Julià Minguillón Universitat Oberta de Catalunya EdReNe – 4 th Strategic Seminar, March 24th.
INFOBALT, October 22, 2004, Vinius IST4Balt project information dissemination using web-based knowledge systems Zigmas Bigelis EU projects consultant Asociation.
1 © Netskills Quality Internet Training, University of Newcastle Metadata Explained © Netskills, Quality Internet Training.
How Do We Educate…
Stand on Top of the Shoulder of Giant , Year of Search!
Publishing and Interacting with Linked Data Roberto Garcia, Josep Maria Brunetti, Antonio López-Muzás, Juan Manuel Gimeno, Rosa Gil WIMS’11 Conference,
Free Access to E-Books and 'OER's at the Open University of Israel Yachin Epstein Center for Integrating Technology in Distance.
No Title, yet Hyunwoo Kim SNU IDB Lab. September 11, 2008.
Social What? Social Bookmarking! Liane Haslauer Greater Manchester Professional Development Center
WHAT IS A SEARCH ENGINE. Widescreen Presentation Proteus, Keeper of Knowledge. Proteus is synonymous with change and success.
Creating Collaborative Partnerships
Objective Understand concepts used to web-based digital media. Course Weight : 5%
Use of Hierarchical Keywords for Easy Data Management on HUBzero HUBbub Conference 2013 September 6 th, 2013 Gaurav Nanda, Jonathan Tan, Peter Auyeung,
What to Know: 9 Essential Things to Know About Web Searching Janet Eke Graduate School of Library and Information Science University of Illinois at Champaign-Urbana.
29-30 October, 2006, Estonia 1 IST4Balt Information analysis using social bookmarking and other tools IST4Balt Information analysis using social bookmarking.
In Search of GREAT Science Web Sites Lisa Bachman.
Knowledge Management: The On-To-Knowledge Project Hans Akkermans Free University Amsterdam VUA.
N NESSTAR: A Semantic Web Application for Statistical Data and Metadata Pasqualino “Titto” Assini Nesstar Ltd - UK.
SMX Madrid 2008 Uncovering the Algorithm A Peek Inside How Google Evaluates and Ranks Pages.
Masoud Makrehchi, PAMI, UW Learning Object Metadata Masoud Makrehchi PAMI University of Waterloo August 2004.
Student Edition: Gale Info Trac Database Lesson Grades 9-12 High School Student Edition: Gale Info Trac Database Lesson Grades 9-12 High School Anita Cellucci.
OWL Representing Information Using the Web Ontology Language.
Automatic Video Tagging using Content Redundancy Stefan Siersdorfer 1, Jose San Pedro 2, Mark Sanderson 2 1 L3S Research Center, Germany 2 University of.
Of 33 lecture 1: introduction. of 33 the semantic web vision today’s web (1) web content – for human consumption (no structural information) people search.
Adaptive Book: Teaching and Learning Environment for Programming Education Ananda Gunawardena & Victor Adamchik.
Semantic web Bootstrapping & Annotation Hassan Sayyadi Semantic web research laboratory Computer department Sharif university of.
From institutional repositories to personal collections of learning resources Julià Minguillón 1,2, Jordi Conesa 1 1 Computer Science, Multimedia and Telecommunication.
Authoring of Adaptive Hypermedia Dr. Alexandra Cristea
GoRelations: an Intuitive Query System for DBPedia Lushan Han and Tim Finin 15 November 2011
5/29/2001Y. D. Wu & M. Liu1 Content Management for Digital Library May 29, 2001.
Strategy implementation seminar for Open Universities belonging to the MORIL consortium MORIL under the OLCOS perspective Julià Minguillón Universitat.
1 Increasing the Impact of Open Education Stephen Carson, External Relations Director, MIT OpenCourseWare 22 June 2012.
Personalized Ontology for Web Search Personalization S. Sendhilkumar, T.V. Geetha Anna University, Chennai India 1st ACM Bangalore annual Compute conference,
Data mining in web applications
Search Engine Optimization
AFFORDABLE WEBSITE DESIGN SERVICES.  The different areas web designing services includes web graphic design, user interface designing, authoring and.
Search Engine Optimization
Using Social Bookmarking in the Classroom
Over 1,000 books, journals, videos and reference material
1.01- Understand Internet search tools and methods.
Objective % Explain concepts used to create websites.
EnTag Enhanced Tagging for Discovery Koraljka Golub, Jim Moon,
Data Management: Documentation & Metadata
Clustering Semantically Enhanced Web Search Results
CS 620 Class Presentation Using WordNet to Improve User Modelling in a Web Document Recommender System Using WordNet to Improve User Modelling in a Web.
Web Mining Department of Computer Science and Engg.
17th APAN Meetings & Joint Techs Workshop
WEB DESIGNING THROUGH HTML
Presentation transcript:

Analyzing OpenCourseWare usage by means of social tagging Julià Minguillón 1,2, Jordi Conesa 2 1 UOC UNESCO Chair in e-Learning 2 Universitat Oberta de Catalunya Barcelona, Spain CAL 2011, April, Manchester, UK

Table of contents Open Educational Resources Social tagging Exploratory analysis Preliminary results Discussion Conclusions Future research CAL 2011, April, Manchester, UK

Open Educational Resources Everybody is creating and publishing OERs Available through repositories, websites, … – MERLOT, … – OpenCourseWare But… – Poor metadata descriptions, flat structure – Wide degree of granularity – Who is (re)using them? CAL 2011, April, Manchester, UK

Searching at OCW Limited possibilities: – Filter by keyword – Language – Source (OCW site) No shared taxonomies between OCW sites Goal: improve users’ support when browsing and searching for OERs CAL 2011, April, Manchester, UK

Internet OERs OCW CAL 2011, April, Manchester, UK delicious

Social tagging Web 2.0 service (a.k.a. social bookmarking) Organize, describe and share URLs Triplets: URL, user, {tags} Delicious: – Largest service of its kind – Sustainability: Yahoo wants to close/sell delicious Simple poll: how many of you use delicious? CAL 2011, April, Manchester, UK

What / How do users tag? Top level web sites Specific links (i.e. courses) Web site:mit, ocw Content:math, course, pdf Usage:phd, ds106, toread If delicious is used for content organization, then tags become summarized descriptions CAL 2011, April, Manchester, UK

Basic idea What if we obtain “all” tags from “all” users sharing “all” the URLs for a given concept? Tags could reveal a hidden semantic structure used for organizing and describing content What does “OCW” mean for OCW users? CAL 2011, April, Manchester, UK

Methodology Retrieve “all” triplets for a given search term (opencourseware + ocw) Do some data cleansing Select the most important triplets Build a binary data set (N x T): – URL X tagged by user Y with tags T 1,…,T T Perform some exploratory data analysis Interpret results CAL 2011, April, Manchester, UK

Retrieving data from delicious scripting + wget + HTML parsing Problem: IP throttling Slow, “risky”, tricky process Search term: “opencourseware+OR+ocw” Before data cleansing and selection: – 2479 URLs – 50 pages of users’ tags per URL – 3 GB of information CAL 2011, April, Manchester, UK

Data cleansing Plurals, capital letters Misspelling: course, coruse Multilingualism: curso, curs, kurs, … Equivalent tags: math = mathematics Abbreviations: cs = compsci = computer science Compound terms: web = web20 Spammers URLs with no tags CAL 2011, April, Manchester, UK

Triplet selection Remove those users tagging less than M resources (spammers, occasional users) Remove those URLs tagged by less than N users (spammers, broken URLs) Keep only a fraction of tags (P) which are the most used (Pareto-like) M=5-10, N=5-10, P=80%-90% CAL 2011, April, Manchester, UK

Resulting data sets triplets (N): Before: 1456 URLs, users, tags After: 1297 URLs, users, 434 tags (T) Course subset: course URLs at MIT OCW – triplets – 351 URLs, 5689 users, 362 tags (120 if P=80%) Data set – subset (Non-course): – 946 URLs, users, 412 tags (116 if P=80%) CAL 2011, April, Manchester, UK

Most used tags Non-course data set – education – resource – free – learning – elearning – video – teaching – science – lessonplan – web20 – reference – technology – online – course – lecture Course data set – mit – education – opencourseware / ocw – course – programming – video – learning – lecture – math – science – python – algorithm – free – reference – resource CAL 2011, April, Manchester, UK

Exploratory data analysis Goal: analyze how tags are used R psych package Factor analysis: – F factors (eigenvalue ≥ 1, F << T) – Maximum likelihood – Varimax rotation – Only factors with weight ≥ 0.1 are considered (0.3) Problems: instability, too many parameters CAL 2011, April, Manchester, UK

Preliminary results 20.1% of variance explained (20 factors, 16.7%) Interesting factors (out of 50): – ML11: programming, algorithm, tutorial, howto – ML17: algebra, linear, math, linearalgebra – ML21: scheme, lisp, sicp, programming, book – ML29: psychology, neuroscience, brain, cognitive Using P=90% (361 tags) also generates: – ML31: politics, society – ML39: design, usability, ui, hci, interface CAL 2011, April, Manchester, UK

Discussion Tags related to web sites are too general… …but both data sets share ≈ 50% of tags General (i.e. most used) tags are not relevant for factoring purposes (mit, ocw, course, …) Factors group tags semantically related, mostly describing the domain knowledge P=80% seems to produce more stable results, but interesting factors are missed (i.e. HCI) CAL 2011, April, Manchester, UK

Conclusions OCW could take advantage of “advanced” users tagging its courses Non “tech” domains are underrepresented Data cleansing is critical Proposal: describe domain (keywords), type of resource (book, course) and format (pdf, video) Some parameters need fine-tuning More experimentation is needed! CAL 2011, April, Manchester, UK

Future research Automate and improve data cleansing – Use of Wordnet / DBpedia / ontologies Non-Negative Matrix Factorization + clustering Build recommendation systems: – Tags / Related resources Build reputation schemes: – Most expert users Pilot at CAL 2011, April, Manchester, UK

Thank you! Julià Minguillón CAL 2011, April, Manchester, UK