Saras Shareable Rich Media Learning Object Repositories and Management for e-Learning Chitra Dorai IBM T.J. Watson Research Center New York

Slides:



Advertisements
Similar presentations
Design and Implementation of WBT System Components and Test Tools for WBT content standard K. Nakabayashi, Y. Kubota(NTT-X,Inc./ Advanced Learning Infrastructure.
Advertisements

GMD German National Research Center for Information Technology Darmstadt University of Technology Perspectives and Priorities for Digital Libraries Research.
Pseudo-Relevance Feedback For Multimedia Retrieval By Rong Yan, Alexander G. and Rong Jin Mwangi S. Kariuki
Mobile phones as a tool for learning by Mats Reinsby
Standardisation Issues in eLearning by Diptendu Dutta AUNWESHA Presented at IEEE Computer Chapter 19 th October, 2001.
A Stepwise Modeling Approach for Individual Media Semantics Annett Mitschick, Klaus Meißner TU Dresden, Department of Computer Science, Multimedia Technology.
Learning Content Standards Demos, Details, and De-mystification Robby Robson, Eduworks Chair, IEEE Learning Technology Standards Committee
DARE Domain Analysis and Reuse Environment סמינר: נושאים מתקדמים בהנדסת תכנה מרצה: ד"ר איריס ריינהרץ- ברגר סמסטר א', תשס"ז אהרוני ענת ברזני ערבה.
DL:Lesson 11 Multimedia Search Luca Dini
Broadcast News Parsing Using Visual Cues: A Robust Face Detection Approach Yannis Avrithis, Nicolas Tsapatsoulis and Stefanos Kollias Image, Video & Multimedia.
Chapter 11 Beyond Bag of Words. Question Answering n Providing answers instead of ranked lists of documents n Older QA systems generated answers n Current.
MULTIMEDIA Development Team.
Content Reusability in Learning Management Systems Priit Mägi DAP01s.
1 CS 430: Information Discovery Lecture 22 Non-Textual Materials 2.
IM Lab NCCU 1 Introduction of SCORM: Sharable Content Object Reference Model Hao-Chuan Wang Computer Science Department National Chengchi University 2003.
ADVISE: Advanced Digital Video Information Segmentation Engine
MS DB Proposal Scott Canaan B. Thomas Golisano College of Computing & Information Sciences.
Supervised by Prof. LYU, Rung Tsong Michael Department of Computer Science & Engineering The Chinese University of Hong Kong Prepared by: Chan Pik Wah,
Multimedia Search and Retrieval Presented by: Reza Aghaee For Multimedia Course(CMPT820) Simon Fraser University March.2005 Shih-Fu Chang, Qian Huang,
Advanced Distributed Learning. Conditions Before SCORM  Couldn’t move courses from one Learning Management System to another  Couldn’t reuse content.
Feature vs. Model Based Vocal Tract Length Normalization for a Speech Recognition-based Interactive Toy Jacky CHAU Department of Computer Science and Engineering.
MUSCLE movie data base is a multimodal movie corpus collected to develop content- based multimedia processing like: - speaker clustering - speaker turn.
LYU 0102 : XML for Interoperable Digital Video Library Recent years, rapid increase in the usage of multimedia information, Recent years, rapid increase.
Presented by Zeehasham Rasheed
A fuzzy video content representation for video summarization and content-based retrieval Anastasios D. Doulamis, Nikolaos D. Doulamis, Stefanos D. Kollias.
Metadata Presentation by Rick Pitchford Chief Engineer, School of Communication COM 633, Content Analysis Methods Fall 2009.
Asst.Prof.Dr.Surasak Mungsing. By: Akshay Kumar Sharable Content Object Reference Model.
Wednesday, October 22, E-Learning Objects: The Value of SCORM and MPEG-7 Packaging for Digital Media Assets TRACK 3: TEACHING AND LEARNING Thursday,
SCORM By: Akshay Kumar. SCORM 2 What we want? What is SCORM? What is SCORM? Connection with e-learning Connection with e-learning Application of XML Technology.
Information Retrieval in Practice
MDC Open Information Model West Virginia University CS486 Presentation Feb 18, 2000 Lijian Liu (OIM:
Accelerating e-Learning Interoperability Introducing the CLEO Lab Tyde Richards IBM Mindspan Solutions Daniel R. Rehak Carnegie Mellon University.
THE ADVANCED DISTRIBUTED LEARNING (ADL) INITIATIVE
An Overview of MPEG-21 Cory McKay. Introduction Built on top of MPEG-4 and MPEG-7 standards Much more than just an audiovisual standard Meant to be a.
Information Systems & Semantic Web University of Koblenz ▪ Landau, Germany Semantic Web - Multimedia Annotation – Steffen Staab
E-Learning standards and meta-data: Case study ดร. น้ำทิพย์ วิภาวิน Sripatum University Library.
Mastering Adaptive Hypermedia Courseware Authors: Boyan Bontchev, Dessislava Vassileva, Slavomir Grigorov ICETA 2008.
Semantic Learning Instructor: Professor Cercone Razieh Niazi.
Multimodal Information Analysis for Emotion Recognition
Introduction to Making Multimedia
Automatic Image Annotation by Using Concept-Sensitive Salient Objects for Image Content Representation Jianping Fan, Yuli Gao, Hangzai Luo, Guangyou Xu.
Prof. Thomas Sikora Technische Universität Berlin Communication Systems Group Thursday, 2 April 2009 Integration Activities in “Tools for Tag Generation“
PSEUDO-RELEVANCE FEEDBACK FOR MULTIMEDIA RETRIEVAL Seo Seok Jun.
SCORM Course Meta-data 3 major components: Content Aggregation Meta-data –context specific data describing the packaged course SCO Meta-data –context independent.
Masoud Makrehchi, PAMI, UW Learning Object Metadata Masoud Makrehchi PAMI University of Waterloo August 2004.
Digital Learning India 2008 July , 2008 Mrs. C. Vijayalakshmi Department of Computer science and Engineering Indian Institute of Technology – IIT.
March 31, 1998NSF IDM 98, Group F1 Group F Multi-modal Issues, Systems and Applications.
1 Applications of video-content analysis and retrieval IEEE Multimedia Magazine 2002 JUL-SEP Reporter: 林浩棟.
Automatic Video Tagging using Content Redundancy Stefan Siersdorfer 1, Jose San Pedro 2, Mark Sanderson 2 1 L3S Research Center, Germany 2 University of.
Semantic Extraction and Semantics-Based Annotation and Retrieval for Video Databases Authors: Yan Liu & Fei Li Department of Computer Science Columbia.
Chapter 3-Multimedia Skills
Digital Video Library Network Supervisor: Prof. Michael Lyu Student: Ma Chak Kei, Jacky.
Lesson 3-Multimedia Skills. Overview Members of a multimedia team. Roles and responsibilities in a multimedia team.
Patricia Ploetz, ABD Academic ADL Co-Lab University of Wisconsin Stevens Point Canadian Association for Distance Education Wise And Witty Weekday Presentation.
MPEG-7 Audio Overview Ichiro Fujinaga MUMT 611 McGill University.
DANIELA KOLAROVA INSTITUTE OF INFORMATION TECHNOLOGIES, BAS Multimedia Semantics and the Semantic Web.
MMM2005The Chinese University of Hong Kong MMM2005 The Chinese University of Hong Kong 1 Video Summarization Using Mutual Reinforcement Principle and Shot.
Enhanced hypertext categorization using hyperlinks Soumen Chakrabarti (IBM Almaden) Byron Dom (IBM Almaden) Piotr Indyk (Stanford)
1 Multimedia Development Team. 2 To discuss phases of MM production team members Multimedia I.
Ontology-based Automatic Video Annotation Technique in Smart TV Environment Jin-Woo Jeong, Hyun-Ki Hong, and Dong-Ho Lee IEEE Transactions on Consumer.
MPEG 7 &MPEG 21.
Semantic Web. P2 Introduction Information management facilities not keeping pace with the capacity of our information storage. –Information Overload –haphazardly.
Update: © Copyright 2004, Carnegie Mellon University Some Rights Reserved 1/79 Daniel Rehak, PhD Professor and Technical Director Learning Systems.
Visual Information Retrieval
Automatic Video Shot Detection from MPEG Bit Stream
Multimedia Content-Based Retrieval
Chapter 10 Development of Multimedia Project
Multimedia Content Description Interface
Ying Dai Faculty of software and information science,
Presentation transcript:

Saras Shareable Rich Media Learning Object Repositories and Management for e-Learning Chitra Dorai IBM T.J. Watson Research Center New York (Saras(wati), a Sanskrit word for flow of knowledge/Goddess of Learning)

Overview of e-Learning Content Management Research E-learning media semantic analysis for metadata generation SCORM and MPEG-7 conformant asset metadata model Search and browse client interfaces Text, Images Course catalogs, Student Assessments Content Manager Asset Repository Search & Browse Client LO ingest Learning Management System Learning Authoring Tool E-Learning Media Analyzer Metadata Audio, Video SCORM / MPEG-7 Data Model (DD) Discussion Sections Narration sections Dialog, interviews,... raw footage, text,... Video On-screen narration Voice Over Direct Narration Assistive Narration Uninterrupted Voice Over Interrupted Voice Over Linkage Sections (DN)(AN)(UV)(IV)(LF) Multimodal narrative structure analysis for partitioning of instructional media Manage learning assets of various types Middleware for shareable learning object repositories Metadata model creation from XML schema

Project Goals Develop SCORM support technologies Enable generic content repositories (CMv8 and DB2) to support standards compliant e- learning and transform into shareable and interoperable learning object repositories Analyze instructional media for automated SCORM/MPEG-7 compliant metadata generation

The Department of Defense (DoD) established Advanced Distributed Learning (ADL) initiative in ADL develops strategy for using learning and information technologies to modernize education and training on the Web, and to promote e-learning standardization. SCORM (Shareable Content Object Reference Model): ADL reference model for shareable learning content objects that enable interoperability, accessibility and reusability of Web-based learning content. Content Aggregation Model: LO Metadata, Content Packaging SCORM is built on many e-Learning standardization efforts --- AICC, IMS, IEEE LOM (became a standard in 06/02), ARIADNE. E-Learning and Standards

SCORM LOM Overview Nine learning object metadata categories from IEEE LOM specification –General, Lifecycle, Meta-metadata, Technical, Educational, Rights, Relation, Annotation, and Classification IMSs XML binding specification for metadata representation Describe three content model components –Asset, Sharable Content Object (SCO), Content Aggregation

Enabling Content Repositories for e-Learning Objective: Develop middleware tools to enable content management products (IBM CM v8) and databases (DB2) for standards- based e-Learning archival and for supporting SCORM- compliant learning object metadata. Creation of SCORM compliant learning object meta- data model on a repository Automated storage of learning objects and their meta-data in the content repository Search and retrieval of learning objects based on their meta-data

E-Learning Content Management with Content Manager

Meta-data Generation Pages

Automated Instructional Media Analysis Objectives: –Develop technologies for standards-based e-learning content tagging, supporting shareable and searchable learning object repositories with rich media. Rich instructional media analysis for automated extraction of learning objects and their metadata from media for content-based search and browse

Problem with the State of the Art The user seeks semantic similarity, the [multimedia] database can only provide similarity on data processing Existing content annotation/management systems cannot ensure reliable content location and access –Fall far short from the expectations of users: Semantic gap –Generic, low-level annotations that deal only with characterizing perceived content, not the meaning of it –Lack of structure in content organization for non-linear navigation

Our Approach to Media Semantics Analysis New Research Approach: Computational Media Aesthetics is the algorithmic study of visual and aural elements in media and associated analysis of the principles that underlie their manipulation in the creative art of clarifying and interpreting some event for an audience. Best semantic grid for media interpretation is that within which its creators work - Derive meaning from the production grammar, aesthetic conventions used Create tools for understanding high-level semantic constructs in a domain by interpreting the data with its makers eye, exploiting media production methods for their perceptual and interpretive guidance. Content Repository Media Semantic Analyzer Metadata (DD) Discussion Sections Narration sections Dialog, interviews,... raw footage, text,... Video On-screen narration Voice Over Direct Narration Assistive Narration Uninterrupted Voice Over Interrupted Voice Over Linkage Sections (DN)(AN)(UV)(IV)(LF) Example 1 - Multimodal analysis for extracting hierarchy of narrative structures in education/training video Focus Areas: Motion picture analysis for affect and story essence using film grammar (recognized w best paper awards) e-learning; Multimodal algorithms to parse and structure audiovisual content in media for content distillation & nonlinear browsing Multigranular media narrative segmentation to generate & annotate reusable assets Tempo in Titanic Tempo ebb and flow and associated story elements and events automatically deconstructed Example 2 - Titanic Movie Analysis for Tempo

Example Narrative Structure Based Segmentation of Education and Training Videos Problem Statement: Automatically structuralize instructional media through high-level semantics-based video partitioning and content tagging for effective segment search, access, and browse services in e-learning content management systems Joint Work with Dinh Q. Phung and Svetha Venkatesh, Curtin University of Technology, W. Australia

Narrative Structures Hierarchy Discussion sections Direct Narration Assistive Narration Un-interrupted VO Interrupted VO Linkage Sections On-screen Narration Voice Over Narration Sections Raw footage, text, … Dialog, interviews, …

Narrative Structures Hierarchy: Discussion Sections Discussion sections Direct Narration Assistive Narration Un-interrupted VO Interrupted VO Linkage Sections On-screen Narration Voice Over Narration Sections Raw footage, text, … Dialog, interviews, … Capture dialog, interviews, meeting sections.

Narrative Structures Hierarchy: On-Screen Narration Discussion sections Direct Narration Assistive Narration Un-interrupted VO Interrupted VO Linkage Sections On-screen Narration Voice Over Narration Sections Raw footage, text, … Dialog, interviews, … Clear view of a narrator speaking in the scene. Dominated by narrators face and captured in a close-up. Interrupted presence of the narrator.

Narrative Structures Hierarchy: Voice Overs Discussion sections Direct Narration Assistive Narration Un-interrupted VO Interrupted VO Linkage Sections On-screen Narration Voice Over Narration Sections Raw footage, text, … Dialog, interviews, … The audio track is dominated by the voice of the narrator, but without their appearances (no faces) smooth and continuous interrupted

Narrative Structures Hierarchy: Linkage Sections Discussion sections Direct Narration Assistive Narration Un-interrupted VO Interrupted VO Linkage Sections On-screen Narration Voice Over Narration Sections Raw footage, text, … Dialog, interviews, … Raw footage, superimposed text, and others.

Visual Processing S = {f 1, f 2, …, f N }: Sequence of frames from shots in a video for face detection Detect faces in frames using CMUs face detector software Feature 1: How many faces -- How many frames contain faces as a proportion of the total frames in a shot ? Feature 2: Avg. face areas -- If there is a face, how big is the face? Two frame sequences from a shot are used: Uniformly sampled and key frames sequence

Audio Processing Classify shot audio into voice (V), no-voice (N) or mixture of two (M) Is the voice consistently delivered ? New voice connectivity feature: Number of contiguous speech-dominant clips normalized by the shot length. Characterize dominance of speech in audio tracks of shots Cluster audio clips into two classes and assume the larger cluster as one of clips with speech domination N = total # of audio clips within a shot Nv = # of clips classified as voice-dominated Va = voice activity = Nv/N

Classification Decision Trees as machine learning classifiers for final labeling of narrative structures C4.5 algorithm to train and test decision trees First learn all six classes at the first children level and test accuracy of labeling Propose a two-level decision tree for improved performance

Experimental Results Average classification result is high: 91.6% Experimental Results: Confusion Matrix for Six Classes

Exp. Results (cont.) Results are very good for classes: DD, DN, AN and UV. However, poor for classes IV and LF VO with presences of many faces (meetings, party,..) accounts for most of misclassification Solution: group IV, LF and UV into a group G and study separately

Exp. Results (cont.) G 97.6%

Exp. Results (cont.) Over-fitting is the problem identified in G due to UV instances outnumbering IV and LF To solve the problem to a certain extent, reduce number of UV such that number of instances of (IV, UV, LF) are approx. the same, and train with C4.5 a b c a = UV b = IV c = LF 84.3%

Conclusion Novel narrative structure based analysis for segmentation of education and training videos Hierarchical DT-classification system achieves an overall accuracy of 84.7% Focus on higher level semantics such as segmentation of topics Work is underway –Map media objects to LOs –Algorithms for support of both SCORM and MPEG- 7 compliant XML metadata

Acknowledgements Team: Geetika Tewari (IBM TJW, currently at Harvard U) Norman Haas (IBM TJW) Austin Schilling (IBM SWG)