2004.11.10- SLIDE 1IS 257 - Fall 2004 Data-Driven Digital Library Applications -- The UC Berkeley Environmental Digital Library University of California,

Slides:



Advertisements
Similar presentations
Connecting to Databases. relational databases tables and relations accessed using SQL database -specific functionality –transaction processing commit.
Advertisements

Chapter 10: Designing Databases
Database System Concepts and Architecture
Lecture-7/ T. Nouf Almujally
Management Information Systems, Sixth Edition
SLIDE 1IS Fall 2002 Extending Object-Relational Database Systems University of California, Berkeley School of Information Management.
Oct 31, 2000Database Management -- Fall R. Larson Database Management: Introduction to Terms and Concepts University of California, Berkeley School.
Requirements Specification
11/20/2001Database Management -- Spring R. Larson Databases and the Future University of California, Berkeley School of Information Management.
SLIDE 1IS Fall 2002 Database Applications -- The UC Berkeley Environmental Digital Library University of California, Berkeley School.
11/21/2000Database Management -- Spring R. Larson Object-Relational Database Applications -- The UC Berkeley Environmental Digital Library University.
SLIDE 1IS 257 – Fall 2010 JDBC and Java Access to DBMS & Introduction to Data Warehouses University of California, Berkeley School of Information.
SLIDE 1IS Fall 2002 Data Warehousing University of California, Berkeley School of Information Management and Systems SIMS 257: Database.
Geographic Information Systems
Internet Resources Discovery (IRD) IBM DB2 Digital Library Thanks to Zvika Michnik and Avital Greenberg.
SLIDE 1IS 257 – Fall 2006 JDBC and Java Access to DBMS University of California, Berkeley School of Information IS 257: Database Management.
SLIDE 1IS 257 – Spring 2004 Object-Relational Database System Features University of California, Berkeley School of Information Management.
11/15/2001Database Management -- Spring R. Larson Object-Relational Database Applications -- The UC Berkeley Environmental Digital Library University.
SLIDE 1IS 240 – Spring 2010 Prof. Ray Larson University of California, Berkeley School of Information Principles of Information Retrieval.
Copyright © 2011 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 2 Overview of Database Languages and Architectures.
RIZWAN REHMAN, CCS, DU. Advantages of ORDBMSs  The main advantages of extending the relational data model come from reuse and sharing.  Reuse comes.
Transaction Processing Systems, & Management Information Systems.
Class 6 Data and Business MIS 2000 Updated: September 2012.
Advance Computer Programming Java Database Connectivity (JDBC) – In order to connect a Java application to a database, you need to use a JDBC driver. –
Digital Library Architecture and Technology
Database System Concepts and Architecture Lecture # 3 22 June 2012 National University of Computer and Emerging Sciences.
1 Java Database Connection (JDBC) There are many industrial-strength DBMS's commercially available in the market. Oracle, DB2, and Sybase are just a few.
Copyright © 2003 by Prentice Hall Module 4 Database Management Systems 1.What is a database? Data hierarchy and data organization Field, record, file,
 Introduction Introduction  Purpose of Database SystemsPurpose of Database Systems  Levels of Abstraction Levels of Abstraction  Instances and Schemas.
4-1 INTERNET DATABASE CONNECTOR Colorado Technical University IT420 Tim Peterson.
Database Programming in Java Corresponds with Chapter 32, 33.
MAHI Research Database Data Validation System Software Prototype Demonstration September 18, 2001
CST203-2 Database Management Systems Lecture 2. One Tier Architecture Eg: In this scenario, a workgroup database is stored in a shared location on a single.
Database System Concepts and Architecture Lecture # 2 21 June 2012 National University of Computer and Emerging Sciences.
Database Environment Chapter 2 AIT632 Sungchul Hong.
3/20/2000Principles of Information Retrieval Digital Libraries – Issues & Geographic Information Retrieval University of California, Berkeley School of.
CHAPTER TEN AUTHORING.
1 CS 430 Database Theory Winter 2005 Lecture 17: Objects, XML, and DBMSs.
© 2001 Business & Information Systems 2/e1 Chapter 8 Personal Productivity and Problem Solving.
11 3 / 12 CHAPTER Databases MIS105 Lec15 Irfan Ahmed Ilyas.
Database Design and Management CPTG /23/2015Chapter 12 of 38 Functions of a Database Store data Store data School: student records, class schedules,
5 - 1 Copyright © 2006, The McGraw-Hill Companies, Inc. All rights reserved.
McGraw-Hill/Irwin © 2008 The McGraw-Hill Companies, All Rights Reserved Chapter 7 Storing Organizational Information - Databases.
1 CS 430 Database Theory Winter 2005 Lecture 2: General Concepts.
INFO1408 Database Design Concepts Week 15: Introduction to Database Management Systems.
Object relational database managmement systems (ORDBMS) Adapted by Edel Sherratt from originals by Nigel Hardy.
Copyright © 2011 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 2 Database System Concepts and Architecture.
JDBC. Java.sql.package The java.sql package contains various interfaces and classes used by the JDBC API. This collection of interfaces and classes enable.
XML and Database.
Object Oriented Database By Ashish Kaul References from Professor Lee’s presentations and the Web.
Intro to GIS | Summer 2012 Attribute Tables – Part 1.
Digital Library The networked collections of digital text, documents, images, sounds, scientific data, and software that are the core of today’s Internet.
Metadata By N.Gopinath AP/CSE Metadata and it’s role in the lifecycle. The collection, maintenance, and deployment of metadata Metadata and tool integration.
CSI 3125, Preliminaries, page 1 JDBC. CSI 3125, Preliminaries, page 2 JDBC JDBC stands for Java Database Connectivity, which is a standard Java API (application.
Object storage and object interoperability
Basics of JDBC Session 14.
McGraw-Hill/Irwin © 2008 The McGraw-Hill Companies, All Rights Reserved Chapter 7 Storing Organizational Information - Databases.
Fundamentals of Web DevelopmentRandy Connolly and Ricardo HoarFundamentals of Web DevelopmentRandy Connolly and Ricardo Hoar Fundamentals of Web DevelopmentRandy.
Copyright 2007, Information Builders. Slide 1 iWay Web Services and WebFOCUS Consumption Michael Florkowski Information Builders.
2) Database System Concepts and Architecture. Slide 2- 2 Outline Data Models and Their Categories Schemas, Instances, and States Three-Schema Architecture.
Building Preservation Environments with Data Grid Technology Reagan W. Moore Presenter: Praveen Namburi.
Copyright © 2016 Ramez Elmasri and Shamkant B. Navathe.
Alexandria Digital Library The ADL Testbed Greg Janée
ISC321 Database Systems I Chapter 2: Overview of Database Languages and Architectures Fall 2015 Dr. Abdullah Almutairi.
1 Section 1 - Introduction to SQL u SQL is an abbreviation for Structured Query Language. u It is generally pronounced “Sequel” u SQL is a unified language.
Data Resource Management Data Concepts Database Management Types of Databases Chapter 5 McGraw-Hill/Irwin Copyright © 2007 by The McGraw-Hill Companies,
DEPTT. OF COMP. SC & APPLICATIONS
ODBC, OCCI and JDBC overview
Copyright © 2011 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 2 Database System Concepts and Architecture.
Data, Databases, and DBMSs
Presentation transcript:

SLIDE 1IS Fall 2004 Data-Driven Digital Library Applications -- The UC Berkeley Environmental Digital Library University of California, Berkeley School of Information Management and Systems SIMS 257: Database Management

SLIDE 2IS Fall 2004 Lecture Outline Final Project Review –ORDBMS Feature –JDBC Access to DBMS Data-Driven Digital Library Applications –Berkeley’s Environmental Digital Library

SLIDE 3IS Fall 2004 Lecture Outline Final Project Review –ORDBMS Feature –JDBC Access to DBMS Data-Driven Digital Library Applications –Berkeley’s Environmental Digital Library

SLIDE 4IS Fall 2004 Final Project Requirements See WWW site: – Report on personal/group database including: –Database description and purpose –Data Dictionary –Relationships Diagram –Sample queries and results (Web or Access tools) –Sample forms (Web or Access tools) –Sample reports (Web or Access tools) –Application Screens (Web or Access tools)

SLIDE 5IS Fall 2004 Final Presentations and Reports Specifications for final report are on the Web Site under assignments Reports Due on December 15. Presentations on December 15, 9:00- 12:00

SLIDE 6IS Fall 2004 Lecture Outline Final Project Review –ORDBMS Feature –JDBC Access to DBMS Data-Driven Digital Library Applications –Berkeley’s Environmental Digital Library

SLIDE 7IS Fall 2004 Object Relational Data Model Class, instance, attribute, method, and integrity constraints OID per instance Encapsulation Multiple inheritance hierarchy of classes Class references via OID object references Set-Valued attributes Abstract Data Types

SLIDE 8IS Fall 2004 PostgreSQL All of the usual SQL commands for creation, searching and modifying classes (tables) are available. With some additions… Inheritance Non-Atomic Values User defined functions and operators

SLIDE 9IS Fall 2004 Inheritance CREATE TABLE cities ( name text, population float, altitude int -- (in ft) ); CREATE TABLE capitals ( state char(2) ) INHERITS (cities);

SLIDE 10IS Fall 2004 Non-Atomic Values - Arrays Postgres allows attributes of an instance to be defined as fixed-length or variable-length multi- dimensional arrays. Arrays of any base type or user-defined type can be created. To illustrate their use, we first create a class with arrays of base types. CREATE TABLE SAL_EMP ( name text, pay_by_quarter int4[], schedule text[][] );

SLIDE 11IS Fall 2004 PostgreSQL Extensibility Postgres is extensible because its operation is catalog- driven –RDBMS store information about databases, tables, columns, etc., in what are commonly known as system catalogs. (Some systems call this the data dictionary). One key difference between Postgres and standard RDBMS is that Postgres stores much more information in its catalogs –not only information about tables and columns, but also information about its types, functions, access methods, etc. These classes can be modified by the user, and since Postgres bases its internal operation on these classes, this means that Postgres can be extended by users –By comparison, conventional database systems can only be extended by changing hardcoded procedures within the DBMS or by loading modules specially-written by the DBMS vendor.

SLIDE 12IS Fall 2004 User Defined Functions CREATE FUNCTION allows a Postgres user to register a function with a database. Subsequently, this user is considered the owner of the function CREATE FUNCTION name ( [ ftype [,...] ] ) RETURNS rtype AS {SQLdefinition} LANGUAGE 'langname' [ WITH ( attribute [,...] ) ] CREATE FUNCTION name ( [ ftype [,...] ] ) RETURNS rtype AS obj_file, link_symbol LANGUAGE 'C' [ WITH ( attribute [,...] ) ]

SLIDE 13IS Fall 2004 External Functions This example creates a C function by calling a routine from a user-created shared library. This particular routine calculates a check digit and returns TRUE if the check digit in the function parameters is correct. It is intended for use in a CHECK contraint. CREATE FUNCTION ean_checkdigit(bpchar, bpchar) RETURNS bool AS '/usr1/proj/bray/sql/funcs.so' LANGUAGE 'c'; CREATE TABLE product ( id char(8) PRIMARY KEY, eanprefix char(8) CHECK (eanprefix ~ '[0-9]{2} [0-9]{5}') REFERENCES brandname(ean_prefix), eancode char(6) CHECK (eancode ~ '[0-9]{6}'), CONSTRAINT ean CHECK (ean_checkdigit(eanprefix, eancode)));

SLIDE 14IS Fall 2004 Creating new Types CREATE TYPE allows the user to register a new user data type with Postgres for use in the current data base. The user who defines a type becomes its owner. typename is the name of the new type and must be unique within the types defined for this database. CREATE TYPE typename ( INPUT = input_function, OUTPUT = output_function, INTERNALLENGTH = { internallength | VARIABLE } [, EXTERNALLENGTH = { externallength | VARIABLE } ] [, DEFAULT = "default" ] [, ELEMENT = element ] [, DELIMITER = delimiter ] [, SEND = send_function ] [, RECEIVE = receive_function ] [, PASSEDBYVALUE ] )

SLIDE 15IS Fall 2004 Rules System CREATE RULE name AS ON event TO object [ WHERE condition ] DO [ INSTEAD ] [ action | NOTHING ] Rules can be triggered by any event (select, update, delete, etc.)

SLIDE 16IS Fall 2004 Views as Rules Views in Postgres are implemented using the rule system. In fact there is absolutely no difference between a CREATE VIEW myview AS SELECT * FROM mytab; compared against the two commands CREATE TABLE myview (same attribute list as for mytab); CREATE RULE "_RETmyview" AS ON SELECT TO myview DO INSTEAD SELECT * FROM mytab;

SLIDE 17IS Fall 2004 GiST Approach A generalized search tree. Must be: Extensible in terms of queries General (B+-tree, R-tree, etc.) Easy to extend Efficient (match specialized trees) Highly concurrent, recoverable, etc.

SLIDE 18IS Fall 2004 Java and JDBC Java is probably the high-level language used in most software development today one of the earliest “enterprise” additions to Java was JDBC JDBC is an API that provides a mid-level access to DBMS from Java applications Intended to be an open cross-platform standard for database access in Java Similar in intent to Microsoft’s ODBC

SLIDE 19IS Fall 2004 JDBC Provides a standard set of interfaces for any DBMS with a JDBC driver – using SQL to specify the databases operations. Resultset Statement Resultset Connection PreparedStatementCallableStatement DriverManager Oracle Driver ODBC DriverPostgres Driver Oracle DBPostgres DBODBC DB Application

SLIDE 20IS Fall 2004 JDBC Simple Java Implementation import java.sql.*; import oracle.jdbc.*; public class JDBCSample { public static void main(java.lang.String[] args) { try { // this is where the driver is loaded //Class.forName("jdbc.oracle.thin"); DriverManager.registerDriver(new OracleDriver()); } catch (SQLException e) { System.out.println("Unable to load driver Class"); return; }

SLIDE 21IS Fall 2004 JDBC Simple Java Impl. try { //All DB access is within the try/catch block... // make a connection to ORACLE on Dream Connection con = DriverManager.getConnection( “mylogin", “myoraclePW"); // Do an SQL statement... Statement stmt = con.createStatement(); ResultSet rs = stmt.executeQuery("SELECT NAME FROM DIVECUST");

SLIDE 22IS Fall 2004 JDBC Simple Java Impl. // show the Results... while(rs.next()) { System.out.println(rs.getString("NAME")); } // Release the database resources... rs.close(); stmt.close(); con.close(); } catch (SQLException se) { // inform user of errors... System.out.println("SQL Exception: " + se.getMessage()); se.printStackTrace(System.out); }

SLIDE 23IS Fall 2004 Lecture Outline Final Project Review –ORDBMS Feature –JDBC Access to DBMS Data-Driven Digital Library Applications –Berkeley’s Environmental Digital Library

SLIDE 24IS Fall 2004 Berkeley DL Project Object Relational Database Applications –The Berkeley Digital Library Project Slides from RRL and Robert Wilensky, EECS –Use of DBMS in DL project

SLIDE 25IS Fall 2004 Overview What is an Digital Library? Overview of Ongoing Research on Information Access in Digital Libraries

SLIDE 26IS Fall 2004 Digital Libraries Are Like Traditional Libraries... Involve large repositories of information (storage, preservation, and access) Provide information organization and retrieval facilities (categorization, indexing) Provide access for communities of users (communities may be as large as the general public or small as the employees of a particular organization)

SLIDE 27IS Fall 2004 Originators Libraries Users Traditional Library System

SLIDE 28IS Fall 2004 But Digital Libraries Are Different From Libraries... Not a physical location with local copies; objects held closer to originators Decoupling of storage, organization, access Enhanced Authoring (origination, annotation, support for work groups) Subscription, pay-per-view supported in addition to “free” browsing. Integration into user tasks.

SLIDE 29IS Fall 2004 Originators Repositories Users Index Services Network A Digital Library Infrastructure Model

SLIDE 30IS Fall 2004 UC Berkeley Digital Library Project Focus: Work-centered digital information services Testbed: Digital Library for the California Environment Research: Technical agenda supporting user-oriented access to large distributed collections of diverse data types. Part of the NSF/NASA/DARPA Digital Library Initiative (Phases 1 and 2)

SLIDE 31IS Fall 2004 UCB Digital Library Project: Research Organizations UC Berkeley EECS, SIMS, CED, IS&T UCOP/CDL Xerox PARC’s Document Image Decoding group and Work Practices group Hewlett-Packard NEC SUN Microsystems IBM Almaden Microsoft Ricoh California Research Philips Research

SLIDE 32IS Fall 2004 Testbed: An Environmental Digital Library Collection: Diverse material relevant to California’s key habitats. Users: A consortium of state agencies, development corporations, private corporations, regional government alliances, educational institutions, and libraries. Potential: Impact on state-wide environmental system (CERES )

SLIDE 33IS Fall 2004 The Environmental Library - Users/Contributors California Resources Agency, California Environment Resources Evaluation System (CERES) California Department of Water Resources The California Department of Fish & Game SANDAG UC Water Resources Center Archives New Partners: CDL and SDSC

SLIDE 34IS Fall 2004 The Environmental Library - Contents Environmental technical reports, bulletins, etc. County general plans Aerial and ground photography USGS topographic maps Land use and other special purpose maps Sensor data “Derived” information Collection data bases for the classification and distribution of the California biota (e.g., SMASCH) Supporting 3-D, economic, traffic, etc. models Videos collected by the California Resources Agency

SLIDE 35IS Fall 2004 The Environmental Library - Contents As of late 2002, the collection represents over one terabyte of data, including over 183,000 digital images, about 300,000 pages of environmental documents, and over 2 million records in geographical and botanical databases.

SLIDE 36IS Fall 2004 Botanical Data: The CalFlora Database contains taxonomical and distribution information for more than 8000 native California plants. The Occurrence Database includes over 600,000 records of California plant sightings from many federal, state, and private sources. The botanical databases are linked to the CalPhotos collection of California plants, and are also linked to external collections of data, maps, and photos.

SLIDE 37IS Fall 2004 Geographical Data: Much of the geographical data in the collection has been used to develop our web-based GIS Viewer. The Street Finder uses 500,000 Tiger records of S.F. Bay Area streets along with the 70,000-records from the USGS GNIS database. California Dams is a database of information about the 1395 dams under state jurisdiction. An additional 11 GB of geographical data represents maps and imagery that have been processed for inclusion as layers in our GIS Viewer. This includes Digital Ortho Quads and DRG maps for the S.F. Bay Area.

SLIDE 38IS Fall 2004 Documents: Most of the 300,000 pages of digital documents are environmental reports and plans that were provided by California state agencies. This collection includes documents, maps, articles, and reports on the California environment including Environmental Impact Reports (EIRs), educational pamphlets, water usage bulletins, and county plans. Documents in this collection come from the California Department of Water Resources (DWR), California Department of Fish and Game (DFG), San Diego Association of Governments (SANDAG), and many other agencies. Among the most frequently accessed documents are County General Plans for every California county and a survey of 125 Sacramento Delta fish species.

SLIDE 39IS Fall 2004 Testbed Success Stories LUPIN: CERES’ Land Use Planning Information Network –California Country General Plans and other environmental documents. –Enter at Resources Agency Server, documents stored at and retrieved from UCB DLIB server. California flood relief efforts –High demand for some data sets only available on our server (created by document recognition). CalFlora: Creation and interoperation of repositories pertaining to plant biology. Cloning of services at Cal State Library, FBI

SLIDE 40IS Fall 2004 Research Highlights Documents –Multivalent Document prototype Page images, structured documents, GIS data, photographs Intelligent Access to Content –Document recognition –Vision-based Image Retrieval: stuff, thing, scene retrieval –Natural Language Processing: categorizing the web, Cheshire II, TileBar Interfaces

SLIDE 41IS Fall 2004 Multivalent Documents MVD Model –radically distributed, open, extensible –“behaviors” and “layers” behaviors conform to a protocol suite inter-operation via “IDEG” Applied to “enlivening legacy documents” –various nice behaviors, e.g., lenses

SLIDE 42IS Fall 2004 Document Presentation Problem: Digital libraries must deliver digital documents -- but in what form? Different forms have advantages for particular purposes –Retrieval –Reuse –Content Analysis –Storage and archiving Combining forms (Multivalent documents)

SLIDE 43IS Fall 2004 Spectrum of Digital Document Representations Adapted from Fox, E.A., et al. “Users, User Interfaces and Objects: Evision, an Electronic Library”, JASIS 44(8), 1993

SLIDE 44IS Fall 2004 Document Representation: Multivalent Documents Primary user interface/document model for UCB Digital Library (Wilensky & Phelps) Goal: An approach to new document representations and their authoring. Supports active, distributed, composable transformations of multimedia documents. Enables sophisticated annotations, intelligent result handling, user-modifiable interface, composite documents.

SLIDE 45IS Fall 2004 Multivalent Documents Cheshire Layer OCR Layer OCR Mapping Layer History of The Classical World The jsfj sjjhfjs jsjj jsjhfsjf sjhfjksh sshf jsfksfjk sjs jsjfs kj sjfkjsfhskjf sjfhjksh skjfhkjshfjksh jsfhkjshfjkskjfhsfh skjfksjflksjflksjflksf sjfksjfkjskfjskfjklsslk slfjlskfjklsfklkkkdsj ksfksjfkskflk sjfjksf kjsfkjsfkjshf sjfsjfjks ksfjksfjksjfkthsjir\\ ks ksfjksjfkksjkls’ks klsjfkskfksjjjhsjhuu sfsjfkjs Modernjsfj sjjhfjs jsjj jsjhfsjf sslfjksh sshf jsfksfjk sjs jsjfs kj sjfkjsfhskjf sjfhjksh skjfhkjshfjksh jsfhkjshfjkskjfhsfh skjfksjflksjflksjflksf sjfksjfkjskfjskfjklsslk slfjlskfjklsfklkkkdsj GIS Layer taksksh kdjjdkd kdjkdjkd kj sksksk kdkdk kdkd dkk skksksk jdjjdj clclc ldldl taksksh kdjjdkd kdjkdjkd kj sksksk kdkdk kdkd dkk skksksk jdjjdj clclc ldldl Table 1. Table Layer kdk dkd kdk Scanned Page Image Valence: 2: The relative capacity to unite, react, or interact (as with antigens or a biological substrate). Webster’s 7th Collegiate Dictionary Network Protocols & Resources

SLIDE 46IS Fall 2004

SLIDE 47IS Fall 2004

SLIDE 48IS Fall 2004 MVD availability The MVD Browser is now available as open source on SourceForge – See also: –

SLIDE 49IS Fall 2004 GIS in the MVD Framework Layers are georeferenced data sets. Behaviors are –display semi-transparently –pan –zoom –issue query –display context –“spatial hyperlinks” –annotations Written in Java

SLIDE 50IS Fall 2004 GIS Viewer: Features Annotation and saving –points, rectangles (w. labels and links), vectors –saving of annotations as separate layer Integration with address, street finding, gazetteer services Application to image viewing: tilePix Castanet client

SLIDE 51IS Fall 2004

SLIDE 52IS Fall 2004

SLIDE 53IS Fall 2004

SLIDE 54IS Fall 2004 GIS Viewer Example

SLIDE 55IS Fall 2004 Geographic Information: Plans and Ideas More annotations, flexible saving Support for large vector data sets Interoperability –On-the-fly conversion of formats generation of “catalogs” –Via OGDI/GLTP –Experimenting with various CERES servers

SLIDE 56IS Fall 2004 Documents: Information from scanned documents Built document recognizers for some important documents, e.g. “Bulletin 17”. “TR-9”. Recognized document structure, with order magnitude better OCR. Automatically generated 1395 item dam relational data base. Enabled access via forms, map interfaces. Enable interoperation with image DB.

SLIDE 60IS Fall 2004 Document Recognition: Ongoing Work Document recognizers: for ~ dozen document types Development and integration of mathematical OCR and recognition. Eventually produce document recognizer generator, i.e., make it easier to write recognizers.

SLIDE 61IS Fall 2004 Vision-Based Image Retrieval Stuff-based queries: “blobs” –Basic blobs: colors, sizes, variable number demonstrated utility for interesting queries –“Blob world”: Above plus texture, applied to retrieving similar images successful learning scene classifier Thing-finding: Successfully deployed detectors adding body plans (adding shape, geometry and kinematic constraints)

SLIDE 62IS Fall 2004 Image Retrieval Research Finding “Stuff” vs “Things” BlobWorld Other Vision Research

SLIDE 63IS Fall 2004 (Old “stuff”-based image retrieval: Query)

SLIDE 64IS Fall 2004 (Old “stuff”-based image retrieval: Result)

SLIDE 65IS Fall 2004 Blobworld: use regions for retrieval We want to find general objects  Represent images based on coherent regions

SLIDE 68IS Fall 2004 (“Thing”-based image retrieval using “body plans”: Result)

SLIDE 69IS Fall 2004 Natural Language Processing Developed automatic categorization/disambiguation method to point where topic assignment (but not disambiguation) appears feasible. Ran controlled experiment: –Took Yahoo as ground truth. –Chose 9 overlapping categories; took 1000 web pages from Yahoo as input. –Result: 84% precision; 48% recall (using top 5 of 1073 categories) Automatic Topic Assignment

SLIDE 70IS Fall 2004 Further Information Berkeley DL web site