Presentation is loading. Please wait.

Presentation is loading. Please wait.

Content and Systems Week 3. Today’s goals Obtaining, describing, indexing content –XML –Metadata Preparing for the installation of Dspace –Computers available.

Similar presentations


Presentation on theme: "Content and Systems Week 3. Today’s goals Obtaining, describing, indexing content –XML –Metadata Preparing for the installation of Dspace –Computers available."— Presentation transcript:

1 Content and Systems Week 3

2 Today’s goals Obtaining, describing, indexing content –XML –Metadata Preparing for the installation of Dspace –Computers available –User names and passwords Will come from Mr. Nadi this week, once he knows the team configurations –Access I believe you all have access to Mendel 290. Please confirm.

3 The Digital Library Content Essential elements for a digital library –Users –Content –Services

4 Content - requirements Store –Organize –Describe Find Deliver

5 Describing the content How to describe content –Metadata Machine readable description of anything What description –Machine readable requires standard descriptive elements Dublin Core (http://dublincore.org/)http://dublincore.org/ –International standard –“a standard for cross-domain information resource description.” –15 descriptive elements Other metadata schemes –IEEE-LOM

6 Metadata What does metadata look like? Metadata is data about data –Information about a resource, encoded in the resource or associated with the resource. The language of metadata: XML –eXtensible Markup Language

7 XML XML is a markup language XML describes features There is no standard XML Use XML to create a resource type Separately develop software to interact with the data described by the XML codes. Source: tutorial at w3school.com

8 XML rules Easy rules, but very strict First line is the version and character set used: – The rest is user defined tags Every tag has an opening and a closing

9 Element naming XML elements must follow these naming rules: –Names can contain letters, numbers, and other characters –Names must not start with a number or punctuation character –Names must not start with the letters xml (or XML or Xml..) –Names cannot contain spaces

10 Elements and attributes Use elements to describe data Use attributes to present information that is not part of the data –For example, the file type or some other information that would be useful in processing the data, but is not part of the data.

11 Repeating elements Naming an element means it appears exactly once. Name+ means it appears one or more times Name* means it appears 0 or more times. Name? Means it appears 0 or one time.

12 Parts of an XML document Elements –The components of an XML document –Some contain other parts, some are empty Ex in HTML: “br” or “table” in XML “ingredient” Attributes –Information about elements, not data Ex in HTML “src=” in XML “scale=” Entities –Special characters or strings with pre-assigned meaning Ex in HTML &nbsp for non-breaking space PCDATA –Parsed Character data: text that will be parsed and interpreted by the reader. Tags and entities will be expanded and used in presentation. CDATA –Character data: text that will not be parsed and interpreted. It will be displayed exactly as provided. The HTML examples are familiar; the XML examples are made up – dependent on the specific XML scheme used

13 Using XML - an example Define the fields of a recipe collection: ISO 8859 is a character set. See http://www.bbsinc.com/iso8859.html

14 Processing the XML data How do we know what to do with the information in an XML file? –Document Type Definition (DTD) Put in the same file as the data -- immediate reference Put a reference to an external description Provides the definition of the legitimate content for each element

15 Document Type Definition <!DOCTYPE recipe [ ]> Repeat 0 or more times

16 Meringue cookies 3 egg whites 1 cup sugar 1 teaspoon vanilla 2 cups mini chocolate chips Beat the egg whites until stiff. Stir in sugar, then vanilla. Gently fold in chocolate chips. Place in warm oven at 200 degrees for an hour. Alternatively, place in an oven at 350 degrees. Turn oven off and leave overnight. Not the way that I want to see a recipe in a magazine! What could we do with a large collection of such entries? How would we get the information entered into a collection? External reference to DTD

17 XML exercise Design an XML schema for an application of your choice. Keep it simple. Examples -- address book, TV program listing, DVD collection, …

18 Another example A paper with content encoded with XML: http://tecfaseed.unige.ch/staf18/modules/ePBL/uploads/proj3/paper81.xml http://tecfaseed.unige.ch/staf18/modules/ePBL/uploads/proj3/paper81.xml First few lines: Standards E-learning and their possible support for a rich pedagogic approach in a 'Integrated Learning' context Rodolophe Borer http://tecfa.unige.ch/perso/staf/borer/ "ePBLpaper11.dtd” shown on next slide

19 %foreign-dtd; Source: http://tecfa.unige.ch/staf/staf-j/vuilleum/staf18/p6/

20 Vocabulary Given the need for processing, do you want free text or restricted entries? Free text gives more flexibility for the person making the entry Controlled vocabulary helps with –Consistent processing –Comparison between entries Controlled vocabulary limits –Options for what is said

21 Vocabulary example Recipe example –What text should be controlled? –What should be free text? Ingredients –Ingredient-amount –Ingredient-name –Should we revise how we coded ingredient amount? Directions

22 Dublin Core Standard set of metadata fields for entries in digital libraries: –Title, creator, subject, description, publisher, contributor, date, type, format, identifier, source, language, relation, coverage, rights

23 Dublin Core elements see: http://dublincore.org/documents/dces/ Title Creator Subject - C Description Publisher Contributor Date Type - C Format - C Identifier Source Language Relation Coverage - C Rights Rights Management information Space, time, jurisdiction. C = controlled vocabulary recommended. Ref. to related resource Standards RFC 3066, ISO639 Unambiguous ID Ex: collection, dataset, event, image YYYY-MM-DD, ex. Entity primarily responsible for making content of the resource Entity making the resource available Contributor to content of the resource What is needed to display or operate the resource.

24 A DSpace example CITIDEL: http://citidel.villanova.edu

25 IEEE - LOM Example of a specialized metadata scheme –Learning Object Metadata Specifically for collections of educational materials Includes all of Dublin Core See http://projects.ischool.washington.edu/sasutton/IEEE1484.html

26 Computing systems Linux machines Introduction to unix: http://www.csc.villanova.edu/~lab/unix/ http://www.csc.villanova.edu/~lab/unix/ Dspace: http://www.dspace.org/http://www.dspace.org/ –Documentation, including installation - http://www.dspace.org/index.php?option=com_content&task=view&id=151&Itemid=116 Najib Nadi, our system administrator, is setting up the machines. He will send a message to the class by the middle of the week with details of machine location and login. Remember - you have the option to use your own machine, but must meet the criteria described last week.

27 This session Defined meta data and its role in digital libraries. Introduced XML as a language for describing a collection of content. Described the computing resources and how to get ready for the first DL setup.


Download ppt "Content and Systems Week 3. Today’s goals Obtaining, describing, indexing content –XML –Metadata Preparing for the installation of Dspace –Computers available."

Similar presentations


Ads by Google