Modul 4 Struktur Informasi Mata Kuliah Preservasi Informasi Digital
Outline Skema Pengumpulan Dokumen Skema Dokumentasi Model Obyek Digital Catatan: sumber bacaan Bab 6 [GHM]
Introduction Preservation method represent whatever authors wants to express (within the limit of what language can express) The need of data model Simple and also able to model all practical information structures Terminologies A blob (binary large object) is a closely associated agglomeration of data that is managed as a single entity. The term ‘blob’ is used to signal that neither the blob’s meaning nor its internal structure is relevant within the discussion at hand. A bit-string is a linearized form of a blob suitable for sending over a simple communication channel. A file or dataset is a blob’s storage form, in which the content might be represented in noncontiguous segments on a magnetic or optical storage volume or on a set of storage volumes. The storage layout is chosen and managed by a file system in order to provide better reliability, economy, performance, and flexibility than is likely to be provided by a simple, contiguous layout. A file system usually includes a programming interface to provide any file in bit-string format.
Syntax Specifications String Syntax Definitions with Regular Expression Regular expressions: Context-independent syntax Pattern – select specific strings from a set of character strings Automata theory & formal language theory models of computation & ways to describe and classify formal language BNF (Backus-Naur Form) formal language for defining the grammar of a context-free language The Extended Backus-Naur Form (EBNF) adds the syntax of regular expressions to the BNF notation in order to allow very compact specifications Each rule of a BNF grammar has the form: symbol ::= expression. symbol if defined by regular expression: initial capital, lower case, expression right-hand side of rule which has the syntax shown below to match strings of one or more characters
Syntax Specifications The Abstract Syntax Notation (ASN.1) used to express syntax of objects and messages Basic Encoding Rules (BERs) enable abstract data value specifications to be represented in concrete form as an array of bytes ASN.1 DER (Distinguished Encoding Rules) and BER-encoded data is largely platform-independent, helping to make the byte-stream representation of a standards definitions document that uses it easy to transport between computers on open networks widely used to describe security protocols, interfaces, and service definitions, such as the X.500 Directory and X.400 Messaging systems, which include extensive security models. An example of language specification to aid the specification of security mechanisms occurred during the specification of the Secure Electronic Transaction (SET) standard.
Syntax Specifications XML Schema define the record structure for any data type of interest, and are particularly prominent in the use of XML to package information of various data types. Typically a schema is expressed as a set of properties with an associated type. For instance, an informational schema description for a customer database would be something like: (1) Name: string of up to 80 characters; (2) Customer ID: number of up to 10 digits; (3) Orders: a list of Order records
Collections Property of a library A documents that tabulates or otherwise identifies the collection members traditional library catalog In digital library? almost every document defines a collection— the set of documents and other objects that it references
Digital Object Schema Payload content Metadata blobs provided by information providers (authors, editor, artist, etc.) Relationships & Relation Names and Identifiers, References, Pointers, and Links Value Sets
Metadata Structured data about other data In digital collections, metadata fulfill a variety of tasks, including identifying items uniquely worldwide; describing collection items (e.g., author, creation date), including their contexts; supporting retrieval and identification; grouping items into collections within a repository; recording authenticity evidence, including historical audit trails; helping protect item integrity against improper change and unintentional corruption; recording access permissions and other digital rights information; facilitating information interchange between autonomous repositories; and recording technical parameters describing items’ representations.