Download presentation
Presentation is loading. Please wait.
Published byLeon Roberts Modified over 9 years ago
1
Data Management David Nathan & Peter Austin & Robert Munro
2
This section 1.Data management 2.Properties of data 3.Relational data model 4.XML 5.Example
3
Workflows - description vs documentation something inscribed something happened you applied knowledge, made decisions NOT OF INTEREST! representations, lists, summaries, analyses cleaned up, selected, analysed archived, presented, published something happened recording you applied knowledge, techniques FOCUS OF INTEREST! representations, eg transcription, annotation made decisions, applied linguistic knowledge archived &... ?? recapitulates Description Documentation
4
Choosing values/priorities Standards & compliance Adeptness with tools Modelling of phenomena, architecture of data Dissemination/publishing Preserving Ethics, responsibility, protocol Range, comprehensiveness Intellectual rigour Which are priorities? Which are dispensible?
5
Data should be: explicit consistent robust meaningful conventional adaptable, convertible, machine readable etc useful!
6
“Portability” Bird and Simons 2003: language documentation data needs to have integrity, flexibility, longevity
7
“Portability” complete explicit documented preservable transferable accessible adaptable not technology-specific (also appropriate, accurate, useful etc!!)
8
Data management the way that data is structured is also information, that may be complex properly structured data allows: usage including manipulation, conversion, derivation preservation machine readability
9
Data management systems a data management system is a system you design for storing data and metadata: information about content and structures relationship between units of information it is not necessarily tied to any particular software, or even a computer
10
Naive managment using filenames a (too) simple management system: information about a recording is captured in the filenames: 1st_int_john_5Aug.wav market_conv_mj.wav …. what does ‘int’ mean? what information about the recording is missing?
11
Data modeling World/universe Domain Relevant entities properties relationships We also need formal ways to represent these
12
Data modeling data modelling is the process of designing your data management system: what information do you need to record? what are the units of information? what are their properties (attributes)? what are the relationships between the units of information? how is the information etc likely to change in the future? how can all this be represented?
13
Data management two well-known formats for structured data: relational database eXtensible Markup Language (XML) these are methods, not softwares or hardwares any system for well-structured data could be OK, but generally: smaller community of users so less tools and support ... so errors more likely
14
Databases Note that database has 3 senses: a body of related information type of software (eg Oracle, Access, Filemaker) a model for the domain of information (ie. formulation of entities and relationships)
15
Relational format Uses tables Table rows represent entities in a domain Table columns represent properties/attributes of entities Each cell represents one atomic unit of data The order of rows and columns has no significance
16
Representing a relational design field name TABLE NAME simplest example
17
Representing a relational design field 1 TABLE NAME less trivial entity field 2
18
Representing a relational design less trivial domain name CONTINENT name COUNTRY = one to many
19
Non-trivial domains non-trivial domains have many-to-many relationships name AUTHOR name SUBJECT.....
20
From model to implementation implementing table relationships name CONTINENT name COUNTRY id continent_id
21
Designing a database Determine the domain, entities and relationships Experiment with scenarios Any non-trivial model will evolve as it is thought out and tested Normalisation is the process of refining models
22
Practical example Create a database model for some audio metadata
23
What does all this achieve? conceptual/intellectual validity scalable, searchable, modular machine readable in fact, portable: complete explicit documented preservable transferable accessible adaptable not technology-specific
24
Stop here!
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.