Presentation is loading. Please wait.

Presentation is loading. Please wait.

Data Management David Nathan & Peter Austin & Robert Munro.

Similar presentations


Presentation on theme: "Data Management David Nathan & Peter Austin & Robert Munro."— Presentation transcript:

1 Data Management David Nathan & Peter Austin & Robert Munro

2 This section 1.Data management 2.Properties of data 3.Relational data model 4.XML 5.Example

3 Workflows - description vs documentation  something inscribed something happened you applied knowledge, made decisions NOT OF INTEREST! representations, lists, summaries, analyses cleaned up, selected, analysed archived, presented, published something happened  recording you applied knowledge, techniques FOCUS OF INTEREST! representations, eg transcription, annotation made decisions, applied linguistic knowledge archived &... ?? recapitulates Description Documentation

4 Choosing values/priorities  Standards & compliance  Adeptness with tools  Modelling of phenomena, architecture of data  Dissemination/publishing  Preserving  Ethics, responsibility, protocol  Range, comprehensiveness  Intellectual rigour  Which are priorities?  Which are dispensible?

5 Data should be:  explicit  consistent  robust  meaningful  conventional  adaptable, convertible, machine readable etc  useful!

6 “Portability”  Bird and Simons 2003: language documentation data needs to have integrity, flexibility, longevity

7 “Portability”  complete  explicit  documented  preservable  transferable  accessible  adaptable  not technology-specific  (also appropriate, accurate, useful etc!!)

8 Data management  the way that data is structured is also information, that may be complex  properly structured data allows:  usage including manipulation, conversion, derivation  preservation  machine readability

9 Data management systems  a data management system is a system you design for storing data and metadata:  information about content and structures  relationship between units of information  it is not necessarily tied to any particular software, or even a computer

10 Naive managment using filenames  a (too) simple management system:  information about a recording is captured in the filenames: 1st_int_john_5Aug.wav market_conv_mj.wav ….  what does ‘int’ mean?  what information about the recording is missing?

11 Data modeling  World/universe  Domain  Relevant  entities  properties  relationships  We also need formal ways to represent these

12 Data modeling  data modelling is the process of designing your data management system:  what information do you need to record?  what are the units of information?  what are their properties (attributes)?  what are the relationships between the units of information?  how is the information etc likely to change in the future?  how can all this be represented?

13 Data management  two well-known formats for structured data:  relational database  eXtensible Markup Language (XML)  these are methods, not softwares or hardwares  any system for well-structured data could be OK, but generally:  smaller community of users so less tools and support ... so errors more likely

14 Databases  Note that database has 3 senses:  a body of related information  type of software (eg Oracle, Access, Filemaker)  a model for the domain of information (ie. formulation of entities and relationships)

15 Relational format  Uses tables  Table rows represent entities in a domain  Table columns represent properties/attributes of entities  Each cell represents one atomic unit of data  The order of rows and columns has no significance

16 Representing a relational design field name TABLE NAME  simplest example

17 Representing a relational design field 1 TABLE NAME  less trivial entity field 2

18 Representing a relational design  less trivial domain name CONTINENT name COUNTRY = one to many

19 Non-trivial domains  non-trivial domains have many-to-many relationships name AUTHOR name SUBJECT.....

20 From model to implementation  implementing table relationships name CONTINENT name COUNTRY id continent_id

21 Designing a database  Determine the domain, entities and relationships  Experiment with scenarios  Any non-trivial model will evolve as it is thought out and tested  Normalisation is the process of refining models

22 Practical example  Create a database model for some audio metadata

23 What does all this achieve?  conceptual/intellectual validity  scalable, searchable, modular  machine readable  in fact, portable:  complete  explicit  documented  preservable  transferable  accessible  adaptable  not technology-specific

24 Stop here!


Download ppt "Data Management David Nathan & Peter Austin & Robert Munro."

Similar presentations


Ads by Google