Presentation is loading. Please wait.

Presentation is loading. Please wait.

Digital recordkeeping and preservation I

Similar presentations


Presentation on theme: "Digital recordkeeping and preservation I"— Presentation transcript:

1 Digital recordkeeping and preservation I
Databases : Noark 5 ARK2100 Digital recordkeeping and preservation I 2017 Thomas Sødring P48-R407

2 Unique identification of record units
All objects (fonds, series) except documentObject should have a unique identification This identifier must be unique within the record keeping organisation Example of good identifiers? Universally unique identifier (UUID) f81d4fae-7dec-11d0-a765-00a0c91e6bf6 Noark doesn't require UUID but UUID will be a good identifier system for Noark Unlikely that two Noark objects in the same organisation will generate the same identifier

3 Metadata for <fonds>
Nr. Name Occrs. Datatype M001 systemId 1 String M020 title M021 description 0-1 M050 fondsStatus M300 documentMedium M301 storageLocation 0-M M600 createdDate Date and Time M601 createdBy M602 finalisedDate M603 finalisedBy

4 DDL fonds CREATE TABLE fonds ( systemId CHAR(36),
title VARCHAR(100) NOT NULL, description VARCHAR(100), fondsStatus CHAR(20), documentMedium VARCHAR(50), createdDate DATETIME NOT NULL, createdBy VARCHAR(100) NOT NULL, finalisedDate DATETIME, finalisedBy VARCHAR(100), PRIMARY KEY (systemId) );

5 Metadata for <series>
Nr. Name Occurs Datatype M001 systemId 1 String M020 title M021 description 0-1 M051 seriesStatus M300 documentMedium M301 storageLocation 0-M M600 createdDate Date and Time M601 createdBy M602 finalisedDate M603 finalisedBy M107 seriesStartDate Dato M108 seriesEndDate M202 referencePrecursor series.systemId M203 referenceSuccessor

6 DDL series CREATE TABLE series ( systemId CHAR(36),
title VARCHAR(100) NOT NULL, description VARCHAR(100), seriesStatus CHAR(20) NOT NULL, documentMedium VARCHAR(50), createdDate DATETIME NOT NULL, createdBy VARCHAR(100) NOT NULL, finalisedDate DATETIME, finalisedBy VARCHAR(100), seriesStartDate DATE, seriesEndDate DATE, referencePrecursor CHAR(36), referenceSuccessor CHAR(36), PRIMARY KEY (systemId) );

7 1 1..* fonds fonds series classification system series class file
record document description document object Electronic document

8 DDL series with foreign key
CREATE TABLE series ( systemId CHAR(36), title VARCHAR(100) NOT NULL, description VARCHAR(100), seriesStatus CHAR(20) NOT NULL, documentMedium VARCHAR(50), createdDate DATETIME NOT NULL, createdBy VARCHAR(100) NOT NULL, finalisedDate DATETIME, finalisedBy VARCHAR(100), seriesStartDate DATE, seriesEndDate DATE, referencePrecursor CHAR(36), referenceSuccessor CHAR(36), referenceFonds CHAR(36), PRIMARY KEY (systemId) ); Not included in the Noark 5 metadata catalogue

9 classificationSystem
fonds series series 1..* classification system 0 ..* classificationSystem class 1 file 1..* class record document description ?? See 0..1?? document object Electronic document

10 DDL series with primary key
CREATE TABLE series ( systemId CHAR(36), title VARCHAR(100) NOT NULL, description VARCHAR(100), seriesStatus CHAR(20) NOT NULL, documentMedium VARCHAR(50), createdDate DATETIME NOT NULL, createdBy VARCHAR(100) NOT NULL, finalisedDate DATETIME, finalisedBy VARCHAR(100), seriesStartDate DATE, seriesEndDate DATE, referencePrecursor CHAR(36), referenceSuccessor CHAR(36), referenceFonds CHAR(36), referenceClassificationSystem CHAR(36), FOREIGN KEY (referenceFonds) REFERENCES fonds (systemId), FOREIGN KEY (referenceClassificationSystem) REFERENCES classificationSystem (systemId), PRIMARY KEY (systemId) ); But I can't create a foreign key relationship to a classificationSystem if it doesn't exist

11 Metadata for classificationSystem
Nr. Name Occurs Avl Datatype M001 systemId 1 A String M086 classificationType 0-1 A String M020 title 1 A String M021 description 0-1 A String M600 createdDate 1 A Date and Time M601 createdBy 1 A String M602 finalisedDate 0-1 A Date and Time M603 finalisedBy 0-1 A String

12 DDL classificationSystem
CREATE TABLE classificationSystem( systemId CHAR(36), classificationType VARCHAR(100), title VARCHAR(100), description VARCHAR(100), createdDate DATETIME, createdBy VARCHAR(100), finalisedDate DATETIME, finalisedBy VARCHAR(100), PRIMARY KEY (systemId) );

13 classificationSystem
fonds classificationSystem series 1 classification system 1..* class class file record document description document object Electronic document

14 Metadata for class Name Occurs Avl. Datatype systemId 1 A String
classId title description 0-1 createdDate Date and Time createdBy finalisedDate finalisedBy

15 DDL class CREATE TABLE class ( systemId CHAR(36) NOT NULL,
classId CHAR(36) NOT NULL, title VARCHAR(100), description VARCHAR(100), createdDate DATETIME NOT NULL, createdBy VARCHAR(100) NOT NULL, finalisedDate DATETIME, finalisedBy VARCHAR(100), referenceClassificationSystem CHAR(36), FOREIGN KEY (referenceClassificationSystem) REFERENCES classificationSystem (systemId), PRIMARY KEY (systemId) );

16 fonds classification system series class file document object
Electronic 1 0..* record description

17 Metadata for file Nr. Name Occurs Datatype M001 systemId 1 String M003
fileId M020 title M025 officialTitle 0-1 M021 description M022 noekkelord 0-M M300 documentMedium M301 storageLocation M600 createdDate Date and Time M601 createdBy M602 finalisedDate M603 finalisedBy M208 referenceSeries series.systemId M711 businessSpecificMetadata User defined structure

18 DDL file CREATE TABLE file ( systemId CHAR(36) NOT NULL,
fileId CHAR(36) NOT NULL, title VARCHAR(100), officialTitle VARCHAR(100), description VARCHAR(100), documentMedium VARCHAR(50), createdDate DATETIME NOT NULL, createdBy VARCHAR(100) NOT NULL, finalisedDate DATETIME, finalisedBy VARCHAR(100), referenceSeries CHAR(36), referenceClass CHAR(36), FOREIGN KEY (referenceClass) REFERENCES class (systemId), FOREIGN KEY (referenceSeries) REFERENCES series (systemId), PRIMARY KEY (systemId) );

19 Metadata for record Nr. Name Occurs Avl Datatype M001 systemId 1 A
String M600 createdDate 1 A Date and Time M601 createdBy 1 A String M604 archivedDate 1 A Date and Time M605 archivedBy 1 A String M208 referenceSeries 0-M A series.systemId

20 fonds classification system series class file record document object
Electronic 0 ..* 1 0..* description

21 DDL record CREATE TABLE record ( systemId CHAR(36),
createdDate DATETIME NOT NULL, createdBy VARCHAR(100) NOT NULL, archivedDate DATETIME, archivedBy VARCHAR(100), referenceSeries CHAR(36), referenceFile CHAR(36), referenceClass CHAR(36), FOREIGN KEY (referenceSeries) REFERENCES series (systemId), FOREIGN KEY (referenceFile) REFERENCES file (systemId), FOREIGN KEY (referenceClass) REFERENCES class (systemId), PRIMARY KEY (systemId) ); #This is 0-1, not 0-M

22 1..* 0 ..* fonds record series classification System
documentDescription class file record document Description document Object Electronic document

23 Metadata for <documentDescription>
Nr. Name Occurs Datatype M001 systemId 1 String M083 documentType M054 documentStatus M020 title M021 description 0-1 M024 author 0-M M600 createdDate Date and Time M601 createdBy M300 documentMedium M301 storageLocation M208 referenceSeries series.systemId M217 associatedWithRecordAs M007 documentNumber Integer M620 associationDate M621 associatedBy

24 DDL documentDescription
CREATE TABLE documentDescription ( systemId CHAR(36), documentType VARCHAR(100) NOT NULL, documentStatus VARCHAR(100) NOT NULL, title VARCHAR(100), description VARCHAR(100), author VARCHAR(255), createdDate DATETIME, createdBy VARCHAR(100), documentMedium VARCHAR(50), referenceSeries CHAR(36), associatedWithRecordAs VARCHAR(100) NOT NULL, associationDate DATETIME, associatedBy VARCHAR(100), documentNumber INT, referenceFile CHAR(36), referenceRecord CHAR(36), FOREIGN KEY (referenceFile) REFERENCES file (systemId), FOREIGN KEY (referenceRecord ) REFERENCES record (systemId), PRIMARY KEY (systemId) ); #This is 0-1, not 1-M

25 fonds classification system series class file record document object
Electronic 0 ..* 1 0..* description

26 Metadata for <documentObject>
Nr. Name Occurs. Datatype M005 versionNumber 1 Integer M700 variantFormat String M701 format M702 formatDetails 0-1 M600 createdDate Date and Time M601 createdBy M218 referenceDocumentFile String (location + filname) M705 checksum M706 checksumAlgorithm M707 fileSize

27 DDL documentObject CREATE TABLE documentObject( systemId CHAR(36),
versionNumber INT NOT NULL, variantFormat VARCHAR(255) NOT NULL, format VARCHAR(255) NOT NULL, formatDetails VARCHAR(255) , createdDate DATETIME NOT NULL, createdBy VARCHAR(255) NOT NULL, referenceDocumentFile VARCHAR(255) NOT NULL, checksum VARCHAR(255) NOT NULL, checksumAlgorithm VARCHAR(255) NOT NULL, fileSize VARCHAR(255) NOT NULL, referenceDocumentDescription CHAR(36), referenceRecord CHAR(36), PRIMARY KEY (systemId), FOREIGN KEY (referenceDocumentDescription) REFERENCES documentDescription (systemId), FOREIGN KEY (referenceRecord) REFERENCES documentObject (systemId) );

28 INSERT INTO TABLE (attributelist) VALUES (valuelist);

29 INSERT fonds INSERT INTO fonds ( systemId, title,
description, fondsStatus, documentMedium, createdDate, createdBy) VALUES ( "b1b6b3f b3d9-4ebf c", "HiOA Arkivet", "Test fonds", "Created", "Electronic records", " ", "admin" );

30 INSERT series INSERT INTO series ( systemId, title, description,
seriesStatus, documentMedium, createdDate, createdBy, seriesStartDate, referenceFonds ) VALUES ( "6cf257fc-262b-4a40-9e00-766d718d8a64", "Series 1", null, "Active period", "Electronic records", " ", "admin", " ", "b1b6b3f b3d9-4ebf c" );

31 UPDATE SET WHERE;

32 UPDATE TABLE SET Attribute='value' WHERE;

33 UPDATE series UPDATE series SET description = "Updated description"
WHERE systemId = "6cf257fc-262b-4a40-9e00-766d718d8a64"; UPDATE caseFile SET caseStatus = "Ended period" caseStatus = "Active period";

34 DELETE FROM WHERE;

35 DELETE FROM TABLE WHERE;

36 DELETE series # Does this work? DELETE FROM fonds WHERE
systemId = "b1b6b3f b3d9-4ebf c"; # What about this? DELETE FROM series WHERE systemId = "6cf257fc-262b-4a40-9e00-766d718d8a64";

37 SQL DISTINCT You may find that you want a list of some distinct values from an attribute / table Useful to identify the different types of files in the database doc, docx, xls, pdf, etc. Useful to view the various status values that have been in use i.e document status

38 SELECT with DISTINCT Create a list of unique file formats in the database Create a list of the various document status in the database SELECT DISTINCT format FROM documentObject; SELECT DISTINCT documentStatus FROM documentDescription;

39 SELECT with ORDER BY The DBMS returns results in the same order as it retrieves data from the table Sometimes it is desirable to be able to sort these results

40 WHERE Attribute = 'value'
SELECT Attribute(s) FROM Relation WHERE Attribute = 'value' ORDER BY Attribute [ ASC | DESC ] ;

41 SELECT with range and Order By
SELECT * FROM file WHERE createdDate BETWEEN ' ' AND ' '; BETWEEN ' ' AND ' ' ORDER BY createdDate ASC; ORDER BY createdBy DESC;

42 SELECT and LIMIT You can limit the result set and specify the maximum number of rows you want back You can also limit the result set by ranges, e.g. ignore the first two rows and fetch the next 5 SELECT * FROM file LIMIT 3 SELECT * FROM file LIMIT 2,5

43 SQL GROUP BY A DBMS supports a number of built-in calculation functions These can also be used in conjunction with GROUP BY clause AVG () MAX () MIN () COUNT () MySQL supports far more than these statistics

44 SQL GROUP BY This is a very useful part of a DBMS and can answer questions like: How many documents are there of each type? SELECT documentType, COUNT( * ) AS 'Total' FROM documentDescription GROUP BY documentType;

45 SQL and JOINs So far we have explored tables where all the data is contained within a single table This corresponds to the flat files and does not show strength in relational model that can join data across tables Data in tables can be linked together in different ways Inner /Natural join Anti join Right Outer join Left Outer join There are more but we stick to these

46 SQL inner Join A SQL INNER JOIN between two tables should return all rows that match on a common attribute Create a list of records and files SELECT * FROM record, file WHERE record.referenceFile = file.systemId; or SELECT * FROM record InnERJOIN file ON record.referenceFile = file.systemId;

47 SQL Inner Join ⋈ record file systemId file 1 title file 3 file 2
18f e a-eb448adcbcb6 createdDate systemId tsodring createdBy 4a56e5f7-5ced-4492-b5f3-e75e76d273f0 305f27f7-81df-4a91-bfdf-cdccf2030a33 55968a3c-62e8-4ab1-b3c0-473bd7c8314d 4e8491f4-65a9-4bf7-b776-cd2dbbe82621 abab3072-bd8b-4987-b851-6bec28e4e992 2ee3f640-f9ce-4dda-a05b-285a6b28788d referenceFile record 4a56e5f7-5ced-4492-b5f3-e75e76d273f0 file systemId file 1 title 4e8491f4-65a9-4bf7-b776-cd2dbbe82621 file 3 2ee3f640-f9ce-4dda-a05b-285a6b28788d file 2 4a56e5f7-5ced-4492-b5f3-e75e76d273f0 file 1 4e8491f4-65a9-4bf7-b776-cd2dbbe82621 file 3 systemId title createdBy 18f e a-eb448adcbcb6 createdDate tsodring 305f27f7-81df-4a91-bfdf-cdccf2030a33 55968a3c-62e8-4ab1-b3c0-473bd7c8314d abab3072-bd8b-4987-b851-6bec28e4e992 2ee3f640-f9ce-4dda-a05b-285a6b28788d record referenceFile record ⋈ file file 2

48 SQL Antijoin SELECT * FROM record WHERE (referenceFile) NOT IN (
A SQL Anti Join between two tables, A and B, is the set of all rows in A that do not have a matching row in B on a common attribute Can be useful to answer Which records have no associated file SELECT * FROM record WHERE (referenceFile) NOT IN ( SELECT systemId FROM file);

49 SQL AntiJoin ∆ record file systemId title file 3 file 2 record file
18f e a-eb448adcbcb6 createdDate systemId tsodring createdBy 4a56e5f7-5ced-4492-b5f3-e75e76d273f0 305f27f7-81df-4a91-bfdf-cdccf2030a33 55968a3c-62e8-4ab1-b3c0-473bd7c8314d 4e8491f4-65a9-4bf7-b776-cd2dbbe82621 abab3072-bd8b-4987-b851-6bec28e4e992 2ee3f640-f9ce-4dda-a05b-285a6b28788d referenceFile record file systemId title 4e8491f4-65a9-4bf7-b776-cd2dbbe82621 file 3 2ee3f640-f9ce-4dda-a05b-285a6b28788d file 2 record file record systemId systemId createdDate createdDate createdBy createdBy referenceFile createdBy 18f e a-eb448adcbcb6 18f e a-eb448adcbcb6 tsodring tsodring 4a56e5f7-5ced-4492-b5f3-e75e76d273f0 4a56e5f7-5ced-4492-b5f3-e75e76d273f0 305f27f7-81df-4a91-bfdf-cdccf2030a33 305f27f7-81df-4a91-bfdf-cdccf2030a33 tsodring tsodring 4a56e5f7-5ced-4492-b5f3-e75e76d273f0 4a56e5f7-5ced-4492-b5f3-e75e76d273f0

50 SQL Left Outer Join A SQL LEFT OUTER JOIN between two tables, A and B, should return all rows in A and all rows in B that match rows in A on a common attribute SELECT * FROM record LEFT OUTER JOIN file ON record.referenceFile = file.systemId;

51 ⟕ SQL Left Outer Join record file systemId title file 3 file 2 null
18f e a-eb448adcbcb6 createdDate systemId tsodring createdBy 4a56e5f7-5ced-4492-b5f3-e75e76d273f0 305f27f7-81df-4a91-bfdf-cdccf2030a33 55968a3c-62e8-4ab1-b3c0-473bd7c8314d 4e8491f4-65a9-4bf7-b776-cd2dbbe82621 abab3072-bd8b-4987-b851-6bec28e4e992 2ee3f640-f9ce-4dda-a05b-285a6b28788d referenceFile record file systemId title 4e8491f4-65a9-4bf7-b776-cd2dbbe82621 file 3 2ee3f640-f9ce-4dda-a05b-285a6b28788d file 2 null 4e8491f4-65a9-4bf7-b776-cd2dbbe82621 file 3 systemId title createdBy 18f e a-eb448adcbcb6 createdDate tsodring 4a56e5f7-5ced-4492-b5f3-e75e76d273f0 305f27f7-81df-4a91-bfdf-cdccf2030a33 55968a3c-62e8-4ab1-b3c0-473bd7c8314d abab3072-bd8b-4987-b851-6bec28e4e992 2ee3f640-f9ce-4dda-a05b-285a6b28788d record referenceFile record ⟕ file file 2

52 SQL Right Outer Join A SQL RIGHT OUTER JOIN between two tables, A and B, should return all rows in B and all rows in A that match rows in B on a common attribute SELECT * FROM record RIGHT OUTER JOIN file ON record.referenceFile = file.systemId;

53 ⟖ SQL Right Outer Join record file systemId title file 3 file 2 file 4
createdDate systemId createdBy 55968a3c-62e8-4ab1-b3c0-473bd7c8314d tsodring 4e8491f4-65a9-4bf7-b776-cd2dbbe82621 abab3072-bd8b-4987-b851-6bec28e4e992 2ee3f640-f9ce-4dda-a05b-285a6b28788d referenceFile record file systemId title 4e8491f4-65a9-4bf7-b776-cd2dbbe82621 file 3 2ee3f640-f9ce-4dda-a05b-285a6b28788d file 2 f2fcad90-5b8c-4a5b-8cb f1bd1a file 4 record ⟖ file record systemId systemId createdDate createdDate createdBy createdBy createdBy referenceFile systemId title 55968a3c-62e8-4ab1-b3c0-473bd7c8314d 55968a3c-62e8-4ab1-b3c0-473bd7c8314d tsodring tsodring 4e8491f4-65a9-4bf7-b776-cd2dbbe82621 4e8491f4-65a9-4bf7-b776-cd2dbbe82621 4e8491f4-65a9-4bf7-b776-cd2dbbe82621 file 3 abab3072-bd8b-4987-b851-6bec28e4e992 abab3072-bd8b-4987-b851-6bec28e4e992 tsodring tsodring 4e8491f4-65a9-4bf7-b776-cd2dbbe82621 4e8491f4-65a9-4bf7-b776-cd2dbbe82621 4e8491f4-65a9-4bf7-b776-cd2dbbe82621 file 3 55968a3c-62e8-4ab1-b3c0-473bd7c8314d 55968a3c-62e8-4ab1-b3c0-473bd7c8314d tsodring tsodring 2ee3f640-f9ce-4dda-a05b-285a6b28788d 2ee3f640-f9ce-4dda-a05b-285a6b28788d 2ee3f640-f9ce-4dda-a05b-285a6b28788d file 2 abab3072-bd8b-4987-b851-6bec28e4e992 abab3072-bd8b-4987-b851-6bec28e4e992 tsodring tsodring 2ee3f640-f9ce-4dda-a05b-285a6b28788d 2ee3f640-f9ce-4dda-a05b-285a6b28788d 2ee3f640-f9ce-4dda-a05b-285a6b28788d file 2 305f27f7-81df-4a91-bfdf-cdccf2030a33 null null tsodring null null 4a56e5f7-5ced-4492-b5f3-e75e76d273f0 f2fcad90-5b8c-4a5b-8cb f1bd1a file 4

54 Joins and extractions Does it matter which join we use?
Inner Join between file and record Only return files that have an associated record No empty files or 'orphaned' records Left Outer Join between file and record Returns all files that have records as well as empty files No 'orphaned' records Right Outer Join between file and record Returns all files that have a record All 'orphaned' records but no empty files

55 Views A view is a stored SQL query that looks like a table and becomes a sort of virtual table A view can be used to define a subset of tables / data link data from multiple tables (JOIN) You can limit a users access to a view Views can be specified with read or write access to data

56 View with read access View that provides guest access to all tuples in the file table, but the user can only see fileId, officialTitle and createdDate CREATE VIEW publicFile AS SELECT fileId, officialTitle, createdDate FROM file; GRANT SELECT ON noark5.publicFile TO guest;

57 View for public journal
A view that can generate the public journal CREATE VIEW publicJournal AS SELECT regEntry.journaldato, regEntry.registreringsId, regEntry.officialTitle, regEntry.description, regEntry.documentDate, caseFile.caseResponsible FROM registryEntry AS regEntry INNER JOIN caseFile ON referenceFile = caseFile.systemId;


Download ppt "Digital recordkeeping and preservation I"

Similar presentations


Ads by Google