Download presentation
Presentation is loading. Please wait.
1
j.mp/RU_ABOUT patrick.pu@nus.edu.sg WORKSHOPS SCHEDULE CONTACT US
2
William Chong D’Linkup Pte. Ltd. Data, for everyone
Workshop: Encoding text for visualization and analyses using the TEI standard William Chong D’Linkup Pte. Ltd. Data, for everyone
3
Principles WORK SHOP D’Linkup Pte. Ltd. Data, for everyone
4
D’Linkup Pte. Ltd. Data, for everyone
5
Attack plan Hands-on Overview XML TEI Content Expertise
6
Attack plan Hands-on XML TEI Overview Content Expertise
7
Annotation Meaning Encoding Markup
8
Metadata Content Textual Structure and Layout
What can be encoded? Metadata Content Textual Structure and Layout D’Linkup Pte. Ltd. Data, for everyone
9
Many elements Formal method of encoding Language eXtensible Markup Language
10
The need for a common language
<person>Tan Quee Lan</person> is a seamstress. Lim Teck is <name>Tan Quee Lan</name>’s husband. Text Encoding Initiative – 1987 P5 Standards D’Linkup Pte. Ltd. Data, for everyone
11
Examples of TEI D’Linkup Pte. Ltd. Data, for everyone
12
HOW to apply tei http://www.tei-c.org/Guidelines/P5/
D’Linkup Pte. Ltd. Data, for everyone
13
Metadata: Header D’Linkup Pte. Ltd.
<teiHeader> <fileDesc> <titleStmt> <title>Shakespeare: the first folio (1623) in electronic form</title> <author>Shakespeare, William ( )</author> </titleStmt> <publicationStmt> <distributor>Oxford Text Archive</distributor> <address> <addrLine>13 Banbury Road, Oxford OX2 6NN, UK</addrLine> </address> <idno type="OTA">119</idno> <availability> <p>Freely available on a non-commercial basis.</p> </availability> <date when="1968">1968</date> </publicationStmt> <sourceDesc> <bibl>The first folio of Shakespeare, prepared by Charlton Hinman (The Norton Facsimile, 1968)</bibl> </sourceDesc> </fileDesc> D’Linkup Pte. Ltd. Data, for everyone
14
STructure <div> VISION & MISSION </div> <p> Vision
A premier knowledge hub promoting the University’s vision as a leading global university centred in Asia </s> NUS Libraries aspires to be the NUS communities’ first stop when seeking information NUS Libraries provides for the information needs of diverse communities in NUS in support of learning and research NUS Libraries equips the NUS communities with information seeking skills to enhance learning, research and scholarly communication </p> Mission To actively engage and partner the NUS community in advancing scholarship and research through innovative library services </p> </div> D’Linkup Pte. Ltd. Data, for everyone
15
Content <div> </div> <p>
Recently, I visited the <org>Design Centric Programme</org> (DCP) in Engineering. One of the student projects I saw was called Snowstorm. As you can see, it is a personal flying machine designed for indoor spaces. Snowstorm received extensive media attention. Our student team1 , led by <persName>Shawn Sim</persName>, and their faculty mentors, were invited to the <event>Founders Forum</event> in the <country>UK</country>, where they had the chance to explain their flying machine to <persName>Prince William</persName>. </p> </div> D’Linkup Pte. Ltd. Data, for everyone
16
Well-formed Valid D’Linkup Pte. Ltd. Data, for everyone
17
Attack plan Hands-on Overview XML TEI Content Expertise
18
extensible mArkup language
D’Linkup Pte. Ltd. Data, for everyone
19
Optional rule 1: Declaration
<?xml version="1.0" encoding="UTF-8" standalone="yes"?> <?xml version="1.0" encoding="UTF-8" standalone="yes"?> D’Linkup Pte. Ltd. Data, for everyone
20
<name>...</name> <date>...</date>
rule 2: End-Tags (Start of highlight) <name>...</name> <date>...</date> <place>...</place> <name>...</name> <date>...</date> <place>...</place> (End of highlight) <lb/> D’Linkup Pte. Ltd. Data, for everyone <pb/>
21
rule 3: root element <?xml version="1.0"?> <TEI>
<div> <p> <s>.....<name>...</name>...</s> </p> </div> <div>.....<date>...</date>...</div> <div>.....<place>...</place>...</div> </TEI>
22
rule 4: Nesting <p>...<name>...</name>...</p> <p>...<name>...</p>...</name> D’Linkup Pte. Ltd. Data, for everyone
23
rule 5: XML Names XML is case-sensitive Element names must start with an alphabet or the “_” May contain only alphanumeric characters (letters and digits) and “_” “-” “.” The colon “:” is reserved for namespaces
24
Practice Open Notepad Save file as “test.xml” 3. Open Google Chrome
4. Drag “test.xml” into Google Chrome D’Linkup Pte. Ltd. Data, for everyone
25
Practice 5. Input xml declaration: <?xml version="1.0"?>
6. On the next line, write a sentence: My name is _______(your name), and I was born on _______(a date) at _______(a place). 7. <TEI> tag 8. <p> tag 9. <name>,<date>,<place> tags 10. case-sensitivity test 11. wrong nesting test D’Linkup Pte. Ltd. Data, for everyone
26
Well-formed D’Linkup Pte. Ltd. Data, for everyone
27
Have a break, have a Kit Kat
28
Attack plan Hands-on Overview XML TEI Content Expertise TEI
29
T ext E ncoding I nitiative
D’Linkup Pte. Ltd. Data, for everyone
30
Metadata Content Textual Structure and Layout
<teiHeader>Metadata</teiHeader> D’Linkup Pte. Ltd. Data, for everyone
31
Metadata: Header D’Linkup Pte. Ltd.
<teiHeader> <fileDesc> <titleStmt> <title>Shakespeare: the first folio (1623) in electronic form</title> <author>Shakespeare, William ( )</author> </titleStmt> <publicationStmt> <distributor>Oxford Text Archive</distributor> <address> <addrLine>13 Banbury Road, Oxford OX2 6NN, UK</addrLine> </address> <idno type="OTA">119</idno> <availability> <p>Freely available on a non-commercial basis.</p> </availability> <date when="1968">1968</date> </publicationStmt> <sourceDesc> <bibl>The first folio of Shakespeare, prepared by Charlton Hinman (The Norton Facsimile, 1968)</bibl> </sourceDesc> </fileDesc> D’Linkup Pte. Ltd. Data, for everyone
32
four main components <fileDesc> <encodingDesc> <profileDesc> <revisionDesc> <fileDesc> D’Linkup Pte. Ltd. Data, for everyone
33
Using tei website http://www.tei-c.org/Guidelines/P5/
D’Linkup Pte. Ltd. Data, for everyone
34
oxygenxml “Request a Trial License” OR ://mirror.oxygenxml.com/InstData/Editor/Windows64/VM/oxy gen-64bit.exe#get_trial D’Linkup Pte. Ltd. Data, for everyone
35
New Document > Choose a file template > TEI P5 > All Open “One Hundred Years’ History of the Chinese in Singapore: An annotated edition” D’Linkup Pte. Ltd. Data, for everyone
36
<teiHeader> <fileDesc> <titleStmt> <title>Title</title> </titleStmt> <publicationStmt> <p>Publication Information</p> </publicationStmt> <sourceDesc> <p>Information about the source</p> </sourceDesc> </fileDesc> </teiHeader> <teiHeader> <fileDesc> <titleStmt> <title>Title</title> </titleStmt> <publicationStmt> <p>Publication Information</p> </publicationStmt> <sourceDesc> <p>Information about the source</p> </sourceDesc> </fileDesc> </teiHeader> Add: Author Publisher Check: Well-formedness Valid D’Linkup Pte. Ltd. Data, for everyone
37
Valid D’Linkup Pte. Ltd. Data, for everyone
38
Metadata Content <div><p>Textual Structure and Layout</div></p>
D’Linkup Pte. Ltd. Data, for everyone
39
Structural markups <div> VISION & MISSION </div> <p>
A premier knowledge hub promoting the University’s vision as a leading global university centred in Asia </s> NUS Libraries aspires to be the NUS communities’ first stop when seeking information NUS Libraries provides for the information needs of diverse communities in NUS in support of learning and research NUS Libraries equips the NUS communities with information seeking skills to enhance learning, research and scholarly communication </p> Mission To actively engage and partner the NUS community in advancing scholarship and research through innovative library services </p> </div> D’Linkup Pte. Ltd. Data, for everyone
40
WORKING at oxygen <text> <body> <p>Some text here.</p> </body> </text> Add: Front Back Page breaks (pb) Line breaks (lb) Quotations (quote) p50 Footnotes (note) D’Linkup Pte. Ltd. Data, for everyone
41
quick structural marking using excel and word
Manually indicate paragraphs(<p>) Sentences(<s>) Excel IF & Word Replace D’Linkup Pte. Ltd. Data, for everyone
42
Metadata <name>Content</name> Textual Structure and Layout
D’Linkup Pte. Ltd. Data, for everyone
43
Metadata Content Textual Structure and Layout
Person Place Date Metadata Content Textual Structure and Layout D’Linkup Pte. Ltd. Data, for everyone
44
Persons and names <persName>Raffles</persName>
“Major Farquhar” or “Farquhar”? “encodingDesc” xml:id listPerson D’Linkup Pte. Ltd. Data, for everyone
45
Collaborative Work hseb4AOW4jwjsk3R4jQ/edit?usp=sharing Share work using “03” file (listPerson) D’Linkup Pte. Ltd. Data, for everyone
46
Place <country>Singapore</country>
<region>Malacca</region> <district>…</district> <settlement>…</settlement> D’Linkup Pte. Ltd. Data, for everyone
47
Date <date>1999</date> <date when=“1999”>1999</date> <date when-iso=" ">11th June 1819</date> D’Linkup Pte. Ltd. Data, for everyone
48
Checking for errors Copy from google docs into OxygenXML Editor
Do validation Make corrections See “04” file for completed markups D’Linkup Pte. Ltd. Data, for everyone
49
Basic visualization See “05” file D’Linkup Pte. Ltd.
Data, for everyone
50
Questions? D’Linkup Pte. Ltd. Data, for everyone
51
How can TEI/XML be applied to your field? How would you execute it?
BRAINSTORMING How can TEI/XML be applied to your field? How would you execute it? D’Linkup Pte. Ltd. Data, for everyone
52
D’Linkup mission D’Linkup Pte. Ltd. Data, for everyone
53
D’Linkup Pte. Ltd. Data, for everyone William Chong
54
Thank you! D’Linkup Pte. Ltd. Data, for everyone
55
YOUR FEEDBACK IS GREATLY APPRECIATED
j.mp/RU_Feedback
57
PARSING valid well-formed not well-formed not valid (DTD/ Schema)
e.g. TEI Parser step 1 Parser step 2 well-formed valid XML not well-formed not valid
58
PDF JScript HTML ePub any XML OUTPUT 輸出 CSS XSLT XQuery XSL-FO XPath TRANSFORM 轉換 <XML Document> ENCODING 標記 Document Model 文件模型: DTD, Relax NG, XML Schema
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.