Presentation is loading. Please wait.

Presentation is loading. Please wait.

{ THE FRONT MATTERS: Capturing Journal Front Matter Content with JATS.

Similar presentations


Presentation on theme: "{ THE FRONT MATTERS: Capturing Journal Front Matter Content with JATS."— Presentation transcript:

1 { THE FRONT MATTERS: Capturing Journal Front Matter Content with JATS

2

3 Obvious

4 This… not as much

5 Team Introduction Rachael Carter   a journal manager at PMC at the National Center of Biotechnology Information at the US National Library of Medicine. Rachael graduated in 2010 from the University of Maryland with a Masters of Library Science. Kathryn Funk   a technical editor for NIHMS and PubMed Health at the National Center of Biotechnology Information at the US National Library of Medicine. Kathryn graduated from The Catholic University of America with a Masters of Library and Information Science. Rebecca Mooney   formerly a journal manager at PMC at the National Center of Biotechnology Information at the US National Library of Medicine, recently moved to a new position as a Project Analyst in the IT Department of the American Association for the Advancement of Science (AAAS). Rebecca graduated in 2008 from the University of Maryland with a Masters of Library Science.

6 The Big Picture

7 PMC as an archive has a responsibility to answer:  What we should preserve?  How we should preserve?  Why preserve? NLM Initiative

8 PMC Submission Method A

9 Currently, PMC strives to archive data at the article level, but sees the potential benefit in finding a way to preserve information about the journal that the articles were published in, such as who was Editor in Chief at the time of publication? What was the journal’s philosophy at this time? Etc. Currently, PMC strives to archive data at the article level, but sees the potential benefit in finding a way to preserve information about the journal that the articles were published in, such as who was Editor in Chief at the time of publication? What was the journal’s philosophy at this time? Etc. TOCs: PMC creates their one table of contents, organized by article-type. Still very article based, not at the issue level. TOCs: PMC creates their one table of contents, organized by article-type. Still very article based, not at the issue level. PMC structure

10 Front Matter “capturing” in PMC as it currently exists – through banner journal-links only

11  What PMC Front Matter IS  Editorial board  Journal philosophy  Submission guidelines  Subscription information  Covers  Journal contact information  Publisher information  What PMC Front Matter is NOT  Tables of contents  Advertisements  Forewords  Prefaces Scope of Front Matter within project

12 Frontmatter DTD development Timeline NLM DTD developed issue- admin.dtd was made available pmc- journalmatter.dtd developed Atypon Issue XML presented at JATS-Con 2001 2011 2012

13 XML to the rescue - The content is queryable and reusable - The content is queryable and reusable - Updating just requires editing a file - Updating just requires editing a file - Allows for data manipulation over various platforms/formats - Allows for data manipulation over various platforms/formats Value of capturing front matter as XML Limitations of PDF - Assumes there is an issue to scan - Assumes there is an issue to scan -Difficult to update content -Limited to certain platforms and technologies

14 o Mostly because we already use JATS o It’s flexible o Already had meaningful framework to capture journal article content o Works well within the structure of PMC consistency consistency Why we chose to create an extension to JATS

15 Why JATS isn’t enough to capture front matter: No meaningful way to capture front matter elements such as editorial boards No way to tag journal metadata at a level higher than article-meta Limitations of JATS

16  To capture front matter in the environment in which it was published  To work as much as possible with the existing JATS framework  To create a DTD that would allow for flexibility in both use in rendering Goals

17 Testing 1 2 3 Looking at samples Defined content types Created new elements Completed first iteration of the pmc- journalmatter.dtd Tagged samples of front matter using our DTD and made adjustments User testing: PMC journal managers Adjustments made to final DTD based on user feedback

18 Highlighted physical example of a journal’s front matter

19 Anything in RED is required contains, in order: * * ? contains, in order: * ? * ? ? OR ? contains, in order: * * contains, in order: requires one or more contains, in order: OR OR * Initial Classification

20 Created new elements

21 Tagged samples of front matter using our DTD and made adjustments

22 User testing: PMC journal managers

23 DTD technical details

24 Root element: journalmatter

25  How to generate a foundation for organizing and labeling the front matter content?  Answering the question of can we tag all of this content in one document? Challenges

26 Root element attribute: @journalmatter-type

27  Prevents hybrid of issue and non-issue content in the same document  Changes in content can be more easily updated  Allows a single journal to have issue and standing documents Issue vs. Standing: The Benefits

28 standing – Information of Authors Example: Standing & Issue issue - Cover

29  @content-type  Separate documents  Flexibility  In tagging and rendering  Update as need be  EX: Journal philosophy vs. ed board Root element: @content-type

30 @content- type edboardcover general- info publisher info-for- authors other Individual documents for each @content-type.

31   Cover ("cover"): can include cover image, caption, and cover image copyright information.   Editorial Board ("edboard"): can include executive editors, associate editors, etc. as well as general editorial board members.   General Journal Information ("general-info"): can include but is not limited to journal mission statement, scope, journal contact information, subscription information, copyright, and other journal-specific content.   Publisher Information ("publisher"): can include publisher philosophy, other journals published, contact information, etc.   Information for Authors ("info-for-authors"): can include article submission and formatting instructions.   Other ("other"): if the document is not one of the listed types or the type of document cannot be determined, the "other" attribute value may be used. @content-type values

32 The 4 Main elements of a document

33 <journal-meta>

34

35 <issue-meta>

36 example – compare art- meta with issue-meta example – compare art- meta with issue-meta

37 <!ENTITY % document-meta-model "((document-title, document-subtitle?)?, contrib-group?, pub-date*, (((fpage, lpage?, page-range?) | elocation-id)?), self-uri*, permissions?)" <document-meta>

38

39 Borrowed directory from JATS (with a few additions) <body>

40 Addition: Addition:

41 Person-list vs. Person-group

42   advisory-board: A board appointed to advise the editorial board   editor: Content editors   editorial-board: A group of editors on a publication   guest-editor: Content editors that have been invited to edit all or part of a work   reviewer: Content reviewer   transed: Editors of a translated version of a work @person-list-type

43  Not required – suggested list  Not controlled attribute  Only used when content-type=“general-info”  Intent was to give meaning for searching and grouping purposes.  Used similarly to JATS’ @sec-types @sec-type

44 @sec-type is not a required or controlled attribute. However, when "general-info" is the @content-type of the document, the following is a suggested list of types:   association * *   copyright   journal-contact   journal-philosophy   subscription-info * * This refers to associations which may be affiliated with a journal but does not necessarily publish the journal. List of @sec-types

45  http://dtd.nlm.nih.gov/ncbi/pmc/journalmatter/ http://dtd.nlm.nih.gov/ncbi/pmc/journalmatter/ DTD Documentation

46 So how’s it all going to look?

47  Still relatively untested  No rendering  No actual use  Lack of an existing model  Based on perceived needs of PMC as an archive. Unanticipated uses beyond.  Different naming conventions and structures of published journal front matter Limitations

48  Trying to start a conversation  Looking for ways to best capture to suit needs both inside PMC and the broader JATS community  Determining whether the content types will be applicable for future applications  Initiating the usage for the DTD and seeing what happens Looking Forward

49   Breena Krick   Jeff Beck   Audrey Hamelers   Christopher Maloney   PMC Journal Managers Acknowledgements

50   Andrew N.. The Oxford Journals Online Archives: The Purpose and Practicalities of a Major Digitization Program. Serials Review. (2006. June). 32(12), 78-80.   Holdsworth David. Preservation Strategies for Digital Libraries. Glasgow, UK: HATII, University of Glasgow;DCC Digital Curation Manual. (2007. November). Retrieved from: http://www​.dcc.ac.uk​/resource/curation-manual​/chapters/preservation- strategies-digital-libraries.http://www​.dcc.ac.uk​/resource/curation-manual​/chapters/preservation- strategies-digital-libraries   Marcum D. Scholars as Partners in Digital Preservation. CLIR Issues. (2001. March/April)20. Retrieved from:http://www​.clir.org/pubs​/issues/issues20.html.http://www​.clir.org/pubs​/issues/issues20.html   Markantonatos N. Article vs Issue XML: Capturing the Table of Contents under the NLM DTD. Bethesda, MD:National Center for Biotechnology Information; Journal Article Tag Suite Conference (JATS-Con) Proceedings 2011. (2011). Retrieved from: http://www​.ncbi.nlm.nih​.gov/books/NBK57236/..http://www​.ncbi.nlm.nih​.gov/books/NBK57236/.   Wheeler B. Journal Identity in the Digital Age. Journal of Scholarly Publishing. (2010. ) 42(1), 45-88.   NLM Journal Archiving and Interchange Tag Suite. Retrieved from: http://dtd​.nlm.nih.gov/.http://dtd​.nlm.nih.gov/   PMC Journal Matter DTD Documentation. Retrieved from: http://dtd​.nlm.nih.gov​/ncbi/pmc/journalmatter/.http://dtd​.nlm.nih.gov​/ncbi/pmc/journalmatter/   BMC Cancer. Retrieved from: http://www​.biomedcentral.com/bmccancer/.http://www​.biomedcentral.com/bmccancer/   Frontiers in Cancer Genetics. Retrieved from: http://www​.frontiersin​.org/cancer_genetics.http://www​.frontiersin​.org/cancer_genetics References

51  pmc@ncbi.nlm.nih.gov pmc@ncbi.nlm.nih.gov Contact us

52 Questions?

53 Multiple documents: Dependent on information being captured 1 XML document: content-type= “standing” OR “issue” 2 document: 1 content-type=“standing 1 content-type=“issue” CoverEditorial Board General Journal Information Publisher Information Information for Authors “standing”“edboard”“general- info” “publisher”“info-for- authors” “issue”“cover”“edboard”“general- info” “publisher”“info-for- authors”


Download ppt "{ THE FRONT MATTERS: Capturing Journal Front Matter Content with JATS."

Similar presentations


Ads by Google