Let me tell you about my grandpa: A content analysis of user annotations to online archival collections This presentation outlines the results and implications of my masters research which, as the long subtitle indicates, was a content analysis of user annotations to online archival collections. First, I’m going to give you some brief background on the project http://beyondbrownpaper.plymouth.edu/item/23506 jessica sedgwick archivist for women in medicine harvard medical school 13 august 2009 Image from Beyond Brown Paper, courtesy Plymouth State University
Archivists are very busy! As we know, there are many things that limit our archival description, including: restraints on time and resources Our own limited knowledge and subjectivity, and: Descriptive methods that address our often competing desire to make more materials available, faster (for example, processing lite, and mass digitization) …and other limits on archival description
Meanwhile… Archivists are experimenting with “2.0” approaches But at the same time, many archives are experimenting with new approaches and technologies, such as user commenting and tagging. In this effort, we are seeking to: meet evolving user needs and expectations by allowing them to actively engage with content We’re also seeking to attract new users create a more open, dynamic descriptive system, and perhaps even allow users to contribute to our work of preserving and providing access to our shared cultural heritage.
Concerns about user-engagement: What is the value? Cost of implementation and maintenance Loosening our grip on authority control However, there are many concerns about implementing these new approaches and technologies. For example: What value is added, for users and archivists alike, through user contributions? Is this added value worth the time and resources spent on implementation and maintenance? Are we loosening our grip on descriptive authority by allowing users to contribute, and if so, is that something we are willing to do? Image from Library of Congress, “1930s-40s in Color” set on Flickr
Study design Content analysis Collected publicly-contributed user comments from online archival collections Analyzed comments against a set of categories Counted each comment toward as many categories as it represented This study sought to inform these concerns by asking the question: What are the characteristics of user engagement in online archival collections? My hope is that, by understanding how users actually engage with the 2.0 features and tools we implement, we can better determine the value of these tools, and understand how best to manage and utilize contributions from our users. So, there on the screen you see my basic study design. I collected comments that users publicly submitted to online archival collections, and categorized those comments to see what types of things users are saying, and how they are utilizing the comment space.
Online collections examined Keweenaw Digital Archives Beyond Brown Paper Polar Bear Expedition Digital Collections Here are the three online archival collections that I gathered comments from: The Keweenaw Digital Archives from Michigan Tech a collection of photographs that document Michigan's historic copper mining district Beyond Brown Paper from Plymouth State University A collection of photographs that document the history of the Brown Paper Company of Berlin, New Hampshire Polar Bear Expedition Digital Collections, from the Bentley Historical Library at the University of Michigan includes materials of various formats on a World War I American intervention to Russia. Each of these projects was launched in 2006, and I collected data from these sites in early 2008. The reason I examined these three particular sites was that, in early 2008, they were the only online archival collections I could find that allowed user comments.
Data set Total comments collected: 568 Breakdown for each site: As many comments as possible were collected from each of the three sites, coming to a total of 568. Thirty four of these comments were in french, and were not coded in my analysis. Here is the breakdown of how each site was represented in the total data set – note that Beyond Brown Paper accounts for nearly half the total number of comments.
Codebook This is the codebook I developed, which describes the categories used for coding the user comments.
Results overall This table shows the results of my analysis of the total set of comments collected from the three online collections. I should note that in my analysis, many individual comments fit into more than one category, and so each comment was counted toward as many categories as it represented. So, you’ll note that the top categories were subject identification, providing further information, linking to additional resources, establishing a personal connection to the materials, and so on. There were far fewer instances of using the comment space as a venue for requesting copies, offering to donate further materials, complaints about the site, etc. I’m going to quickly walk through a few examples of some of the top categories, just to give you a sense of the types of things people were saying most frequently.
#1 Subject identification Here is an example of the most common type of user comment, subject identification. Anecdotally, I’d venture to guess that the majority of the identifications provided by users were of people, but there were also many users identifying buildings, equipment, streets, and other subjects, in this case, a train. Since the user goes on to describe how he and his mother used to take this train, this comment would’ve also been coded as a personal connection. http://beyondbrownpaper.plymouth.edu/item/351 Image from Beyond Brown Paper, courtesy Plymouth State University
#2 Providing further information Here is an example of a comment that provides further information about an item. (talk about example) http://beyondbrownpaper.plymouth.edu/item/209 Image courtesy Michigan Tech Archives
#3 Linking to further resources This is an example of a user comment that links or provides a reference to further resources. …so the user is linking to a site with additional images of this monastery. http://polarbears.si.umich.edu/index.pl?node_id=15472&lastnode_id=18065 Image courtesy Bentley Historical Library, University of Michigan
#4 Establishing personal connection Here’s an example of a user establishing a personal connection. As you might imagine, much of the user community engaging with these collections had personal or family connections to the communities that are documented by the materials, hence the title of the study, “let me tell you about my grandpa.” In fact, within the total of 88 personal connections in the data set, the term “grandfather” occurs 37 times. (42%) Image from Beyond Brown Paper, courtesy Plymouth State University
#6 Correction I also wanted to include an example of a correction, because this was my favorite user comment out of them all. You’ll see that, while the figure in the image is identified as Donald Duck in the metadata, this anonymous contributor says “I believe this is Scrooge McDuck, the rich uncle of Donald. The monacle and money bags are dead giveaways” Image courtesy Michigan Tech Archives
Results compared across sites Here is a comparison of the results across the three sites. This chart is based on percentages calculated for each site, rather than simply the number of instances counted, since an uneven number of comments was collected from the three sites. Also, I only included categories here that accounted for more than 5% of the comments for at least one of the three sites. These results are interesting, because they tells us something about how users react to projects that are set up somewhat differently. For example, the Keweenaw site had by far the most archivist-created description of the three sites, providing metadata for 19 distinct fields for every object. However, this site also had the highest number of corrections among the three sites, suggesting that perhaps the more metadata is provided, the more opportunity there is for users to correct that information. Conversely, Beyond Brown Paper, which provided the most sparse metadata, had the lowest number of corrections. Another interesting finding was that the Polar Bear Expedition site had by far the most questions and answers posted by users. This may be because the Polar Bear site has a strong “archivist” presence, they encourage users to interact with the archivist through the commenting function, and have a user named “The Archivist” that responds to queries and requests, which accounts for the high number of comments they have in the “answer” category. The other sites, which do not have the same strong “archivist” presence, did not attract nearly as many questions from users. This finding is especially interesting because one of the concerns I came across in my research was that, if we allow users to comment on digital collections, it would become yet another reference venue to be monitored by public services staff, who are already stretched thin. However, these results seem to suggest that it may be possible to shape the way in which your users engage based on the type of environment you create.
Implications Users are willing to contribute, how willing are archivists to let them? Considerations: Encouraging and managing comments Maintaining clear spaces of authority Making user comments searchable Incorporating user comments into archival description It appears that our users are ready and willing to contribute. Now we are left to figure out how to manage and utilize their contributions. Going forward, some important considerations to keep in mind will be how to maintain clear spaces of authority through the platforms we use and interfaces we build. We must also consider whether we want user contributions to be searchable, which allows them to serve as access points for our materials. And perhaps the most extreme utilization of user-created content would be to incorporate it into our own metadata.
Encouraging your user community Since the results of this study show variance in how users contribute to sites that are set up and presented differently, we should think about how we want our users to engage, and present our tools in such a way to encourage that type of engagement. Here is an example from an online collection from Eastern Carolina University. They offer a fairly standard looking comment box, but specifically suggest that users comment or tag when they know something about the item, and direct them elsewhere to add general questions or comments. Image and screenshot from Joyner Library Digital Collections, Eastern Carolina University
Authority control This set of comments illustrates the concern over authority control, and why it would be desirable to maintain clearly separate spaces for user contributions as opposed to archival metadata. Here, three comments, all from anonymous users, disagree as to which church is pictured in the photograph, which could be very confusing for future viewers. Image courtesy Michigan Tech Archives
Searching across comments The search system used in Beyond Brown Paper does not search across user comments. This means that all of the identifications, contextual information, and other contributions provided as comments cannot help other users discover and access the materials. The systems used for the Keweenaw and Polar Bear collections do search across user comments. These models seem to offer far more utility, because they allow the comments to help users retrieve search results, including those items with sparse or incorrect metadata. The Polar Bear Expedition site takes it a step further, and returns faceted results, showing users whether the hits came from user comments or other sections of the site. This approach allows user contributions to become a part of the discovery system, while clearly differentiating between user- and archivist-created description. Bentley Historical Library, University of Michigan
Incorporating user-contributed content A couple examples were found of sites incorporating user-contributed information into their own descriptive metadata, as seen here in on Beyond Brown Paper. This seems to be the most involved, time-consuming, and probably also controversial way to deal with user contributions. The Polar Bear site only incorporates user contributions into their description if proper documentation is provided, such as a death certificate, discharge papers, etc. This is also a good opportunity to point out one of the limitations of the study, which was that there was usually no way for me to tell whether these commenters were truly outside users of the archives, or whether some of these individuals may be archivists themselves, or otherwise affiliated with the repository. So, it may be the case that …. (explain example) Image from Beyond Brown Paper, courtesy Plymouth State University
Incorporating user-contributed content Here is an example of a contribution from an anonymous user incorporated into the archival metadata on the Keweenaw Digital Archives site. Note that the contribution, which was added to the “description” field, is clearly and parenthetically attributed to an anonymous patron. Image courtesy Michigan Tech Archives
Final thoughts and questions How is user engagement affected by: format of materials (textual vs. photographic) amount of metadata provided Implications for finding aids? How to verify accuracy (and why bother?) How to encourage and shape user engagement? It is my hope that these findings will help archivists become more comfortable with incorporating archives 2.0 approaches into their own digital projects and other endeavors, and to make room for user contributions alongside their archival description. This research has left me with some unanswered questions and thoughts that should be further explored. For example, how might users respond differently collections of other formats, such as text-based materials, rather than photographic materials? Or “mass digitized” collections, which have only extremely minimal metadata? I am also curious about whether the results of this study have any implications for finding aids. Would users be as active and willing to contribute to and comment on archival finding aids, or is there something about the item or object being immediately visible on the screen that encourages users to comment and contribute? If we wish to experiment with incorporating user contributions into archival metadata, how might we go about verifying the accuracy of those comments? Or should we merely leave the two types of contributions, archive-generated and user-generated, completely separate? Or, as a third option, should we attempt to incorporate it, but clearly attribute it to an unverified source (like the in Keweenaw example?) Finally, if you are thinking about allowing user comments to your online collections, it seems worthwhile to consider how you would like your users to interact with your site, and design your system and procedures accordingly. Whatever expectations you have of your users, make them explicit on your site. Invite them to share their knowledge about the materials, have discussions with one another, and share other resources. You may also invite them to make corrections, ask questions, or request copies, if you can develop a work flow for managing and responding to those types of requests. Whatever approaches we take, I hope we can find manageable ways to not only allow our user communities to share their knowledge, resources, and thoughts about the historical record, but to somehow harness and make use of their contributions as well. http://www.archives.gov/social-media/photo-comment-policy.html
Special thanks to The Donald Peterson Student Award Committee Thank you! email: jmsedg@gmail.com twitter: jm_sedgwick Image from Beyond Brown Paper, courtesy Plymouth State University Complete study available online: http://etd.ils.unc.edu/dspace/handle/1901/561 Special thanks to The Donald Peterson Student Award Committee