‘Harvesting’ Community Knowledges: Crowdsourcing and Community Representation

by brandontlocke

I presented this paper at the James A. Rawley Annual Conference in the Humanities on March 15, 2014 in Lincoln, Nebraska. Disclaimer: I worked as project manager for History Harvest for over a year, and was involved in two of the History Harvest events (I participated in my third after this presentation). I want to clarify that I am no longer officially associated with the project, and I do not speak on its behalf, nor on the behalf of the UNL History Dept. The slides for this talk are also available.

slateI want to open with one of my favorite ongoing series on the web. Every few months, Slate covers an American story through the same lens that American journalists cover news stories that occur outside of the US and Europe. The pieces critically and often demeaningly question some very basic tenets of American culture, highlighting the social, cultural, and political lenses through which we view others, without even being aware of it. Librarians, archivists, and curators have a long and tumultuous relationship with these biases and the way they present truth. Curators are faced with the impossible task of relating truths about culturally constructed topics that have innumerable meanings. Although museums in the postmodern era have made tremendous strides to speak from the perspective of the cultures represented and move away western viewpoints and colonialist narratives, a central problem remains. In nearly all cases, the descriptions, metadata, and contexts surrounding objects and exhibits, as well as their inclusion in the first place, are created by a staff of elite experts writing from the perspective of the ivory tower (Srinivasan, 2009a, 667). Cultural heritage institutions, operated by dominant cultures, often miss or intentionally ignore the significance and meaning of objects and histories in other cultures and communities. The result of these practices are collections which represent communities only through the paradigm of the dominant culture, and infuse shared memories with the ideological, political, and cultural influences of the present dominant community (Somerville and Echohawk, 650). UNESCO has highlighted this as an issue throughout the world, and has encouraged projects over the past two decades to reflect diversities and avoid what they call a “world without memories (Somerville and Echohawk, 651).”

Cultural heritage institutions have largely tried to combat these problems by better educating their experts, understanding and confronting their biases, and by consulting community leaders. Again, these attempts have significantly improved collections and descriptions, but still rely on authorities and (in most cases) standardized language to describe and define histories, and often to nothing to improve the collection to better reflect the community. Like decisions of significance and inclusion, the descriptions and the language in the metadata is nearly always written from the standpoint and language of the dominant cultural elite. The issue of problematic controlled vocabulary has been discussed for decades, since the late 1960s when “radical cataloger” Sandford Berman began criticizing the Library of Congress for only reflecting the white, Christian, middle-class viewpoints, and for continuing to use terms considered offensive to the very people they were meant to represent. Although Berman’s “radical cataloging” gained traction and has made significant progress, controlled vocabulary, by nature, is going to be insufficient to represent all things to all people and to keep progress with language as it evolves. The practice of subject heading assignment is what Boast et, al. describes as an “…imposition of the efficiency driven priorities the public institution upon its diverse publics (Boast, Bravo, and Srinivasan, 397).” The efficiency and usefulness of subject headings does have a great deal of value. Rather than throwing them out, I believe libraries should continue to improve them, while also supplementing other kinds of terminology knowledge collection methods.

In recent years, many cultural heritage institutions have turned to technology and the participatory nature of the so-called “Web 2.0” to improve descriptions and add community knowledge. Users can add tags to materials using whatever language they feel is appropriate. These collaborative tags, known as folksonomies, can enable better search functionality for users by broadening the metadata terms and injecting the vocabulary that is likely to be used by the users. There have been a number of successful crowdsourcing projects that have added valuable metadata to objects with sparse records, but thus far, folksonomies have been generally unsuccessful in adequately representing minority communities. While tagging does allow communities to apply their knowledge to the collection, they can be lost in the tags of member of the dominant community (Bates and Rowley, 445). Users can also add tags which lack specificity or constructive knowledge, or can add tags that are not useful to other users. Folksonomies can be useful if specific knowledge communities are targeted and solicited for input, but simply opening up the collection to tagging will likely result in tags that only represent dominant groups.

Many museum informaticists recognize that the current model is fundamentally flawed because of its inability to represent multiple knowledges, and are calling for fundamental changes to museums and cultural heritage institutions. The authors cite a number of progressive Web 2.0 projects that bring multivocality into core documentation by “…fundamentally changing the philosophy with which these institutions approach documentation and description (Srinivasan, et al., 2009b, 275).” Attempts to do so have primarily involved partnering with community leaders to document and represent objects from the perspective of the community. While this is certainly a vast improvement over current methods in representing localized and culturally-specific knowledge, I feel that this model has some shortcomings worth examining. By reaching out to authorities within a community, heritage institutions are only getting knowledge from those in authoritative positions. A true representation of a community would include the multivocality within its population, and would give individuals the opportunity to voice their own histories. There has yet to be a study on the ways in which a truly crowdsourced collection can contribute to the understanding and representation of diverse communities.

History Harvest - historyharvest.unl.edu

History Harvest – historyharvest.unl.edu

For the past several years, History Harvest has gone out into selected communities and asked individuals to bring in objects of significance to contribute to the historical record. The significance of these objects is determined by the contributors, and an oral history interview is conducted often revolving around the objects the contributor is sharing. History Harvest’s collection process is worth investigating through the prism of the shortcomings in cultural heritage institutions. A crowdsourced collection, combined with metadata derived from rich contextual knowledge, provides the ability to represent items with the knowledge of the contributing community, rather than the knowledge of the institution and the dominant culture (Srinivasan, et al. 2010, 766-767). It is essential that community stories are constituted with a plethora of first-hand knowledge and stories to make up the collective community narrative.

Only by knowing their identity as a function of their unfolding biographical history, and their engagement with multiple knowledge groups, can [objects] be set within the dynamic and expanding negotiations that constantly work to constitute the knowledges of which they are an active part. If a dynamic and situated knowledge is discredited, overruled, or abandoned, it does not lose its validity, for it retains its place and time in the local negotiations that are knowledge. If, however, it loses its associations, it becomes uprooted, displaced, and therefore severed from the people and places that validate its social meaning (Boast, Bravo, and Srinivasan, 400-401).

Warren Taylor's Penny

Warren Taylor’s Penny

One example of the benefits of a crowdsourced model of collection development can be seen in the 2012 History Harvest. Warren Taylor, a History Harvest contributor from the predominantly African American North Omaha area, brought in a “liberty penny” owned by his great-great-grandmother. It was a family heirloom, given to him by his great aunt along with a few other objects. The penny was obtained by his great-great-grandmother while she was enslaved in the south, and, given the rarity of an enslaved person having money combined with the ‘liberty’ message in the penny, the object had value and meaning within his family. The penny also serves as a symbol of the Great Migration, which brought thousands of African Americans, including Warren Taylor’s family, northward to cities like Omaha. In this instance, the object itself, devoid of the contributor’s story and the context surrounding it, has little value or instruction in a cultural heritage collection. However, by documenting Mr. Taylor’s story and studying the intertwined biographies of the penny and his family, the public is able to understand a bit more about the experiences of African Americans directly from the source.

The biographies of objects and the meanings they hold to individuals are not always excluded from the historical record or from cultural heritage institutions, but when they are, they’re generally from someone from the dominant social group who was widely considered important by their peers. When keepsakes or talismans of this kind are represented, they’re most often from the wealthy and powerful who, as part of the dominant society, are empowered to donate their possessions to museums which reflect the value and significance of the items. History Harvest reflects the significance that each individual person recognized for their own possessions and histories, and makes those available to the public. These reflections, taken in combination with others, can share some exceptional insights into communities and their cultures, customs, and histories.

Metadata in the History Harvest is also written primarily using the language and dialect of the contributor and does not used controlled vocabulary for the majority of the fields. This avoids the longstanding issue of subject term usage, which can erase or flatten the languages and vocabularies that natively applied to the artifacts. More importantly, oral history interviews are posted along with the objects, to represent the entire context in the most complete possible way. The major benefit of a process like History Harvest’s is in providing items to the historical record that are context-rich and fully informed from community knowledge. Even with the best intentions, the nature of controlled vocabularies and slow-moving institutions means that objects in typical GLAMs are described through a limited vocabulary that is not necessarily representative of all communities and vantage points.

Although the method employed by History Harvest minimizes the institutional imprint, it is not perfect. First, the oral history interviews are edited for time, and the metadata is necessarily edited and reformulated to some extent. These are largely unavoidable, but a consciousness of the concerns will go a long way to minimize the impact of editing and reformulation. Second, the information obtained (that is, not just the information that is recorded and displayed, but the initial interview) is shaped by the interviewer. Their cultural lens, as well as their knowledge of the topic will shape the information received, and will impact the direction of the knowledge. Like the first problem, this is unavoidable, but can be minimized through careful training, education, interviewer selection, and other considerations. Third, this method does not necessarily aid problems of intellectual property and different cultural contexts. Contributors could be asked for their preferred rights and customary sharing options, but then a great deal of work must go into creating an infrastructure that that supports these. The social, technical, and infrastructural issues created by such a collection are many, and they must be addressed in the near future.

History Harvest, and its experiences with crowdsourced content show promise for the future in building collections and adding community knowledges to existing collections. The methods put a much needed focus on community inclusion, diverse knowledges, and the full social, political, and cultural context of historical items. These experiences with very basic, stripped down methods, can positively contribute to diversity, intellectual tension, and community knowledge grounded in the lives of those in the community. With this groundwork, institutions can either publish the work on its own as grassroots archives like the History Harvest, or they may retain this information while adding more layers of knowledge derived from multiple different places.

Works Cited

Jo Bates and Jennifer Rowley, “Social Reproduction and Exclusion in Subject Indexing: A Comparison of Public Library OPACs and LibraryThing Folksonomy,” Journal of Documentation 67, no. 3 (April 26, 2011), doi:10.1108/00220411111124532.

Robin Boast, Michael Bravo, and Ramesh Srinivasan, “Return to Babel: Emergent Diversity, Digital Resources, and Local Knowledge,” Information Society 23, no. 5 (October 2007), doi:10.1080/01972240701575635.

Mary M. Somerville and Dana Echohawk, “Recuerdos Hablados/Memories Spoken: Toward the Co-Creation of Digital Knowledge with Community Significance,” Library Trends 59, no. 4 (Spring 2011).

Ramesh Srinivasan et al., “Blobgects: Digital Museum Catalogs and Diverse User Communities,” Journal of the American Society for Information Science & Technology 60, no. 4 (April 2009a).

Ramesh Srinivasan et al., “Digital Museums and Diverse Cultural Knowledges: Moving Past the Traditional Catalog,” Information Society 25, no. 4 (July 2009b), doi:10.1080/01972240903028714.

Ramesh Srinivasan et al., “Diverse Knowledges and Contact Zones Within the Digital Museum,” Science, Technology & Human Values 35, no. 5 (September 2010).