Crowdsourcing Bib - RGC Grant
(Temporary spot until account is established on other wiki)
Project and Platform Design
Best Practices for Design and Management
- Barber, S. T. 2018. The ZOONIVERSE is expanding: Crowdsourced solutions to the hidden collections problem and the rise of the revolutionary cataloging interface. Journal of Library Metadata, 18(2), 85–111. doi.org/10.1080/19386389.2018.1489449
Provides an overview of the LIS literature on crowdsourcing and describes a number of crowdsourced cultural heritage projects and describes the “revolutionary cataloging interface” provided by Operation War Diary (OWD) – a major crowdsourcing transcription project administered by the UK’s National Archives and The Imperial War Museum. Includes detailed discussion of a usability testing survey administered to measure the OWD tutorial and interface and quality control mechanisms built into this crowdsourcing model.
- Causer, T., Tonra, J., & Wallace, V. 2012. Transcription maximized; expense minimized? Crowdsourcing and editing The Collected Works of Jeremy Bentham. Literary and Linguistic Computing, 27(2)2, 119–137. doi.org/10.1093/llc/fqs004.
Report around five key questions: is the crowdsourcing of manuscript transcription cost effective; is it exploitative; is the product high quality; is the project sustainable and the product permanent; and how can success be measured. Provides a review of major crowdsourcing projects to extract data, correct OCR, and transcribe indexes while noting that the transcription of manuscript collections is more complicated. The meat of the article addresses conceptualization and execution of the project including: development of the transcription tool (a customized MediaWiki allowing for encoding of linguistic and bibliographic data using TEI XML tags); the contributions of transcribers, which were significant; moderation by staff, which was intensive; and assessment of impact and success. Describes user survey employed to gauge factors that either motivated or discouraged volunteers.
- Carletti, Laura, Derek McAuley, Dominic Price, and Gabriella Giannachi. 2013. Digital Humanities and Crowdsourcing: An Exploration. Paper presented at the MW2013: The Annual Conference of Museums and the Web, Portland, OR (April).
Directly addresses crowdsourcing within digital humanities. Contains a solid lit review covering major publications to date with a focus on defining crowdsourcing for digital humanities and making recommendations for best practices.
- Holley, Rose. 2009. Many Hands Make Light Work: Public Collaborative OCR Text Correction in Australian Historic Newspapers. National Library of Australia. (March): 1-28.
Report on the Australian Newspaper Digitisation Project – a foundational case study on the design, management, and assessment of crowdsourcing in GLAMs. Provides an excellent model for employing usability testing to gauge the feasibility of a crowdsourced project. Furthermore, it presents a concrete model for assessing participant motivation through questionnaires distributed to the most active users. Finally, this study demonstrates how to measure the quality of crowdsourced transcription using confidence levels.
- Holley, Rose. 2010. Crowdsourcing: How and Why Should Libraries Do It? D-Lib Magazine 16, no. 3/4 (March).
Outlines tips for designing and managing crowdsourcing platforms that maximize the benefits for libraries. These include: having one main transparent and clear goal on the home page and a visible charting of progress toward that goal; making the overall environment easy to use, intuitive, quick and reliable and the activities easy and fun; keeping the site active by addition of new content/work; giving volunteers options and choices; and rewarding high achievers with rankings that encourage competition. Illustrates these tips using screenshots from successful projects.
- Mika, K., DeVeer, J., & Rinaldo, R. 2017. Crowdsourcing natural history archives: Tools for extracting transcriptions and data. Biodiversity Informatics, 12.
- Oomen, Johan, and Lora Aroyo. 2011. Crowdsourcing in the Cultural Heritage Domain: Opportunities and Challenges. In Proceedings of the 5th International Conference on Communities and Technologies: 138–149.
Identifies two main challenges: recruiting and maintaining knowledgeable users over time and maintaining the quality of user-generated metadata. Concludes that crowdsourcing volunteers are primarily motivated by connectedness/membership and sharing/generosity, with secondary motivations of altruism, fun, and competition. Also, crowdsourcing design must build in sufficient quality control and verification safeguards.
- Ridge, Mia. 2013. From Tagging to Theorizing: Deepening Engagement with Cultural Heritage through Crowdsourcing. Curator: The Museum Journal 56, no. 4: 435–50.
Useful as a source for thinking about what makes a well-designed crowdsourcing platform work for volunteers. Outlines specific design elements that, based on a careful survey of successful crowdsourcing projects and the literature on participant motivation, are likely to encourage and sustain participation. For example, beginning with a quick and easy task can initially engage a volunteer who can be drawn in further by scaffolding, which limits initial complexity and gradually increases difficulty as mastery increases. This approach addresses a main challenge for crowdsourcing projects – that is, how to maintain participation when boredom and anxiety are both factors that can cause volunteers to drop out. Presents casual game design, in which tasks are presented as easy to pick up games that can be enjoyed in short increments, as a promising build interfaces that respond to the changing needs of their participants with either new content, appropriately challenging tasks, or new roles and responsibilities.
- Simperl, Elena. 2015. How to Use Crowdsourcing Effectively: Guidelines and Examples. LIBER Quarterly 25, no. 1: 18–39. .
An excellent how-to guide for designing crowdsourcing projects. Final third of the article is a guide to crowdsourcing that breaks project design down into four main dimensions: what is being crowdsourced (a high-level goal); who the crowd is; how crowdsourcing will be implemented; and how participation will be incentivized. The guide explains how to translate the high-level goal of a project into specific tasks performed on selected items (full pages of text, data logs, etc.). The how-to-crowdsource section helpfully discusses how to select between macro and micro tasks. It also distinguishes explicit crowdsourcing (which uses a professional crowdsourcing platform to solicit contributions) from implicit crowdsourcing (in which the crowd does not explicitly solve tasks but rather plays a game or collects and shares information similar to social media applications) and suggests potential best applications for each approach. Simperl concludes by suggesting that game elements can be used to effectively incentivize continued participation in a number of different kinds of crowdsourcing projects.
- Thomer, A., Vaidya, G. Guralnick, R. Bloom, D., & Russel, L. 2012. From documents to datasets: A MediaWiki-based method of annotating and extracting species observations in century-old field notebooks. ZooKeys, 209, 235-253. doi.org/10.3897/zookeys.209.3247.
- Transforming Libraries and Archives through Crowdsourcing D-Lib Magazine May/June 2017, Volume 23, Number 5/6
Tools, Tutorials, and Project Examples
- Crowdsourcing Transcriptions Harvard Library Wiki annotated list of tools
- Crowdsourcing Special Collections Discusses uses and limitations of Zooniverse
- How to Build Your Own Zooniverse Project Video tutorial giving quick overview of steps for building a project; technical.
- Emigrant City NYPL Scribe Project
- Old Weather: Whaling Zooniverse/Scribe project
Research Study
Transcribing Handwritten Records
- How to Transcribe from the Library of Congress
- Transcription Tips from The National Archives
- Instructions for Volunteers from the Smithsonian Transcription Center
- Transcription Guidelines from Transcribe Bentham Project
- Transcription Tips from DIY History at U Iowa
- Europeana Transcription Tutorial
Data Extraction and Using Spreadsheets
- Old Weather Transcribing Guide
- Transcribing between the lines: Crowd-sourcing historic data collection
- Natural History Museum of Utah Catalog Notes
- Midwives Register
Transcription Tools
Methods Articles
Feasibility Study
- McKinley, D. 2012. Practical Management strategies for crowdsourcing in libraries, archives and museums.
Background Readings
- Schenk, E. & Guittard, C. (2011). Towards a characterization of crowdsourcing practices. Journal of Innovation Economics & Management, 7(1), 93-107. doi:10.3917/jie.007.0093.
According to this article, what we are doing can be characterized as a form of "content crowdsourcing" (CS) that they refer to as "Integrative Crowdsourcing":
Integrative CS will be relevant when the client firm seeks to build data or information bases. Therefore Integrative CS is a form of content Crowdsourcing. While gathering information or data at an individual’s level can be unproblematic, building a data base generally requires significant amounts of resources. The rationale of integrative CS therefore lies in the cost of building large data or information bases. Since individuals within the crowd are heterogeneous, Crowdsourcing enables the client firm to gather a variety of contents. The firm seeking to implement integrative CS should however be aware of integration issues. Data or information stemming from various origins might be incompatible or redundant if no precaution is taken. Precautions include the definition of a data format and the sound selection of data sources. [Emphasis added]
There are other relevant sections of this paper including:
- A way to distinguish 1992 data transcription tasks and the more complex cognitive efforts required for data extraction from the written articles from the 1961 season. See these sections of the paper:
- Crowdsourcing of simple tasks
- Crowdsourcing of complex tasks
- Discussion of incentives in this section: Discussion: benefits and pitfalls
- Brabham, D.C. (2013). Crowdsourcing (book). MIT Press. Available as an ebook through UA Libraries
This should be one of our "go to" resources. It's a short book. This quote is indicative of its usefulness:
Crowdsourcing is a problem-solving model because it enables an organization confronted with a problem and desiring a goal state to scale up the task environment dra-matically and enlarge the solver base by opening up the problem to an online community through the Internet.
This quote is from Chapter 1 of the book
- Terras, M.M. (2016). Crowdsourcing in the Digital Humanities In Schreibman, S., Siemens, R., and Unsworth, J. (eds), (2016) "A New Companion to Digital Humanities", (p. 420 –439). Wiley-Blackwell.
This chapter has a section that is a nice background discussion of historical document transcription.
- Ridge, Mia, ed. 2014. Crowdsourcing Our Cultural Heritage. Ashgate: Burlington, VT.
The book on crowdsourcing for cultural heritage projects. Brings together scholars working on and writing about both established and emergent aspects of crowdsourcing in GLAMs. There are good essays in here on projects like Transcribe Bentham and Old Weather, along with essays that deal in more depth with how crowdsourcing can be employed to generate high-quality metadata to improve discovery and use. Oomen, Gligorov, and Hildebrand describe a competitive social tagging game called Waisda, which was developed by the Netherlands Institute for Sound and Vision. Essays broadly cover project design, management, and assessment and discuss participant motivation and project outcomes.</blockquote.