Digital curation

Digital curation is the selection,[1] preservation, maintenance, collection and archiving of digital assets.[2][3][4] Digital curation establishes, maintains and adds value to repositories of digital data for present and future use.[3] This is often accomplished by archivists, librarians, scientists, historians, and scholars. Enterprises are starting to use digital curation to improve the quality of information and data within their operational and strategic processes.[5] Successful digital curation will mitigate digital obsolescence, keeping the information accessible to users indefinitely.

The term curation in the past commonly referred to museum and library professionals. It has since been applied to interaction with social media including compiling digital images, web links and movie files.

Core Principles and Activities

The term “digital curation” was first used in the e-science and biological science fields as a means of differentiating the additional suite of activities ordinarily employed by library and museum curators to add value to their collections and enable its reuse[6][7][8] from the smaller subtask of simply preserving the data, a significantly more concise archival task.[6] Additionally, the historical understanding of the term “curator” demands more than simple care of the collection. A curator is expected to command academic mastery of the subject matter as a requisite part of appraisal and selection of assets and any subsequent adding of value to the collection through application of metadata.[6]

Principles

There are five commonly accepted principles that govern the occupation of digital curation:

  • Manage the complete birth-to-retirement life cycle of the digital asset.[4]
  • Evaluate and cull assets for inclusion in the collection.[4]
  • Apply preservation methods to strengthen the asset’s integrity and reusability for future users.[4]
  • Act proactively throughout the asset life cycle to add value to both the digital asset and the collection.[4]
  • Facilitate the appropriate degree of access to users.[4]

Methodology

The Digital Curation Center offers the following step-by-step life cycle procedures for putting the above principles into practice:[9]

  • Conceptualize: Consider what digital material you will be creating and develop storage options. Take into account websites, publications, email, among other types of digital output.[9]
  • Create: Produce digital material and attach all relevant metadata, typically the more metadata the more accessible the information.[9]
  • Access and use: Determine the level of accessibility for the range of digital material created. Some material may be accessible only by password and other material may be freely accessible to the public.[9]
  • Appraise and select: Consult the mission statement of the institution or private collection and determine what digital data is relevant. There may also be legal guidelines in place that will guide the decision process for a particular collection.[9]
  • Dispose: Discard any digital material that is not deemed necessary to the institution.[9]
  • Ingest: Send digital material to the predetermined storage solution. This may be an archive, repository or other facility.[9]
  • Preservation action: Employ measures to maintain the integrity of the digital material.[9]
  • Reappraise: Reevaluate material to ensure that is it still relevant and is true to its original form.[9]
  • Store: Secure data within the predetermined storage facility.[9]
  • Access and reuse: Routinely check that material is still accessible for the intended audience and that the material has not been compromised through multiple uses.[9]
  • Transform: If desirable or necessary the material may be transferred into a different digital format.[9]

The term "digital curation" is sometimes used interchangeably with terms such as "digital preservation" and "digital archiving". While digital preservation does focus a significant degree of energy on optimizing reusability, preservation remains a subtask to the concept of digital archiving, which is in turn a subtask of digital curation.[6][8] For example, archiving is a part of curation, but so are subsequent tasks such as themed collection-building, which is not considered an archival task. Similarly, preservation is a part of archiving, but so are the important tasks of selection and appraisal that are not necessarily part of preservation.[8]

Data curation is another term that is often used interchangeably with digital curation, however common usage of the two terms differs. While “data” is a more all-encompassing term that can be used generally to indicate anything recorded in binary form, the term “data curation” is most common in scientific parlance and usually refers to accumulating and managing information relative to the process of research.[10] So, while documents and other discrete digital assets are technically a subset of the broader concept of data,[6] in the context of scientific vernacular digital curation represents a broader purview of responsibilities than data curation due to its interest in preserving and adding value to digital assets of any kind.[7]

Challenges

Rate of creation of new data and data sets

The ever lowering cost, and increasing prevalence of entirely new categories of technology has led to a quickly growing flow of new data sets.[11] These come from well established sources such as business and government, but the trend is also driven by new styles of sensors becoming embedded in more areas of modern life.[7] This is particularly true of consumers, whose production of digital assets is no longer relegated strictly to work. Consumers now create wider ranges of digital assets, including videos, photos, location data, purchases, and fitness tracking data, just to name a few, and share them in wider ranges of social platforms.[7]

Additionally, the advance of technology has introduced new ways of working with data. Some examples of this are international partnerships that leverage astronomical data to create “virtual observatories”, and similar partnerships have also leveraged data resulting from research at the Large Hadron Collider at CERN and the database of protein structures at the Protein Data Bank.[8]

Storage format evolution and obsolescence

By comparison, archiving of analog assets is notably passive in nature, often limited to simply ensuring a suitable storage environment. Digital preservation requires a more proactive approach.[12] Today’s artifacts of cultural significance are notably transient in nature and prone to obsolescence when social trends or dependent technologies change.[7] This rapid progression of technology occasionally makes it necessary to migrate digital asset holdings from one file format to another in order to mitigate the dangers of hardware and software obsolescence which would render the asset unusable.[9]

Underestimation of human labor costs

Modern tools for program planning often underestimate the amount of human labor costs required for adequate digital curation of large collections. As a result cost-benefit assessments often paint an inaccurate picture of both the amount of work involved, and the true cost to the institution for both successful outcomes and failures.[7]

Standardization and coordination between institutions

An absence of coordination across different sectors of society and industry in areas such as the standardization of semantic and ontological definitions,[13] and in forming partnerships for proper stewardship of assets has resulted in a lack of interoperability between institutions, and a partial breakdown in digital curation practice from the standpoint of the ordinary user.[7]

Digitization of analog materials

The curation of digital objects is not limited to strictly born-digital assets. Many institutions have engaged in monumental efforts to digitize analog holdings in an effort to increase access to their collections. Examples of these materials are books, photographs, maps, audio recordings, and more.[7] The process of converting printed resources into digital collections has been epitomized to some degree by librarians and related specialists. For example, The Digital Curation Centre is claimed to be a "world leading centre of expertise in digital information curation"[14] that assists higher education research institutions in such conversions.

New representational formats

For some topics, knowledge is embodied in forms that have not been conducive to print, such as how choreography of dance or of the motion of skilled workers or artisans is difficult to encode. New digital approaches such as 3D holograms and other computer-programmed expressions are developing.

For mathematics, it seems possible for a new common language to be developed that would express mathematical ideas in ways that can be digitally stored, linked, and made accessible. The Global Digital Mathematics Library is a project to define and develop such a language.

Accessibility

The ability of the intended user community to access the repository’s holdings is of equal importance to all the preceding curatorial tasks. This must take into account not only the user community’s format and communication preferences, but also a consideration of communities that should not have access for various legal or privacy reasons.[15]

Responses to challenges

  • Specialized research institutions[16][17]
  • Academic courses
  • Dedicated symposia[18][19]
  • Peer reviewed technical and industry journals[20]

Approaches

Many approaches to digital curation exist, and have evolved over time in response to the changing technological landscape. Two examples of this are sheer curation[6] and channelization.

Sheer curation is an approach to digital curation where curation activities are quietly integrated into the normal work flow of those creating and managing data and other digital assets. The word sheer is used to emphasize the lightweight and virtually transparent nature of these curation activities. The term sheer curation was coined by Alistair Miles in the ImageStore project,[21] and the UK Digital Curation Centre's SCARP project.[22] The approach depends on curators having close contact or 'immersion' in data creators' working practices. An example is the case study of a neuroimaging research group by Whyte et al., which explored ways of building its digital curation capacity around the apprenticeship style of learning of neuroimaging researchers, through which they share access to datasets and re-use experimental procedures.[23]

Sheer curation depends on the hypothesis that good data and digital asset management at the point of creation and primary use is also good practice in preparation for sharing, publication and/or long-term preservation of these assets. Therefore, sheer curation attempts to identify and promote tools and good practices in local data and digital asset management in specific domains, where those tools and practices add immediate value to the creators and primary users of those assets. Curation can best be supported by identifying existing practices of sharing, stewardship and re-use that add value, and augmenting them in ways that both have short-term benefits, and in the longer term reduce risks to digital assets or provide new opportunities to sustain their long-term accessibility and re-use value.

The aim of sheer curation is to establish a solid foundation for other curation activities which may not directly benefit the creators and primary users of digital assets, especially those required to ensure long-term preservation. By providing this foundation, further curation activities may be carried out by specialists at appropriate institutional and organisation levels, whilst causing the minimum of interference to others.

A similar idea is curation at source used in the context of Laboratory Information Management Systems LIMS. This refers more specifically to automatic recording of metadata or information about data at the point of capture, and has been developed to apply semantic web techniques to integrate laboratory instrumentation and documentation systems.[24] Sheer curation and curation-at-source can be contrasted with post hoc digital preservation, where a project is initiated to preserve a collection of digital assets that have already been created and are beyond the period of their primary use.

Channelization is curation of digital assets on the web, often by brands and media companies, into continuous flows of content, turning the user experience from a lean-forward interactive medium, to a lean-back passive medium. The curation of content can be done by an independent third party, that selects media from any number of on-demand outlets from across the globe and adds them to a playlist to offer a digital "channel" dedicated to certain subjects, themes, or interests so that the end user would see and/or hear a continuous stream of content.

    See also

    References

    1. Erin Scime (8 December 2009). "The Content Strategist as Digital Curator". A List Apart.
    2. Rusbridge, C.; Buneman, P.; Burnhill, P.; Giaretta, D.; Ross, S.; Lyon, L.; Atkinson, M. (2005). "The Digital Curation Centre: A Vision for Digital Curation". 2005 IEEE International Symposium on Mass Storage Systems and Technology (PDF). p. 31. doi:10.1109/LGDI.2005.1612461. ISBN 0-7803-9228-0.
    3. 1 2 "What is Digital Curation?". Digital Curation Centre. Retrieved 2008-04-01.
    4. 1 2 3 4 5 6 Elizabeth Yakel (2007). "Digital curation". Emerald Group Publishing. Retrieved 2008-04-01.
    5. E. Curry, A. Freitas, and S. O'Riáin, "The Role of Community-Driven Data Curation for Enterprises," Archived 2012-01-23 at the Wayback Machine. in Linking Enterprise Data, D. Wood, Ed. Boston, MA: Springer US, 2010, pp. 25-47.
    6. 1 2 3 4 5 6 Dallas, Costis (2016-12-01). "Digital curation beyond the "wild frontier": a pragmatic approach". Archival Science. 16 (4): 421–457. doi:10.1007/s10502-015-9252-6. ISSN 1389-0166.
    7. 1 2 3 4 5 6 7 8 Council, National Research (2015-04-22). Preparing the Workforce for Digital Curation. doi:10.17226/18590. ISBN 9780309296946.
    8. 1 2 3 4 Beagrie, Neil (2008). "Digital Curation for Science, Digital Libraries, and Individuals". International Journal of Digital Curation. 1: 3–16. doi:10.2218/ijdc.v1i1.2.
    9. 1 2 3 4 5 6 7 8 9 10 11 12 13 "DCC Curation Lifecycle Model | Digital Curation Centre". www.dcc.ac.uk. Retrieved 2018-02-19.
    10. 1951-, Borgman, Christine L.,. Big data, little data, no data : scholarship in the networked world. Cambridge, Massachusetts. ISBN 9780262327862. OCLC 900409008.
    11. Ray, Joyce (2009). "Sharks, digital curation, and the education of information professionals". Museum Management and Curatorship. 24 (4): 358. doi:10.1080/09647770903314720.
    12. Higgins, Sarah (2011). "Digital Curation: The Emergence of a New Discipline". International Journal of Digital Curation. 6 (2): 78–88. doi:10.2218/ijdc.v6i2.191.
    13. Paul Watry (November 2007). "Digital Preservation Theory and Application: Transcontinental Persistent Archives Testbed Activity". The International Journal of Digital Curation. Archived from the original on 2008-03-15. Retrieved 2008-04-01.
    14. Digital Curation Centre. "About the DCC". Website. Digital Curation Centre. Retrieved 6 March 2013.
    15. Lavoie, Brian; OCLC (2014). "The Open Archival Information System (OAIS) Reference Model: Introductory Guide (2nd Edition)". doi:10.7207/twr14-02.
    16. Digital Curation Centre
    17. Digital Preservation Coalition
    18. DigCCurr 2007 - an international symposium on Digital Curation, April 18-20, 2007
    19. 1st African Digital Management and Curation Conference and Workshop - Date: 12-13 February 2008
    20. International Journal of Digital Curation
    21. The ImageStore Project - ImageWeb
    22. Digital Curation Centre: DCC SCARP Project
    23. Whyte, A., Job, D., Giles, S. and Lawrie, S. (2008) 'Meeting Curation Challenges in a Neuroimaging Group', The International Journal of Digital Curation Issue 1, Volume 3, 2008
    24. Frey, J. 'Sharing and Collaboration' keynote presentation at UK e-Science All Hands Meeting, 8–11 September 2008, Edinburgh
    This article is issued from Wikipedia. The text is licensed under Creative Commons - Attribution - Sharealike. Additional terms may apply for the media files.