4.18.24 Movement Legacies, Reflection, and Review.pptx
Research Data in the Arts and Humanities: A Few Difficulties
1. Research Data Management in the
Arts and Humanities a few difficulties
TomPhillips,AHumument(1970,1986,1998,2004,2012…)
Martin Donnelly, Digital Curation Centre, University of Edinburgh
SCONUL Summer Conference, Cardiff, 23 & 24 June 2016
2. About the DCC
The UK’s centre of expertise in digital preservation and data
management, established in 2004
Provide guidance, training, tools and other services on all aspects of
research data management
Organise national and international events and webinars (International
Digital Curation Conference, Research Data Management Forum)
Our primary audience has been the UK higher education sector, but we
increasingly work further afield (Europe, North America, Australia, South
Africa) and in new sectors (government, commercial, etc)
Involved in various European projects and initiatives, including FOSTER,
OpenAIRE and EUDAT
Now offering tailored consultancy and training services
3. Context and overview
Policy-driven expectations to archive, link and share the data
(evidence) underpinning scholarly publications are increasingly
becoming ‘the new normal’
The drivers behind this shift tend to be quite science-centric, to the
extent that in some circles the terms ‘research’ and ‘science’ are used
almost interchangeably. This, alongside other terminological
problems such as the use of ‘data’ as shorthand for a broad range of
quantitative and non-quantitative research objects, can serve to
alienate those working in the Arts and Humanities
These sessions will explore some of the problems inherent in treating
all research observations, records and products as ‘data’ and will
include an opportunity for attendees to reflect on the policy and
support environment within which their own institutions operate…
4. Structure of the hour
1. Talk (“Defining our terms”)
2. Exercise (“Sketching the problem”)
3. Discussion (“What can we do to
make things better?”)
5. What is research data management?
“the active
management and
appraisal of data
over the lifecycle of
scholarly and
scientific interest”
6. The old way of doing research (science)
1. Researcher collects data (information)
2. Researcher interprets/synthesises data
3. Researcher writes paper based on data
4. Paper is published (and preserved)
5. Data is left to benign neglect, and
eventually ceases to be accessible
7. The new way of doing research (science)
Plan
Collect
Assure
Describe
Preserve
Discover
Integrate
Analyze
DEPOSIT
…and
RE-USE
The DataONE
lifecycle model
8. N.B. other models are available…
Ellyn Montgomery, US Geological Survey
9. What’s “normal” is shifting…
Data management is a part of good research practice.
- RCUK Policy and Code of Conduct on the Governance of Good Research Conduct
10. Why do RDM?
In a word,
so we and others
can re-use data
in the future
11. RDM key drivers
TRANSPARENCY: The evidence that underpins research can be
made open for anyone to scrutinise, and attempt to replicate
the findings of others.
EFFICIENCY/VfM: Data collection can be funded once, and used
many times for a variety of purposes.
SPEED: Data can be distributed and accessed more quickly. In
some disciplines, such as climate science, this is vital.
RISK MANAGEMENT: A pro-active approach to data
management reduces the risk of inappropriate disclosure of
sensitive data, whether commercial or personal.
PRESERVATION: Lots of data is unique, and can only be
captured once. If lost, it can’t be replaced.
13. Definitions vary from discipline to discipline, and from funder to funder…
Here’s a science-centric definition:
“The recorded factual material commonly accepted in the scientific community as
necessary to validate research findings.” (US Office of Management and Budget,
Circular 110)
[Addendum: This policy applies to scientific collections, known in some disciplines
as institutional collections, permanent collections, archival collections, museum
collections, or voucher collections, which are assets with long-term scientific value.
(US Office of Science and Technology Policy, Memorandum, 20 March 2014)]
And another from the visual arts:
“Evidence which is used or created to generate new knowledge and
interpretations. ‘Evidence’ may be intersubjective or subjective; physical or
emotional; persistent or ephemeral; personal or public; explicit or tacit; and is
consciously or unconsciously referenced by the researcher at some point during
the course of their research.”
(Leigh Garrett, KAPTUR project: see http://kaptur.wordpress.com/
2013/01/23/what-is-visual-arts-research-data-revisited/)
So what is ‘data’ exactly?
14. Scientific and other methods…
The scientific method is a body of
techniques for investigating phenomena,
acquiring new knowledge, or correcting and
integrating previous knowledge.
To be termed scientific, a method of inquiry
must be based on empirical and measurable
evidence subject to specific principles of
reasoning.
The Oxford English Dictionary defines the
scientific method as: “a method or procedure
that has characterized natural science since
the 17th century, consisting in systematic
observation, measurement, and experiment,
and the formulation, testing, and modification
of hypotheses.”
Source:
http://en.wikipedia.org/wiki/Scientific_method
An art methodology differs from a science
methodology, perhaps mainly insofar as the artist is
not always after the same goal as the scientist. In art it
is not necessarily all about establishing the exact truth
so much as making the most effective form (painting,
drawing, poem, novel, performance, sculpture, video,
etc.) through which ideas, feelings, perceptions can
be communicated to a public. With this purpose in
mind, some artists will exhibit preliminary sketches
and notes which were part of the process leading to
the creation of a work. Sometimes, in Conceptual art,
the preliminary process is the only part of the work
which is exhibited, with no visible end result displayed.
In such a case the "journey" is being presented as
more important than the destination.
Source: http://en.wikipedia.org/wiki/Art_methodology
15. “A work is never completed except by some accident such as
weariness, satisfaction, the need to deliver, or death: for, in
relation to who or what is making it, it can only be one stage in a
series of inner transformations” – Paul Valéry
Paraphrased by Auden as “A work of art is never completed, only
abandoned”
“You could not step twice into the same river” – Heraclitus, as
reported by Plato (via Socrates)
“In science, one man’s noise is another man’s signal” – Edward Ng
‘Truth?’ said Pilate. ‘What is that?’ – John 18:38
A few useful (?) quotations
16. There’s nothing new about data re-use in the Arts and
Humanities; it’s an integral part of the culture, and
always has been…
Think Shakespeare’s plots, found sounds/ objects/
poems (e.g. Duchamp, Morgan), variations on a theme,
collage and intermedia art, T.S. Eliot’s allusions, DJ
culture (sampling/breakbeat), etc.
However, it’s often more fraught than data re-use in
other areas (e.g. the Physical Sciences, if not the Social
Sciences). Some characteristics of Arts and Humanities
data are likely to require a different kind of handling
from that afforded to other disciplines
For starters, people do not always think of their
sources/ influences/ outputs as ‘data’, and the value
and referencing systems (and norms) may be quite
different…
Strengths and weaknesses (I)
opinions?
affect?
beliefs?
perceptions?
17. Digital ‘data’ emerging in the Arts is as likely to be an outcome of the creative
research process as an input to a workflow
Furthermore, practice/praxis based research is more or less the sole preserve
of the Humanities, and research/production methods are not always
rigorously methodical or linear. This is at odds with the scientific approach,
and the way in which most RDM resources are described/defined/oriented
Arts ‘data’ is often personal, and creative data in particular may not be factual
in nature. What matters most may not be the content itself, but rather the
presentation, the arrangement, the quality of expression…
Creative researchers also care a great deal about the way in which their work
is presented, or ‘showcased’: standard repository installations don’t cut it!
What do Arts and Science data have in common? Both may be financially
valuable and/or precious to their creators
Strengths and weaknesses (II)
18. Archiving issues
Business case (“could anyone die or go to jail?”)
The law: data protection
Policy: retention and embargo periods
Financial/cultural benefit
Commercial considerations and IPR… personal data?
Multiplicity of (file) formats and creation/storage media (N.B. other
disciplines report this as well, e.g. Dryad 2015 Stats Roundup)
Most disciplinary repositories support only a limited set of
recommended file formats/object types
Linking analogue and digital, structuring collections
Scope. Respect des fonds? Ownership/IP issues may make this tricky
Non-digital material. Access arrangements/digitisation. Demand for
digitisation/archiving may outstrip capacity/budgets…
Scale. Imaging data is comparatively large in terms of disk space
19. Exercise
Make a quick list of the policies and laws
which govern the management of research
data in your institution. Who is responsible
for creating and monitoring compliance with
these?
Rank the five key drivers and benefits of
RDM for (a) the STEM subject areas, and (b)
the Arts and Humanities
Is it more appropriate to have a single
institutional RDM policy, or should these be
devolved to School or Department level?
Or is it better to devolve guidance and
procedures as opposed to high level policy?
1. TRANSPARENCY: The evidence that
underpins research can be made
open for anyone to scrutinise, and
attempt to replicate the findings of
others.
2. EFFICIENCY/VfM: Data collection can
be funded once, and used many
times for a variety of purposes.
3. SPEED: Data can be accessed more
quickly. In some disciplines, such as
climate science, this is vital.
4. RISK MANAGEMENT: A pro-active
approach to data management
reduces the risk of inappropriate
disclosure of sensitive data, whether
commercial or personal.
5. PRESERVATION: Lots of data is
unique, and can only be captured
once. If lost, it can’t be replaced.
KEY DRIVERS
20. Suggested ranking for the Sciences
TRANSPARENCY: The evidence that underpins research
can be made open for anyone to scrutinise, and attempt to
replicate the findings of others.
EFFICIENCY/VfM: Data collection can be funded once, and
used many times for a variety of purposes.
SPEED: Data can be accessed more quickly. In some
disciplines, such as climate science, this is vital.
RISK MANAGEMENT: A pro-active approach to data
management reduces the risk of inappropriate disclosure
of sensitive data, whether commercial or personal.
PRESERVATION: Lots of data is unique, and can only be
captured once. If lost, it can’t be replaced.
2
4
3
5
1
21. Suggested ranking for the Arts & Humanities
TRANSPARENCY: The evidence that underpins research
can be made open for anyone to scrutinise, and attempt to
replicate the findings of others.
EFFICIENCY/VfM: Data collection can be funded once, and
used many times for a variety of purposes.
SPEED: Data can be accessed more quickly. In some
disciplines, such as climate science, this is vital.
RISK MANAGEMENT: A pro-active approach to data
management reduces the risk of inappropriate disclosure
of sensitive data, whether commercial or personal.
PRESERVATION: Lots of data is unique, and can only be
captured once. If lost, it can’t be replaced.
1
2
5
3
4
22. Need – what do we need to archive? Is it always evidence without which the
research outcomes are in doubt?
Want – do we want to archive materials for other reasons? Does preserving
early/developmental work (such as artists’ sketch books) provide a richer
experience/understanding of the creative work and process? How do we
make a business case for this? Are datasets really “the special collections of
the future”?
Liminality
Many creative researcher/practitioners are on fractional contracts, and
there is not always a clear delineation between professional work and
personal practice. Where and how do we draw the line?
More practically, the same notebook or sketchbook may be used for both
professional and personal purposes. Its contents may be messy, personal,
confusing…
How much time/effort does potentially sensitive Arts ‘data’ require in order
to be prepared for archiving? Again, how do we know when it’s worth it?
Possible discussion points
23. Be careful with our terminology
“Data” – be clear that this is not the dictionary definition, but rather
shorthand for a variety of scholarly products/by-products (see
www.researchobject.org for examples)
Don’t use “science” and “research” interchangeably. Challenge those
who do.
Ensure that institutional policies fit the full range of scholarly disciplines
Be mindful of the sometimes blurred lines between professional
investigation and personal expression
Talk to researchers: understand their working methods, discover their
needs, assuage their fears
Build bridges before they’re needed
Accept that not everything needs to be archived – prioritise!
What can we do?
24. Paper: Marieke Guy, Martin Donnelly, Laura Molloy (2013) “Pinning It Down: Towards a Practical Definition of
‘Research Data’ for Creative Arts Institutions”, International Journal of Digital Curation, Vol. 8, No. 2, pp. 99-110. URL:
doi:10.2218/ijdc.v8i2.275
Projects:
KAPTUR (2011-13) URL: http://www.vads.ac.uk/kaptur/
A consortial approach to building an integrated RDM system (2014-16) (Partners: CREST, University for the Creative Arts,
ULCC, Leeds Trinity University, Arkivum). URL: http://www.crest.ac.uk/welcome-to-the-crest-rdms-project-blog/
Event: “Research Data Management Forum #10: RDM in the Arts and Humanities”, September 2013, St Anne's
College, University of Oxford. URL: http://www.dcc.ac.uk/events/research-data-management-forum-rdmf/rdmf10-
research-data-management-arts-and-humanities
Case study: Jonathan Rans (2013) “Planning for the future: developing and preserving information resources in the
Arts and Humanities” URL: http://www.dcc.ac.uk/resources/developing-rdm-services/dmps-arts-and-humanities
Blog posts:
Marieke Guy (2013) “RDM in the Performing Arts” URL: http://www.dcc.ac.uk/blog/rdm-performing-arts
Elizabeth Hull (2016) Dryad 2015 Stats Roundup. URL: https://blog.datadryad.org/2016/04/18/2015-stats-roundup/
Laura Molloy (2015) “Digital Preservation for the Arts, Social Sciences and Humanities - benefits for everyone” URL:
http://www.dcc.ac.uk/blog/digital-preservation-arts-social-sciences-and-humanities-benefits-everyone
Slides: Martin Donnelly (2013) “‘Found’ and ‘after’ - a short history of data reuse in the Arts” URL:
http://www.slideshare.net/martindonnelly/data-reuse-in-the-arts
Further reading and links
25. And if all else fails?
Eat your data
Put your data in the cloud
Abandon your data at a grocery store
Put your data in an old tube sock
Put a fake mustache on your data
Teach your data how to protect itself
Poison your data
Bury your data
Don’t make any data
Ahmed Amer – Ways to Protect Your Data, McSweeney’s, November 2015
26. Thank you / Diolch
For information about the DCC:
Website: www.dcc.ac.uk
Director: Kevin Ashley
(kevin.ashley@dcc.ac.uk)
General enquiries: info@dcc.ac.uk
Twitter: @digitalcuration
My contact details:
Email: martin.donnelly@ed.ac.uk
Twitter: @mkdDCC
Slideshare:
www.slideshare.net/martindonnelly
This work is licensed under the
Creative Commons Attribution
2.5 UK: Scotland License.
Editor's Notes
I'm painting in broad strokes here, of course… data can be output from, or input to, the research process.
Deposit = archive, share, link, publish, etc
Montgomery – “The ideas I tried to capture are:
1) it's a non-linear (and perhaps multi-threaded) process
2) multiple loops or phases (not restricted to the number drawn) that may overlap are needed
3) parts of the process are ongoing
4) there's a transition between data provider and data curator somewhere in the middle of the
progression this may vary between types of data and the eventual avenue for publication and distribution”
Note that some commentators state that the whole point of management is the possibility of re-use, and that re-use is extremely common in the humanities, and indeed the social sciences.
There’s also compliance with policies – although it’s best, when talking with researchers, not to labour this point.
WILL RETURN TO THIS SLIDE LATER TO ASK THE QUESTION - WHICH OF THESE ARE MOST IMPORTANT/COMPELLING IN THE ARTS AND HUMANITIES?
Note that small, specialist institutions may not have as much in-house expertise as larger universities, and may produce a comparatively heterogeneous range of output (data) types.
QUESTION - WHICH OF THESE ARE MOST VITAL IN THE STEM subjects? Ranked in a possible order
QUESTION - WHICH OF THESE ARE MOST VITAL IN THE ARTS AND HUMANITIES? Ranked in a possible order