Research Data in the Arts and Humanities: A Few Tricky Questions
1. ìResearch Data in the Arts and
Humanities a few tricky questions
Tom
Phillips,
A
Humument(1970,
1986,
1998,
2004,
2012…)
Martin
Donnelly,
Digital
Curation
Centre,
University
of
Edinburgh
(and
the
FOSTER
project)
OCEAN
launch
event,
University
of
Warsaw,
14
December
2015
2. About the DCC
ì The
UK’s
centre of
expertise
in
digital
preservation
and
data
management,
established
in
2004
ì Provide
guidance,
training,
tools
and
other
services
on
all
aspects
of
research
data
management
ì Organise national
and
international
events
and
webinars
(International
Digital
Curation
Conference,
Research
Data
Management
Forum)
ì Our
primary
audience
has
been
the
UK
higher
education
sector,
but
we
increasingly
work
further
afield
(Europe,
North
America,
Australia,
South
Africa)
and
in
new
sectors
(government,
commercial,
etc)
ì Involved
in
various
European
projects
and
initiatives,
including
FOSTER,
OpenAIRE and
EUDAT
ì Now
offering
tailored
consultancy/training
services
3. Overview of talk
1. What
do
we
mean
by
“research
data
(management)”?
2. Why
is
it
different
in
the
arts
and
humanities?
3. What
can
we
do
to
make
things
better?
4. What is research data management?
“the
active
management
and
appraisal
of
data
over
the
lifecycle
of
scholarly
and
scientific
interest”
5. The old way of doing research (science)
1.
Researcher
collects
data
(information)
2.
Researcher
interprets/synthesises
data
3.
Researcher
writes
paper
based
on
data
4.
Paper
is
published
(and
preserved)
5.
Data
is
left
to
benign
neglect,
and
eventually
ceases
to
be
accessible
6. The new way of doing research (science)
Plan
Collect
Assure
Describe
Preserve
Discover
Integrate
Analyze
DEPOSIT
…and
RE-‐USE
The
DataONE
lifecycle
model
7. N.B. other models are available…
Ellyn Montgomery, US Geological Survey
8. What’s “normal” is shifting…
Data
management
is
a
part
of
good
research
practice.
-‐ RCUK
Policy
and
Code
of
Conduct
on
the
Governance
of
Good
Research
Conduct
9. Reminder: key drivers and benefits of RDM
ì TRANSPARENCY:
The
evidence
that
underpins
research
can
be
made
open
for
anyone
to
scrutinise,
and
attempt
to
replicate
the
findings of
others.
ì EFFICIENCY/VfM:
Data
collection
can
be
funded
once,
and
used
many
times
for
a
variety
of
purposes.
ì SPEED:
Data
can
be
accessed
more
quickly.
In
some
disciplines,
such
as
climate
science,
this
is
vital.
ì RISK
MANAGEMENT:
A
pro-‐active
approach
to
data
management
reduces
the
risk
of
inappropriate
disclosure
of
sensitive
data,
whether
commercial
or
personal.
ì PRESERVATION:
Lots
of
data
is
unique,
and
can
only
be
captured
once.
If
lost,
it
can’t
be
replaced.
10. ì Definitions
vary
from
discipline
to
discipline,
and
from
funder
to
funder…
ì Here’s
a
science-‐centric
definition:
ì “The
recorded
factualmaterial
commonly
accepted
in
the
scientific
community
as
necessary
to
validate research
findings.”
(US
Office
of
Management
and
Budget,
Circular
110)
ì [Addendum:
This
policy
applies
to
scientific
collections,
known
in
some
disciplines
as
institutional
collections,
permanent
collections,
archival
collections,
museum
collections,
or
voucher
collections,
which
are
assets
with
long-‐term
scientific
value.
(US
Office
of
Science
and
Technology
Policy,
Memorandum,
20
March
2014)]
ì And
another
from
the
visual
arts:
ì “Evidence
which
is
used
or
created
to
generate
new
knowledge
and
interpretations.
‘Evidence’
may
be
intersubjective or
subjective;
physical
or
emotional;
persistent
or
ephemeral;
personal
or
public;
explicit
or
tacit;
and
is
consciously
or
unconsciously
referenced
by
the
researcher
at
some
point
during
the
course
of
their
research.”
(Leigh
Garrett,
KAPTUR
project:
see
http://kaptur.wordpress.com/
2013/01/23/what-‐is-‐visual-‐arts-‐research-‐data-‐revisited/)
So what is ‘data’ exactly?
11. Scientific and other methods…
ì The scientific method is a body of
techniques for investigating phenomena,
acquiring new knowledge, or correcting
and integrating previous knowledge.
ì To be termed scientific, a method of
inquiry must be based on empirical and
measurable evidence subject to specific
principles of reasoning.
ì The Oxford English Dictionary defines the
scientific method as: “a method or
procedure that has characterized natural
science since the 17th century, consisting
in systematic observation, measurement,
and experiment, and the formulation,
testing, and modification of hypotheses.”
ì Source:
http://en.wikipedia.org/wiki/Scientific_m
ethod
An art methodology differs from a
science methodology, perhaps mainly
insofar as the artist is not always after
the same goal as the scientist. In art it is
not necessarily all about establishing the
exact truth so much as making the most
effective form (painting, drawing, poem,
novel, performance, sculpture, video,
etc.) through which ideas, feelings,
perceptions can be communicated to a
public. With this purpose in mind, some
artists will exhibit preliminary sketches
and notes which were part of the process
leading to the creation of a work.
Sometimes, in Conceptual art, the
preliminary process is the only part of
the work which is exhibited, with no
visible end result displayed. In such a
case the "journey" is being presented as
more important than the destination.
Source:
http://en.wikipedia.org/wiki/Art_methodo
logy
12. ì “A
work
is
never
completed
except
by
some
accident
such
as
weariness,
satisfaction,
the
need
to
deliver,
or
death:
for,
in
relation
to
who
or
what
is
making
it,
it
can
only
be
one
stage
in
a
series
of
inner
transformations”
– Paul
Valéry
ì Paraphrased
by
Auden
as
“A
work
of
art
is
never
completed,
only
abandoned”
ì “You
could
not
step twice into the same river”
– Heraclitus,
as
reported
by
Plato
(via
Socrates)
ì “In
science,
one
man’s noise is
another
man’s signal”
– Edward
Ng
ì ‘Truth?’
said
Pilate.
‘What
is
that?’ – John
18:38
ì “What
is
truth? said
jesting
Pilate,
and
would
not
stay for
an
answer”
– Sir
Francis
Bacon
A few tricky quotations
13. ì There’s
nothing
new
about
data
re-‐use
in
the
Arts
and
Humanities;
it’s
an
integral
part
of
the
culture,
and
always
has
been…
ì Think
Shakespeare’s
plots,
Kristeva’s intertextuality,
Barthes’
“galaxy
of
signifiers”,
found
sounds/objects/poems
(e.g.
Duchamp,
Morgan),
variations
on
a
theme,
collage
and
intermedia
art,
T.S.
Eliot,
DJ
culture
(sampling/breakbeat),
etc
etc
ì However,
it’s
often
more
fraught
than
data
re-‐use
in
other
areas
(e.g.
the
Physical
Sciences,
if
not
the
Social
Sciences).
Some
characteristics
of
Arts
and
Humanities
data
are
likely
to
require
a
different
kind
of
handling
from
that
afforded
to
other
disciplines
ì For
starters,
people
do
not
always
think
of
their
sources/influences/outputs
as
‘data’,
and
the
value
and
referencing
systems
(and
norms)
may
be
quite
different…
Strengths and weaknesses re. data in the Arts and
Humanities (I)
14. ì Digital
‘data’
emerging
in
the
Arts
is
as
likely
to
be
an
outcome of
the
creative
research
process
as
an
input to
a
workflow
(e.g.
the
UK
AHRC
policy)
ì Furthermore,
practice/praxis
based
research
is
more
or
less
the
sole
preserve
of
the
Humanities,
and
research/production
methods
are
not
always
rigorously
methodical
or
linear.This
is
at
odds
with
the
scientific
approach,
and
the
way
in
which
most
RDM
resources
are
described/defined/oriented
ì Arts
‘data’
is
often
personal,
and
creative
data
in
particular
may
not
be
factual
in
nature.
What
matters
most
may
not
be
the
content
itself,
but
rather
the
presentation,
the
arrangement,
the
quality
of
expression…
ì This
variance
in
emphasis
tends
to
be
why
the
reason
why
Open
Access
embargoes
are
often
longer
in
the
Arts
and
Humanities
than
in
other
areas
ì Creative
researchers
also
care
a
great
deal
about
the
way
in
which
their
work
is
presented,
or
‘showcased’:
standard
repository
installations
don’t
cut
it!
ì What
do
Arts
and
Science
data
have
in
common?
Both
may
be
financially
valuable
and/or
precious
to
their
creators
Strengths and weaknesses re. data in the Arts and
Humanities (II)
15. ì Are
the
goals
– or
indeed
the
concepts
– of
evidence,
facts,
validation,
replication
still
central
in
disciplines
which
tend
towards
subjectivity,
interpretation,
argument
and
quality
of
expression?
ì How
do
we
identify,
preserve
and
share
ephemera,
emotions,
the
unconscious…?
How
do
we
protect
rights
around
creative
data?
What
are
the
financial/ownership
issues
accompanying
creative/Arts
research?
ì Is
it
clear
where
creative
research
begins
and
ends?
How
can
we
draw
a
line
between
funded
research
and
unfunded
personal
work?
ì What
complexities
are
introduced
by
practice-‐based
research?
ì To
what
extent
is
non-‐digital
material
a
problem?
Can
we
share
approaches
to
this
with
other
subject
areas
(e.g.
biology,
geology),
remembering
that
“the
map
is
not
the
land”?
(Korzybski)
ì What
other
characteristics
do
Arts
and
Humanities
data
have
in
common with
those
of
the
Sciences?
Which
other
disciplines
share
these
issues
more
generally?
ì Is
the
perfect
the
enemy
of
the
good?
A few tricky questions around data in the Arts and
Humanities
16. ì Business
case (“could
anyone
die
or
go
to
jail?”)
1. The
law:
data
protection
2. Policy:
retention
and
embargo
periods
3. Financial/cultural
benefit
ì Commercial
considerations
and
IPR…
personal
data?
ì Access
arrangements/digitisation.
Demand
for
digitisation/archiving
may
outstrip
capacity/budgets…
ì Metadata
creation
(NISO
types):
descriptive
(for
discovery),
administrative
(for
reuse),
structural
(for
inter-‐relating
objects)
– obviously
producing
metadata
also
costs
money/effort
ì Multiplicity
of
(file)
formats
and
creation/storage
media
ì Linking
analogue
and
digital,
structuring
collections
ì Most
disciplinary
repositories
support
a
limited
set
of
recommended
file
formats/object
types
ì Scope
ì Respect
des
fonds?
Ownership/IP
issues
may
make
this
tricky
ì Scale
Archiving issues around Arts and Humanities data
17. ì Need
– what
do
we
need to
archive?
Is
it
always
evidence
without
which
the
research
outcomes
are
in
doubt?
ì Want
– do
we
want to
archive
materials
for
other
reasons?
Does
preserving
early/developmental
work
provide
a
richer
experience/understanding
of
the
creative
work
and
process?
How
do
we
make
a
business
case
for
this?
ì Liminality
ì Many
creative
researchers
are
on
fractional
contracts,
and
there
is
not
always
a
clear
delineation
between
professional
work
and
personal
practice.
Where
and
how
do
we
locate
the
line?
ì More
practically,
the
same
notebook
or
sketchbook
may
be
used
for
both
professional
and
personal
purposes.
Its
contents
may
be
messy,
personal,
confusing…
ì Is
a
work
ever
finished,
or
just
abandoned?
(Valéry)
How
do
we
know?
Sometimes
early
versions
are
equally
(or
more)
valuable…
(Munch’s
“Scream”,
Blondie’s
“Out
In
The
Streets”)
ì How
much
time/effort
does
(potentially)
sensitive
creative
‘data’
require
in
order
to
be
prepared
for
archiving?
How
do
we
know
when
it’s
worth
it?
Possible discussion points
18. Reprise: key drivers and benefits of RDM
ì TRANSPARENCY:
The
evidence
that
underpins
research
can
be
made
open
for
anyone
to
scrutinise,
and
attempt
to
replicate
the
findings of
others.
ì EFFICIENCY/VfM:
Data
collection
can
be
funded
once,
and
used
many
times
for
a
variety
of
purposes.
ì SPEED:
Data
can
be
accessed
more
quickly.
In
some
disciplines,
such
as
climate
science,
this
is
vital.
ì RISK
MANAGEMENT:
A
pro-‐active
approach
to
data
management
reduces
the
risk
of
inappropriate
disclosure
of
sensitive
data,
whether
commercial
or
personal.
ì PRESERVATION:
Lots
of
data
is
unique,
and
can
only
be
captured
once.
If
lost,
it
can’t
be
replaced.
1
2
5
3
4
19. ì Be
careful
with
our
terminology
ì “Data”
– be
clear
that
this
is
not
the
dictionary
definition,
but
rather
shorthand
for
a
variety
of
scholarly
products/biproducts
(see
www.researchobject.org for
examples)
ì Don’t
use
“science”
and
“research”
interchangeably.
Challenge
those
who
do…
(c.f.
Jan’s
Wissenschaft example)
ì Be
mindful
of
the
sometimes
blurred
lines
between
professional
investigation
and
personal
expression
ì Talk
to
researchers:
understand
their
working
methods,
discover
their
needs,
assuage
their
fears
ì Build
bridges
before they’re
needed
ì Accept
that
not
everything
needs
to
be
archived
– prioritise!
What can we do?
20. ì Paper:
Marieke Guy,
Martin
Donnelly,
Laura
Molloy
(2013)
“Pinning
It
Down:
Towards
a
Practical
Definition
of
‘Research
Data’
for
Creative
Arts
Institutions”,
International
Journal
of
Digital
Curation,
Vol.
8,
No.
2,
pp.
99-‐110.
URL:
doi:10.2218/ijdc.v8i2.275
ì Projects:
ì KAPTUR
(2011-‐13)
URL:
http://www.vads.ac.uk/kaptur/
ì A
consortial approach
to
building
an integrated
RDM
system
(2014-‐16)
(Partners:
CREST,University
for
the
Creative
Arts,
ULCC, Leeds
Trinity
University,Arkivum).
URL:
http://www.crest.ac.uk/welcome-‐to-‐the-‐crest-‐
rdms-‐project-‐blog/
ì Event:
“Research
Data
Management
Forum
#10:
RDM
in
the
Arts
and
Humanities”,
September
2013,
St
Anne's
College,
University
of
Oxford.
URL:
http://www.dcc.ac.uk/events/research-‐data-‐management-‐forum-‐
rdmf/rdmf10-‐research-‐data-‐management-‐arts-‐and-‐humanities
ì Case
study:
Jonathan
Rans (2013)
“Planning
for
the
future:
developing
and
preserving
information
resources
in
the
Arts
and
Humanities”
URL:
http://www.dcc.ac.uk/resources/developing-‐rdm-‐services/dmps-‐arts-‐and-‐
humanities
ì Blog
posts:
ì Marieke Guy
(2013)
“RDM
in
the
Performing
Arts”
URL:
http://www.dcc.ac.uk/blog/rdm-‐performing-‐arts
ì Laura
Molloy
(2015)
“Digital
Preservation
for
the
Arts,
Social
Sciences
and
Humanities
-‐ benefits
for
everyone”
URL:
http://www.dcc.ac.uk/blog/digital-‐preservation-‐arts-‐social-‐sciences-‐and-‐humanities-‐benefits-‐everyone
ì Slides:
Martin
Donnelly
(2013)
“‘Found’
and
‘after’
-‐ a
short
history
of
data
reuse
in
the
Arts”
URL:
http://www.slideshare.net/martindonnelly/data-‐reuse-‐in-‐the-‐arts
Further reading and links
21. Thank you / Dziękuję
ì For
information
about
the
DCC:
ì Website:
www.dcc.ac.uk
ì Director:
Kevin
Ashley
(kevin.ashley@dcc.ac.uk)
ì General
enquiries:
info@dcc.ac.uk
ì Twitter:
@digitalcuration
ì For
information
about
the
FOSTER
project:
ì Website:
www.fosteropenscience.eu
ì Principal
investigator:
Eloy Rodrigues
(eloy@sdum.uminho.pt)
ì General
enquiries:
Gwen
Franck
(gwen.franck@eifl.net)
ì Twitter:
@fosterscience
ì My
contact
details:
ì Email:
martin.donnelly@ed.ac.uk
ì Twitter:
@mkdDCC
ì Slideshare:
www.slideshare.net/martindonnelly This work is licensed under the
Creative Commons Attribution
2.5 UK: Scotland License.Slide
3
image
credits:
score,
linked
open
data,
rhizomatic
network