8. Because good research needs good data
•OPEN DATA
•LINKED
DATA
•RESEARCH DATA
•ACTIVITY DATA
•SENSITIVE DATA
2012-06-18 Kevin Ashley, DCC; IWMW12; CC-BY 8
9. An explosion of data…
Because good research needs good data
2012-06-18 Kevin Ashley, DCC; IWMW12; CC-BY 9
10. “small science” :
Because good research needs good data
the long tail
ChemSpider
CATH, SCOP
•GenBank (Protein
•PDB Structure
Classification)
•UniProt
•Pfam
Spreadsheets, Notebooks
Local, Lost
•Slide: Carole Goble
12. Because good research needs good data
Enough of anything becomes data…
• Tag galaxy: http://taggalaxy.de/
• Amsterdam: http://vimeo.com/2312662
2009-07-21 Kevin Ashley: http://dablog.ulcc.ac.uk/ 12
13. Because good research needs good data
Why care?
• Data is expensive – an investment
• Reuse:
• More research
• Teaching & Learning
• Planning
• Impact – with or without publication
• Accountability
• Legal & regulatory requirements
2012-06-18 Kevin Ashley, DCC; IWMW12; CC-BY 13
14. Because good research needs good data
5-year programme to realise
maximum value and benefit from
research data
National services, support, coordination from DCC
Focus on creating capacity and
capability within research
institutions
2012-06-18 Kevin Ashley, DCC; IWMW12; CC-BY 14
15. Because good research needs good data
Without good RDM – BAD THINGS HAPPEN
With good RDM – GOOD STUFF HAPPENS
2012-06-18 Kevin Ashley, DCC; IWMW12; CC-BY 15
16. Because good research needs good data
Funder Legal
pressures issues
Researcher Value
demand Realisation
2012-06-18 Kevin Ashley, DCC; IWMW12; CC-BY 16
17. Because good research needs good data
EPSRC expects all those institutions it funds
•to develop a roadmap that aligns … with
EPSRC’s expectations by 1st May 2012;
•to be fully compliant … by 1st May 2015.
http://www.epsrc.ac.uk/about/standards/researchdata/Pages/expectations.aspx
2012-06-18 Kevin Ashley, DCC; IWMW12; CC-BY 17
18. Because good research needs good data
• Awareness of regulatory environment
• Data access statement
• Policies and processes
• Data storage
• Structured metadata descriptions
• DOIs for data
• Securely preserved for a minimum of 10
years from last use
2012-06-18 Kevin Ashley, DCC; IWMW12; CC-BY 18
19. Because good research needs good data
•NERC •Wellcome
•ESRC
•MRC
•BBSRC
•European
•NSF Commission
2012-06-18 Kevin Ashley, DCC; IWMW12; CC-BY 19
20. Because good research needs good data
TOOLS & SERVICES
ADVICE
ON-SITE SUPPORT
2012-06-18 Kevin Ashley, DCC; IWMW12; CC-BY 20
21. Because good research needs good data
2012-06-18 Kevin Ashley, DCC; IWMW12; CC-BY 21
22. Understanding Data Requirements Because good research needs good data
http://www.dcc.ac.uk/
2012-06-18 Kevin Ashley, DCC; IWMW12; CC-BY 22
23. Because good research needs good data
Data management plans
2012-06-18 Kevin Ashley, DCC; IWMW12; CC-BY 23
24. What data to keep
Because good research needs good data
How to cite data
2012-06-18 Kevin Ashley, DCC; IWMW12; CC-BY 24
25. Because good research needs good data
Data Licensing
• Bespoke licences
• Standard licences
• Multiple licensing
• Licence mechanisms
2012-06-18 Kevin Ashley, DCC; IWMW12; CC-BY 25
26. Because good research needs good data
Institutional Engagement
• In-depth support from team of DCC staff
• Helping with:
• Re-skilling
• Policy development
• Costing
• Use of tools
• Professional liaison
2012-06-18 Kevin Ashley, DCC; IWMW12; CC-BY 26
27. Because good research needs good data
•Institutional
Policy
•…article in
•International
Journal of
Digital Curation
www.ijdc.net
2012-06-18 Kevin Ashley, DCC; IWMW12; CC-BY 27
28. Because good research needs good data
•Institutional
Policy
2012-06-18 Kevin Ashley, DCC; IWMW12; CC-BY 28
29. Because good research needs good data
2012-06-18 Kevin Ashley, DCC; IWMW12; CC-BY 29
30. Because good research needs good data
What’s it to you ?
• Discovery is key
• Integration of data with institutional web
presence
• Highlighting, visualising the gems
• Linking what you have & what you claim
credit for
• Customising our tools
2012-06-18 Kevin Ashley, DCC; IWMW12; CC-BY 30
31. Because good research needs good data
Don’t put stuff in boxes
Publication Web CMS
Repository
OER
Repository
Research Our data
Admin Their data Repository
(CRIS) Repository
2012-06-18 Kevin Ashley, DCC; IWMW12; CC-BY 31
32. Data in, data out… Because good research needs good data
Andy Todd @ building-blocks.com
2012-06-18 Kevin Ashley, DCC; IWMW12; CC-BY 32
33. Because good research needs good data
2011-05-12 ESYM11 - Kevin Ashley, DCC - CC-BY-SA 33
34. Because good research needs good data
2012-06-18 Kevin Ashley, DCC; IWMW12; CC-BY 34
35. Because good research needs good data
Why care?
• Data is expensive – an investment
• Reuse:
• More research
• Teaching & Learning
• Planning
• Impact – with or without publication
• Accountability
• Legal & regulatory requirements
2012-06-18 Kevin Ashley, DCC; IWMW12; CC-BY 35
36. Because good research needs good data
Early results: public data archiving
increases scientific contribution by
one third
2012-06-18 Kevin Ashley, DCC; IWMW12; CC-BY 36
37. Because good research needs good data
Data drives impact
• Making data:
• Findable
• Reusable
• Visible
• Shared
• Linked to publications
• … increases reuse, citation & impact
2012-06-18 Kevin Ashley, DCC; IWMW12; CC-BY 37
38. Because good research needs good data
Letting data out to play
• http://www.globe4d.com/
2012-06-18 Kevin Ashley, DCC; IWMW12; CC-BY 38
39. •OVER TO YOU Because good research needs good data
•Or2012.ed.ac.uk
9-13 July 2012
IDCC13
14-16 January 2013
Amsterdam
Call for papers now open
•http://slideshare.net/kevinashley
2012-06-18 Kevin Ashley, DCC; IWMW12; CC-BY 39
Editor's Notes
2012-06-18 Kevin Ashley, DCC; IWMW12; CC-BY I’m going to speak briefly today about data and its relevance to institutional web managers Data and the Web Manager
But I’m aware I need to begin by establishing some credibility. Back in the early to mid 1990s I was part of the web team at ULCC – at one point I WAS the web team. We ran web sites for a lot of external customers – people in the education sector like UCISA Data and the Web Manager 2012-06-18 Kevin Ashley, DCC; IWMW12; CC-BY
… in the government sector like OFTEL Data and the Web Manager 2012-06-18 Kevin Ashley, DCC; IWMW12; CC-BY
As well as our own web site. That gave me a lot of interesting data to look at from some very different web sites. One thing I learned from that is that statistics about browser market penetration at that time were nearly all wrong. Browser market segmentation differed dramatically for different types of site. Common knowledge now, perhaps, but not then. Few people were lucky enough to have access to enough data to make such assessments. Data and the Web Manager 2012-06-18 Kevin Ashley, DCC; IWMW12; CC-BY
But the thing I’m proudest of from then is this – a phone dialer page. This was produced by myself and a colleague, Martin Powell, as a bit of fun and a way of showing off. It was at a time when it was very difficult to find any documentation on the syntax and function of HTML form elements, and when the ‘Common’ in ‘Common Gateway Interface’ was still being argued out. It was pointless but difficult. Data and the Web Manager 2012-06-18 Kevin Ashley, DCC; IWMW12; CC-BY
We were chuffed to end up in the hall of fame of the useless web pages – officially one of the 10 most useless web pages in the world. With competition like Jason’s desk inventory this was some doing. It also drove more referrals to our site for some years than any other part of it (the nearest competition was an open learning resource on the X Window System, in the days before OER was an acronym.) Data and the Web Manager 2012-06-18 Kevin Ashley, DCC; IWMW12; CC-BY
We also had company like the WPS toilet cam – finally revealed to be a hoax. Data and the Web Manager 2012-06-18 Kevin Ashley, DCC; IWMW12; CC-BY
So, credibility established, what am I really here to talk about ? ‘Data’ makes friends with lots of other words – some of them are here. You’ll be hearing about some of these tomorrow or on Wednesday as well as the more general topic of data visualisation. I am going to focus on research data, for reasons that I hope will become clear. Data and the Web Manager 2012-06-18 Kevin Ashley, DCC; IWMW12; CC-BY
One of the reasons that things like the useless pages and our phone dialler worked back in 1994 was that the web was a much smaller place. Within a year there were hundreds of phone diallers and thousands of other examples of some of the other oddities on that list. It became too much for one person to navigate, catalogue and describe. Similar things have been happening with research data. This graph illustrates change in one field – a 100 million-fold increase in 35 years of the output from a single DNA sequencing machine. It doesn’t even take account of the huge declining cost of these systems. That increase way outstrips the increase in computing power that has accompanied it. The problems that result – and the opportunities – have been given names such as the ‘data deluge’ and the ‘fourth paradigm’. Data and the Web Manager 2012-06-18 Kevin Ashley, DCC; IWMW12; CC-BY
But there’s huge variety still. At one end are the big-iron data projects way out of the league of most of us in this room. Somewhere in between are things of global importance that can be sustained by a few individuals. And over on the right is the long tail – lots and lots of relatively small data sets which collectively can amount to more than all the rest.
‘ Data’ often brings to mind dullness, numbers, spreadsheets and databases. I know that many of you will know better, but it does no harm to remind ourselves of the possibilities even simple data offers. Data and the Web Manager 2012-06-18 Kevin Ashley, DCC; IWMW12; CC-BY
W8 - web archiving use cases 2009-07-21 Kevin Ashley, ULCC http://dablog.ulcc.ac.uk/ When we have enough data even the boring becomes useful and beatiful. Tag galaxy combines simple searches of flickr tags with a compelling UI which allows simple exploration of related content. I make no apologies for the fact that I’ve been using this example for years – it never loses its fascination for me nor its ability to be relevant to any place or event. Let’s look at a communication technology, which allows individuals to send messages to each other. If we have a small number of these messages we might well be interested in analysing their content. If we have a few tens of millions of them, though, other things become interesting, as this visualisation demonstrates. It shows SMS traffic in Amsterdam in the days leading up to and following new year’s eve. There’s a lot of information in this visualisation and a lot more that could be done with the data behind it. Yet it was done without knowing the contents of any of the messages, nor who they were from nor who they were to. We simply know where they were sent and when. When you have enough data of a given type, the useless becomes useful.
2012-06-18 Kevin Ashley, DCC; IWMW12; CC-BY So enough of beauty. Back to business imperatives and investment decisinos. There are a number of reasons to care about research data management – I’ve listed some of the more pertinent ones here. The ones about investment and reuse are those that motivated government to put money behind a programme to improve practice in this area Some of the latter reasons – regulatory requirements, for instance – are of more concern to institutions. Only a small number matter to the researchers that create the data. But it’s important to be aware of all the motivating factors and the ones that concern different stakeholders. Data and the Web Manager
We’ve got a 5-year programme – or at least a 5-year business plan – to maximise the value and benefit of research data. The plan assumes that an investment of £24m over 5 years will realise benefits of at least £120m by the end of that time. But those benefits are realised primarily towards the end of the spending. In the meantime we focus on creating capacity and capability in institutions, with national services, support, and coordination from the DCC. I should stress that the bulk of that investment is not going to the DCC itself; we operate on a much smaller scale. The focus is to create capacity and capability to manage research data effectively within institutions; there’s a thin layer of national services, support and coordination that comes from the DCC. Data and the Web Manager 2012-06-18 Kevin Ashley, DCC; IWMW12; CC-BY
I could summarise our message very simply – do this badly and bad things will happen – do it well and good things will happen. The bad things range from the annoying to imprisonment; the good things include saving money and increased research impact. Data and the Web Manager 2012-06-18 Kevin Ashley, DCC; IWMW12; CC-BY
Institutions are our focus because they are the focus of many of the pressures involved – from funders, from the law and compliance officers, from researcher demand and from internal wishes to save money and increase value. Data and the Web Manager 2012-06-18 Kevin Ashley, DCC; IWMW12; CC-BY
Those external pressures include those from funders such as EPSRC. Looming deadlines this year and in 2015 got the attention of senior university management. Data and the Web Manager 2012-06-18 Kevin Ashley, DCC; IWMW12; CC-BY
The expectations that universities need to sign up are listed here – their roadmaps need to demonstrate how they are going to deliver on these expectations by 2015. They include a commitment to keep data for 10 years after its last use – note, not just after the project ends. Some worry that this means they need to keep data for 100 years. I say that if your data is still being used (and cited) 100 years later you should break out the champagne, not worry about paying for it. Data and the Web Manager 2012-06-18 Kevin Ashley, DCC; IWMW12; CC-BY
But EPSRC aren’t the only funders with requirements. Almost every UK research council, some international funders and charitable funders also have requirements about research data. Many place the onus for compliance on the researcher, not the institution. Data and the Web Manager 2012-06-18 Kevin Ashley, DCC; IWMW12; CC-BY
What we’re doing involves a mixture of tools and services, advice, and on-site support. Data and the Web Manager 2012-06-18 Kevin Ashley, DCC; IWMW12; CC-BY
We provide some simple visual guidance to funder policies on our web site as well as more detailed analysis. Data and the Web Manager 2012-06-18 Kevin Ashley, DCC; IWMW12; CC-BY
The tools include DAF, which helps you discover what data exists, and CARDIO, which helps understand how well prepared the institution is for research data management services. The latter was developed jointly with colleagues from ULCC, using their expertise in maturity models for digital asset management and records management. Data and the Web Manager 2012-06-18 Kevin Ashley, DCC; IWMW12; CC-BY
2012-06-18 Kevin Ashley, DCC; IWMW12; CC-BY They also include a service to help create data management plans that comply with a variety of funder requirements. It can also be customised to institutional requirements – and it is multilingual. This tool is designed to be customisable so that it appears as an offering from an institution rather than a DCC-hosted service. This is one area where we will look for collaboration with you and your colleagues to make this effective. Data and the Web Manager
We also produce guidance – much of it not specific to a UK context. I hope you’ve seen some of it. These two are of wide interest, and one was produced in collaboration with ANDS. I hope we’ll see far more of these. Data and the Web Manager 2012-06-18 Kevin Ashley, DCC; IWMW12; CC-BY
The advice on data licencing is still of wide interest but inevitably has to make more account of the legal context in the UK. Although we’re supporters of open access our guidance is agnostic on the issue. It helps researchers understand how to achieve their desired ends using current legal frameworks, be that completely open data or highly protected data. Data and the Web Manager 2012-06-18 Kevin Ashley, DCC; IWMW12; CC-BY
And finally we have our institutional engagement programme where we send in a team of consultants to train people, help them develop policy, use tools, and build bridges between professional groups. We also use this work to develop case studies to inform others. 2012-06-18 Kevin Ashley, DCC; IWMW12; CC-BY Data and the Web Manager
When you’ve got a policy, make it public. You can read about how the University of Edinburgh developed their policy in our journal, the International Journal of Digital Curaton. Data and the Web Manager 2012-06-18 Kevin Ashley, DCC; IWMW12; CC-BY
Bath have released their EPSRC roadmap, of which institutional policy is just a part. They seem to prefer a pictorial approach. Data and the Web Manager 2012-06-18 Kevin Ashley, DCC; IWMW12; CC-BY
You can read more about the programme on our web site. 2012-06-18 Kevin Ashley, DCC; IWMW12; CC-BY Data and the Web Manager
So what relevance does this all have for you as web managers ? Here are some examples. Discovery of data is key, and discovery is something you know a lot about – SEO by another name. Integrating data with your institutional web presence, highlighting the treasures, is another key task. Identifying not just what you have, but what you claim credit for, is important – not all your research outputs will live in institutional systems. And I’ve already mentioned customising our tools. Data and the Web Manager 2012-06-18 Kevin Ashley, DCC; IWMW12; CC-BY
Many of those points relate to avoiding boxes and silos, the temptation to treat all these systems as ends in themselves. They aren’t – they are infrastructure supporting us to pursue the goals of universities – teaching people stuff and finding stuff out. This is the web after all – linking stuff up is its very purpose. Data and the Web Manager 2012-06-18 Kevin Ashley, DCC; IWMW12; CC-BY
Here’s one example of a how a company working with ESRC used a Microsoft repository system to do just that. It shows data flowing to and from a variety of systems all supporting manipulation and visualisation through Zentity, linking administrative, library, scholarly and social data. Data and the Web Manager 2012-06-18 Kevin Ashley, DCC; IWMW12; CC-BY
ESYM11 - Kevin Ashley, DCC 2011-05-12 CC-BY-SA And we mustn’t forget that we’ve already got good homes for data in a number of subject areas in the UK. Some, such as the British Atmospheric Data Centre and the British Oceanographic Data Centre are not just national bodies, designated by NERC as appropriate homes for managed data outputs – they are international bodies with an international role. We’re lucky to have them close to home – but we can’t unilaterally decide how they operate. The archaeology data service is another reminder that not all data of interest to academic research emerges from academic endeavour. Much of the data that lives there is deposited by commercial bodies, usually as a prelude to property development that will destroy archaeological heritage. The 40+ years of UKDA are a reminder that we have a long history of expertise and practical knowledge to draw on for many aspects of data curation.
Here’s another example of hiding systems to build services, from Hong Kong university. A relatively modest investment in their web site and repository systems allows them to be used to provide everything that an academic staff member could want for their personal pages and for tracking and promoting their research and teaching. Data and the Web Manager 2012-06-18 Kevin Ashley, DCC; IWMW12; CC-BY
2012-06-18 Kevin Ashley, DCC; IWMW12; CC-BY So let’s come back to those motivations I talked about earlier. One key finding is that data that is shared and visible has a big effect on the impact of the research that produced it. Data and the Web Manager
One of the ways you can motivate researchers and your institution is through research like this by Heather Piwowar. She shows that public data archiving has a marked positive effect on the science related to it – increased citation rates and reuse. Her work is paralleled in many other fields. And yesterday I heard of a pan-european survey that showed that although 90% of researchers would like to deposit research data in an appropriate repository, only 10% were able to. What could you do to improve that ? Data and the Web Manager 2012-06-18 Kevin Ashley, DCC; IWMW12; CC-BY
Impact is of concern to funders, researchers and institutions. It isn’t just about impact in research – good data finds uses in teaching as well. Data and the Web Manager 2012-06-18 Kevin Ashley, DCC; IWMW12; CC-BY
I’ll end with a wonderful example of what happens when you let your data out to play. It reminds me of points Ewan McIntosh made here in Edinburgh at the 2009 JISC conference and later at the IDCC conference in Bristol in 2011. In 2009, he criticised us for keeping our learning materials in walled gardens, behind closed doors, expecting people to visit us to find them instead of putting stuff where learners already were. Here’s a great example of letting data get out there. And he also observed what could be done with data and children by creative teaching. Instead of giving them some data and some problems to solve, he recommended letting learners define their own problems and then teaching them to use the data to solve them. The result – more empowered students with a better understanding of data manipulation and the power of open data. And a huge sense of fun. This tool makes children out of everyone. Data and the Web Manager 2012-06-18 Kevin Ashley, DCC; IWMW12; CC-BY
But at that point I’ll stop for questions. Data and the Web Manager 2012-06-18 Kevin Ashley, DCC; IWMW12; CC-BY