Category Archives: Legal issues

LIDP Toolkit: Phase 2

We are starting to wrap up the loose ends of LIDP 2. You will have seen some bonus blogs from us today, and we have more about reading lists and focus groups to come – plus more surprises!

Here is something we said we would do from the outset – a second version of the toolkit to reflect the work we have done in Phase 2 and to build on the Phase 1 Toolkit:

Stone, Graham and Collins, Ellen (2012) Library Impact Data Project Toolkit: Phase 2. Manual. University of Huddersfield, Huddersfield.

The second phase of the Library Impact Data Project set out to explore a number of relationships between undergraduate library usage, attainment and demographic factors. There were six main work packages:

  1. Demographic factors and library usage: testing to see whether there is a relationship between demographic variables (gender, ethnicity, disability, discipline etc.) and all measures of library usage;
  2. Retention vs non-retention: testing to see whether there is a relationship between patterns of library usage and retention;
  3. Value added: using UCAS entry data and library usage data to establish whether use of library services has improved outcomes for students;
  4. VLE usage and outcome: testing to see whether there is a relationship between VLE usage and outcome (subject to data availability);
  5. MyReading and Lemon Tree: planning tests to see whether participation in these social media library services had a relationship with library usage;
  6. Predicting final grade: using demographic and library usage data to try and build a model for predicting a student’s final grade.

This toolkit explains how we reached our conclusions in work packages 1, 2 and 6 (the conclusions themselves are outlined on the project blog. Our aim is to help other universities replicate our findings. Data were not available for work package 4, but should this data become available it can be tested in the same way as in the first phase of the project, or in the same way as the correlations outlined below. Work package 6 was also a challenge in terms of data, and we made some progress but not enough to present full results.

The toolkit aims to give general guidelines about:

1. Data Requirements
2. Legal Issues
3. Analysis of the Data
4. Focus Groups
5. Suggestions for Further Analysis
6. Release of the Data

The Final Blog Post

It has been a short but extremely productive 6 months for the Library Impact Data Project Team. Before we report on what we have done and look to the future, we have to say a huge thank you to our partners. We thought we would be taking a lot on at the start of the project in getting eight universities to partner in a six month project; however, it has all gone extremely smoothly and as always everyone has put in far more effort and work than originally agreed. So thanks go to all the partners, in particular:

Phil Adams, Leo Appleton, Iain Baird, Polly Dawes, Regina Ferguson, Pia Krogh, Marie Letzgus, Dominic Marsh, Habby Matharoo, Kate Newell, Sarah Robbins, Paul Stainthorp

Also to Dave Pattern and Bryony Ramsden at Huddersfield.

So did we do what we said we would do

Is there is a statistically significant correlation across a number of universities between library activity data and student attainment?

There answer is a YES!

There is statistically significant relationship between both book loans and e-resources use and student attainment. And this is true across all of the universities in the study that provided data in these areas. In some cases this was more significant than in others, but our statistical testing shows that you can believe what you see when you look at our graphs and charts!

Where we didn’t find a statistical significance was in entries to the library, although it looks like there is a difference between students with a 1st and 3rd, there is not an overall significance. This is not surprising as many of us have group study facilities, lecture theatres, cafes and student services in the library. Therefore a student is as just likely to be entering the library for the above reasons than for studying purposes.

We want to stress here again that we realise THIS IS NOT A CAUSAL RELATIONSHIP!  Other factors make a difference to student achievement, and there are always exceptions to the rule, but we have been able to link use of library resources to academic achievement.

So what is our output?

Firstly we have provided all the partners in the project with short library director reports and are in the process of sending out longer in-depth reports. Regrettably, due to the nature of the content of these reports, we cannot share this data; however, we are in the process of anonymising partners graphs in order to release charts of averaged results for general consumption

Furthermore we are also planning to release the raw data from each partner for others to examine. Data will be released on an Open Data licence at

Finally, we have been astonished by how much interest there has been in our project. To date we have two articles ready for publication imminently and have another 2 in the pipeline. In addition by the end of October we will have delivered 11 conference papers on the project. All articles and conference presentations are accessibly at:

Next steps

Although this project has had a finite goal in proving or disproving the hypothesis, we would now like to go back to the original project which provided the inspiration. This was to seek to engage low/non users of library resources and to raise student achievement by increasing the use of library resources.
This has certainly been a popular theme in questions at the SCONUL and LIBER conferences, so we feel there is a lot of interest in this in the library community. Some of these ideas have also been discussed at the recent Business Librarians Association Conference

There are a number of ways of doing this, some based on business intelligence and others based on targeting staffing resources. However, we firmly believe that although there is a business intelligence string to what we would like to take forward, the real benefits will be achieved by actively engaging with the students to improve their experience. We think this could be covered in a number of ways.

  • Gender and socio-economic background? This came out in questions from library directors at SCONUL and LIBER. We need to re-visit the data to see whether there are any effects of gender, nationality (UK, other European and international could certainly be investigated) and socio-economic background in use and attainment.
  • We need to look into what types of data are needed by library directors, e.g. for the scenario ‘if budget cuts result in less resources, does attainment fall’? The Balanced Scorecard approach could be used for this?
  • We are keen to see if we add value as a library through better use of resources and we have thought of a number of possible scenarios in which we would like to investigate further:
    • Does a student who comes in with high grades leave with high grades? If so why? What do they use that makes them so successful?
    • What if a student comes in with lower grades but achieves a higher grade on graduation after using library resources? What did they do to show this improvement?
    • Quite often students who look to be heading for a 2nd drop to a 3rd in the final part of their course, why is this so?
    • What about high achievers that don’t use our resources? What are they doing in order to be successful and should we be adopting what they do in our resources/literacy skills sessions?
  • We have not investigated VLE use, and it would be interesting to see if this had an effect
  • We have set up meetings with the University of Wollongong (Australia) and Mary Ellen Davis (executive director of ACRL) to discuss the project further. In addition we have had interest from the Netherlands and Denmark for future work surrounding the improvement of student attainment through increased use of resources

In respect to targeting non/low users we would like to achieve the following:

  • Find out what students on selected ‘non-low use’ courses think to understand why students do not engage
  • To check the amount and type of contact subject teams have had with the specific courses to compare library hours to attainment (poor attainment does not reflect negatively on the library support!)
  • Use data already available to see if there is correlation across all years of the courses. We have some interesting data on course year, some courses have no correlation in year one with final grade, but others do. By delving deeper into this we could target our staffing resources more effectively to help students at the point of demand.
    • To target staffing resources
  • Begin profiling by looking at reading lists
    • To target resource allocation
    • Does use of resources + wider reading lead to better attainment – indeed, is this what high achievers actually do?
  • To flesh out themes from the focus groups to identify areas for improvement
    • To target promotion
    • Tutor awareness
    • Inductions etc.
  • Look for a connection between selected courses and internal survey results/NSS results
  • Create a baseline questionnaire or exercise for new students to establish level of info literacy skills
    • Net Generation students tend to overestimate their own skills and then demonstrate poor critical analysis once they get onto resources.
    • Use to inform use of web 2.0 technologies on different cohorts, e.g. health vs. computing
  • Set up new longitudinal focus groups or re-interview groups from last year to check progress of project
  • Use data collected to make informed decisions on stock relocation and use of space
  • Refine data collected and impact of targeted help
  • Use this information to create a toolkit which will offer best practice to a given profile
    • E.g. scenario based

Ultimately our goal will be to help increase student engagement with the library and its resources, which as we can now prove, leads to better attainment. This work would also have an impact on library resources, by helping to target our precious staff resources in the right place at the right time and to make sure that we are spending limited funds on the resources most needed to help improve student attainment.

How can others benefit?

There has been a lot of interest from other universities throughout the project. Some universities may want to take our research as proof in itself and just look at their own data; we have provided instructions on how to do this at We will also make available the recipes written with the Synthesis project in the documentation area of the blog, we will be adding specific recipes for different library management systems in the coming weeks:

For those libraries that want to do their own statistical analysis, this was a was a complex issue for the project, particularly given the nature of the data we could obtain vs. the nature of the data required to specifically find correlations. As a result, we used the Kruskal Wallis (KW) test, designed to measure whether there are differences between groups of non-normally distributed data. To confirm non-normal distribution, a Kolmogorov-Smirnov test was run. KW unfortunately does not tell us where differences are, the Mann Whitney test was used on specific couplings of degree results, selected based on visual data represented in boxplot graphs. The number of Mann Whitney tests have to be limited as the more tests conducted, the higher the significance value required, so we limited them to three (at a required significance value of 0.0167 (5% divided by 3)). Once Mann Whitney tests had been conducted, effect size of the difference was calculated. All tests other than effect size were run in PASW 18; effect size was calculated manually. It should be noted that we are aware the size of the samples we are dealing with could have indicated relationships where they do not exist, but we feel our visual data demonstrates relationships that are confirmed by the analytics, and thus that we have a stable conclusion in our discarding of the null hypothesis that there is no relationship between library use and degree result.

Full instructions of how the tests were run will first be made available to partner institutions and disseminated publicly through a toolkit in July/August

Lessons we learned during the project

The three major lessons learned were:

Forward planning for the retention of data. Make sure all your internal systems and people are communicating with each other. Do not delete data without first checking that other parts of the University require the data. Often this appears to be based on arbitrary decisions and not on institutional policy. You can only work with what you’re able to get!

Beware e-resources data. We always made it clear that the data we were collecting for e-resource use was questionable, during the project we have found that much of this data is not collected in the same way across an institution, let alone 8! Athens, Shibboleth and EZProxy data may all be handled differently – some may not be collected at all. If others find that there is no significance between e-resources data and attainment, they should dig deeper into their data before accepting the outcome.

Legal issues. For more details on this lesson, see our earlier blog on the legal stuff

Final thoughts

Although this post is labelled the final blog post, we will be back!

We are adding open data in the next few weeks and during August we will be blogging about the themes that have been brought out in the focus groups.

The intention is then to use this blog to talk about specific issues we come across with data etc. as we carry our findings forward. At our recent final project meeting, it was agreed that all 8 partners would continue to do this via the blog.

Finally a huge thank you to Andy McGregor for his support as Programme Manager and to the JISC for funding us.

The legal stuff…

One of the big issues for the project so far has been to ensure we are abiding to legal regulations and restrictions.  The data we intend to utilise for our hypothesis is sensitive on a number of levels, and we have made efforts to ensure there is full anonymisation of both students and universities (should our collaborators choose to remain so).  We contacted JISC Legal prior to data collection to confirm our procedures are appropriate, and additionally liaised with our Records Manager and the University’s legal advisor.

Our data involves tying up student degree results with their borrowing history (i.e. the number of books borrowed), the number of times they entered the library building, and the number of times they logged into electronic resources.  In retrieving data we have ensured that any identifying information is excluded before it is handled for analysis.  We have also excluded any small courses to prevent identification of individuals e.g. where a course has less than 35 students and/or fewer than 5 of a specific degree level.

To notify library and resource users of our data collection, we referred to another data project, EDINA, which provides the following statement for collaborators to use on their webpages:

“When you search for and/or access bibliographic resources such as journal articles, your request may be routed through the UK OpenURL Router Service (, which is administered by EDINA at the University of Edinburgh.  The Router service captures and anonymises activity data which are then included in an aggregation of data about use of bibliographic resources throughout UK Higher Education (UK HE).  The aggregation is used as the basis of services for users in UK HE and is made available to the public so that others may use it as the basis of services.  The aggregation contains no information that could identify you as an individual.”

Focus groups have also been conducted with a briefing and a consent form to ensure participants are fully aware of data use from the group and of their anonymisation and advising them that they can leave the group at any point.