Data analysis update

Whilst we wait for all of the data from the project partners to arrive, Bryony and I have done a quick & dirty analysis of the data we’ve received so far.

The good news (touch wood!) is that we’re still on track to prove the project hypothesis:

“There is a statistically significant correlation across a number of universities between library activity data and student attainment”

The data we’ve looked at so far has a small Pearson correlation (in the region of -0.2) that has a high statistical significance (with a p-value of below 0.01).

The reason we’re seeing a negative correlation is due to the values we’ve assigned to the degree results (1=first, 2=upper second, 3=lower second, 4=third, etc).

We suspect one of the reasons for the small Pearson correlation is the level of non & low usage (which is something we’ve looked at previously in Huddersfield’s data). Within each degree level, there are sizeable minorities of students who either never made use of a library service (e.g. they never borrowed any books) or who only made low use (e.g. they borrowed less than 5 books), and it’s this which seems partly responsible for lowering the Pearson correlation. However, the data shows that:

  • students who gained a first are less likely to be in that set of non & low users than those who gained a lower grade
  • students who gained the highest grades are more likely to be in the set of high library usage than those who gained lower grades

One thought on “Data analysis update”

Comments are closed.