Thursday, 25 August 2016

Big Data - What we're finding out about a State-wide collection

Now that the SA public library network has all of its library branches using a shared Library Management System we can begin to do some analysis on both the composition of the State's public library holdings, and customer use of this collection.

We have started running some reports which show us which communities are making the most use of the collections of other libraries, and which libraries are supplying the items to fulfil that demand. We're also looking at collection sizes & how they compare to the size of the communities they're meant to be serving.

We have also shared the dataset of our collection, to be used by students and others who're interested in analysing our collections and also in finding new ways to represent the data using interesting tools.

A university student, Keren Sutcliffe was interested in looking at our Non-Fiction holdings & also in using some visualisation tools to show the data. While I haven't had time to dig deeply into the data the ways in which it is presented here is fascinating. 

Rather than looking at standard columns & rows in spreadsheets, or even the standard graphs produced by Excel, these representations are really engaging.  They're also interactive.  You can hold your cursor over a part of the display to get more data.  While this isn't new in itself, it is interesting to use on these data representations.  For example, while it is easy to see that items with the Dewey number 641 is the largest single collection libraries hold, hovering over this square tells you that 641 is the Dewey number for food & drink, and that we hold 15,880 titles and 54,733 items with that Dewey number.  And at the other end of the scale I can find that for the Dewey number 497 (North American native languages) we hold 3 titles and 4 items.

There is also a good, simple bar graph which shows how many titles and copies we have in each Dewey hundred group.  This one shows us at a glance that our largest collection in this area is the 600's - Technology & Applied Sciences, where libraries hold 77,586 titles and 236,651 copies, at a ratio of almost exactly 3 copies per title.  While out 800's - Literature & Rhetoric, while not the smallest group at 19,293 titles and 45,507 copies has a title to copy ratio of 2.36.

This is all very interesting, but not overly useful at this stage.  However we were talking in the office about this yesterday & we're thinking that the next couple of datasets we could look at would be the lending patterns of the NF collection & then see whether our collecting patterns reflect demand.  We could then use these visualization tools to show the "hardest working" parts of our collection. 

However it is always difficult to draw firm conclusions from such information.  Are people borrowing items because they're there & on the shelf, or do they have demand for more content in some subject areas, but libraries don't have sufficient stock in these areas?  We can't see unfulfilled demand. 

The good news is that we have the data & the tools, as well as access to people with the skills to provide us with these sorts of representations.  The more complex bit will be both the analysis of the data, and then trying to see patterns and causes. 

So, if you're interested in some high level collection overviews and you want to see the "ground floor" of our collection analysis journey then this is really worth looking at.

Tuesday, 3 May 2016

Database cleanup work (Technical)

Alert – this post contains some very important information about improved user experience – but also some technical stuff that not all of us will fully understand.  (Much of it has been written by Di Cranwell our cataloguing guru!)

PLS has engaged the services of the Database experts at SirsiDynix to complete a different to usual Database clean-up  This process will be followed by a load of Library of Congress Subject, Author and Series headings. These processes were originally scheduled for May, but given our recent network instability we have postponed this until the next available “window” in the LMS schedule to undertake this work.  It has therefore been rescheduled for August.

All Database clean up activity has a primary objective of improving the user experience in using the system – whether the users are library staff or customers. So this is something that is uppermost in the minds of many who use the system. We are aware that the database continues to contain records from various pre-One Card local cataloguing conventions, some of which impact on the user experience. For instance different libraries had their own conventions for the use of GMDs (general media designators) in Titles. Some used [Music CD], while others used [music – cd], and other errors crept in such as [musci CD]. Because these diverse terms were in titles the same items with different GMDs could not be matched by our previous automated database de-duplication processes. However our new database experts have found a way to map and change all erroneous GMDs to standard ones. Oh – and BTW the standard for music CDs is [music CD]. As soon as we get correct GMDs in records we can merge those which are for the same work.

And this same logic can be applied to other AV materials where the Material Type in the record is clear. For example if the Material Type of the record is “DVD” and the existing GMD is [videorecording] this will be updated to [DVD]. GMDs that are no longer used e.g. [book] and [text], will be stripped from the title. Once this work has been completed, we will reissue the list of One Card approved GMDs by updating the LMS Ops Guide.  There will be an expectation that libraries use these in new Bib records ensuring consistency in the future.

A clean-up of the MARC Tags in the Bibliographic records will also take place.  Tags that are considered “junk” Tags that have been unintentionally loaded into the database will be removed along with obsolete Tags.  This part of the clean up will strip over 300,000 lines of data from our records; leading to a cleaner and faster database. 

Some Tags will be updated to the latest standard, for example obsolete series fields 400, 410, 411, and 440 will be upgraded to field 490/8XX pairs. There are also many local Tags which were used during data loads, either with the One Card implementation or as used by previous Library Management Systems, that will be deleted. RDA tags will be added where possible and as RDA Cataloguing rules prohibit the use of abbreviations (unless they form part of an actual word in a Title/Author) these will be updated to the full version of the word if identifiable e.g. Dept. becomes Department.  SirsiDynix has also offered to add a 007 Tag using the Item Type of the item as the source of information. Therefore if the Item Type is AB [audio book CD], the 007 Tag will be added and this will update the icon on Enterprise from a “book” icon to an “audio disc” icon.

All of this will mean much tidier Bibliographic records, reducing the indexing reports duration and increasing search speed for library staff and search results appearance for our Enterprise customers.

The second part of the Authority Processing Service will be a load of Library of Congress Subject, Author and Series headings including the Children’s headings.  These will be added in addition to the existing Libraries Australia headings loaded in 2014 so that no local headings will be lost.  A report will then run to match the Tags in the Bibliographic records to the correct form of the heading and update the Tag.  Again, this will clean up and enrich our records.

There has been much discussion with SirsiDynix regarding our local headings for SCISS, Torrens Toy Library and Local History.  All of these headings will be retained.  Where the Bib record indicates the Audience level to be “Juvenile”, a subdivision “v” of “Juvenile literature” or “Juvenile fiction” will be added.  If the Form subdivision is incorrect, e.g. Fiction in an “x” subfield instead of the correct “v”, these will be updated as well.

The LMS Collection & Cataloguing Group have been eagerly awaiting the Authority Processing Service since it was first presented at a meeting in August 2015 and the results will be appreciated by all Network staff and customers.

I would like to thank the LMS User Group members for their support, along with Jo Cooper as the Chair of the Collections & Cataloguing Group and the members of the sub-group who have been working through the finer details to make this all happen; Alice Mariano (Holdfast), Chris Kennedy (PLS & Holdfast), Angela Jones (Salisbury), Peter Thomas (Mitcham), Cathy O’Brien (Campbelltown), Jane Murphy (PLS) and Di Cranwell (PLS).

And I should add that once all of this work is complete I believe that our database will be sufficiently clean to add all of our holdings to Libraries Australia.  But what that means will be the subject of another post some time in the future.

Tuesday, 1 March 2016

TROVE: "over 374,419,217 books, articles, images, historic newspapers, maps, music, archives, datasets and more"*: the library community's greatest contribution at risk

*The quote in the title comes from a great article here.

I need to start by declaring an almost obsessive fascination with Trove - the National Library of Australia's magnificent contribution to librarianship, scholarship and research for our nation.  My view is that it is the single greatest contribution to cultural heritage produced in Australia in the last decade.  It utilises inputs from many different sources and makes them available to the serious researcher and the general community in a simple to use, engaging way.

This resource is transforming both the work of serious academics, wishing to study Australian history and culture and the curious amateur who wants to know more about their family's history somewhere in Australia.  Don't take my word for it; look at these testimonies: Internationally and locally . There are heaps of other articles I could point to that speak eloquently about how Trove has changed the face of Australian research.

Personally I have curiously looked up family members from previous generations, who I've heard about but want to know more details of their lives.  It is a truly addictive, fun place to find out so much!!

I'd recommend you search by your family surname and various towns you know families have lived in & see what you can discover!!  As an Australian of German descent, it is interesting/disturbing to see how my forebears dealt with wars and prejudice in Australia.   While some family members were fighting for Australia, at the same time others were having their haystacks burnt & their cattle poisoned because they had a German surname.  Interesting & perhaps slightly relevant in our current race charged debates!!

Why am I talking about Trove?  Well sadly the continual ingest of new content into Trove is under threat because of ongoing cuts to the budget of the National Library of Australia (NLA). These annual cuts have been in place for a number of years, but have now gone beyond "trimming the fat" to hacking at the bone of what is one of our national treasures!

The NLA built Trove using its existing, recurrent grant funding - by scrimping and saving to deliver on a magnificent vision.  The NLA management - both past and present - are to be congratulated for their vision and persistence to deliver this nationally significant research tool.  It is truly sensational & has something for everyone.

I am aware that for a number of years there have been annual cuts to the real value of the NLA budget.  These cuts have largely been absorbed internally, impacting on various levels of service quality, but largely hidden from view to the average library user.  However, these ongoing annual cuts have finally reached a point where they are directly impacting on the capacity of the NLA to support the legitimate role of Trove to continue to ingest and make available new and interesting content without direct additional funding support. 

The issues for Trove and the library & research communities are more eloquently explained by various people herehere, here or here.  

While I am not advocating any particular course of action, you may wish to consider the call of the #fundTrove campaign on Twitter and consider how you can contribute to the ongoing national, public campaign regarding the value of what I consider to be a wonderful national treasure.

Wednesday, 10 February 2016

Big Data - empowering business decisions

I follow OCLC on Twitter & yesterday they tweeted about a new blog, which includes a post called transforming data into impact.  It is a really interesting article - with a few key points in it. One was that of all the data being gathered in the world, only half of one percent is actually analysed. The article goes on to provide examples of where libraries are using data to drive informed decisions. The examples are worth looking at.

The article has a couple of sub-headings which read From data to insight and From insight to action.  This resonated with me as I've always been a bit of a data freak, because I am convinced that the best decisions are made based on evidence.  That may appear self-evident, but it is amazing how often decisions are made without a firm evidence base.  People who spend any time with me will hear me say that librarianship (like all professions) is a combination of an art and science.  Science is the evidence we gather to inform us, but the "art" is the professional judgement we use to interpret the data, or to ask more questions to get more data or to form an hypothesis which we then want to test with more action and evidence.  The "art" can be taught formally through library courses, and then it must be further developed through practise, reflection, working with peers and being actively involved in the changes that the profession continues to go through.

True library excellence is based on data, which is translated into insight which then informs action.  And without insight, data is of little value.

In relation to data, PLS has just released the 2014/15 library statistics for each library, and provided some comparisons based on the ALIA Standards and Guidelines and averages of various libraries across the State.  While these ALIA standards are not perfect & don't measure much of what is important about library practise they are a good starting place for libraries to measure themselves against an industry benchmark.  (There is a project on to revise these standards - but that will be the subject of another post soon).

Each library's data was released only to that library, and not published more broadly as we want to give library managers the opportunity to reflect on the data, to use it judiciously to inform their business planning and also choose how they use it within their own council and community.  We have also published the rankings for the metropolitan library, so that they can see where they stand in comparison with other libraries. I am hopeful that by providing comparative data against ALIA Standards and between libraries, managers will have an additional information layer to support their practise  of the "art" of library management.

As we gather more data from within the LMS and from other sources we'll be happy to share it, and in some cases publish it as well.  I am interested in any feedback from libraries about the data we have provided to date.

Port Lincoln RFID makes the media

Congratulations to the team at Port Lincoln for attracting media attention to the installation of your public self-service RFID kiosks.  The article is here.  And it is great to see the journalist taking on some of the context of this local project - the fact that RFID is rolling out across the State & this is a follow on from the One Card rollout.  She was obviously very well briefed by library manager Louise!

Wednesday, 20 January 2016

Growth in e-Content loans

The consortium has had a subscription to the Overdrive e-content provider for several years.  Just over a year ago we were able to make the e-Books and e-Audiobooks discoverable through Enterprise - the public access "catalogue" / discovery product.  This increase in accessibility has directly contributed to a significant increase in loans.

Below are two graphs provided through the Overdrive management portal which demonstrate both the growth in loans and also the proportion of loans which are e-Books compared to e-Audiobooks.  This growth in customer usage also supports the consortium's decision to increase its expenditure on e-content as well as choosing a second e-content provider to complement our current Overdrive subscription. We are current tidying up a few loose ends with our new provider and SirsiDynix to ensure that the new content we're about to purchase will also be accessible through Enterprise.  We will have more to say about this when we're ready to launch this second content source.

Compared to some library services, as a consortium we are relatively new to providing loanable e-content & we have not been as aggressive as some in promoting this service.  I was talking to staff at Brisbane City Council libraries at the end of last year & the subject of e-content came up.  They treat their e-content as a separate "branch" in that it is online, and independent from all of their physical libraries; and they tell me that late last year their e-content branch generated more loans than any of the other 30+ physical libraries.  

It will be interesting to keep score of our growth in e-content in coming years, to see when our State-wide "branch" begins to surpass many of our physical libraries. In fact, while I don't have the figures with me, I'd have to think that loans of 35,000+ per month (as seen October - December) is a figure that would challenge many of our branches.  Of course this figure is for all loans right across the State, and many customers who download e-content also borrow physical items, but all the same this is an impressive figure!  I might do some research about the costs per loan from this collection compared to others. 

As this part of our business continues to grow we will need to ensure that we continue to increase our skills and our support for the community as well as finding new ways to promote this collection to new audiences. 

Wednesday, 30 December 2015

Presenting library statistics

I'm always interested in how library statistics can tell a story about library performance.  Marissa and I are currently building report cards for each library, looking at how they rate against the ALIA Standards and Guidelines.  We hope to release these to each library manager in the next month or so.

While looking at how others have reported library statistics I came across this really interesting blog post which uses a range of technologies to report on visits to UK libraries & how this figure compares to other activities undertaken by the community.  There is the use of infographics, Slideshare (my favourite) and Sway amongst others.  And someone has re-used the approach to display stats about Canadian libraries.  The referencing of sources for the UK data at the bottom of the blog is particularly "librarian" - and great to have so that the authenticity of the data can be verified.

This has given me an idea about both Australian and South Australian library statistics, compared to other community activities.  I'm not sure I'll get around to doing anything in the near future, but it's a mini-project I may come back to at some time. Of course if someone else wants to do the comparative research of community visits to various other events I'd be delighted to publish it here (& of course give credit to the authors and sources of their data). 

As a starting point for Australian library visits, the most recent set of data about public libraries is from 2013/14 & is available here on the NSLA website. This data shows that there were over 112M visits to Australian public libraries that year, which is over 9M per month, or more than 2.1M a week / 300,000 a day.  As Ned Potter's blog says, "So next time someone says libraries are no longer relevant, consider these statistics for a minute (and during that minute 536 people will visit a library."

Collectively, the public libraries of Australia are a powerhouse of community learning and recreation & are obviously very relevant to a significant proportion of our community.  We need to shout this out loud to all who we come across.