The process begins with data being extracted from the Talis library management system (LMS) at JRUL in CSV format. This data is parsed by a PHP script which separates the data into two tables in a MySQL database, the bibliographic details describing an item go into a table called items and the loan specific data, including borrower ID, goes into a table called, you’ve guessed it, loans. A further PHP script then processes the data into two additional MySQL tables, nloans and nborrowers; nloans contains the total number of times each item has been borrowed, and nborrowers contains, for each combination of two items, a count of the unique number of library users to have borrowed both items.
With the above steps complete, additional processing is performed on demand by the web API. When called for a given item, say item_1, the API returns a list of items for suggested reading, where this list is derived as follows. From the nborrowers table a list of items is compiled from all combinations featuring item_1. For each item in this list the number of unique borrowers, from the nborrowers table, is divided by the total number of loans for that item, from the nloans table, following the logic used by Dave Pattern at the University of Huddersfield. The resulting values are ranked in descending order and the details associated with each suggested item are returned by the API.
For a bit of light relief here’s an image.
This is a screenshot from a piece of code written to demonstrate the web API. For a given item, identified by the ISBN, the details are retrieved from the items table in the MySQL database and displayed in [A]. An asynchronous call is made to the web API that accepts the ISBN as a parameter, along with threshold and format values which are set using the controls in [B]; threshold is the minimum number of unique borrowers that any given combination of items must have to be considered, and format specifies how the returned data is required (either xml or json). Results from the web API are displayed in [C], with the actual output from the API reproduced in [D]. Note that all available results are returned by the API but the test code only shows the number set by the third control in [B].
The exact format of the output is yet to be ratified but the API is in a state where it can now be incorporated into prototype interfaces at JRUL and in COPAC. In addition the remaining 3 million or so loan transactions from JRUL will be loaded and processed in readiness for user testing.