What’s behind a recommendation?

This guide is about types of recommendations that might be desirable to use in different situations, together with notes about the kinds of activity data needed to make the recommendations
The problem
Recommendations, such as those embedded in services like Amazon and iTunes, are typically based on attention (navigation, search) and activity (transaction) data, though they may also take account of user generated ratings. Different types of recommendation need different levels of data.
For example,
‘People who bought that also bought this’
requires less data to recommend than either
‘People who bought that bought this next’
‘People like you bought this’.
The initial problem is to identify, for a given service what kind of recommendations might be offered.
These recommendations will have particular activity data demands as to the activity data to be collected, preserved, collated and processed. You should check availability of the data sources, possibly using a pre-prepared list of activity data sources (as recommended elsewhere in these guides). You should also check if you can use the data within the bounds of your corporate Privacy Policy (again please see elsewhere in these guides).
The solution
Services need to make a list of the recommendations that they would like to offer, identifying the data required for each.
To save you doing that from scratch, here is a list of broad yet distinct recommendation types and the generic data from which they might be derived.
Generic recommendation types
[A] People who did that also did this – a list of all transactions (e.g. book circulation records) with anonymised user IDs and no further user details
[B] People who looked at that also looked at this – a variation on [A] based on ‘attention’, perhaps signified by retrievals of full details catalogue pages
[C] People who did that did this next / first – as per [A] but this will need a time stamp
[D] - People like you also did this - ‘Like you’ can be in a different league of difficulty, depending on how you define ‘like you’; you can cheat and treat it as a variation on [A] where likeness is based on activity. However, in education we might expect ‘like you’ to be based on a scholarly context – such as a Course or even a Module or perhaps a discipline and a level (e.g. Undergraduate Physicists, First Year Historians) – all of which implies personal data, with each transaction linked to a user and their context.
[E] - People like you rated this – A variation on [D] that is possible if your system collects ratings such as stars or a ‘liked’ / ‘useful’ indicator.
[F] - New items that may interest you – This ‘awareness’ service requires no information about other users, simply using a ‘precise-enough’ catalogue code (e.g. Subject heading or Dewey class in libraries) to link the user’s previous activity to new items.
The following are particularly interesting in libraries, though they could be supported elsewhere
[G] People who searched using this term ended up viewing / borrowing in these subject areas – building a mapping from search terms to the eventual classification of choice may help with accessing classifications that differ from the key words used in teaching; this does not need data about individual users
[H] The following alternative titles are available – at its simplest, this can use such as Dewey codes or reading lists to recommend alternatives; whilst this does not have to involve individual user data, the recommendation could be enhanced based on ‘people like you’.
Taking it further
The first steps are to determine
  • What sort of recommendations you could offer in order to assess the user experience
  • Which events you would need to capture in order to generate those recommendations
  • What the recommendation algorithm is, or how feasible it is to design an algorithm, possibly experimentally
  • What supporting personal context data would be desirable
  • Whether capturing additional data would enable to support a wider range of recommendation types at a later stage
Additional resources
As illustrated in Dave Pattern’s presentation, the University of Huddersfield has identified a range of recommendation types that can be derived from library activity data – http://www.slideshare.net/gregynog/bw-dave-pattern-lidp
The RISE and SALT projects also focused on recommender services based for library resources. The OpenURL project developed a recommender as a by-product of other work (http://edina.ac.uk/cgi-bin/news.cgi?filename=2011-08-09-openurldata.txt).
Websites such as Amazon illustrate a range of activity-based recommendations; you might consider what data lies behind each type. For your own account, consider what they need to know about you and about other users – www.amazon.co.uk