What is Activity Data?

Put simply, activity data is the record of any user action (online or in the physical world) that can be logged on a computer. We can usefully think of it falling in to three categories:
  • Access - logs of user access to systems indicating where users have travelled (eg log in / log out, passing through routers and other network devices, premises access turnstiles).
  • Attention - navigation of applications indicating where users have been are paying attention (eg page impressions, menu choices, searches).
  • Activity - ‘real activity’, records of transactions which indicate strong interest and intent (eg purchases, event bookings, lecture attendance, book loans, downloads, ratings).
Attribution (knowing who the user is) is invaluable in analysing and building services around activity data. This allows us to tie activity together (i.e. by the same person or, more uncertain but still of interest, from the same IP address). Whilst knowing who people are is a hazardous proposition in the online world, the veracity and utility of activity analysis is greatly enhanced by:
  • Scale (network effect) - which highlights patterns of activity in spite of exceptions (such as a shared family login to Amazon).
  • Context - which adds detail to identity, such as an area of interest (eg enrolled course, current module, occupation, family size)
Analytics might be best defined as ‘what we do with activity data’ - analysing patterns of known importance (eg people who do this do that) or more broadly looking for clues and exploring data to identify the stories it may have to tell. Analysis may involve combining multiple data sources, some of which may not be activity data (eg Exam results). On account of the scale of data involved (large numbers of records) such analysis and subsequent presentation can be assisted by a range of visualisation tools.