When collecting data there are several issues that you will have to consider in order to ensure that the information that you are collecting is appropriate for the purposes that you are collecting it for. These include what data you should be collecting and its quality. For instance, the data may be coming from a variety of different sources in which case you may need to match users up between systems. This may be easy as there is a common user id in the systems, or require some mapping exercise if not. You will also need to check that the systems are actually logging the types of event that you are interested in. When the logs were turned on they may not have been set to collect data that you need. You will also need to consider the format that you collect the data in, which will depend on how you want to process the data. Each of these is discussed:
We have also produced a guide and a number of recipes on collecting and extracting data.