Data services need to advocate quality in data and in research. The same paradigm holds for big data. And we do need to keep in mind that part of the rationale for the invention of sample surveys was to be able to ask more from fewer. If big data offers us less from more, is it worth it? If it offers us more for less, or something entirely new (for example, data from smart energy meters) then it may be worth it.
There are some significant risks however, can anticipate and mitigate in this area. These risks will depend on the type of data, the context in which it was created, the owner of the data, the method of ‘analysis’, and the purpose to which the analysis is put. If personal data from a government business information system is used by belarus rcs data respected researchers, whose research is approved by data owners for public good, who understand the complexities of the data, work in a secure environment, have their outputs vetted, and are aware of the penalties under the law, the risks of anything “bad” happening is virtually zero. But, it seems from some of the public responses to the recent UK Data Sharing legislation that not everyone is quite so optimistic about the likelihood of something bad happening.
But there are some very specific risks surrounding the quality of these data. There is the possibility decisions may be based on insights from attractive new forms of big data, without the necessary work to understand and calibrate the extent to which it provides valid alternatives to more traditional forms of data collection. In the main, traditional census and survey type data sources involve considerable effort in the design of questions, sampling frames and definitions, allowing users to understand and quantify issues such as representativeness and bias.