Richard Novak: AWARENESS OF BIG DATA ISSUES
March 18 @ 3:00 pm - 4:30 pm UTC+1
Big Data is a relatively new term that has so far not been viewed through the lens of applied ethics.
My focus in this paper is on the awareness of the conflicts arising between Big Data phenomenon and its issues and the relevant ethical principles.
Firstly, I start with the research of other authors and an overview of ethics and the Digital Divide that are generally accepted. Secondly, I continue with the description of data sources and Big Data use cases from the telecommunication industry, demonstrating what is currently feasible, that I will generalize and, furthermore, suggest a comprehensive list of twelve Big Data issues such as Privacy Intrusion, New Barriers, Business Advantage, Power of All data, New Big Brother effect, Missing Transparency, Confusion, Social Pressure, Belief in Legislation, End of Theory, Data Religion and Unawareness of our Data. Thirdly, I describe the existing regulatory framework of the Big Data area with some suggestions for improvement and I also verify the awareness of the suggested twelve Big Data issues by launching an international survey. Finally, I discuss and conclude the paper results.
The survey (N=733) of university students, IT professionals and seniors from EU countries, mainly Czechia and Slovakia concluded that Big Data issues are grouped into three different and consistent clusters: hot, cold and warm (suggested by the Ward method that uses the Euclid distance between the mean and standard deviation).
I found, using MANOVA Pillai’s statistical test, that clusters are significantly dependent on demography (IT Skills, Occupation and Sex). Warm clusters show interesting dependencies on the demographic category, such as the social pressure perceived important by pensioners and women compared to the underestimated importance reported by men and IT Professionals. The conclusion of the paper is that the awareness of Big Data issues can be grouped into three consistent clusters that depend on a few demographic variables. I also conclude that there is a need for regulation frameworks to move past Data Ethic by Default (Law) to a priori Data Ethics by Design approach.
Keywords: awareness, big data issues, cluster analysis, data ethics by design, demography, digital divide, manova