The university-wide workshop on “Ethical Reasoning in Big Data” will take place on MARCH 28, between 8:30 am – 11:30 aM in BRNG B247 (B is for Basement). The workshop is organized as part of the activities funded by the NSF award (1636891 ) BD Spokes: PLANNING: MIDWEST: Cyberinfrastructure to Enhance Data Quality. PI Elisa Bertino, Co-PI Sorin Adam Matei.
If you want to learn more about the ethical treatment of big data, especially if it involves recombining and repurposing of data collected for other purposes than your intended research, this is the place to start. The workshop will introduce the participants to the Ethical Reasoning Matrix and to a reasoned and value-based decision-making process. At the end of the workshop, you will be better prepared to handle data (big and small) in an ethical manner.
The workshop will be directed by two co-authors of the chapter “A Theoretical Framework for Ethical Reflection in Big Data Research,” Michael Steinmann, Sorin Adam Matei. The chapter was published in the volume “Ethical Reasoning in Big Data” (eds. Jeff Collmann and Sorin Adam Matei).
The workshop will focus on three case studies, presented by three Purdue researchers who handle big or social media data (See below for details).
The workshop will include a presentation of an ideal-typical ethical reasoning and decision making matrix. After presenting the case studies, the participants will be broken down into small groups that will discuss the applicability of the matrix to a given case study. In the process, the participants will discover new questions and answers to ethical reasoning in a data manipulation context.
The workshop will be followed by a keynote address by Professor Elisa Bertino, on Data Security and Privacy. This is a separate, follow-up event, which we will market separately. This will start at noon and is held in WTHR 160.
If you have questions, please let me know (Sorin Adam Matei).
The workshop will start with three presentations of three case studies of ethical challenges in big data collection. Each case will be discussed by a group of self-selected participants on the basis of our Ethical Reasoning Matrix. Please acclimate yourselves with the matrix and then read the “challenges” documents listed below, at the end of each abstract.
Ethical Reasoning in Big Data – Case Studies
CAM2: Massive Public Webcam Feed Harvesting and Interpretation
Dr. Yung-Hsiang Lu
Dr. Lu has designed a workflow for capturing and interpreting in real time video feeds from thousands of publicly available web cameras. His main goal is to facilitate automatic interpretation of video feeds, which can support a variety of activities, from environmental management, to safety and emergency situations. However, the flow of data may include, even if inadvertently, behaviors or events that may raise ethical concerns. How should the researchers handle such issues, especially when privacy or security are in conflict?
Dr. Lu will present and we will discuss the challenges raised by the CAM2 project.
Collecting data about religious practices in China: Lessons learned by the Center for the Study of Religion at Purdue University
Fenggang Yang and Jonathan Pettit
Although more than two-thirds of Chinese citizens declare themselves as non-religious, religious practices are spreading at a very fast rate in China. Some of them are encouraged, some tolerated, and some frowned upon according to their potential impact on the social and political stability of the country. Collecting data about religious practices and establishments raises a number of ethical issues, including the possibility of revealing the presence of religious groups that were not known to the government (and perhaps did not want to be known). We will present some of the challenges and conundrums raised by our recent attempt to create a comprehensive database and online GIS visualization/mapping of religious practice in China. We focus on our geocoding of over 72,000 temples, mosques, and churches in China, and the ethical considerations we considered as we made this dataset public (it will be posted to Github this fall). The political and ethical impacts of geocoded data, we conclude, are fundamental for all projects in the study religion, especially those scholars who make their data publicly available.
Dr. Pettit will discuss the ethical challenges involved in collecting data on religious practices in China.
The ethical challenges of crowdsourcing
Crowd-powered systems integrate web workers in computational processes to fill the gaps where AI still falls short. The popularity of online task markets means that such systems can quickly summon a large team of workers to perform data collection and transformation at scale—and at quality levels unachievable by any current machine algorithms. It may then be tempting to frame the interaction as a “remote person call.” However, goals that are ubiquitous with machine computing—e.g., aggressive minimization cost and communication—would violate norms of fair labor. Human expectations must be reconciled with computational requirements, including data quality. Dr. Quinn will show how these challenges affected the design of a system developed by his lab for generating lists from internet sources on arbitrary topics (e.g., US coffee wholesalers, famous dogs, etc.).