Data Provenance and Accountability on the Web

Jan 1, 2021·
Oshani W. Seneviratne
· 0 min read
Abstract
Decentralized information-sharing mediums, such as the Web, have been designed to share information by merely creating a document on a web server and linking it from another document. As the Web has grown exponentially in its existence for over three decades, we have come to face many tough questions on the data and content shared on the Web. Some of these questions include how do we characterize data on the Web, who owns the data, how does ownership help decide anything? If we own data, how do we give consent for uncertain future analyses or activities? How can we use data for purposes beyond its intended use ethically and responsibly? Furthermore, what is the meaning of morally responsible data collection, usage, and sharing? As users can anonymously or pseudonymously engage in activities, the lack of an individual’s identity on the Web is further complicated because consent on data usage changes in value, especially concerning confidentiality over time. While there are no simple answers to these challenging questions, there have been several technological advances to provide Web-based data provenance and accountability on the Web. This chapter reviews some of the underlying problems that give rise to data reuse and accountability issues on the Web, and features technical solutions with provenance by utilizing the Resource Description Framework (RDF).
Type
Publication
Provenance in Data Science: From Data Models to Context-Aware Knowledge Graphs