Data portability is a concept to protect users from having their data stored in "silos" or "walled gardens" that are incompatible with one another, i.e. closed platforms, thus subjecting them to vendor lock-in and making the creation of data backups or moving accounts between services difficult.
Data portability requires common technical standards to facilitate the transfer from one data controller to another, such as the ability to export user data into a user-accessible local file, thus promoting interoperability, as well as facilitate searchability with sophisticated tools such as grep
. [1] [2]
Data portability applies to personal data. It involves access to personal data without implying data ownership per se. [3]
At the global level, there are proponents who see the protection of digital data as a human right. Thus, in an emerging civil society draft declaration, one finds mention of the following concepts and statutes: Right to Privacy on the Internet, Right to Digital Data Protection, Rights to Consumer Protection on the Internet – United Nations Guidelines for Consumer Protection. [4]
At the regional level, there are at least three main jurisdictions where data rights are seen differently: China and India, the United States and the European Union. In the latter, personal data was given special protection under the 2018 General Data Protection Regulation (GDPR).
The GDPR thus became the fifth of the 24 types of legislation listed in Annex 1 Table of existing and proposed European Directives and Regulations in relation to data. [5]
Personal data are the basis for behavioral advertising, and early in the 21st century their value began to grow exponentially, at least as measured in the market capitalization of the major platforms holding personal data on their respective users. European Union regulators reacted to this perceived power imbalance between platforms and users, although much still hinges on the terms of consent given by users to the platforms. The concept of data portability comprises an attempt to correct the perceived power imbalance by introducing an element of competition allowing users to choose among platforms.
With the advent of the General Data Protection Regulations (GDPR), social media platforms such as Twitter, Instagram, Snapchat, and the Wall Street Journal online subscriber community have widely adopted the ability to export and download user data into a ZIP archive file. [6] Other platforms such as Google and Facebook were equipped with export options earlier. [7] Some platforms restrict exports with time delays between each, such as once per 30 days on Twitter, and many platforms lack partial export options. [8]
Other sites such as Quora and Bumble offer no automated request form, requiring the user to request a copy of their data through personal support email. [9]
Reputation portability refers to the ability of an individual to transfer their reputation or credibility from one context to another. [10] [11] This concept is becoming increasingly important in today's interconnected world, where individuals are involved in multiple online and offline communities.
The idea behind reputation portability is that an individual's reputation should not be tied solely to a single community or platform. [12] Rather, it should be transferable across different contexts, such as professional networks, social media platforms, and online marketplaces. This enables individuals to maintain a consistent reputation across various contexts, which can be beneficial in terms of building trust, and overcoming the so-called "cold-start" problem, [13] [14] and hence mitigating platform lock-in.
Overall, reputation portability is an important concept in today's digital landscape, and research has shown that imported reputation can serve as viable signals for building trust. [15] [16] [17] As technology continues to evolve, it is likely that reputation portability will become increasingly important in shaping how we interact with each other online and offline.
Some mobile apps restrict data portability by storing user data in locked directories while lacking export options. Such may include configuration files, digital bookmarks, browsing history and sessions (e.g. list of open tabs [a] and navigation histories), watch and search histories in multimedia streaming apps, custom playlists in multimedia player software, entries in note taking and memorandum software, digital phone books (contact lists), call logs from the telephone app, and conversations through SMS and instant messaging software.
Locked directories are inaccessible to an end-user without extraordinary measures such as so-called rooting (Android) or jailbreaking (iOS).
The former requires the so-called boot loader of the device to be in an unlocked state in advance, which it usually is not by default. Toggling that state involves a full erasure of all user data, known as the wipe, making it a vicious cycle if the user's aim were to access their locked data. [18]
Other mobile apps only allow the creation of user data backups using proprietary software provided by the vendor, lacking the ability to directly export the data to a local file in the mobile device's common user data directory. Such said software requires an external host computer to run on. [19] [20]
Some device vendors offer cloud storage and synchronisation services for backing up data. Such services however require registration and depend on internet connection and preferably high internet speeds and data plan limits if used regularly. Some services may only allow moving parts of the data such as text messages and phone books between locked directories on devices of the same vendor (vendor lock-in), without the ability to export the information into local files directly accessible by the end user. [21] [22]
Restrictions added in more recent versions of operating systems, such as scoped storage , which is claimed to have been implemented with the aim to improve user privacy, compromise both backwards compatibility to established existing software such as file managers and FTP server applications, as well as legitimate uses such as cross-app communication and facilitating large file transfers and backup creation. [23] [24]
Further possible restraints on data portability are poor reliability, stability and performance of existing means of data transfer, such as described in Media Transfer Protocol § Performance.
Some digital video recorders (DVRs) which store recordings on an internal hard drive lack the ability to back up recordings, forcing a user to delete existing recordings upon exhausted disk space, which is an instance of poor data portability.
Some DVRs have an operating system that depends on an Internet connection to boot and operate, meaning that recordings stored locally are inaccessible if no internet connection is available. If service for the device gets deprecated by the television service provider, the existing recordings become inaccessible and thus considerably lost. [25] [26]
Cordless landline telephone units, as well as their associated base stations, which have firmwares with phone book and SMS messaging functionality, commonly lack an interface to connect to a computer for backing the data up.
Some software such as the Discourse forum software offers a built-in ability for users to download their posts into an archive file.
Other software may operate locally, but store user data in a proprietary format, thus causing vendor lock-in until successfully reverse-engineered by third party developers.
The right to data portability was laid down in the European Union's General Data Protection Regulation (GDPR) passed in April 2016. The regulation applies to data processors, whether inside or outside the EU, if they process data on individuals who are physically located within an EU member state.
Controllers must make the data available in a structured, commonly used, machine-readable and interoperable format that allows the individual to transfer the data to another controller. [27] [28]
Earlier the European Data Protection Supervisor had stated that data portability could "let individuals benefit from the value created by the use of their personal data". [29]
The European-level Article 29 Data Protection Working Party held a consultation on this in English lasting until the end of January 2017.
Their guidelines and FAQ on the right to data portability contain this call for action:
WP29 strongly encourages cooperation between industry stakeholders and trade associations to work together on a common set of interoperable standards and formats to deliver the requirements of the right to data portability. This challenge has also been addressed by the European Interoperability Framework (EIF).
The French national data supervisor CNIL hosted a discussion in French. Current participants offer opinions on how the legislation provides few benefits for companies, but many for users. [30]
In April 2017, new guidelines were published on the Article 29 Working Party website. [31] In late 2019 the Data Governance Act was published by the Commission. [32]
In 2021 researchers, many of them French and Finnish, published a 46-page report covering the state-of-the-art. [33]
In 2022 the European Commission published the Data Act. [34]
Although the United Kingdom voted to withdraw from the EU, it intends to incorporate much of the GDPR in its own legislation, which will include data portability, as "...the GDPR itself contains some noteworthy innovations – for instance… the introduction of a new right to data portability". [35] In November at the Internet Governance Forum 2019 in Berlin panelists reported that Article 20 GDPR is not actionable, neither legally nor technically. [36] In the UK—ironically post-Brexit—researchers are monitoring developments. [37] [38] [39]
Germany has called to strengthen the European Union's right to data portability using competition law. A commission was set up for the purpose of proposing improvements. [40]
Likewise, in Switzerland, a nation-state that is related to the EU only on a bilateral basis and as an EFTA member state, there has been a trend moving in the same direction. The Swiss view was officially published in March 2018 (as a document in PDF). [41]
An association proposed to have a right to data portability anchored in the constitution of the Swiss Confederation. [42] A law was passed that includes data portability; as described here in German [43] and here in French. [44] The association partners with a cooperative called MIDATA.coop, which will offer users a place to store their data. [45]
A second association has issued its guideline on the topic. [46]
Over the longer term, the Swiss may have to consider that data portability is in the GDPR. Given that the GDPR will raise compliance costs for EU-based companies, it is unlikely that the EU would tolerate a situation with third-party countries in which Swiss companies would not be held to the same standard in order to keep competition fair. The legal terms involved are adequacy and reciprocity. [47]
California has a Consumer Privacy Act (CCPA) of 2018, which introduces data portability to the USA. [48]
Canada anticipates a law in that it shows Transparency, Portability and Interoperability as Principle No. 4 of its Digital Charter. [49]
Data portability is included in the Personal Data Protection Bill 2019 about to become law as section 26 in chapter VI.
Data portability is included in the Privacy law#Brazil as its Article 18. [50]
In Australia, a Consumer Data Right has been proposed. [51]
Data portability is included in the new law. [52]
A right to data portability is enshrined in the new data protection law under clause 34. [53] However, the intentions behind the new law, its enforcement and relation to the government's new Identity management system have already been contested. [54]
It is always tricky for legislators to regulate at the right level of precision, as everyone understands technology will evolve faster than the law. So far, only the European Union has formalized the expectations around data portability, requiring the data "in a structured, commonly used, machine-readable and interoperable format".
This touches on at least two distinct technical requirements for effective interoperability:
Likewise, European researchers stress that there are both practical and legal gaps that the EU should fill. [55]
The list of these rights has grown. [56]
The data portability right is slightly different from the Right of access to personal data; see GDPR and the seventh item in the list cited immediately above. The right of access only mandates that the data subject gets to see their personal data. The old EU Data Protection Directive used to require explicitly in such cases for the data to be provided in "intelligible" form, which has been interpreted so far as "human readable". This requirement is still somewhat present in the EU's General Data Protection Regulation, but only implicitly in conjunction with Recital (law). Since the right to portability is mostly concerned with reuse by other services (i.e. most likely automated), it could be that both "human readable" and "raw format" would be inappropriate for effective data portability. Some intermediate level might need to be sought.
In addition, the GDPR limits the scope of data portability to cases where the processing is made on the basis of either consent of the data subject, or the performance of a contract.
The data portability right is related to the "right to explanation", i.e. when automated decisions are made that have legal effect or significant impact on individual data subjects. How to display an algorithm? One way is through a decision tree. This right, however, was found to be not very useful in an empirical study. [57] The right to explanation is related to the "Right to not be evaluated on the basis of automated processing" shown as the last item in the list shown in Gabel / Hickman. [58] This includes decisions based on profiling. Such a right was included in the EU Data Protection Directive of 1995, but not much enforcement followed. An article in Wired emphasised the poignancy of the discussion. [59] The issue has been discussed by Bygrave, [60] and by Hildebrandt, [61] who claimed this to be one of the most important transparency rights in the era of machine learning and big data. Contrary to Hildebrandt's high expectations in 2012, four years later, after many revisions to the GDPR, when the text was finalized, three other well-known authors contest whether a right to explanation still exists in the GDPR (see below).
In the United States there was a description of related developments in a seminal book by law professor Frank Pasquale; [62] the relevant passages were reviewed by the Electronic Privacy Information Center (EPIC). [63] Even the U.S. Defense Advanced Research Projects Agency DARPA has an Explainable AI (XAI) program [64] cited critically by blogger Artur Kiulian. [65]
Several papers have been published on these topics in 2016, the first of which, by Goodman / Flaxman, outlines the development of the right to explanation. [66] Pasquale does not think the approach goes far enough, as he has stated in a blog entry at the London School of Economics (LSE). [67] In fact at LSE there is a whole series on Algorithmic Accountability of which that was one entry in Feb. of 2016, and other notable ones were by Joshua Kroll and Mireille Hildebrandt. [68]
Another 2016 paper, published by Katarinou et al., includes remarks on a right of appeal such that "individuals would have a right to appeal to a machine against a decision made by a human." [69]
A third 2016 paper, one co-authored by Mittelstadt et al., maps the literature and relates it to the GDPR on its pages 13–14. [70]
A fourth paper, one co-authored by Wachter, Mittelstadt and Floridi, refutes the idea that such a right might be included in the GDPR, proposes a limited 'right to be informed' instead and calls for the creation of an agency to implement the transparency requirement. [71] A further paper by Edwards and Veale claims such a right is unlikely to apply in the cases of the 'algorithmic harms' attracting recent media attention, and that insufficient attention has been paid to both the computer science literature on explanation and how other GDPR provisions, such as data protection impact assessments and data portability, might help. [72] Almost two years later a paper appeared that challenges earlier papers, especially Wachter / Mittelstadt / Floridi. [73]
On both sides of the Atlantic, there has been recent activity pertaining to this ongoing debate. Early in 2016 experts on artificial intelligence and UK government officials met during a number of meetings, [74] and developed a Data Science Ethical Framework. [75] On November 7, 2016 an event was held in Brussels, organized by MEP Marietje Schaake in the European Parliament and described by danah Boyd. [76] Only eleven days later at New York University there was a conference on "Fairness, Accountability, and Transparency in Machine Learning " where Principles for Accountable Algorithms and a Social Impact Statement for Algorithms were articulated and placed online for discussion. [77] By mid-December the IEEE came out with a document whose editing was backed up by public comments that were invited by March 2017 on "Ethically Aligned Design". [78] Later in 2017 data portability was analysed by professors of data protection as a central innovation of the new GDPR. [79]
Information privacy is the relationship between the collection and dissemination of data, technology, the public expectation of privacy, contextual information norms, and the legal and political issues surrounding them. It is also known as data privacy or data protection.
The Data Protection Directive, officially Directive 95/46/EC, enacted in October 1995, was a European Union directive which regulated the processing of personal data within the European Union (EU) and the free movement of such data. The Data Protection Directive was an important component of EU privacy and human rights law.
A privacy policy is a statement or legal document that discloses some or all of the ways a party gathers, uses, discloses, and manages a customer or client's data. Personal information can be anything that can be used to identify an individual, not limited to the person's name, address, date of birth, marital status, contact information, ID issue, and expiry date, financial records, credit information, medical history, where one travels, and intentions to acquire goods and services. In the case of a business, it is often a statement that declares a party's policy on how it collects, stores, and releases personal information it collects. It informs the client what specific information is collected, and whether it is kept confidential, shared with partners, or sold to other firms or enterprises. Privacy policies typically represent a broader, more generalized treatment, as opposed to data use statements, which tend to be more detailed and specific.
Personal data, also known as personal information or personally identifiable information (PII), is any information related to an identifiable person.
Information privacy, data privacy or data protection laws provide a legal framework on how to obtain, use and store data of natural persons. The various laws around the world describe the rights of natural persons to control who is using its data. This includes usually the right to get details on which data is stored, for what purpose and to request the deletion in case the purpose is not given anymore.
Privacy law is a set of regulations that govern the collection, storage, and utilization of personal information from healthcare, governments, companies, public or private entities, or individuals.
Privacy and Electronic Communications Directive2002/58/EC on Privacy and Electronic Communications, otherwise known as ePrivacy Directive (ePD), is an EU directive on data protection and privacy in the digital age. It presents a continuation of earlier efforts, most directly the Data Protection Directive. It deals with the regulation of a number of important issues such as confidentiality of information, treatment of traffic data, spam and cookies. This Directive has been amended by Directive 2009/136, which introduces several changes, especially in what concerns cookies, that are now subject to prior consent.
Axel Voss is a German lawyer and politician of the Christian Democratic Union of Germany who has been serving as a Member of the European Parliament since 2009 and became coordinator of the European People's Party group in the Committee on Legal Affairs in 2017. His parliamentary work focuses on digital and legal topics.
The General Data Protection Regulation, abbreviated GDPR, or RGPD is a European Union regulation on information privacy in the European Union (EU) and the European Economic Area (EEA). The GDPR is an important component of EU privacy law and human rights law, in particular Article 8(1) of the Charter of Fundamental Rights of the European Union. It also governs the transfer of personal data outside the EU and EEA. The GDPR's goals are to enhance individuals' control and rights over their personal information and to simplify the regulations for international business. It supersedes the Data Protection Directive 95/46/EC and, among other things, simplifies the terminology.
In the regulation of algorithms, particularly artificial intelligence and its subfield of machine learning, a right to explanation is a right to be given an explanation for an output of the algorithm. Such rights primarily refer to individual rights to be given an explanation for decisions that significantly affect an individual, particularly legally or financially. For example, a person who applies for a loan and is denied may ask for an explanation, which could be "Credit bureau X reports that you declared bankruptcy last year; this is the main factor in considering you too likely to default, and thus we will not give you the loan you applied for."
The ePrivacy Regulation (ePR) is a proposal for the regulation of various privacy-related topics, mostly in relation to electronic communications within the European Union. Its full name is "Regulation of the European Parliament and of the Council concerning the respect for private life and the protection of personal data in electronic communications and repealing Directive 2002/58/EC ." It would repeal the Privacy and Electronic Communications Directive 2002 and would be lex specialis to the General Data Protection Regulation. It would particularise and complement the latter in respect of privacy-related topics. Key fields of the proposed regulation are the confidentiality of communications, privacy controls through electronic consent and browsers, and cookies.
The gathering of personally identifiable information (PII) refers to the collection of public and private personal data that can be used to identify individuals for various purposes, both legal and illegal. PII gathering is often seen as a privacy threat by data owners, while entities such as technology companies, governments, and organizations utilize this data to analyze consumer behavior, political preferences, and personal interests.
A data economy is a global digital ecosystem in which data is gathered, organized, and exchanged by a network of companies, individuals, and institutions to create economic value. The raw data is collected by a variety of factors, including search engines, social media websites, online vendors, brick and mortar vendors, payment gateways, software as a service (SaaS) purveyors, and an increasing number of firms deploying connected devices on the Internet of Things (IoT). Once collected, this data is typically passed on to individuals or firms, often for a fee. In the United States, the Consumer Financial Protection Bureau and other agencies have developed early models to regulate the data economy.
The right of access, also referred to as right to access and (data) subject access, is one of the most fundamental rights in data protection laws around the world. For instance, the United States, Singapore, Brazil, and countries in Europe have all developed laws that regulate access to personal data as privacy protection. The European Union states that: "The right of access occupies a central role in EU data protection law's arsenal of data subject empowerment measures." This right is often implemented as a Subject Access Request (SAR) or Data Subject Access Request (DSAR).
Sandra Wachter is a professor and senior researcher in data ethics, artificial intelligence, robotics, algorithms and regulation at the Oxford Internet Institute. She is a former Fellow of The Alan Turing Institute.
Regulation of algorithms, or algorithmic regulation, is the creation of laws, rules and public sector policies for promotion and regulation of algorithms, particularly in artificial intelligence and machine learning. For the subset of AI algorithms, the term regulation of artificial intelligence is used. The regulatory and policy landscape for artificial intelligence (AI) is an emerging issue in jurisdictions globally, including in the European Union. Regulation of AI is considered necessary to both encourage AI and manage associated risks, but challenging. Another emerging topic is the regulation of blockchain algorithms and is mentioned along with regulation of AI algorithms. Many countries have enacted regulations of high frequency trades, which is shifting due to technological progress into the realm of AI algorithms.
The Digital Services Act (DSA) is an EU regulation adopted in 2022 that addresses illegal content, transparent advertising and disinformation. It updates the Electronic Commerce Directive 2000 in EU law, and was proposed alongside the Digital Markets Act (DMA).
Michael Veale is a technology policy academic who focuses on information technology and the law. He is currently associate professor in the Faculty of Laws at University College London (UCL).
Digital self-determination is a multidisciplinary concept derived from the legal concept of self-determination and applied to the digital sphere, to address the unique challenges to individual and collective agency and autonomy arising with increasing digitalization of many aspects of society and daily life.
Automated decision-making (ADM) involves the use of data, machines and algorithms to make decisions in a range of contexts, including public administration, business, health, education, law, employment, transport, media and entertainment, with varying degrees of human oversight or intervention. ADM involves large-scale data from a range of sources, such as databases, text, social media, sensors, images or speech, that is processed using various technologies including computer software, algorithms, machine learning, natural language processing, artificial intelligence, augmented intelligence and robotics. The increasing use of automated decision-making systems (ADMS) across a range of contexts presents many benefits and challenges to human society requiring consideration of the technical, legal, ethical, societal, educational, economic and health consequences.