Guest Column

Whose data is it anyway?

ANURAG AGRAWAL

Sep 2022
from Shaastra :: vol 01 issue 05 :: Sep - Oct 2022

Rather than restricting access to health data, we need to see it as part of digital public goods for sanctioned societal use.

The rise of machine learning and artificial intelligence is being driven by the large amounts of easily accessible digital data in a digitalised and connected world. With some exceptions, much of the fundamental computer science, such as deep learning, that underpins current AI has been around for decades. Alongside continuous improvements to hardware, the post-Internet increase in the availability of large volumes of digital data was the real game-changer. Due to its high value, many people have started referring to data as the new oil, and ownership of this 'data-oil' has sparked controversy.

In most sectors, in the absence of any social contract, data is considered to belong to those who generate data – or pay to generate it. Beyond any compulsory disclosures, the owner of the data has the right to govern access, and to monetise such rights by selling ownership or access to such data. In this context, data is a tradeable commodity, just like oil, which possibly prompted the first invocation of that analogy by Clive Humby in 2006. This view – suggesting extractive exploitation of a trapped fungible asset – is highly superficial when examined in detail. Data, unlike physical commodities, is non-fungible, duplicatable, shareable, and indefinitely usable by multiple people without being consumed. Importantly, when data touches humans, such as in healthcare, the absence of an evident social contract cannot be taken as evidence of the absence of such a contract.

HEALTH (DATA) IS WEALTH

Healthcare is a high-value economic sector that is growing rapidly, and thus high value is being placed on health data. The high sensitivity of health data, from the perspective of privacy, social justice and ethics, makes it obvious that social contracts must exist, even if not explicitly stated. A high degree of social activism in this sector has led to a commonly subscribed view that health data belongs to the patients. It is getting enshrined in governance and law (see the National Health Portal of India, bit.ly/3QDNvr4). This view, although superficially attractive, is as unfounded in facts as the alternative view of data as belonging to the generator or the payor, when examined through the lens of unwritten social contracts governing health.

In healthcare, the foremost unwritten social contract is one that ensures that data is used in a way that maximally benefits patients' health.

The primary understanding between the healthcare system and the patient, while generating such data irrespective of the payor, is that the data would be used for the benefit of the patient. Patients, if asked, voice a preference for data practices that maximise the health benefits while minimising any risks. While they would certainly prefer to be part of any data dividend that accrues from the monetisation of such data, the foremost unwritten social contract here is one that ensures that data is used in a way that maximally benefits health. Take, for example, a genome sequencing study in a patient who wants health guidance. There is no value of such a study to the individual patient if it is not compared to the data of other patients with known health trajectories. Implicit to the generation of health data is a solidarity of purpose with other people for whom such data is generated – for the data to be used in a manner that benefits the collective, while not harming the individual. Such data solidarity extends in both directions – promoting beneficial use while dissuading misuse.

The concept of solidarity is not only compatible with privacy and protection from harm, it is likely to be more robust than current practices that emphasise consent. The Lancet and Financial Times Commission for governing health futures in a digital world recognised that minimal consent-based architectures, where people are deemed to have permitted the use of their data while hastily checking a box on a pop-up window, are inadequate. Getting that box checked is not a difficult task for the tech giants, and misuse is clearly rampant.

FINDING THE BALANCE

We need a lean but balanced data governance system. Rather than trying to restrict access to health data by complex regulations and consent architectures, we need to see it as part of digital public goods for sanctioned societal use. Building the necessary trust will require technological innovation, such as federated learning systems where algorithms and parameters move, but not the actual data. Deidentification of data is far more difficult than is commonly realised, and access control as the principal security mechanism tends to veer towards either inefficiency or insufficiency. Irrespective of technology, we need a clearer articulation and understanding of stakeholder needs and a society-centred design.

Who owns the data? We all do, but with different rights and responsibilities. Responsible flow of data will benefit all.

Name

Your Comments

Your Name

Your Email

Are you an alumnus of IIT Madras?

Yes

Please let us know your

Year of Graduation

Department

Send me updates on new articles on Shaastra

Name

Are you an alumnus of IIT Madras?

Yes

Please let us know your

Year of Graduation

Department

Country of Residence

Educational Profile

Work Profile

Send me updated on new articles on Shaastra

Whose data is it anyway?

HEALTH (DATA) IS WEALTH

FINDING THE BALANCE

LEAVE A COMMENT

Other Articles

Power to the people

Other Articles

More from ore

Other Articles

'Hydrogen plays an exciting role in decarbonisation'

Have a story idea? Tell us.

Could you tell us a little more about yourself?

Already given us your details?

Could you tell us a little more about yourself?

Have a
story idea?
Tell us.