ORIGINAL RESEARCH
Ethics and legal regulation of using large databases in medicine
1 Pirogov Russian National Research Medical University, Moscow, Russia
2 Scientific Research Institute of Systemic Biology and Medicine of the Federal Service for Surveillance on Consumer Rights Protection and Human Wellbeing, Moscow, Russia
3 Federal Medical and Biological Agency, Moscow, Russia
4 Russian Medical Academy of Continuous Professional Training, Moscow, Russia
5 Moscow State Legal University named after Kutafin OE, Moscow, Russia
Correspondence should be addressed: Natalia V. Orlova
ul. Ostrovityanova, 1, Moscow, 117997, Russia, ur.xednay@513hcarv
Financing: The work was performed by the authors at the Federal Budgetary Institution of Science Research Institute of Health and Safety Management of Rospotrebnadzor within the framework of the state assignment ‘Development of methods for molecular genetic diagnostics for quantification of sanogenesis in healthy people’, code of the scientific topic ‘Norma’, number of state registration of research, development work in Unified State Information System for Accounting Research, Development and Technological Works for Civil Use 122030900062–5.
Author contribution: Orlova NV — analysis of scientific data, review of publications related to the topic of the article, making an abstract, writing an article text; Suvorov GN, Gorbunov KS — developing the article design, article editing.
SINGLE INFORMATION DATABASE IN RUSSIAN HEALTHCARE
In healthcare of the Russian Federation, transition to digital medicine is being implemented at an accelerated pace. The strategy of healthcare digital transformation includes formation of a single digital circuit, medical platform solutions taken at the federal level, personal medical assistants and artificial intelligence. Implementation of new projects is aimed at a single approach to provision of medical aid, implementation of the control system, statistical accounting and analysis, use of electronic documents to manage the healthcare system. The planned wide application of information technologies in medicine should correspond to ethical standards and rely on the legislative base.
Resolution of the Government of the Russian Federation as of February 09, 2022 No. 140 ‘About a single state information system in healthcare (SSISH)’ was issued to improve information technologies [1]. The SSISH should provide connection between all regional medical organizations. Its functions include processing and storage of medical documentation and health-related data, providing analytical information based on anonymous personal data to be subsequently used in statistics and research, and to develop and apply solutions driven by artificial intelligence. The SSISH should bring together information about drug provision to citizens including those who have a right for preferential provision of medicines and medical products, include federal bases of medical documents about death and birth, structured electronic medical cards, centralized systems named ‘Laboratory research’ and ‘Central archive of medical images’.
The single information database in healthcare will improve interaction between medical institutions, increase accessibility and effectiveness while providing medical aid, including the one rendered using artificial intelligence and telemedical consultations.
LARGE DATABASES IN MEDICINE
Medical data collection and archiving in medicine have a long history. Accumulation of big data in healthcare using information technologies greatly expands the possibilities of their use and can become an effective tool both for development of practical medicine, and for scientific purposes to examine prevalence and pathogenesis of diseases, revealing risk factors and developing new effective methods of treatment. Use of medical cards, including electronic ones (EMC), and other medical documentation in Big Data is associated with a number of difficulties such as unproperly filled EMC, unnecessary duplication of data, lack of data completeness and single system of EMC management, and lack of structured records. Organizational problems are related to outdated technologies in separate medical organizations and lack of united regulatory and reference information.
Implementation of the healthcare digital transformation project with SSISH formation is aimed at elimination of the abovementioned problems. This allows to use the database in scientific research. Information technology capabilities allow to use the EMC right now. To retrieve the data from the EMC and perform machine processing, special preparation is required, including retrieval of data using artificial intelligence and other technologies that allow to retrieve the data from unstructured records, cleaning, transformation, filtration, separation, translation, uniting, sorting and checking the data [2].
In modern medicine, Big Data are most in demand in bioinformatics and biomedicine. Genome sequencing projects include thousands of people, animals, insects and microorganisms. Use of large databases expands diagnostic capabilities of interpretation of results obtained during massive parallel sequencing. The research results are applied to determine the risk of diseases, diagnostics, including prenatal testing, prognosis of the course of diseases, and producing qualitatively novel drugs. Technologies that use large databases are used to study the microbiome. The databases contain data as billions of short readings that can be extended to create compositional and functional profiles of hundreds and thousands of microbes within this microbiome [3].
Use of big data and block chains in pharmacology allows to significantly expand a number of research centers and reduce the duration of clinical trials of medicines. Use of Big Data in systemic biology and medicine allows to detect markers that predict various diseases. To search for the biomarkers, models of multiple effects are used to detect their association with an immunome and epigenetics (targets for microRNA and DNA methylation and telomere length). The environmental effect on the body depending on the genetic status is assessed [4]. The capabilities of prediction of catastrophic events, including epidemics, are expanded using big data and artificial intelligence [5].
Big data are analyzed and used in every sphere of life. It is expected that in the future a third part of information data available in the world can be attributed to healthcare. Apart from EMC, large databases use medical registries of various diseases (HIV-infection, TB, oncological diseases, etc.), bases of cohort and clinical trials, biobanks and panomics. The existing databases include thousands to millions of people already now. For instance, the Danish DOC * X cohort includes social and economic data and health-related information related to over 6 million adults and 1.2 million children, genomic data of the British biobank involve nearly half a million of British citizens, database ‘Multiple-Parameter Intellectual Monitoring in Intensive Therapy II’ (MIMIC–II) involves about 30,000 patients from the department of intensive therapy.
ETHICAL ISSUES OF USING LARGE DATABASES
Medical Вig Data include demographic data, results of laboratory and instrumental studies, and information about the conducted treatment. Use of patient-related data poses ethical questions to researchers. Biomedical data constitute secure medical information protected by the legislation. Right for confidentiality means that personal data submitted by patients will not be displayed without their permission, except for the cases set ethically and legally. Healthcare information technologies include protection of data integrity and confidentiality. In accordance with regulating documents, data depersonalization is one of basic requirements by SSISH [6]. This significantly limits the capability to interpret the obtained results including assessment of the environmental role, hygienic characteristics of living conditions, manufacturing factors of risk, and features of the healthcare system in the region of residency. Epidemiological, ecological, geographical, climatic, demographic data, analysis of social networks, statistical data of medical institutions, economical and sociological values are currently used to examine the effect of various factors on the change in biomarkers. These data are not covered by the Medical Secrecy Law. However, analysis of large databases enables to detect individuals with peculiar characteristics by indirect values. For instance, provision of medicines reveals patients with HIV, TB, orphan diseases, etc. Digital footprint technology violates privacy. Social media monitoring allows to predict the risk of suicide and wrongful acts. Data loss from the registries of patients with disability, genetic disturbances, mental diseases, alcohol abuse, drug addiction may lead to discrimination during employment, insurance, crediting and other negative social consequences. Data bases are of interest both for commercial structures, and state authorities.
Analysis of ethical problems occurring while using large bases performed by Ienca M. et al. has shown that personal privacy and confidentiality belong to the dominant problem (n = 146), followed by the issues of informed consent (n = 49), honesty and justice (n = 34), trust (n = 23), right of ownership, etc. (figure) [7].
Use of large databases highlighted a number of new ethical issues. Analysis of interaction between large database values and ethnic features, geographical location, environmental pollution makes it necessary to protect considering the group identity. Data about predisposition of a large group of people united by certain attributes to mental, genetic diseases, gender identity, drug addiction, and juvenile delinquency can lead to problems of individuals in various spheres of life. The Council of Europe offered to accept the right ‘not to be subjected to profiling’ as a new right to prevent discrimination against certain persons or groups of persons [8].
Use of big data allows to reveal patterns between environmental conditions, a way of life and morbidity. Ethical risks associated with the use of these data are not just about developing recommendations but also about urging to change the way of life at the state level leading to restriction of an individual right to privacy. Interpretation of data obtained during analysis of large bases constitutes another ethical issue. High level of evidence is observed due to inclusion of a large number of patients based on large databases. However, the results can depend on experience and good faith of institutions and persons who analyze and interpret the databases. The results obtained during analysis of large databases do not have demonstrability and require an additional inspection and addition of data provided by other trials [9].
LEGAL REGULATION OF USING LARGE DATABASES
Issues of biomedical ethics while using large databases are considered in documents of various countries.
- Big data: using capabilities and preserving values (Administration of U. S. President, 2014) [10].
- Collection, binding and using data in biomedical research and healthcare: ethical issues, Nuffield Council on Bioethics (Great Britain, 2015) [11].
- Big data and sovereignty of medical data as building information freedom (German Ethics Council, 2017) [12].
- Code of Ethics of IMIA for specialists in medical informatics and Code of professional and ethical behavior of AMIA [13, 14].
Ethical issues of using personal medical data are documented in international instruments. In 2016, the World Medical Association published a declaration about ethical considerations in relation to healthcare data bases and biobanks [15]. In 2016, (EU) Regulation 2016/679 of the European Parliament and Council on protection of individuals while treating personal data and on free movement of these data was accepted in Europe [16]. In 2017, the Report from the International Bioethics Committee of UNESCO on Big Data and Health was accepted (2017) [8].
ALTERNATE SOLUTIONS OF ETHICAL ISSUES OF USING LARGE DATABASES
Solving confidentiality-related ethical issues provides several algorithms of safety of medical data aimed at the information system of safe storage, creation of programs that exclude any connection between real and pseudoidentification data. Data encryption (key, algorithm) that could identify a patient and the access code in the form of a password are used to improve safety. A safety measure includes a capability of utilizing ‘empty pseudoidentifications’ that improve safety by neutralizing the possibility to compare information data with a real personality [17].
The method of pseudonymization when identification data are transformed and then replaced by a specifier that can’t be associated with identification data without reference to a certain password is suggested to protect data. Confidentiality means excluding storage of personal information with pseudonymized data making it necessary to form two bases: one having personal data and the other one with pseudonymized information.
A wider model is based on protection of access to database by storage encryption with access keys for service personnel. However, even such models of ‘access control’ can be bypassed using information technologies or people operating inside the system [18].
Every country has a regulatory agency which takes into account patients’ interest and regulates the market of medical services. It is about the Roszdravnadzor in Russia and Food and Drug Administration in the U. S. In Russia, confidentiality protection significantly limits access to medical data. It is the patient who owns biomedical data. Data access and treatment are possible only in case of permission taken from the person who signed the informed consent form. The company that grants the permission and purposes of data collection and treatment are mentioned in the form.
The use of large databases is currently associated with a dilemma posed by the legislative protection of patients’ rights, on the one hand, and decrease in analytical capabilities of data use, on the other hand [19]. Depersonilization of data and possibility of their use without a patient’s consent could be solution to this problem. Today, the legislation in Russia allows using the depersonalized data only within a medical institution that obtained permission via the mentioned goals in accordance with the mentioned purposes. In perspective, it requires a change in the legislation to utilize depersonalized data without a patient’s consent meaning transfer of data ownership to the state. It is necessary to form an authority that could be responsible for data storage and regulate its usage.
The study of secondary medical data is usually carried out when a patient’s consent is obtained or when data are completely anonymous. Today, an informed consent is commonly granted once (at hospitalization, inclusion into a clinical trial, etc.). It limits the use of databases for retrospective assessment or for other research purposes. The practice of obtaining the Dynamic Consent in some countries means temporary renewal of consent to use data when patients can provide a repeated consent to the use of data which can also be different from the first one. It could facilitate cooperation with a patient and expand their capability to control the use of data. The practice of granting the Wide Consent means using data within more than one clinical trial if the projects are related to a certain area or direction of the trials [8].
CONCLUSION
Big data analysis enables to reveal the patterns of influence of various factors on biosystems, expands the capabilities of clinical trials, medical education, clinical practice, improves identification and prevention of diseases, assessment of treatment effectiveness and prediction. Healthcare big data analysis opens up a huge potential for scientific research, on the one hand, and increases the risk of accessibility to personal data of the patients, on the other hand [20].
Ethics of using large databases should be based on solving the following tasks:
- strengthening control over data storage and usage;
- respecting the privacy of individuals and groups of individuals having the same profiles;
- informed consent to data transfer and proper practice in relation to the ways of their obtaining;
- responsibility of medical workers, researchers, managers and computer specialists for their professional activity while dealing with big bases.