Publications

AI-Induced Deskilling in Medicine: A Mixed Method Literature Review for Setting a New Research Agenda

Natali, C., Marconi, L., Dias Duran L.D., Miglioretti, M., Cabitza, F.

(In review)

Abstract

The integration of Artificial Intelligence (AI) in healthcare is reshaping clinical practice, offering both opportunities for enhanced decision-making and risks of skill degradation among medical professionals. This growing impact calls for a comprehensive evaluation of its effects on medical expertise. This study presents a mixed-method literature review, combining systematic analysis with narrative synthesis to examine AI-induced deskilling and upskilling inhibition—the erosion of medical expertise and the reduction of opportunities for skill acquisition due to AI-driven decision support systems. Anchoring the discussion in the core medical competencies outlined by the Federation of Royal Colleges of Physicians of the UK—Practical Assessment of Clinical Examination Skills (PACES-MRCPUK), the systematic review identifies key vulnerabilities in physical examination, differential diagnosis, clinical judgment, and physician-patient commu- nication, while the narrative review explores broader themes related to Human-AI Interaction and the Impact of AI on Human Skills in Organizations. In response to concerns about the Second Singularity—a scenario in which decision-making autonomy is increasingly ceded to AI, weakening human oversight—this review advocates for a research agenda that prioritizes longitudinal studies, real-time monitoring of AI’s impact, and the development of frameworks to mitigate skill erosion, ensuring the preservation of professional autonomy and the safeguarding of the irreplaceable elements of human judgment in medicine.

Machine Learning in Medical Diagnosis: A Framework for a Normative Evaluation of Chances and Risks.

Leslye Denisse Dias Duran

Metzler J.B (Springer)

Abstract

This book seeks to navigate between the optimism that has arisen from the promise of the potential of machine learning (ML) in healthcare, and the lack of clarity about what realistic risks and benefits we can foresee. Its main aim is to develop a relational, rights-based normative approach to evaluating the distribution of burdens and benefits of implementing ML in medical diagnosis. This framework, called the "Ecosystem of Moral Constellations", assumes that every person has an equal claim to the fundamental rights necessary to lead one’s life, but recognizes that there may be conflicting interests that risk violating or infringing the rights of an individual or individuals, and that therefore an assessment of these tensions requires a situational prioritization of certain rights over others. This framework proposes to consider the normative relevance of relationships at different points of moral engagement to assess the potential tensions between these burdens and benefits of these technologies. The author argues that decisions about the implementation of AI systems require more than an assessment of technical feasibility. Instead, it is imperative to consider the different normative goals and interests of the actors involved, the material capabilities of the tools, and the role they should play in the clinical workflow.


Deskilling of medical professionals: an unintended consequence of AI implementation?

Leslye Denisse Dias Duran

Giornale di filosofia, 2021

Abstract

In the last decade, the development of applications powered by arti-ficial intelligence algorithmic architectures, especially machine learning, has increased exponentially thanks to heightened computational power, the availability of digital data in large amounts, and more mature and sophisticated algorithmic models. As a result, there has been a surge in academic, scientific, and journalistic publications concerned with the be-nefits that AI could bring to a variety of sectors, including healthcare1. So far, we have seen the rise of models capable of making highly accurate predictions in radiology2, detecting diabetic retinopathy3, diagnosing the presence or absence of tuberculosis4, as well as helping in diagno-sing breast cancer5 and detecting diabetes6, among many other projects. Some research, fascinated by these results and other recent studies, suggests that AI-powered applications have the potential to improve patient outcomes by a range of 30% to 40% while decreasing treatment costs by up to 50%7. However, hand-in-hand with the opportunities come risks and pos-sible unintended consequences. Much has been said about problems of bias and discrimination8; the black-box issue stemming from the opacity of highly accurate models9 and their lack of interpretability and tran-sparency10; the potential risks to privacy11 and informed consent, among others. These risks belong to the phases of conception and design of AI models; further down the line, however, there are also risks associated with deployment and supervision. In other words, risks present at the point when applications encounter direct and indirect end users. In the case of healthcare, it is not yet clear if direct users are medical professionals, medical administrative staff, or patients themselves. For the purpose of this paper, I take physicians and other highly skilled medi-cal personnel (radiology clinicians, nurses, etc.) as those whose skills and employment opportunities may be negatively affected and will attempt to consider ethical implications of technical and moral deskilling as an unin-tended consequence of the application of machine learning in healthcare.

The Ghost in the Machine has an American accent: value conflict in GPT-3

Rebecca L Johnson, Giada Pistilli, Natalia Menédez-González, Leslye Denisse Dias Duran, Enrico Panai, Julija Kalpokiene, Donald Jay Bertulfo

arXiv preprint Mar 2022

Abstract

The alignment problem in the context of large language models must consider the plurality of human values in our world. Whilst there are many resonant and overlapping values amongst the world's cultures, there are also many conflicting, yet equally valid, values. It is important to observe which cultural values a model exhibits, particularly when there is a value conflict between input prompts and generated outputs. We discuss how the co-creation of language and cultural value impacts large language models (LLMs). We explore the constitution of the training data for GPT-3 and compare that to the world's language and internet access demographics, as well as to reported statistical profiles of dominant values in some Nation-states. We stress tested GPT-3 with a range of value-rich texts representing several languages and nations; including some with values orthogonal to dominant US public opinion as reported by the World Values Survey. We observed when values embedded in the input text were mutated in the generated outputs and noted when these conflicting values were more aligned with reported dominant US values. Our discussion of these results uses a moral value pluralism (MVP) lens to better understand these value mutations. Finally, we provide recommendations for how our work may contribute to other current work in the field.