Looking for love and sex on Tinder, soldiers endanger national security

To experience the interactive, original publication, please click here. You can find the full text without interactive and scrollytelling elements below.

This article in 1 minute

What’s the news?

FTM identified hundreds of soldiers from nations including Germany, the UK and the US on the dating app Tinder. The personal details including home addresses of many soldiers can be found and their travel movements can be tracked via the application.

The investigation found and virtually followed army personnel across major military facilities from the US Ramstein Air Base in Germany to a NATO complex in Lithuania.

Why is this important?

Intelligence services worldwide have been warning for years about “honey traps” via dating apps such as Tinder, in which spies establish contact with people who have access to sensitive data and deceive them in order to extract the information.

The use of these apps has not been comprehensively regulated by defence ministries of major nations, and guidelines on online privacy for soldiers are thin.

FTM found that military personnel share a lot of information about their work on their dating profiles. And Tinder’s security and location settings make it pretty easy to follow their movements, in what experts said could be a “threat to national security”.

But the problem doesn’t stop there. The same risks apply to all Tinder users, from members of the public to police officers and diplomats.

How did FTM investigate this?

FTM created three fake Tinder accounts and used a programme to change the locations of these profiles and virtually place them near military bases. By doing so, FTM gathered more than 100,000 Tinder profiles, and identified at least 400 soldiers in a short timespan.

These profiles could be monitored and their movements tracked, even if there was no match when FTM’s profiles were “turned down” by the soldiers being monitored.

Using trilateration, a technique in which FTM digitally moved its accounts to different points and measured the distance to a given soldier, it was possible to hone in on precise locations. In combination with information from public sources, FTM was often able to easily identify the soldiers on Tinder, verify or retrieve their home addresses and ascertain additional information about their work, hobbies, (sexual) preferences and interests.

This article is part of an ongoing series.

Some people look for love – or perhaps just sex – in unusual places.

Be it a US Air Force base in Germany, a NATO military complex in Lithuania or in the vicinity of a former UN peacekeeping facility in Mali.

The problem is that these singletons are soldiers scrolling through Tinder while on duty, either unaware or unconcerned about the fact that the dating app is indirectly giving away their personal data, current location and even their movements between military bases, an investigation by Follow the Money has revealed.

“This is a threat to national security,” said Dutch cybersecurity consultant Matthijs Koot.

By creating three fake accounts on Tinder – under the names Jacky, Naomi and Daisy – FTM identified the dating profiles of hundreds of military personnel at bases across Europe who came from countries including France, Germany, the Netherlands, the UK and the US.

Using the bios and photos on the soldiers’ Tinder profiles in combination with information from public sources such as LinkedIn and Instagram, FTM managed to find out details such as their job titles, home addresses, and reams of other material about their personal lives.

Thanks to Tinder’s lax security settings, the locations of the fictional Jacky, Naomi and Daisy could be constantly changed and digitally moved around various military bases. Because the app shows the distance between its users, FTM was not only able to establish the whereabouts of a given soldier, but also monitor their travel over many days or even weeks.

This did not require any swiping right (a match), simply liking the profiles was sufficient.

Take Michael, for example. FTM’s fake account Naomi came across his Tinder profile in mid-November while her location was set near the US Ramstein Air Base in southwestern Germany.

Michael’s bio showed that he was in his mid-thirties and worked in “ballistic missile defense”. Naomi liked him and although there was no match, FTM could still track Michael’s movements through the app’s location data.

Within the space of 10 days or so, the US soldier travelled from Ramstein to Frankfurt to London to northern Spain – near Santander – before returning to the base in Germany. FTM also discovered Michael’s date of birth, his past working as a pub crawl promoter in South Korea’s capital of Seoul, and information about his father, who also served in the US military.

From Germany and Estonia to Mali

Jacky, Naomi and Daisy found many other soldiers on Tinder across Germany – from the US Army’s European headquarters in Wiesbaden to the Grafenwöhr Training Area and the Büchel Air Base.

Looking closer to Russia, FTM’s three profiles also identified military personnel at the Rukla Military Base in Lithuania, where NATO’s Enhanced Forward Presence Battlegroup is based, and around the Ämari Air Base in Estonia. FTM tracked two members of the Dutch Royal Airforce travelling to this base in northern Estonia, where the Netherlands is part of an active mission to protect the eastern flank of NATO’s airspace from Russian threats.

In one extreme case, a US soldier identified only as “M” was tracked travelling from the Ramstein Air Base to the West African nation of Mali, where he stayed for a few days near a former UN peacekeeping base, before returning to Germany.

Intelligence agencies worldwide have warned for years about how malicious powers misuse social media and applications that process users’ location data, including dating apps.

The age-old tactic of the honey trap – whereby unsuspecting targets in positions of power or with access to sensitive data are approached by attractive individuals and deceived through flattery or other means – has evolved from smoky bars and glitzy nightclubs to the likes of Facebook and Tinder.

However, Tinder’s security flaws, along with limited online privacy guidelines from the defence ministries of major nations such as the Netherlands, Germany, and the US, suggest that the risk to army personnel is going largely unchecked, cybersecurity experts told FTM.

And it’s not just individuals who are in danger, but entire states, said the Dutch expert Koot.

“If you can follow the movements of soldiers in such a simple, structured and long-term way, that is a major problem” he said.

“For example, if there is suddenly a lot more app activity on a military base, that could mean that they are scaling up or that an exercise is imminent. In parts of the world, such strategic information can be crucial,” Koot added.

FTM shared its findings with Tinder and asked for a response, but received no reply.

Research from Belgian university KU Leuven (see the next section) shows how Tinder masks the exact whereabouts of its users so that only an approximate location can be found through the data it sends to its servers. But in sparsely populated or remote areas – such as military bases – this still allows users’ locations to be identified, as FTM did in this investigation.

Security flaws

Dating apps are increasingly being used by spies seeking to ensnare valuable targets and extract sensitive information, according to top intelligence and security services.

In January 2020, a representative of the US Department of Justice said in an interview that such apps could be weaponised by individuals and foreign intelligence agencies.

Two years later, in February 2022, the head of Australia’s Security Intelligence Organisation (ASIO) said the agency was tracking suspicious approaches from foreign spies on dating apps including Tinder.

And in April 2023, Germany’s military intelligence agency reported that Russian spies were using Tinder to try to obtain confidential information about the war in Ukraine from German politicians and army personnel.

In Tinder’s case, the location data it shares is likely why it is seen as a useful tool for spies.

Every time someone uses the app, their location is updated. While Tinder does not reveal the exact whereabouts of its users, an individual browsing the app can see the approximate distance between themselves and their potential suitors, even if there is not a match.

For this investigation, FTM created a programme to provide Tinder with false locations for Jacky, Naomi and Daisy. These accounts were then digitally placed at various army bases so that the profiles of users in the area could be identified and collected.

Jacky, Naomi and Daisy were repeatedly moved – virtually-speaking – to different points in a given area in order to measure the distance between them and individual soldiers, allowing FTM to hone in on their near-exact whereabouts (a process known as trilateration).

The soldiers in question would not have had any inkling that their location and movements were being tracked in this way, or been able to prevent it if they did while active on Tinder.

“It is worrying that profiles can be obtained on such a large scale from such a globally operating platform,” Belgian cyber researcher Karel Dhondt told FTM after studying its findings.

“If that information falls into the wrong hands, the consequences for the people and groups of people concerned can be far-reaching,” he said. “The fact that a group of journalists managed to do this without too many technical barriers is evidence of a lack of effective protective measures at Tinder.”

Dhondt obtained his PhD from KU Leuven on the security of location-based applications and is the lead author of a recent study into the security of 15 dating apps, including Tinder.

“If that information falls into the wrong hands, the consequences for the people and groups of people concerned can be far-reaching”

His research found significant differences in the ways that dating apps handle location data, even within the Match Group, which includes Tinder. Some of the apps in that group, such as Plenty Of Fish (POF) and Meetic, use less precise location data such as a user’s city or neighbourhood, according to Dhondt.

“This significantly reduces the risk of leaking exact locations,” he explained. “Tinder therefore consciously makes a different choice.”

Tinder knows where you are, and can share this data

According to Tinder’s privacy policy, it can collect exact geolocation (latitude and longitude) data with users’ consent (one of the many elements contained in the T&Cs) – including when they are not using the app.

Tinder also reserves the right to share data from and about users with other Match Group companies, suppliers and advertising partners.

FTM has previously reported on how the sensitive data of military personnel, politicians, police officers and security service employees are for sale in the digital advertising market.

‘Love, sex, friendship or money’

While Tinder’s identified security shortcomings are systematic, militaries have to deal with individual lapses in judgment when it comes to sharing too much information on dating apps.

In the Tinder profiles analysed by FTM, many were instantly identifiable as soldiers: their bios stated that they worked in the military, and even included their specific divisions and roles (such as mechanic and radar specialist).

Often, photos showed them wearing army uniforms. US soldiers tend to have their surname printed on military fatigues. Other uniforms (from various nations) feature emblems indicating the military unit. Many of the photos FTM saw also featured weapons, tanks, and fighter jets.

By combining the Tinder profile information of soldiers with their location data, FTM was able to easily find out more about them from other online sources.

For example, their accounts on LinkedIn, Facebook or Instagram. These platforms can provide surnames and dates of birth, which help to track down and verify home addresses. They also offer the names of friends and family, employment histories, interests, hobbies and favourite places to go out. With just a few searches, an entire life can be pieced together.

The Dutch expert Koot said this kind of information is exactly what countries such as Russia, China and Iran use to reach out to and influence people who have access to sensitive data.

“Of course they don’t immediately ask ‘give me the launch codes’ – it is a grooming process that can be preceded by a long preparation, in order to extract specific details from the target. Even seemingly unimportant details can in fact be of great value,” Koot said.

“The information that can be obtained via Tinder and from public sources makes it easier to start a conversation, gain trust and build a bond, and ultimately try to extract sensitive information under the promise of love, sex, friendship or money,” he added.

And the dangers presented by dating apps extend beyond targets in the military, according to Koot, who said he thought it was unlikely that Tinder had fully considered these possible consequences of its security settings.

From police to diplomats to civil servants

Not only military personnel are at risk of this flaw in Tinder’s settings. FTM also identified and tracked dozens of other people (mainly Dutch) on Tinder working in law enforcement or diplomacy.

In the Netherlands, a police security analyst, an employee of the interior ministry and an IT specialist at the defence ministry all shared their job titles in their Tinder bios. FTM was able to ascertain their home addresses and dates of birth through public sources. In another case, FTM managed to find the home address of a police chief who had suggested in a photo on social media that he was part of the Netherlands’ Special Intervention Service.

FTM was even able to track a London-based Canadian diplomat on two recent visits to Ukraine.

Underestimating the risks?

Despite the threat posed by Tinder and other applications that share location data, it appears that major countries’ defence ministries are lagging when it comes to taking concrete action.

In January 2018, military analysts observed that soldiers using the fitness app Strava to track their runs were giving away the locations of secret army bases and spy outposts.

Later that year, a joint investigation by Bellingcat and Dutch publication De Correspondent found that another fitness app, Polar, was revealing exercise by military and intelligence personnel in secretive locations including Guantanamo Bay and Baghdad’s Green Zone.

In response to these revelations, the Netherlands’ then-defence minister said army personnel would be banned from using fitness apps – but only on their work phones.

Several Russian and Chinese apps have also been blacklisted by the Dutch ministry of defence in recent years, meaning that the Netherlands’ troops can only use them on their personal devices, according to internal documentsseen by FTM.

However, dating apps such as Tinder are not on the blacklist, meaning that Dutch soldiers are permitted to use them on their work phones.

A spokesperson for the Dutch ministry of defence said military personnel are not allowed to use personal phones while on mission or at classified locations. The spokesperson did not respond to a question about if these rules are actively enforced.

One of the Dutch military personnel who FTM followed via its fake account Naomi agreed to a telephone interview.

“If you guys [FTM] can find out about that so easily, then yes, that is worrying” – a soldier who uses Tinder

Dennis, who works at the Volkel Air Base in the Netherlands, said that troops are informed about the risks of social media and online privacy in their training, but called it “limited”.

“There are apps banned on work phones, such as Strava, but private use is still possible,” he said. “Even on bases, this is not very difficult.”

When asked if he was worried that FTM could track him through Tinder, Dennis said he had nothing to hide. However, he did acknowledge the risks to colleagues in sensitive locations.

“If you guys [FTM] can find out about that so easily, then yes, that is worrying,” he said.

Meanwhile, the US Department of Defense has no mandatory regulations on the use of dating apps but provided advice about “do’s and don’ts” as part of an online privacy and security guide it published in 2021 and re-issued in December 2023.

The guide recommends avoiding using names and photos that appear on other social media apps, and not sharing information about work. When using dating apps, US troops are encouraged to get a paid account to have more control over their privacy, to check and adjust location settings, and to critically read the terms and conditions and privacy policies.

The German ministry of defence’s guidelines on social media are also quite relaxed. A spokesperson from Germany’s Bundeswehr told FTM in response to emailed questions that the country’s military personnel are free to use social media and dating apps such as Tinder.

Military personnel are allowed to share their name, rank and job title on their social media accounts. They are warned about the fact that other people can see what they post online, and are expected to behave as representatives of the armed forces (sexualised content is banned). What German troops are allowed to share on social media – certain photos, for example – depends on the specific policy at a given base, according to the guidelines.

Ultimately, given Tinder’s popularity in several Western countries, it is not a problem that is contained to or that can be solved by any single nation, according to Koot.

“This [situation] offers serious opportunities for foreign intelligence services,” he said. “The only limit on how far malicious people can abuse this is their own imagination.”

“The only limit on how far malicious people can abuse this is their own imagination”

Ultimately, given Tinder’s popularity in several Western countries, it is not a problem that is contained to or that can be solved by any single nation, according to Koot.

“This [situation] offers serious opportunities for foreign intelligence services,” he said. “The only limit on how far malicious people can abuse this is their own imagination.”

Methodology

FTM created three free accounts on Tinder under the names Jacky, Naomi and Daisy. Using a computer programme, FTM was able to provide Tinder with false locations and virtually move the fake profiles around several military bases in different countries. By doing so, FTM was able to scroll through the dating profiles of individuals within a certain radius (for example, 5 kilometres), and see how far those users were from the three fake accounts.

Through trilateration, a technique in which FTM digitally changed the position of its accounts to different points to measure the distance to a given individual on Tinder, it was possible to eventually hone in on where those users were actually located.

By using keywords, inspecting profile photos, and repeatedly checking the location data indirectly provided by Tinder, FTM was able to easily identify the profiles of hundreds of military personnel. The investigation also mapped travel movements by monitoring profiles for longer periods of time, although this was not done for any longer than was necessary for the reporting.

Tinder replies:

‘At Tinder, the privacy and safety of our members are of paramount importance. We have implemented robust measures to help ensure that no user can be distinctly tracked through the app. Here’s an overview of the steps we take and our ongoing commitment to member safety: Tinder has developed a sophisticated system to protect our members’ privacy while allowing them to filter profiles based on the approximate distance of potential matches. Rather than using precise locations, we employ privacy and security centric grid snapping methods. [..]

We value the input of the global security research community in identifying and addressing potential vulnerabilities. Through our Bug Bounty Program, we incentivize responsible disclosure of security issues, ensuring that vulnerabilities are addressed promptly and effectively. [..]

That said, we have settings and systems to provide users additional control regarding how they use the Tinder app and how they show up to others in the app. Our Safety Center and privacy settings are designed to empower users to make informed decisions while maintaining control over their experience. ’

Editors: Karin Spaink, Kieran Guilbert
Graphics: Leon de Korte, Thomas Kuijpers
Audio: Hannah Veldhoen, Frederique de Jong, Emma du Chatinier, Alexander Fanta
Socials: Maya Luz Hartog

Mole or cancer? The algorithm that gets one in three melanomas wrong and erases patients with dark skin

The original publication of the project is embedded below. Underneath, find the full text.

Time is money. Especially for melanoma, the most dangerous skin cancer: diagnosing this tumour as soon as possible is more decisive in saving lives than in any other cancer. In Spain, it is estimated that in 2025 there will be around 9,400 cases of melanoma, a very aggressive cancer which can spread rapidly and metastasise in just a few months. When this happens, the prognosis is usually poor, so detection errors can be fatal.

It is precisely this urgency that has led the Basque Country to commit to artificial intelligence. In 2025, the Basque health system, Osakidetza, is working on incorporating Quantus Skin, an algorithm designed to diagnose the risk of skin cancer, including melanoma, at its primary health clinics and hospitals. In theory, it promises to speed up the process: general practitioners will be able to send images of suspicious lesions to the hospital’s dermatology service, along with the algorithm’s estimate of their being malignant. The Basque government’s idea is that Quantus Skin, which is currently being tested, will help prioritise patients for treatment.

A 1.6 million euro public contract for Asisa

In 2022, the Basque Health Service, Osakidetza, awarded a €1.6 million contract to Transmural Biotech to implement “artificial intelligence algorithms in medical imaging,” which required achieving a sensitivity and specificity of “at least 85%.” The company, which was created as a spin-off from the University of Barcelona and Hospital Clínic, belongs to the private insurance company Asisa. Although the specifications included several types of cancer and other diseases, Osakidetza only selected two algorithms, including Quantus Skin, due to its “greater healthcare impact” and to “obtain a higher return on health.” The decision, moreover, was taken unilaterally, without consulting specialists, Civio has learned. In February, Osakiteza stated that Quantus Skin had passed the “validation phase” and was in the “integration phase.” In a more recent response to queries from Civio, the service now says that it continues testing the algorithm and will take decisions “accounting for the results we obtain.” However, the service did not address the fact that the published clinical results of Quantus Skin (69.1% sensitivity and 80.2% specificity) are below the 85% threshold set by the public contract. Apart from the award in the Basque Country, Transmural Biotech has only one other public contract, in Catalonia, for a much smaller amount (25,000 euros) to certify artificial intelligence algorithms in radiology.

However, the data are troubling. Transmural Biotech, the company that markets Quantus Skin, conducted an initial study with promising results, but it had significant limitations: it was conducted entirely online and was not published in any academic journal, meaning that it did not undergo the usual quality control required in science.

Later, dermatologists from Ramón y Cajal Hospital and professors from Complutense University in Madrid conducted a second study, which was published, to evaluate the actual clinical efficacy of Quantus Skin. This study, which received funding and technical assistance from Transmural Biotech, showed worse results: the algorithm misses one in three melanomas. Its sensitivity is 69.1%, which means that it misses 31% of actual cases of this potentially lethal cancer.

When Civio contacted the CEO of Transmural Biotech, David Fernández Rodríguez, he responded evasively by email: “I don’t know what it is right now.” After insisting on the phone, he changed his story: “What we were doing was testing,” to detect possible implementation problems. And, at the end of the call, Fernández Rodríguez acknowledged that Quantus Skin “didn’t stop working, it just worked much worse, but we had to figure out why.”

Fernández Rodríguez attributes these poorer results to deficiencies in image capture due to not following Quantus Skin’s instructions. This is something they have also seen in trials in the Basque Country: “Primary care doctors are not well trained to take the images,” he says, which implies a need to “train the doctors.” However, the second study involved dermatologists who specialise precisely in photographing suspicious lesions for subsequent diagnosis. According to Fernández Rodríguez, reliability improved after “cropping the images properly” because “they were not following the instructions exactly.”

Independent sources criticise the diagnostic tool

“For skin cancer, having sensitivities of 70% is very bad. It’s very poor. If you give this to someone to take a photo and tell you if it could be a melanoma and they are wrong in one out of three, it is inappropriate for skin cancer screening in a primary setting. You have to demand more,” explains Dr Josep Malvehy Guilera, director of the Skin Cancer Unit at the Hospital Clínic in Barcelona. “A 31% false negative rate sounds dangerous to say the least,” says Dr Rosa Taberner Ferrer, dermatologist at the Son Llàtzer Hospital in Mallorca and author of Dermapixel: “As a screening test it’s crap.”

However, Fernández Rodríguez attempts to downplay the problem by focusing only on data that favours his product, avoiding mention of Quantus Skin’s low sensitivity. Quantus Skin fails in two ways: according to the same study, its specificity implies a 19.8% false positive rate, i.e. it mistakes one in five benign moles for melanoma. This could lead to an unnecessary referral of around 20% of screened patients.

In the study, the authors – dermatologists at the Ramón y Cajal Hospital in Madrid and optics professors at the Complutense University of Madrid – argue that it is preferable for Quantus Skin to have a higher specificity (fewer false positives) even at the cost of a lower sensitivity (more false negatives) because it is not used for definitive diagnoses. It is just a screening tool, to refer cases from primary care. They hypothesise that this can prevent specialist consultations from overcrowding, reducing waiting times and lowering medical costs.

Specialists consulted by Civio disagree. Although there is no ideal standard for cancer diagnosis – partly because it depends on the aggressiveness of each tumour – what Quantus Skin has achieved is far from acceptable. “If it misdiagnoses melanoma in lesions with a potential risk of growing rapidly and potentially even causing the patient’s death, then I must be very intolerant. I expect sensitivities of at least 92%, 93%, 94%,” says dermatologist Malvehy Guilera of the Hospital Clínic in Barcelona.

“If they intend to use it for screening, then the system should have a super high sensitivity at the expense of a slightly lower specificity,” explains Taberner Ferrer. In other words, it is preferable for an algorithm like this to be overly cautious: better to err a little by generating false alarms in healthy people than to miss a real case of cancer.

Dark skin, uncertain diagnosis

The problems with Quantus Skin go beyond its low sensitivity. The paper only evaluated its efficacy in diagnosing melanoma, but did not look at other more common but less aggressive skin cancers, such as basal cell carcinoma and squamous cell carcinoma, where Quantus Skin can also be applied. The authors also did not study how skin colour affects the performance of the algorithm, although they acknowledge that this is one of the main limitations of their research.

The diversity that Quantus Skin neglects

At the beginning of 2025, the Basque Country had 316,942 people of foreign origin, according to data from the Ikuspegi Basque Immigration Observatory. More than 60,000 came from the Maghreb and sub-Saharan Africa, while nearly 164,000 people came from Latin America, where there is a great variability of skin tones. That is not counting people born in Spain with foreign ancestry who reside in the Basque Country, such as the well-known footballers Iñaki and Nico Williams.

Quantus Skin, based on neural networks, has learned to recognise skin cancer almost exclusively in white people. The algorithm was first fed with just over 56,000 images from the International Skin Imaging Collaboration (ISIC), a public repository of medical photographs collected mainly by Western hospitals, where the majority of patients have fair skin. Subsequently, Quantus Skin was re-trained using images of 513 patients from the Ramón y Cajal Hospital in Madrid, all of whom were white.

The data set used to train Quantus Skin includes images “of Caucasian males and females,” Fernández Rodríguez says. “I don’t want to get into the issue of ethnic minorities and all that, because the tool is used by the Basque Country, by Osakidetza. What I am making available is a tool, with its limitations,” Fernández Rodríguez says. Despite the lack of training on darker skins, the Basque government says it is not necessary to “implement” any measure “to promote equality and non-discrimination,” as stated in the Quantus Skin file in the Basque Country’s catalogue of algorithms and artificial intelligence systems. However, as the neural networks have been trained almost exclusively on images of white people, they are likely to fail even more on darker skins, such as individuals of Roma ethnicity or migrants from Latin America and Africa.

“Algorithms are so easily fooled,” says Adewole Adamson, professor of dermatology at the University of Texas, who warned in 2018 of the discrimination that artificial intelligence could lead to if it was not developed in an inclusive and diverse way.

His predictions have been confirmed. In dermatology, when algorithms are fed mainly with images of white patients, “diagnostic reliability in dark skin decreases,” says Taberner Ferrer. The precision of the Skin Image Search algorithm, from the Swedish company First Derm, trained mainly on photos of white skin, dropped from 70% to 17% when tested on people with dark skin. More recent research has confirmed that such algorithms perform worse on black people, which is not due to technical problems, but to a lack of diversity in the training data.

Although melanoma is a much more common cancer in white people, people with dark skin have a significantly lower overall survival rate. American engineer Avery Smith knows these figures well. His wife, Latoya Smith, was diagnosed with melanoma only a year and a half after getting married. “I was so surprised by the survival rates listed for people with the same diagnosis as my wife, and by how they were dependent on race. My wife and I are both Black American and we were at the bottom of the survival rate. I didn’t know until it hit me like a bus. That’s scary as hell,” he tells Civio. Some time after the diagnosis, in late 2011, Latoya died.

Since then, Smith has been working to make dermatology more inclusive and to ensure that algorithms do not amplify inequalities. To remind us of the impact they can have, especially on vulnerable groups, Smith rejects talking about artificial intelligence as a “tool,” as if it were a simple scissors: “It’s a marketing term. It’s a way to get people to grasp it who aren’t technologists. But it’s far more than just a tool.”

Legal expert Anabel K. Arias, spokesperson for the Federation of Consumers and Users (CECU), also speaks of these effects: “When thinking about using it to make an early diagnosis, there may be a portion of the population that is under-represented, and in that case, it may be wrong and have an impact on the health of the person. You can even call it harm.”

Invisible patients in the eyes of an algorithm

“People tend to put too much trust in artificial intelligence, we attribute to it qualities of objectivity that are not real,” says Helena Matute Greño, professor of experimental psychology at the University of Deusto. Any AI uses the information it receives to make decisions. If that input data is bad or is incomplete, it is likely to fail. When it is systematically wrong, the algorithm makes mistakes that we call biases. If they affect a certain group of people more than others – because of their origin, skin colour, gender or age – we call them discriminatory biases.

A review published in the Journal of Clinical Epidemiology showed that only 12% of studies on AI in medicine looked for bias. And when they did, the most frequent bias was racial bias, followed by gender and age, with the vast majority affecting groups that had historically suffered discrimination. These errors can occur if the training data are not sufficiently diverse and balanced: if algorithms learn from only part of the population, they perform worse on different or minority groups.

Errors are not limited to skin colour. Commercial facial recognition technologies fail more often at classifying black women because they have historically been trained on images of white men. Similarly, algorithms that analyse chest X-rays or predict cardiovascular disease perform worse in women if the training data is unbalanced. Meanwhile, one of the most widely used datasets for predicting liver disease is so biased – 75% of the training set is male – that algorithms using it fail much worse in women. In the UK, the algorithm for prioritising organ transplants discriminated against younger people. The reason? It had been trained with limited data, which only took into account survival over the next five years, and not the potentially much longer life that patients receiving a new organ might gain.

“The data used for training must represent the entire population in which it will be used,” explains Nuria Ribelles Entrena, spokesperson for the Spanish Society of Medical Oncology (SEOM) and oncologist at the Virgen de la Victoria University Hospital in Malaga: “If I only train it with a certain group of patients, it will be very effective in that group, but not in others.”

Age bias is especially problematic in paediatrics. “Children are not little adults. They have completely different physiology and pathological processes,” warn the authors of a journal article. Since children do not normally participate in clinical research, the situation is “a drama,” according to Antonio López Rueda, spokesperson for the Spanish Society of Medical Radiology (SERAM) and radiologist at Bellvitge University Hospital in Barcelona.

Ignasi Barber Martínez de la Torre, spokesperson for the Spanish Society of Paediatric Radiology (SERPE) and head of paediatric radiology at Sant Joan de Déu Hospital, illustrates this with a personal experience. His team tried to validate a chest X-ray model trained on adults in the paediatric population. “We soon realised that it made many more errors. The sensitivity and specificity were totally different,” he says. One of the errors was identifying the thymus, a very large gland in young children that disappears in adulthood, as a suspect. The same goes for the skeleton, which in young children has “unossified parts” that can be mistaken for fractures.

Navigating the bias obstacle course

The solution to avoid bias exists: “The training set has to be as large as possible,” explains López Rueda. But the data are not always available for independent analysis. So far, most artificial intelligence systems implemented in Spain that use medical images do not usually publish the training data. This is the case with two dermatology systems – whose names are not even public – that will first be tested in the Caudal health area and then extended to the whole Principality of Asturias. The commercial application ClinicGram, for detecting diabetic foot ulcers, in use at the University Hospital of Vic near Barcelona; and several private radiology systems, such as Gleamer BoneView and ChestView and Lunit, which are operating in the Community of Madrid, the Principality of Asturias and the Community of Valencia also fail to publish their training data.

Where training datasets are accessible, another obstacle is that they do not collect metadata, such as origin, gender, age or skin type, which would allow us to check whether the datasets are inclusive and balanced. In dermatology, most public datasets do not usually tag the origin of patients or their skin tone. When this information is included, studies consistently show that the black population is severely underrepresented. “There has been a growing awareness of the problem and developers of algorithms have tried to address these shortcomings. However, there is still work to be done to create representative training data for algorithms,” Adamson says.

The quality and quantity of available data also determines how well the algorithms work. “What made us improve our diagnostic efficiency was that we used our own imaging resources,” says Julián Conejo-Mir, professor and head of dermatology at the Virgen del Rocío University Hospital in Seville. Conejo-Mir and colleagues developed an artificial intelligence algorithm for skin cancer diagnosis and to identify the depth of melanoma, a parameter that is related to the aggressiveness of these tumours.

Its database, which includes images of nearly a thousand patients from the hospital in Seville and photographs from other repositories, has been used to design an algorithm, currently under research, with a 90% accuracy rate. But even in apparently successful systems like this one, it is difficult to train algorithms to recognise less frequent cases. This is precisely what happens with acral lentiginous melanoma, the most common skin cancer in the black population and the one Bob Marley died of when he was only 36 years old. This tumour is particularly treacherous because it appears in areas where people rarely look for suspicious lesions, such as the palms of the hands and feet or under the fingernails, as happened to Marley.

Every year, the dermatology service at the Virgen del Rocío University Hospital in Seville diagnoses around 150 cases of melanoma, of which only 2 or 3 are acral lesions. “We had to take it out of training, because we had very few cases and, if we joined it to the rest, it failed; if we separated it, we didn’t have a sufficient number of images,” says José Juan Pereyra Rodríguez, head of section at the Virgen del Rocío University Hospital in Seville.

This artificial intelligence, which is not used for clinical screening but for research purposes, cannot be applied to cases of acral lentiginous melanoma because they did not have enough data on this type of cancer to train the algorithm reliably. To achieve this, they would have needed about 50 years worth of locally available data, Pereyra Rodriguez estimates. “In our case, it’s as simple as saying: ‘Don’t use the algorithm for acral lesions, in general, because I haven’t trained it for that.’ That’s it; it’s a limitation,” he says.

“The theory says that if 90% of my population” corresponds to white skins, “I must train” with those types “because prevalence is also important when it comes to making decisions. I must train in my environment,” Pereyra Rodríguez says. In the case of systems developed abroad, hospitals should ideally evaluate the performance of the algorithms on their own patient groups. López-Rueda also calls for “re-training with local data” before implementing any artificial intelligence: “It’s very expensive for both the company and the hospital, but that’s what would really work.”

Even in Spain, the characteristics of the population also vary depending on the postcode. “If I develop software in the Hospital Clínic [in the centre of Barcelona] and implement it in Bellvitge [in the suburbs], it won’t work for me. If I do it the other way around, it won’t work either,” López-Rueda says. The consequences of algorithmic biases can be truly disastrous: patients can be harmed by an incorrect diagnosis. “Biased algorithms rob patients of the potential benefits of this revolutionary technology,” Adamson says, who points to the root of the problem: “The problem isn’t with the algorithm but with the thought and care going into designing and developing the algorithms.”

Methodology

This is the first health article in our algorithmic transparency series. We will soon publish new information on algorithm use in the National Health System. Eva Belmonte and David Cabo collaborated on this investigation, while Ter García, Javier de Vega, Olalla Tuñas and Ana Villota helped review the text.

Civio has confirmed the incorporation of the algorithms and AI systems mentioned in the article through various sources:

1) In the case of the Basque Country, through a public information request via the Transparency Law and three subsequent responses from the press office of the Basque Health Service (Osakidetza), which, at the time of writing, has not responded to our latest questions about Quantus Skin. We expanded our investigation to cover the Transmural Biotech contract, the Basque Country’s catalogue of AI algorithms and systems, the medRxiv pre-print by Trasmural Biotech employees, which is not peer-reviewed, and the subsequent peer-reviewed article in the International Journal of Environmental Research and Public Health (MDPI) evaluating Quantus Skin, published by dermatologists at the Ramón y Cajal Hospital in Madrid and optics researchers at the Complutense University of Madrid. We also interviewed the Transmural Biotech CEO, who referred to a metric favourable to Quantus Skin, known as negative predictive value (NPV), a parameter that does not answer the same question as sensitivity, but rather points to the probability that, if a test is negative, it is really negative. As explained in the article, sensitivity focuses on whether the system will detect all the patients who have melanoma. In other words, these metrics do not contradict each other, but they measure different things. Furthermore, although the company praises the preprint, and publishes data from the preprint on its website, including NPV, the manuscript itself states that “using online images does not constitute a clinical study” and that the findings must be validated in a real population.

2) In the Principality of Asturias, although the regional government initially told us that “there is currently no system in place in the health sector that uses artificial intelligence,” the press office of the Regional Ministry of Health subsequently confirmed the recent launch of several radiology systems, as well as the future inclusion in two dermatology systems.

3) In the Community of Madrid, the Department of Digitalisation initially reported seven AI projects in the Madrid Health Service, a figure that later rose to 70, as we confirmed through an information request. In the text, we only mention some of these systems, which have been implemented in several hospitals in the region.

4) In the Valencian Community, whose Regional Ministry of Health initially responded to our request for access by stating that it had not yet decided on incorporating AI systems. We later confirmed, thanks to GVA Confía, Valencia’s Registry of Algorithms, that there are indeed three active systems.

To prepare this report, we searched the scientific literature, mainly through PubMed, Google Scholarand preprint repositories such as medRxiv, bioRxiv and arXiv. We have also consulted health technology assessment reports, such as this document on the application of AI in breast cancer screening, specialised books such as this one published by the CSIC, other specialised documents such as this report by activist Júlia Keserű for the Mozilla Foundation, and we have also attended events such as the conference on AI and medicine organised by the Federation of Spanish Scientific Medical Associations (FACME). We have also consulted the few existing public records on algorithms and artificial intelligence systems, such as those launched in the Basque Country and the Valencian Community.

In addition to the documentary sources above, and the sources quoted in the text, we contacted multiple expert sources and we would like to thank them: lawyers Estrella Gutiérrez David and Guillermo Lazcoz Moratinos, dermatologist Tania Díaz Corpas, and mathematician and statistician Anabel Forte Deltell, helped us make the first visualisation as clear and accurate as possible.

We also sought evidence on the distribution of different skin types in Spain. A 2020 study in Acta Dermato-Venereologica indicated that the Spanish population is usually classified as phototypes II and III, which we corroborated with various dermatology specialists, who also helped us determine which population groups could be included in the lighter and darker phototypes.

Regarding medical image catalogues in dermatology and the limited annotation of skin tone in photographs, we searched for evidence in the scientific literature and, in addition to a 2022 Lancet Digital Health systematic review, whose data we used for the third visualisation, we also relied on a 2022 Journal of the American Academy of Dermatology systematic review, a 2021 JAMA Dermatology scoping review, and a 2022 Proceedings of the ACM on Human Computer Interaction analysis.

To identify the public contracts awarded to Transmural Biotech S.L. (B65084675), we searched the Public Sector Contracts Platform.

Regarding the first visualisation (sensitivity and specificity of the algorithms):

To create an educational visual representation of the efficacy results of the Quantus Skin algorithm at the end of the article, we approximated the probability distributions of both populations (healthy and diseased) using Gaussian curves.

To do this, based on the actual data from the study (55 cases of melanoma, of which 38 were true positives and 17 were false negatives; 177 healthy cases, of which 142 were true negatives and 35 false positives, with a diagnostic threshold of 67.33%), we mathematically calculated the parameters of two normal distributions that exactly reproduced the observed sensitivity (69.1%) and specificity (80.2%).

Results of the [Quantus Skin](https://web.archive.org/web/20250505091000/https://www.mdpi.com/1660-4601/19/7/3892) efficacy study

These curves, although simplified, allow us to intuitively visualise how the algorithm distinguishes between the two populations and how the different types of diagnostic hits and errors (false positives and false negatives) respond to different diagnostic thresholds, showing the balance between the two metrics in a simple and visual way.

It is important to note that these graphical representations are approximations for purely informative purposes.

Regarding the second visualisation, about the underrepresentation of dark skin in the largest catalogue of dermatological images:

To design the Fitzpatrick scale (skin tone colour scale) in an educational manner, we used the RGB values from Table 1 published in 2017 in the book Cutaneous melanoma: Etiology and therapy by Codon Publications.

The choice of personalities from the worlds of sport and music as examples has been checked with dermatology specialists. The authors of the photographs used are (CC BY-SA 3.0):

Ed Sheeran: Harald Krichel
Taylor Swift: iHeartRadio CA
Rafael Nadal: Barcex
Rosario Flores: Fuenlabrada Town Council
Salma Paralluelo: Michael Emilio
Nico Williams: Maider Goikoetxea

In the visualisation, each square is equivalent to about 100 images collected in the ISIC.

Regarding the third visualisation, on the limited coding of skin tone in most medical databases:

Each group consists of 10,695 circles, each representing ten dermatological images included in a Lancet Digital Health systematic review. The percentages used in the visualisations can be found in the summary of the publication. From the different clinical information collected in the systematic review, we chose the three that we thought were most relevant and best suited to the research.

The three visualisations were developed with D3.js and Svelte.js.

Further Credits:

Lucas Laursen (English editor), Eva Belmonte (editor), David Cabo (editor), Ter García (contributor), Javier de Vega (contributor), Olalla Tuñas (contributor), Ana Villota (contributor)