ChatGPT has a terrible answer to casualties of war

Researchers examined artificial intelligence data from airstrike victims.Image: trapezoid

A Swiss study provides evidence of language bias in generative artificial intelligence. Such mistakes can have far-reaching social consequences.

More and more people are using the AI chatbot ChatGPT instead of Google to get reliable information on a specific topic.

However, as two researchers show in a recent study, AI-generated answers should be treated with extreme caution. They asked ChatGPT about civilian casualty figures in the Israeli-Palestinian and Turkish-Kurdish conflicts and warned against systematic distortion of the facts. Their AI research even provides “the first evidence of language bias in the context of conflict-related violence.”

What was investigated?

two Researchers from the University of Zurich (UZH) and the University of Konstanz ChatGPT is asked the same questions over and over in different languages in an automated process about armed conflicts such as those in the Middle East. They asked in Arabic and Hebrew how many casualties there were in 50 randomly selected air strikes.

How are the answers provided by AI chatbots affected by the language of the search query? Does it make any difference whether you ask the same question in English or German, Arabic or Hebrew?

The same pattern as in Middle East conflicts occurs in Researchers Christoph Steinert and Daniel Kazenwadel Questions were asked about Turkish military air strikes on Kurdish areas, with the questions asked in Turkish and Kurdish.

What’s wrong with AI answers?

Depending on the user's language, ChatGPT generates distorted answers. And systematically. Researchers found “significant language bias.”

According to the scientific study, ChatGPT reported on average one-third higher casualty figures from Middle East conflicts in Arabic than in Hebrew.
When it comes to Israeli airstrikes in Gaza, the AI chatbot mentions twice as many civilian casualties and six times as many children killed.
In general, ChatGPT will show more victims if the search query is created in the language of the attacked group.
ChatGPT also tends to use the language of the targeted groups to talk about the killing of more children and women, describing airstrikes as indiscriminate and arbitrary.

The results also showed that ChatGPT airstrikes were more likely to be controversial if users asked relevant questions in the language of the aggressor country.

It’s important to remember how AI chatbots work. They are based on a Sprachmodell (Large Language Model, LL.M.) and are trained on large amounts of text by developers to generate statistically likely answers.

Problem: This type of training data already contains some bias from the media reporting on the topic. Depending on the source of the online sources, descriptions of the air raid have varying tones or perhaps none. ChatGPT only processes the corresponding text without understanding its content or correcting errors.

Why is this problematic from a social perspective?

“These systemic distortions can exacerbate biases in armed conflict and fuel information bubbles.”

Researchers draw warning conclusionsThose: news.unizh.ch

Researchers say people with different language skills receive different information through generative AI, which has a central impact on their perception of the world. This could lead the Israelis to assess the air strikes in Gaza as being less damaging than for the Arabic-speaking population, based on information received from ChatGPT.

While traditional news media may also distort reporting (through bias), the systematic language-related distortions of large language models such as ChatGPT are difficult for most users to understand.

“Critical consumers may be able to distinguish between high- and low-quality news sources, but they are unlikely to understand the sources of bias created by LL.M.s.”

Implementing these AI language models in search engines could reinforce different perceptions, biases, and information bubbles at language boundaries, the researchers say. This could further exacerbate future armed conflicts, such as those in the Middle East.

However, researchers believe the underlying problem goes far beyond distorted information about military conflicts and casualty tolls.

«Similar language biases may affect the information generated by LL.M.s in other subject areas, especially if the training data are similarly heterogeneous and vary by language. This may also be the case with other contentious areas of information, such as sensitive political issues, religious beliefs, or cultural identities. “

What do we learn from this?

Anyone interested in how generative AI works and its weaknesses knows that ChatGPT's answers should be treated with extreme caution.

Due to its language bias, ChatGPT is not suitable for serious (i.e. fact-based) research on a specific topic. This issue has become especially explosive as AI chatbots are integrated into traditional search engines.

As the two scientists write in the conclusion, more research is needed to investigate the extent to which language distortion occurs among LL.M.s in other subject areas and which languages are particularly susceptible to such distortion.

source

More information on ChatGPT and the topic of artificial intelligence:

Artificial Intelligence’s Current Dangers and Future Risks

1/13

Artificial Intelligence’s Current Dangers and Future Risks

This is a British-Canadian computer scientist and psychologist Jeffrey Hintonhe is considered the “godfather” of artificial intelligence. The renowned scientist also urgently warned of the current and future dangers of new technologies…

Those: Keystone/Noah Berger