Tackling disinformation: A learning guide

Generative AI is the ultimate disinformation amplifier

Generative artificial intelligence tools allow anyone to quickly and easily create massive amounts of fake content.

A hand pushes some sliders on a sound mixer

Generative artificial intelligence (GAI) adds a new dimension to the problem of disinformation. Freely available and largely unregulated tools make it possible for anyone to generate false information and fake content in vast quantities. These include imitating the voices of real people and creating photos and videos that are indistinguishable from real ones.

But there is also a positive side. Used smartly, GAI can provide a greater number of content consumers with trustworthy information, thereby counteracting disinformation.

To understand the positives and negatives of GAI, it is first important to understand what AI is, and what is so special about generative AI.

What do machine learning, AI and generative AI mean?

Artificial intelligence refers to a collection of ideas, technologies and techniques that relate to a computer system's capacity to perform tasks that normally require human intelligence. When we talk about AI in the context of journalism, we usually mean machine learning (ML) as a sub field of AI.

In basic terms, machine learning is the process of training a piece of software, called a model, to make useful predictions or generate content from data. The roots of machine learning are in statistics, which can also be thought of as the art of extracting knowledge from data. What machine learning does is to use data to answer questions. More formally, it refers to the use of algorithms that learn patterns from data and can perform tasks without being explicitly programmed to do so. Or in other words: they learn.

A language model (LM) is a machine learning model that aims to predict and generate plausible language (natural or human-like language). To put it very simply, it's basically a probability model that, using a data set and algorithm, predicts the next word in a sentence based on previous words.

Such models are called generative models or generative AI, because they create new and original content and data. Traditional AI, on the other hand, focuses on performing preset tasks using preset algorithms, but doesn't create new content.

When models are trained on enormous amounts of data, their complexity and efficacy increase. Early language models could predict the probability of a single word whereas modern large language models (LLMs) can predict the probability of sentences, paragraphs or even entire documents based on patterns used in the past.

A key development in language modeling was the introduction in 2017 of Transformers, a deep learning architecture designed around the idea of attention mechanisms. This innovation allows the model to selectively focus on the most important part of the input for making the prediction, boosting a model's ability to capture crucial information. The computer science portal Geeks for Geeks gives Google Streetview's house number identification as an example of an attention mechanism in computer vision that enables models to systematically identify certain portions of an image for processing.

Attention mechanisms also made it possible to process longer sequences by solving memory issues encountered in earlier models. Transformers are the state-of-the-art architecture for a wide variety of language model applications, such as translators and chatbots.

ChatGPT, the best known chatbot, is based on a language model developed by OpenAI. It is built on the GPT (Generative Pre-trained Transformer) model architecture, and it is known for its natural language processing capabilities.

A screenshot of the landing page of the ChatGPT website in March 2023. It says Introducing ChatGPT.

Large language models are the algorithmic basis for chatbots like OpenAI's ChatGPT

What does generative AI mean for disinformation?

Generative AI is the first technology to enter an area that was previously reserved for humans: the autonomous production of content in any form, and the understanding and creation of language and meaning.

And this is precisely what links generative AI to the topic of disinformation — the fact that, today, it is often impossible to tell if content originates from a human or a machine, and if we can trust what we read, see or hear.

Media users are beginning to understand that something is broken in their relation to media and are confused. "Some of the indicators that we have historically used to decide we should trust a piece of information have become distorted," Vinton G. Cerf, known as one of the "fathers of the internet," said in a 2024 video podcast by the international law firm Freshfields Bruckhaus Deringer.

Generative tools are different as they bypass many of the traditional principles of journalistic work, such as relying on trusted sources. We have to say goodbye to the idea that there are authors behind every text or a creator behind every piece of visual content. This connection no longer exists.

What are the risks of ChatGPT and open-source large language models?

Although generative AI tools are still unavailable in some countries because of their internet censorship laws and regulations, the launch of ChatGPT by OpenAI in November 2022 (and later on its alternatives) was a turning point. Now, a large part of the world's internet users have access to these powerful tools and can use them according to their own purposes — whether positive or negative. It also means that through widespread use, the models can continue to learn and become better and even more powerful.

But the underlying LLM used by ChatGPT and Google's Gemini (formerly Bard) are owned by their companies, that is they are proprietary models. This raises concerns about LLMs' lack of transparency, the use of personal data for training purposes and limited accessibility. There's also significant debate on the ability to use chatbots to produce disinformation and fake content.

While these two chatbots in particular have garnered significant attention, other powerful open-source large language models, the foundational technology behind these chatbots, are freely available.

Research by Democracy Reporting International, a Berlin-based organization promoting democracy, found these open-source LLMs, when managed by someone with the relevant coding skills, can rival the quality of products like ChatGPT and Gemini.

But, it warned in its December 2023 report, "[u]nlike their more prominent counterparts, ... these LLMs frequently lack integrated safeguards, rendering them more susceptible to misuse in the creation of misinformation or hate speech."

How the media can build trustworthy AI tools to tackle disinformation

What concrete negative effects does GAI have on disinformation?

We are seeing a whole range of different disinformation created by GAI, from fully AI generated fake news websites to fake Joe Biden robocalls telling Democrats not to vote.

And with the technology developing so quickly, media systems are having trouble adapting to it, learning how to use it safely and preventing dangers, while researchers are scrambling to identify and analyze the impacts. From the user's point of view, generative AI is causing a general loss of trust in the media and difficulties in verifying the truthfulness of content, especially around elections. Deep fakes can be used to create non-consensual explicit content using someone's likeness, leading to severe privacy violations and harm to individuals, particularly women and marginalized communities.

Problem 1: Volume, automation and amplification

With GAI, the volume of disinformation potentially becomes infinite rendering fact checking an insufficient tool. As the marginal costs of the production of disinformation fall towards zero, the costs of dissemination are also nearly zero thanks to social media.

On top of this, individuals can now use user-friendly apps to easily and quickly generate sophisticated and convincing GAI content such as deep fake videos and voice clones – content that previously needed entire teams of tech-savvy individuals to produce. This democratization of deep fake technology lowers the barrier of entry for creating and disseminating false narratives and misleading content online.

Malign actors can easily leverage chatbots to spread falsehood across the internet at record speed, regardless of the language. Text-to-text chatbots, such as ChatGPT or Gemini, or image generators, such as Midjourney, DALL-E or Stable Diffusion, can be used to create massive amounts of text as well as highly realistic fake audio, images and videos to spread misinformation and disinformation. This can lead to false narratives, country-specific misinformation, manipulation of public opinion and even harm to individuals or organizations.

In a 2023 study, researchers at the University of Zurich in Switzerland found that generative AI can produce accurate information that is easier to understand, but it can also produce more compelling disinformation. Participants also failed to distinguish between posts on X, formerly Twitter, written by GPT-3 and written by real people.

GAI applications can be combined to automate the whole process of content production, distribution and amplification. Fully synthetic visual material can be produced from a text prompt, and websites can be programmed automatically.

Problem 2: Disinformation and the public arena's structural transformation

Digitization has been transforming the public sphere for some time now. Generative AI is yet another element fueling this transformation, but it shouldn't be viewed in isolation, with structural shifts mainly happening because of digital media, economic pressures on traditional media organizations and the reconfiguration of attention allocation and information flows. The increase in the volume of AI-generated content, coupled with the difficulty in recognizing that content is AI-generated, is an additional factor in the public sphere's transformation.

Information pollution has more than one cause apart from deliberately generated disinformation. Emily M. Bender, a linguistics professor at the University of Washington, addressed this problem in testimony before the US House Committee on Science, Space and Technology.

Issues

Some reputable media houses are quietly posting synthetic text as if it were real reporting (venerable tech outlet CNET was one of them, although it says it has paused this for now after an outcry). But the content can be biased or inaccurate if algorithms aren't designed properly, or if the training data sets are inherently biased.
GAI can hallucinate. That means, it can produce content that isn't based on existing data or examples provided during the training process but rather made up. In one infamous example, in its very first demonstration, Google's Bard chatbot (as Gemini was called at the time) claimed that the James Webb Space Telescope had captured the first images of a planet outside our solar system, which wasn't factually true.
GAI has turbocharged plagiarism. NewsGuard from the Journalism Trust Initiative was thefirst to identify the emergence of AI content farms using AI to copy and rewrite content from mainstream sources without credit. NewsGuard has identified hundreds of additional unreliable AI-generated websites.
Trust in democratic processes and institutions is eroding. The more polluted our information ecosystem becomes with synthetic text, the harder it will be to find trustworthy sources of information, and the harder it will be to trust them when we've found them. UN Secretary General Antonio Guterres sees this as an "existential risk to humanity."

Problem 3: Authoritarian regimes benefit

ChatGPT reproduces harmful narratives propagated by authoritarian regimes when given hypothetical prompts, finds research by Democracy Reporting International. In one case study, researchers were able to prompt ChatGPT to emulate a reporter from Russia Today, a state-controlled news organization. By doing this, they were able to get ChatGPT to circumvent its safeguards and produce problematic outputs such as advocating for the "need to de-nazify Ukraine," which is a common Russian narrative used to justify their 2022 invasion of Ukraine. The research demonstrated the relative ease with which AI chatbots can be co-opted by malicious actors to produce misleading or false information regardless of the language used.

As such, generative AI models developed in authoritarian countries — with possible state involvement — have implications that extend beyond the confines of these states. "The world's most technically advanced authoritarian governments have responded to innovations in AI chatbot technology, attempting to ensure that the applications comply with or strengthen their censorship systems. Legal frameworks in at least 21 countries mandate or incentivize digital platforms to deploy machine learning to remove disfavored political, social, and religious speech," Democracy Reporting International finds. "With user-friendly online tools powered by these models, they are becoming increasingly accessible globally. This ensures that the biases and propaganda originating from these models' home countries will proliferate far beyond their borders."

Problem 4: GAI could negatively impact elections

Elections and generative AI have a special connection. This is because the actors involved in elections always pursue specific goals: to either win power for their allies or themselves or to influence a foreign country’s political landscape. GAI enables such actors to create "unreality," and it's becoming a weapon in information warfare and influence operations. Such campaigns are mostly coordinated, concerted, evaluated, measured and funded by political or foreign actors.

"These actors see information as a theater of war," says Carl Miller, the founder of the UK-based Centre for the Analysis of Social Media, in a recent podcast.

An elderly woman votes at a polling station

There are fears generative AI may undermine democratic processes like elections

Research by the International Center for Journalists found election disinformation had common and cyclical patterns regardless of the country they examined. For example, the narrative that votes were cast in the name of deceased people or disinformation about what documents were needed to vote were found in a range of nations.

Generative AI is a perfect tool for creating such campaigns. In January 2024, the attorney general in the US state of New Hampshire said it was investigating an apparent robocall that used artificial intelligence to mimic US President Joe Biden's voice and discourage people from voting during the state's primary election.

Companies like OpenAI are rushing to develop safeguards to make sure GIA is not used in a way that could undermine the election process.

This article is part of Tackling Disinformation: A Learning Guide produced by DW Akademie.

The Learning Guide includes explainers, videos and articles aimed at helping those already working in the field or directly impacted by the issues, such as media professionals, civil society actors, DW Akademie partners and experts.

It offers insights for evaluating media development activities and rethinking approaches to disinformation, alongside practical solutions and expert advice, with a focus on the Global South and Eastern Europe.

Date 26.03.2024
Author Julius Endert
Feedback: Send us your feedback.
Print Print this page
Permalink https://p.dw.com/p/4doP4

Date 26.03.2024
Author Julius Endert
Send us your feedback.
Print Print this page
Permalink https://p.dw.com/p/4doP4

Tackling disinformation: A learning guide

Generative AI is the ultimate disinformation amplifier

What do machine learning, AI and generative AI mean?

What does generative AI mean for disinformation?

What are the risks of ChatGPT and open-source large language models?

How the media can build trustworthy AI tools to tackle disinformation

What concrete negative effects does GAI have on disinformation?

DW recommends

Learning initiative against disinformation

Home

AI race is deepening existing inequalities across the globe