Wikimania 2025

Wikimedia’s knowledge infrastructure in a changing internet: Establishing sustainable pathways for content reuse
2025-08-07 , MOMBASA (🗣️ ar, es, fr, sw)
Language: English

The Wikimedia projects are the largest collection of open knowledge in the world. This has made our knowledge infrastructure an invaluable destination not just for humans, but also for search and AI companies that access our content automatically as a core input to their products. With the rise of AI, the demand for human created content has grown exponentially, which in return has led to an unsustainable increase in automated traffic to our sites. In this session we will explore the dynamics and impact of this trend on our infrastructure, discuss how a sustainable knowledge as a service model could look like, and share about the work that is underway to enable responsible content reuse. No technical knowledge required to follow along!


This session will give an overview on the impact of high volume bot traffic on Wikimedia’s infrastructure and the work that is underway to establish sustainable pathways for content reuse and to improve the overall developer experience.

Since 2024 we’re observing a significant rise in demand for Wikimedia’s content, via mechanisms including scraping, APIs, and bulk downloads. Most of the increase in traffic to our sites is coming from scraping bots collecting training data for large language models (LLMs), which in return enables products such as chat-based search engines or virtual assistants. This expansion caused a high load on our infrastructure, which is taking time and resources away that we need to support the Wikimedia projects, contributors and readers. At the same time, this happens largely without sufficient attribution, which is key to drive new users to participate in the movement.

The goal of this session is to provide insights in challenges and opportunities in the field of automated access, explore what is required to establish a sustainable model for knowledge as a service, and share how we are approaching this work in the upcoming months (see objective WE5 in the 2025/26 annual plan of the Wikimedia Foundation: https://meta.wikimedia.org/wiki/Wikimedia_Foundation_Annual_Plan/2025-2026/Product_%26_Technology_OKRs#Responsible_Use_of_Infrastructure_(WE5)).

This overview is designed for both technical and non-technical audiences. Everyone is welcome to join!


What other themes or topics does your session fit into? Please choose from the list of tags below.

Other

How does your session relate to the event theme: Wikimania@20: Inclusivity. Impact. Sustainability?

High volume automated traffic places an increasingly unsustainable load on our infrastructure. This session explores the need and approaches to establish sustainable pathways for developers and reusers to access knowledge content, that ultimately results in responsible use of our infrastructure and improved developer experience.

What is the experience level needed for the audience for your session?

Average knowledge about Wikimedia projects or activities

How do you plan to deliver this session? You will be asked to confirm this closer to the date in case of changes to the format.

Onsite in Nairobi

Should your session be selected for the program, do you agree to release your session and supporting materials on-wiki and on the eventyay platform under CC BY-SA 4.0?

I agree

See also: Presentation slides

Birgit Mueller is the Director of Product for MediaWiki and Developer Experiences at the Wikimedia Foundation. She joined the Wikimedia movement as a staff member of Wikimedia Deutschland in 2014 and moved to the Foundation in 2019 to support Wikimedia's developer ecosystem as the Director of Technical Engagement. Since 2023, she has been focusing on evolving the MediaWiki platform and improving the experiences of developers within Wikimedia’s ecosystem.

This speaker also appears in: