Wikimania 2026

India's Software Stories: A Hands-on Workshop with Software Heritage, Commons and Wikidata

Session type: Lightning talk Showcase
Track: Technology

Speakers

mathilde.fichen@inria.fr
Pavan Santhosh

I am a Wikimedia contributor and community leader from India, active in the movement since 2013, when I began editing Telugu Wikipedia. My early involvement in community initiatives led me to design on-wiki and off-wiki projects.

I later worked with the Centre for Internet and Society – Access to Knowledge as a Community Advocate for Telugu Wikimedia projects, focusing on content quality, community growth, partnerships, and capacity building. During this period, I supported the improvement of 25,000 stubs on Telugu Wikipedia, co-organised Train-the-Trainer 2019 and Wikimedia Summit India 2019, and collaborated with educational and government institutions. I have also contributed to movement governance processes, including participation in the Community Health Working Group under Movement Strategy.

More recently, I have co-organised major convenings on AI, Indian languages, and digital commons, including WikiConference India 2023, Future of Commons (2024), the Roundtable on Advancing Open and Sustainable Knowledge Networks (2025), and Bahu Bhasa (2025). I continue to work at the intersection of multilingual knowledge ecosystems, community governance, and emerging technologies.

Super nabla

I'm a computer science researcher and Wikimedian from Pisa (Italy). By day, I work at Sant'Anna School of Advanced Studies, collaborating with the Software Heritage team at Inria Paris to make large-scale code archives more efficient through compression techniques, helping preserve the world's software for future generations.
On evenings and weekends, I contribute to Wikimedia projects under the username User:Super_nabla. My interests focus on structured data and documenting South Asian culture, mainly on Italian Wikipedia. I help as a Meta-Wiki translation administrator, operate a small bot (Nablabot), and enjoy organising local meetups, such as Indian Wiki-cinema in Pisa (2025). I'm also an occasional contributor to the Diff blog.
My academic background is a PhD from the University of Pisa, where I studied compression algorithms that allow working directly on compressed data. I'm still learning every day: from fellow Wikimedians, from colleagues at Software Heritage, and from the communities I try to serve.

Abstract

Our talk introduces Wikimedians to the preservation of software as cultural heritage through Software Heritage (SWH), Wikimedia Commons, and Wikidata. SWH is a UNESCO-backed Digital Public Good archiving all publicly available source code.
After a brief introduction to software as cultural heritage (highlighting India's under-documented computing history, from Tamil and Telugu localisation to contemporary FOSS), participants engage in hands-on practice with the SWH Acquisition Process (SWHAP). Working in small groups, attendees will archive historically significant software and link those archives to Wikidata via the SWH identifier (now P6138).
Crucially, to ensure success within 90 minutes, we will pre-select a set of 'pilot-ready' Indian software projects with clear licensing and historical relevance. Participants self-select into groups based on interest, with facilitators ensuring balanced skill sets (archiving, Wikidata, Commons). Beginners will create a Wikidata item and permanently archive its code; intermediate participants gain skills to contribute to the broader "Software Stories" initiative, preserving the digital heritage of emerging communities for present and future generations.

Additional information

How does your session relate to the event theme: Liberté, Équité, Fiabilité (Freedom, Equity, Reliability).
As for Freedom, we will teach participants how to use free and open-source infrastructure (SWH, Wikidata, Commons) to preserve software. This aligns with the “Paris call on Software Source Code” by UNESCO (2019; https://www.softwareheritage.org/wp-content/uploads/2024/08/paris-call.pdf), which called for “support efforts to gather and preserve the artefacts and narratives of the history of computing, while the earlier creators are still alive”. The project empowers emerging communities to safeguard their digital heritage, ensuring code knowledge remains accessible to all, free from the risk of loss due to server decommissioning or the author's death. https://diff.wikimedia.org/2026/02/16/when-source-code-archival-is-recognised-as-digital-public-good-insights-from-software-heritages-10-year-journey-at-unesco/ As for Equity, the history of computation has traditionally centred on the West. This workshop actively works to correct that imbalance. We provide tools and a platform for Indian-language communities and technologists to document and share their innovations. This centres narratives that have been marginalised, promoting equity in the global record of technological progress. As for Reliability, the SWH archive provides a permanent archive with unique, citable identifiers (SWHIDs) that serve as perma-links to specific versions, releases, or commits. Linking these to Wikidata creates a verifiable and interconnected knowledge graph. This process ensures the long-term reliability of both the software artefact and its contextual narrative (people, institutions, historical significance), making it a trustworthy source for researchers, our community members, and the wider public. The reliability isn't just in the SWHID, but in the combination of the immutable SWH archive and the curated, linked data on Wikidata, which provides the narrative and context that a raw code dump lacks.
Which Wikimedia audiences will find this content the most useful?
The project lies at the intersection of technology and heritage; thus, it has the potential to interest different segments of the community. Wikidata enthusiasts may be interested in expanding the wiki's coverage of software, programming languages, and the history of technology. GLAM and partnership professionals may be interested in new models to collaborate with technical universities, archives, and museums on digital preservation. Community members, particularly from India and the Global South, who have invested in documenting their region's technological and linguistic history, may find a direct interest in the “Software Stories”. Technical contributors, ICT researchers, and engineers within the Wikimedia movement might be eager to learn about best practices for archiving and citing software. Education and Advocacy organizers can use software stories to demonstrate the movement's commitment to preserving all forms of knowledge.
What is the experience level needed for the audience for your session?
Average knowledge about Wikimedia projects or activities

Contact the Organizer

Have a question about the event? You can send a message to the organizer.

The organizer will send replies to the email address you provide above.

Your message has been sent successfully!