Jump to content
Wikimedia Meta-Wiki

উইকিমিডিয়া এন্টারপ্রাইজ

From Meta, a Wikimedia project coordination wiki
The printable version is no longer supported and may have rendering errors. Please update your browser bookmarks and please use the default browser print function instead.
This page is a translated version of the page Wikimedia Enterprise and the translation is 1% complete.
Outdated translations are marked like this.
উইকিমিডিয়া ফাউন্ডেশনের কর্মী ও অংশীদাররা এই পাতার বিষয়বস্তু রক্ষণাবেক্ষণ করতে স্বেচ্ছাসেবক সম্প্রদায়ের সাথে অংশগ্রহণ করে থাকে।
Wikimedia Enterprise
অংশীদারিত্ব ও উপার্জিত আয়ের মাধ্যমে রাজস্ব উৎপাদন এবং বিনামূল্যে জ্ঞান প্রচারের জন্য নতুন সুযোগ তৈরি করা।
enterprise.wikimedia.com
প্রকল্প
কারিগরি

উইকিমিডিয়া এন্টারপ্রাইজ হল উইকিমিডিয়া ফাউন্ডেশনের একটি নতুন ক্রস-বিভাগীয় পরিষেবা। এটি enterprise.wikimedia.com ঠিকানায় উপলব্ধ। এই পরিষেবাটির লক্ষ্য হল উইকিমিডিয়া সামগ্রীর উচ্চ-পরিমাণ বাণিজ্যিক পুনঃব্যবহারকারীদের সেবা দেওয়ার জন্য একটি পরিষেবা তৈরি করা। এই পরিষেবাটি ২০২১ সালের মার্চ মাসে ঘোষণা করা হয় (ব্লগপোস্ট, ওয়্যার্ড নিবন্ধ) এবং ২০২১ সালের অক্টোবরে চালু করা হয় (প্রেস রিলিজ, ওপেনফিউচার্স নিবন্ধ)।

The focus is on organizations that want to repurpose Wikimedia content in other contexts, providing data services at a large scale, so that they are faster and more comprehensive, reliable, and secure. Wikimedia Enterprise aims to improve the user experience of Wikimedia's readers beyond our own websites; increase the reach and discoverability of the content; and improve awareness and ease of attribution and verifiability by the organizations that reuse Wikimedia project data the most—through self-funding services.


There is a very high barrier to entry for using Wikimedia data, outside of the common use cases of reading or editing. This is because the content is hard for machines to segment and understand, which in turn affects how far Wikimedia project data reaches beyond our own ecosystem, and the scale of impact it can have.

In the Movement Strategy recommendations to increase the sustainability of our movement and improve user experience there are the recommendations to, respectively: "Explore new opportunities for both revenue generation and free knowledge dissemination through partnerships and earned income—for example...Building enterprise-level APIs," and "Make the Wikimedia API suite more comprehensive, reliable, secure and fast, in partnership with large scale users.... and improve awareness of and ease of attribution and verifiability for content reusers."

It is well known that a few massive companies use our projects' data. Those companies recognize that without the Wikimedia projects, they would not be able to provide as rich or reliable an experience to their own users. There has long been a feeling among community members that these companies should do more to reinvest in the Wikimedia communities for the benefits they gain from the content and resources they use.

This led to the idea of developing a new approach that is more sustainable in the long term and provides a much clearer relationship between Wikimedia and enterprise users. Most financial benefit for Wikimedia would likely only come from a very small handful of heavy for-profit users, and would feed back into the Wikimedia movement.

As this idea developed, it became clear there is a responsibility to democratize our data for organizations that do not possess the resources of these largest users, to ensure we are leveling the playing field and helping to foster a healthy internet without reinforcing monopolies. The benefits of such a service shouldn't just be for startups or alternatives to the internet giants, but also for universities and university researchers; archives and archivists; along with the wider Wikimedia movement.

Overview

Wikimedia Enterprise’s focus is on businesses that reuse our content, typically at a large scale—e.g., integrated into knowledge graphs, search, voice assistants, maps, news reporting, community tools, third party applications, and full-corpus research studies. Augmenting Wikimedia's many datasets to put structure behind our unstructured content will allow all our content reusers to meet their individual requirements while also setting us up to build new tools and services in the future, available to everyone. Reusers of our content are looking for three critical components:

  • Frequency: Regular current snapshots of Wikimedia projects
  • Reliability: Dependable, accessible infrastructure
  • Quality: a "best last revision"

Emphasizing a self-funding set of specific use cases allows the Wikimedia API team to focus on volunteers, teams, and organizations looking to access (and, most importantly, interact with) our data sets. This includes the majority of community editing tools, which will be out of scope for this service. For more information on improvements to the existing Wikimedia APIs see the service page on the "API Gateway" initiative.

Program Goals:

  • Content: Make more of our movement's content available in consistent machine-readable formats, freely available for all researchers and re-users.
  • Resource-load: Reduce the need for high-intensity site-scraping by the highest-frequency and highest-volume reusers, which currently target our production servers.
  • Fundraising: Provide a clearer and more consistent way for the largest re-users to reinvest derived benefits back to the movement, instead of making occasional altruistic donations that vary in size.

Community

Wikimania 2023 presentation slides

Contact the team if you would like to arrange a conversation about this service with your community, at a time and meeting software platform of your choice.

Past public meetings:

মার্চ ২০২১ #১ & মার্চ ২০২১ #২, এপ্রিল ২০২১, জুন ২০২২, ফেব্রুয়ারি ২০২৩

...and also at the EMWCon Spring 2021 conference (video); March and July 2022 Strategic Wikimedia Affiliates Network (SWAN) meetings; the May 2021 Wikimedia Clinic ; at Wikimania in 2021 and 2023 .

Following are the introduction paragraphs for a much more detailed Community essay .

The full essay covers the following topics

Libre and Gratis are the two meanings of "free," commonly phrased as free as in speech, or free as in beer.

Wikimedia projects are, have always been, and will always remain libre. The principles of free cultural works mean that anyone can use Wikimedia without restriction, including commercially. As a movement, we embrace this. It is why we reject ‘non-commercial’ licenses, as they would limit the kinds of reuse possible. And it is why we consider commercial reuse an important means of distributing knowledge to audiences.

Equally, Wikimedia projects are, have always been, and will always remain gratis. The ability to freely access the knowledge available across all Wikimedia projects has always been core to the mission of the Foundation and the movement. We provide this access not only to individuals visiting our websites but also programmatically to machines so that our content can be repurposed in other environments. The full corpus of Wikimedia content always has been, and will continue to be, made available for reuse in various forms (including but not limited to database dumps, APIs, and scraping) at no cost.

As a result, our content is often repurposed by for-profit organizations that rely on it to support their business models, and which consequently earn revenue from it. Outside of voluntary corporate donations to the Wikimedia Foundation, the movement has never received benefits from any of this revenue through return investment. In acknowledgement of this, under the heading of Increase the sustainability of our movement the Movement Strategy process asked the Wikimedia Foundation to explore, among other things, "enterprise-level APIs...models for enterprise-scale for-profit reusers, taking care to avoid revenue dependencies or other undue external influence in product design and development." Furthermore, under the heading Improve User Experience , a further recommendation stated, "Make the Wikimedia API suite more comprehensive, reliable, secure, and fast, in partnership with large scale users where that aligns with our mission and principles, to improve the user experience of both our direct and indirect users, increase the reach and discoverability of our content and the potential for data returns, and improve awareness of and ease of attribution and verifiability for content reusers."

The Enterprise project team is developing a new resource aimed at for-profit content reusers, who have product, service, and system requirements that go beyond what we freely provide. Use of this offering will not be required for for-profit content reuse; companies can continue to use the current tools available at no cost. All Enterprise API revenue will unequivocally be used to support the Wikimedia mission—for example, to fund Wikimedia programs or help grow the Wikimedia Endowment.

This project represents a new kind of activity at the Foundation. The project is at a very early stage that should be considered a learning period. We will have successes, we will make mistakes, and we will need to adapt our strategies. The team is committed to listening, engaging, and where possible, integrating the feedback we get on our work. This document is organic and is reflective of the team's current thinking; we are attempting to document as much work as possible in the open. Up until now, our work has been shaped by a series of initial interviews with community members, Wikimedia Foundation Board and staff, researchers, and reusers.

...continue to read the rest of the Community essay. See also the FAQ and Principles .

Given the nature of the service, primary decision making for it will rest with the Wikimedia Foundation. We are seeking community input, in particular from the technical community and those who have been involved in the strategy process, throughout the lifetime of the service. Technical feedback has been gathered from colleagues at the Wikimedia Foundation, industry and research partners, technical partners across the movement, and with the broader technical communities via Phabricator. Input into the funding development side of the service will follow a similar pattern. We will continue gathering input via research interviews and focus groups, as well feedback here on Meta as per our principles.

Access

There are several methods to obtain access to the Enterprise API datasets

.

All content is freely-licensed (see also the project's principles).

  • Paid
    • Realtime API (Batch and Streaming) and daily dump file in NDJSON format through the Enterprise API dedicated product website: enterprise.wikimedia.com.
  • Free
    • Creating an account via the Enterprise API product website includes 5,000 on-demand API requests that refresh monthly (including the Structured Contents endpoint) and twice-monthly snapshot API files in NDJSON format at no cost (refreshes on the 2nd and 21st of each month).
    • Several datasets are available outside of the WME website. An update of the Enterprise API data is provided for all every two weeks on the Wikimedia Dumps site. Several Beta datasets are also available on HuggingFace.
    • The Snapshot API and Realtime (Batch) are available via Data services to anyone with a Wikimedia cloud services account.
    • Those who have a non-commercial and mission-relevant use-case, which cannot be fulfilled by existing free-access APIs/dumps etc, can request expanded access to the API service at either reduced cost or no cost depending on usage and application.

Technical

For full information about the product, see the regular technical updates on MediaWiki.org and the documentation page.

Over time, the "API product" being offered will grow and improve. This information is accurate as of September 2024.

Overview

All of our APIs return the same structured JSON (or ND-JSON) response format making it easy to augment one API with another. Three APIs; same data, different retrieval methods:

  • Retrieve bulk data with the Snapshot API
  • Receive changes instantly with Realtime API streaming
  • Retrieve single articles with the On-demand API

API responses include article data such as summary, image, Wikidata QID, license, and more. Also included is data specific to the last revision, such as editor, size of change, and credibility score with revert probability.

On-demand API

Reusers that use an infrastructure reliant on the EventStream platform depend on services like RESTBase to pull HTML from page titles and current revisions to update their products. High-volume reusers have requested a reliable means to gather this data, as well as structures other than HTML when incorporating our content into their KGs and products.

The Wikimedia Enterprise On-demand API allows users to retrieve single articles from any Wikimedia project at anytime.

  • Make standard HTTP requests to retrieve documents by ID or name from all projects and languages, or use filters to limit response
  • Request the latest page data anytime to augment your Realtime or Snapshot API data
  • A wide range of commercial and consolidated schemas under SLAs

Realtime API

High-volume reusers currently rely heavily on the changes that are pushed from our community to update their products in real time, using EventStream APIs to access such changes. High-volume reusers are interested in a service that will allow them to filter the changes they receive to limit their processing, guarantee stable HTTP connections to ensure no data loss, and supply a more useful schema to limit the number of api calls they need to make per event.

The Enterprise Realtime API allows users to stream updates in real-time from any Wikimedia project.

  • Streaming: Receive streaming updates (firehose) of every change as they occur in real-time
  • Batch: Download compressed snapshot files of incremental updates every hour
  • Instant updates for new content, any edits, deletions, and breaking news events including community-curated visibility changes
  • Filtering of events by Project or Revision Namespace
  • A wide range of commercial and consolidated schemas under SLAs with guaranteed connections

Snapshot API

For high volume reusers that currently rely on the Wikimedia Dumps to access our information, we have created a solution to ingest Wikimedia content in near real time without excessive API calls (On-demand API) or maintaining hooks into our infrastructure (Realtime).

The Enterprise Snapshot API allows users to retrieve entire Wikimedia projects as a database dump file.

  • Download a compressed file containing everything in any project, in any language
  • Article body in HTML as well as Wikitext
  • Up to a daily snapshots cadence
  • 24-hour JSON, Wikitext, or HTML compressed dumps of "text-based" Wikimedia projects
  • A hourly update file with revision changes of "text-based" Wikimedia projects
  • A wide range of commercial and consolidated schemas under SLAs
  • SLA and Support

Contracted accounts receive 99% SLA and support response time guarantees. All accounts have access to our introductory onboarding resources and help center faqs.

Team

The Wikimedia Foundation staff who work specifically on this project:

Names in bold indicate management.

Many people from different teams also contribute significantly, including from the WMF Legal, Engineering, Partnerships, Design, Communications teams etc. Additional contract work provided by: PartnerHero provide customer support services; Vuurr are assisting our sales process; and Super Natural Design are the designers of the project website.

Governance

The board of the LLC overseeing the project are Ex officio from Wikimedia Foundation leadership, representing their Wikimedia Foundation staff roles. This includes the Chief Advancement Officer Lisa Seitz-Gruwell; General Counsel Stephen LaPorte; Chief Product and Technology Officer Selena Deckelman; and Lane Becker who serves as the LLC's president. The LLC is subject to the governance of the Wikimedia Foundation Board of Trustees as described at the Wikimedia Foundation Board Statement on Wikimedia Enterprise revenue principles .

All reports and official documents of the LLC are published on a dedicated Wikimedia Enterprise page on the Wikimedia Foundation Governance website. For convenience, annual reports are also linked here:

See also: FAQ § Legal

Press

Initial announcement - March 2021

Initial Wikimedia Foundation Diff blogpost
note: media stories listed below are written and published independently and were neither pre-reviewed nor approved by the WMF

Of particular note:

ShiftDelete.net Wikimedia launches its paid service (translated from Turkish)
Digital Trends Español Wikipedia will start charging large companies (translated from Spanish)
iPadizate Apple will have to pay Wikipedia to use its data (translated from Spanish)
iPhone Italia Wikipedia may soon ask Apple for 'the bill' (translated from Italian)
EZANIME.net Apple will soon have to pay for the Wikipedia data it uses (translated from Spanish)
RTL Nieuws Wikipedia will offer payment services to tech giants (translated from Dutch)
Tuga Tech Apple may have to pay to use Wikipedia data (translated from Portuguese)
turi2 Wikimedia launches paid service for businesses. (translated from German)
Les Echos Wikipedia wants to make Gafa pay (translated from the french)
Punto Informatico Wikimedia Enterprise to debut by 2021 (translated from Italian)
C News Wikipedia becomes paid. So far for IT giants (translated from Russian)
SDP Noticias Wikipedia will have a paid version for companies (translated from Spanish)
L'usine Digitale Wikimedia wants to launch a new paid service for digital giants (translated from French)
version2.dk Wikipedia will sell its content to businesses (translated from Danish)
Cinco Días (El País) Wikipedia will launch its paid version for companies at the end of the year (translated from Spanish)
Slo Tech The Wikimedia Foundation prepares paid services (translated from Slovenian)
Sawt Beirut International Create a paid service for companies that rely on Wikipedia data (translated from Arabic)
der brutkasten Wikipedia misses out on a business model (translated from German)
Canal RCN Wikipedia will launch a paid version for companies (translated from Spanish)
Ammon News Wikipedia turns some of its services into "paid" (translated from Arabic)
Nexofin Wikipedia announces that it will have a paid version (translated from Spanish)
Thanh Niên Apple will pay for Wikipedia content (translated from Vietnamese)
Business Insider Mexico Wikimedia will offer a payment service to companies that use its content (translated from Spanish)
Engadget Wikipedia plans to charge large organizations using its encyclopedia
Feber Wikimedia launches business payment service (translated from Swedish)
Letem světem Applem Apple may have to pay Wikipedia to use Siri's content (translated from Czech)
suara.com No longer free, Wikipedia will charge users (translated from Indonesian)
Techtoday Wikipedia will be paid, but users should not worry (translated from Ukrainian)
WIRED Italia Wikipedia will have a paid service reserved for big techs (translated from Italian)
dir.bg Wikipedia will charge large organizations that use it (translated from Bulgarian)
Bug.hr Wikimedia launches a paid service for large companies (translated from Croatian)
De Standaard Wikipedia is getting a paid version (translated from Dutch)
El Universal Wikipedia plans to start charging for its content (translated from Spanish)
masralyoum.net Reports: Use of Wikipedia will not be free for these parties (translated from Arabic)
Sputnik News "Wikipedia" to charge for major IT companies (translated from Japanese)
Business AM Wikipedia will soon no longer be free for everyone (translated from Dutch)
sentieriselvaggi.it Wikipedia's payday: Wikimedia Enterprise is born (translated from Italian)
Heidi.news Wikipedia wants to bring GAFA to the cashier (translated from French)
The Friday Checkout Story 3: Wikipedia getting paid (via YouTube)
La Stampa Are the tech greats willing to pay to use Wikipedia? (translated from Italian)
China Press Wikipedia will charge a fee! (translated from Simplified Chinese)
BitPort A really weird pairing: comes the enterprise Wikipedia (translated from Hungarian)
El Mundo (Pixel) Wikipedia is paid: this is all you need to know (translated from Spanish)
kumparanTech Wikipedia Will Have Premium Version, Comotous Content is Charged (translated from Indonesian)
marketingprzykawie.pl Wikipedia is launching a paid service for big tech (translated from Polish)
El Comercio Wikipedia announces its first paid version on the internet (translated from Spanish)
Tokar.ua Wikipedia will be paid for by IT giants (translated from Ukrainian)
HICOMM Paid Wikipedia platform launched for technology giants (translated from Bulgarian)
iPro Up Wikipedia launches its paid version aimed at companies (translated from Spanish)
Agencja Informacyjna Wikipedia is commercializing (translated from Polish)
Nezavisne novine Wikimedia is launching a paid service for large companies (translated from Bosnian)
it resenja Wikimedia is launching a paid service for large companies (translated from Bosnian)
Digital de León Wikipedia will be paid, we tell you everything (translated from Spanish)
Compromiso Atresmedia Wikipedia will have a paid service (although you will never use it) (translated from Spanish)
65 y Más Wikipedia Enterprise: the new paid version for companies (translated from Spanish)
Commercial launch - October 2021

Wikimedia Foundation Press release

Of particular note:

Golem Wikipedia gets a commercial offer (Translated from German)
Linux-Magazin.de Wikimedia Enterprise has a commercial offer (Translated from German)
First customers - June 2022

Press Release
note: media stories listed below are written and published independently and were neither pre-reviewed nor approved by the WMF

Of particular note:

See also

  • API:Main page – MediaWiki Action API documentation
  • Wikitech: Data Services portal – A list of community-facing services that allow for direct access to databases and dumps, as well as web interfaces for querying and programmatic access to data stores.
  • Enterprise hub – a page for those interested in using the MediaWiki software in corporate contexts.

AltStyle によって変換されたページ (->オリジナル) /