Jump to content
Wikipedia The Free Encyclopedia

Open-source artificial intelligence

From Wikipedia, the free encyclopedia
Concept of open-source software applied to AI
Part of a series on
Artificial intelligence (AI)
Glossary

Open-source artificial intelligence, as defined by the Open Source Initiative, is an AI system that is freely available to use, study, modify, and share.[1] [2] This includes datasets used to train the model, its code, and model parameters, promoting a collaborative and transparent approach to AI development so someone could create a substantially similar result.[3] [4]

The debate over what should count as ‘open-source’ given a range of openness among AI projects has been significant. Some large language models touted as open-sourced that only release model-weights (but not training data and code)[5] [6] have been criticized as "openwashing"[7] systems that are mostly closed.[8] Free and open-source software (FOSS) licenses, such as the Apache License, MIT License, and GNU General Public License, outline the terms under which open-source artificial intelligence can be accessed, modified, and redistributed.[9]

Popular open-source artificial intelligence project categories include large language models, machine translation tools, and chatbots.[10] Debate over the benefits and risks of open-sourced AI involve a range of factors like security, privacy and technological advancement.[11] [12] [8] [13]

History

[edit ]

The history of open-source artificial intelligence is intertwined with both the development of AI technologies and the growth of the open-source software movement.[14] [better source needed ] Open-source AI has evolved significantly over the past few decades, with contributions from various academic institutions, research labs, tech companies, and independent developers.[15] [better source needed ] This section explores the major milestones in the development of open-source AI, from its early days to its current state.

1990s: Early development of AI and open-source software

[edit ]

The concept of AI dates back to the mid-20th century, when computer scientists like Alan Turing and John McCarthy laid the groundwork for modern AI theories and algorithms.[16] An early form of AI, the natural language processing "doctor" ELIZA, was re-implemented and shared in 1977 by Jeff Shrager as a BASIC program, and soon translated to many other languages. Early AI research focused on developing symbolic reasoning systems and rule-based expert systems.[17]

During this period, the idea of open-source software was beginning to take shape, with pioneers like Richard Stallman advocating for free software as a means to promote collaboration and innovation in programming.[18] The Free Software Foundation, founded in 1985 by Stallman, was one of the first major organizations to promote the idea of software that could be freely used, modified, and distributed. The ideas from this movement eventually influenced the development of open-source AI, as more developers began to see the potential benefits of open collaboration in software creation, including AI models and algorithms.[19] [better source needed ][15] [better source needed ]

In the 1990s, open-source software began to gain more traction,[20] [better source needed ] the rise of machine learning and statistical methods also led to the development of more practical AI tools. In 1993, the CMU Artificial Intelligence Repository was initiated, with a variety of openly shared software.[21] [better source needed ]

2000s: Emergence of open-source AI

[edit ]

In the early 2000s open-source AI began to take off, with the release of more user-friendly foundational libraries and frameworks that were available for anyone to use and contribute to.[22] [better source needed ]

OpenCV was released in 2000[23] with a variety of traditional AI algorithms like decision trees, k-Nearest Neighbors (kNN), Naive Bayes and Support Vector Machines (SVM).[24]

2010s: Rise of open-source AI frameworks

[edit ]

Open-source deep learning framework as Torch was released in 2002 and made open-source with Torch7 in 2011, and was later augmented by PyTorch, and TensorFlow.[25]

AlexNet was released in 2012.[26]

GPT-1 was released in 2018.

2020s: Open-weight and open-source generative AI

[edit ]

With the announcement of GPT-2 in 2019, OpenAI originally planned to keep the source code of their models private citing concerns about malicious applications.[27] After OpenAI faced public backlash, however, it released the source code for GPT-2 to GitHub three months after its release.[27] OpenAI did not publicly release the source code or pretrained weights for the GPT-3 model.[28] At the time of GPT-3's release GPT-2 was still the most powerful open source language model in the world. Competition in building more open models included mostly smaller efforts like EleutherAI.[29] [30] 2022 also saw the rise of larger and more powerful models under licenses of varying openness including Meta's OPT.[31]

The Open Source Initiative consulted experts over two years to create a definition of "open-source" that would fit the needs of AI software and models. The most controversial aspect relates to data access, since some models are trained on sensitive data which can't be released. In 2024, they published the Open Source AI Definition 1.0 (OSAID 1.0).[1] [2] [3] It requires full release of the software for processing the data, training the model and making inferences from the model. For the data, it only requires access to details about the data used to train the AI so others can understand and re-create it.[2]

In 2023, Llama 1 and 2 and Mistral AI's Mistral and Mixtral open-weight models were first released,[32] [33] along with MosaicML's MPT open-source model.[34] [35]

In 2024, Meta released a collection of large AI models, including Llama 3.1 405B, which was competitive with less open models.[36] The company claimed its approach to AI would be open-source, differing from other major tech companies.[36] The Open Source Initiative and others stated that Llama is not open-source despite Meta describing it as open-source, due to Llama's software license prohibiting it from being used for some purposes.[37] [38] [39]

DeepSeek released their V3 LLM in December 2024, and their R1 reasoning model on January 20, 2025, both as open-weights models under the MIT license.[40] [41] This release made widely known how China had been embracing using and building more open AI systems as a way to reduce reliance on western software and gatekeeping as well as to help give its industries access to higher-powered AI more quickly.[42] Projects based in China have since become more widely used around the world as well as they have closed at least some of the gap with leading proprietary American models.[42] [43] [44]

Since the release of OpenAI's proprietary ChatGPT model in late 2022, there have been only a few fully open (weights, data, code, etc.) large language models released. In September 2025, a Swiss consortium added to this short list by releasing a fully open model named Apertus.[45] [46] Latam-GPT, an open Latin America-focused model, launched in 2025 as a regional effort that trains primarily Spanish and Portuguese-language content.[47] [48]

Significance

[edit ]

The label ‘open-source’ can provide real benefits to companies looking to hire top talent or attract customers.[4] The debate around "openwashing" (or calling a project open-source when it is mostly closed) has big implications for the success of various projects within the industry.[7]

Open-source artificial intelligence tends to get more support and adoption in countries and companies that do not have their own leading AI model.[4] These open-source projects can help to undercut the position of business and geopolitical rivals with the strongest proprietary models.[4]

Applications

[edit ]
See also: Generative AI

Healthcare

[edit ]

In the healthcare industry, open-source AI has been used in diagnostics, patient care, and personalized treatment options.[49] Open-source libraries have been used for medical imaging for tasks such as tumor detection, improving the speed and accuracy of diagnostic processes.[50] [49] Additionally, OpenChem, an open-source library specifically geared toward chemistry and biology applications, enables the development of predictive models for drug discovery, helping researchers identify potential compounds for treatment.[51]

Military

[edit ]

Meta's Llama models, which have been described as open-source by Meta, were adopted by U.S. defense contractors like Lockheed Martin and Oracle after unauthorized adaptations by Chinese researchers affiliated with the People's Liberation Army (PLA) came to light.[52] [53] The Open Source Initiative and others have contested Meta's use of the term open-source to describe Llama, due to Llama's license containing an acceptable use policy that prohibits use cases including non-U.S. military use.[39] Chinese researchers used an earlier version of Llama to develop tools like ChatBIT, optimized for military intelligence and decision-making, prompting Meta to expand its partnerships with U.S. contractors to ensure the technology could be used strategically for national security.[53] These applications now include logistics, maintenance, and cybersecurity enhancements.[53]

Benefits

[edit ]

Privacy and independence

[edit ]

A Nature editorial suggests medical care could become dependent on AI models that could be taken down at any time, are difficult to evaluate, and may threaten patient privacy.[12] Its authors propose that health-care institutions, academic researchers, clinicians, patients and technology companies worldwide should collaborate to build open-source models for health care of which the underlying code and base models are easily accessible and can be fine-tuned freely with own data sets.[12]

Collaboration and faster advancements

[edit ]

Large-scale collaborations, such as those seen in the development of open-source frameworks like TensorFlow and PyTorch, have accelerated advancements in machine learning (ML) and deep learning.[54] The open-source nature of these platforms also facilitates rapid iteration and improvement, as contributors from across the globe can propose modifications and enhancements to existing tools.[54]

Democratizing access

[edit ]

Open-source allows countries and organizations that otherwise do not have access to proprietary models a way to use and invest in AI more cheaply.[4] [55] [56]

Transparency

[edit ]
A video about the importance of transparency of AI in medicine

One key benefit of open-source AI is the increased transparency it offers compared to closed-source alternatives.[57] [better source needed ] The open-sourced aspects of models allow those algorithms and code to be inspected, which promotes accountability and helps developers understand how a model reaches its conclusions.[58] [better source needed ] Additionally, open-weight models, such as Llama and Stable Diffusion, allow developers to directly access model parameters, potentially facilitating the reduced bias and increased fairness in their applications.[58] [better source needed ] This transparency can help create systems with human-readable outputs, or "explainable AI", which is a growingly key concern, especially in high-stakes applications such as healthcare, criminal justice, and finance, where the consequences of decisions made by AI systems can be significant.[59] [better source needed ]

Concerns

[edit ]

Quality and security

[edit ]

Open-source AI may allow bioterrorism groups to remove fine-tuning and other safeguards of AI models.[11] [4] A July 2024 report by the White House found it did not yet find sufficient evidence to restrict revealing model weights.[60]

Once an open-source model is public, it cannot be rolled back or updated if serious security issues are detected.[61] [better source needed ] The main barrier to developing real-world terrorist schemes lies in stringent restrictions on necessary materials and equipment.[61] [better source needed ] Furthermore, the rapid pace of AI advancement makes it less appealing to use older models, which are more vulnerable to attacks but also less capable.[61] [better source needed ]

Researchers have also criticized open-source artificial intelligence for existing security and ethical concerns. An analysis of over 100,000 open-source models on Hugging Face and GitHub using code vulnerability scanners like Bandit, FlawFinder, and Semgrep found that over 30% of models have high-severity vulnerabilities.[62] [better source needed ] Furthermore, closed models typically have fewer safety risks than open-sourced models.[61] [better source needed ] The freedom to augment open-source models has led to developers releasing models without ethical guidelines, such as GPT4-Chan.[61] [better source needed ]

Practicality

[edit ]

Even with truly open-source AI, the cost of training a model oneself can still be prohibitively expensive for many users, unlike other open-source projects that require only downloading code.[4]

Partially open-sourced code that is released with many legal restrictions has scared off some companies from using those projects for fear of a future lawsuit.[4]

See also

[edit ]
Wikimedia Commons has media related to Open source artificial intelligence .

References

[edit ]
  1. ^ a b Williams, Rhiannon; O'Donnell, James (August 22, 2024). "We finally have a definition for open-source AI". MIT Technology Review. Retrieved 28 November 2024.
  2. ^ a b c Robison, Kylie (28 October 2024). "Open-source AI must reveal its training data, per new OSI definition". The Verge. Retrieved 28 November 2024.
  3. ^ a b "The Open Source AI Definition – 1.0". Open Source Initiative . Archived from the original on 2025年03月31日. Retrieved 2024年11月14日.
  4. ^ a b c d e f g h "A battle is raging over the definition of open-source AI". The Economist. November 6, 2024. ISSN 0013-0613 . Retrieved 2025年12月09日.
  5. ^ "Open Weights: not quite what you've been told". Open Source Initiative. Retrieved 2025年09月23日.
  6. ^ "OpenAI releases lower-cost models to rival Meta, Mistral and DeepSeek". CNBC. 2025年08月05日. Retrieved 2025年09月23日.
  7. ^ a b Liesenfeld, Andreas; Dingemanse, Mark (5 June 2024). "Rethinking open source generative AI: Open washing and the EU AI Act". The 2024 ACM Conference on Fairness, Accountability, and Transparency. Association for Computing Machinery. pp. 1774–1787. doi:10.1145/3630106.3659005 . ISBN 979-8-4007-0450-5.
  8. ^ a b Widder, David Gray; Whittaker, Meredith; West, Sarah Myers (November 2024). "Why 'open' AI systems are actually closed, and why this matters". Nature. 635 (8040): 827–833. Bibcode:2024Natur.635..827W. doi:10.1038/s41586-024-08141-1 . ISSN 1476-4687. PMID 39604616.
  9. ^ "Licenses". Open Source Initiative. Archived from the original on 2018年02月10日. Retrieved 2024年11月14日.
  10. ^ Castelvecchi, Davide (29 June 2023). "Open-source AI chatbots are booming — what does this mean for researchers?". Nature . 618 (7967): 891–892. Bibcode:2023Natur.618..891C. doi:10.1038/d41586-023-01970-6. PMID 37340135.
  11. ^ a b Sandbrink, Jonas (2023年08月07日). "ChatGPT could make bioterrorism horrifyingly easy". Vox. Retrieved 2024年11月14日.
  12. ^ a b c Toma, Augustin; Senkaiahliyan, Senthujan; Lawler, Patrick R.; Rubin, Barry; Wang, Bo (December 2023). "Generative AI could revolutionize health care — but not if control is ceded to big tech". Nature. 624 (7990): 36–38. Bibcode:2023Natur.624...36T. doi:10.1038/d41586-023-03803-y. PMID 38036861.
  13. ^ Davies, Pascale (20 February 2024). "What is open source AI and why is profit so important to the debate?". Euronews . Retrieved 28 November 2024.
  14. ^ "The Evolution of Open Source: From Software to AI: Argano". argano.com. Retrieved 2024年11月24日.
  15. ^ a b Daigle, Kyle (2023年11月08日). "Octoverse: The state of open source and rise of AI in 2023". The GitHub Blog. Retrieved 2024年11月24日.
  16. ^ "Appendix I: A Short History of AI | One Hundred Year Study on Artificial Intelligence (AI100)". ai100.stanford.edu. Retrieved 2024年11月24日.
  17. ^ Kautz, Henry (2022年03月31日). "The Third AI Summer: AAAI Robert S. Engelmore Memorial Lecture". AI Magazine. 43 (1): 105–125. doi:10.1002/aaai.12036. ISSN 2371-9621.
  18. ^ "Why Software Should Be Free - GNU Project - Free Software Foundation". www.gnu.org. Archived from the original on 2024年12月01日. Retrieved 2024年11月24日.
  19. ^ "The Power of Collaboration: How Open-Source Projects are Advancing AI". kdnuggets.com.
  20. ^ Code, Linux (2024年11月03日). "A Brief History of Open Source". TheLinuxCode. Retrieved 2024年11月24日.[permanent dead link ]
  21. ^ "Topic: (/)". www.cs.cmu.edu. Retrieved 2025年09月11日.
  22. ^ Priya (2024年03月28日). "The Evolution of Open Source AI Libraries: From Basement Brawls to AI All-Stars". TheGen.AI. Retrieved 2024年11月24日.
  23. ^ Pulli, Kari; Baksheev, Anatoly; Kornyakov, Kirill; Eruhimov, Victor (1 April 2012). "Realtime Computer Vision with OpenCV". ACM Queue. 10 (4): 40:40–40:56. doi:10.1145/2181796.2206309 .
  24. ^ Adrian Kaehler; Gary Bradski (14 December 2016). Learning OpenCV 3: Computer Vision in C++ with the OpenCV Library. O'Reilly Media. pp. 26ff. ISBN 978-1-4919-3800-3.
  25. ^ Costa, Carlos J.; Aparicio, Manuela; Aparicio, Sofia; Aparicio, Joao Tiago (January 2024). "The Democratization of Artificial Intelligence: Theoretical Framework". Applied Sciences. 14 (18): 8236. doi:10.3390/app14188236 . hdl:10362/173131 . ISSN 2076-3417.
  26. ^ Lee, Timothy B. (2024年11月11日). "How a stubborn computer scientist accidentally launched the deep learning boom". Ars Technica. Retrieved 2025年09月11日.
  27. ^ a b Xiang, Chloe (2023年02月28日). "OpenAI Is Now Everything It Promised Not to Be: Corporate, Closed-Source, and For-Profit". VICE. Retrieved 2024年11月14日.
  28. ^ Hao, Karen (September 23, 2020). "OpenAI is giving Microsoft exclusive access to its GPT-3 language model". MIT Technology Review. Archived from the original on 2021年02月05日. Retrieved 2024年12月08日.
  29. ^ "GPT-3's free alternative GPT-Neo is something to be excited about". VentureBeat. 2021年05月15日. Archived from the original on 9 March 2023. Retrieved 2023年04月14日.
  30. ^ "EleutherAI: When OpenAI Isn't Open Enough". IEEE Spectrum. 2021年06月02日. Archived from the original on March 27, 2022.
  31. ^ Heaven, Will (2022年05月03日). "Meta has built a massive new language AI—and it's giving it away for free". MIT Technology Review . Retrieved 2023年12月26日.
  32. ^ Nicol-Schwarz, Kai (2025年12月02日). "French AI lab Mistral releases new AI models as it looks to keep pace with OpenAI and Google". CNBC. Retrieved 2025年12月05日.
  33. ^ Heikkilä, Melissa (December 2, 2025). "Mistral unveils new models in race to gain edge in 'open' AI". Financial Times . Retrieved 2025年12月05日.
  34. ^ Nunez, Michael (2023年06月22日). "MosaicML challenges OpenAI with its new open-source language model". VentureBeat . Retrieved 2025年07月21日.
  35. ^ Chen, Joanne (2023年07月19日). "MosaicML launches MPT-7B-8K, a 7B-parameter open-source LLM with 8k context length". VentureBeat . Retrieved 2025年07月21日.
  36. ^ a b Mirjalili, Seyedali (2024年08月01日). "Meta just launched the largest 'open' AI model in history. Here's why it matters". The Conversation. Retrieved 2024年11月14日.
  37. ^ Waters, Richard (2024年10月17日). "Meta under fire for 'polluting' open-source". Financial Times . Retrieved 2024年11月14日.
  38. ^ Edwards, Benj (18 July 2023). "Meta launches Llama 2, a source-available AI model that allows commercial applications". Ars Technica . Archived from the original on 7 November 2023. Retrieved 14 December 2024.
  39. ^ a b "Meta offers Llama AI to US government for national security". CIO . 5 November 2024. Archived from the original on 14 December 2024. Retrieved 14 December 2024.
  40. ^ Chen, Caiwei (January 24, 2025). "How a top Chinese AI model overcame US sanctions". MIT Technology Review. Archived from the original on 2025年01月25日. Retrieved 2025年02月03日.
  41. ^ Guo, Daya; et al. (18 September 2025). "DeepSeek-R1 incentivizes reasoning in LLMs through reinforcement learning". Nature. 645 (8081): 633–638. Bibcode:2025Natur.645..633G. doi:10.1038/s41586-025-09422-z. PMC 12443585 . PMID 40962978.
  42. ^ a b Bloom, Peter (2025年02月12日). "DeepSeek: how China's embrace of open-source AI caused a geopolitical earthquake". The Conversation. Retrieved 2025年12月09日.
  43. ^ Huang, Raffaele (2025年08月13日). "China's Lead in Open-Source AI Jolts Washington and Silicon Valley". The Wall Street Journal. Retrieved 2025年12月09日.
  44. ^ Cui, Jasmine; Perlo, Jared (2025年11月30日). "More of Silicon Valley is building on free Chinese AI". NBC News. Retrieved 2025年12月09日.
  45. ^ Welle, Elissa (2025年09月03日). "Switzerland releases an open-weight AI model". The Verge. Retrieved 2025年10月08日.
  46. ^ Allen, Matthew (2025年09月02日). "Switzerland launches transparent ChatGPT alternative". SWI swissinfo.ch. Retrieved 2025年10月08日.
  47. ^ Lagos, Anna (September 1, 2025). "Latam-GPT: The Free, Open Source, and Collaborative AI of Latin America". Wired. ISSN 1059-1028 . Retrieved 2025年10月08日.
  48. ^ Osborn, Catherine (2025年12月22日). "Where Does Latin America Stand in the Global AI Race?". Foreign Policy (magazine) . Retrieved 2025年12月05日.
  49. ^ a b Esteva, Andre; Robicquet, Alexandre; Ramsundar, Bharath; Kuleshov, Volodymyr; DePristo, Mark; Chou, Katherine; Cui, Claire; Corrado, Greg; Thrun, Sebastian; Dean, Jeff (January 2019). "A guide to deep learning in healthcare" . Nature Medicine. 25 (1): 24–29. Bibcode:2019NatMe..25...24E. doi:10.1038/s41591-018-0316-z. ISSN 1546-170X. PMID 30617335.
  50. ^ Ashraf, Mudasir; Ahmad, Syed Mudasir; Ganai, Nazir Ahmad; Shah, Riaz Ahmad; Zaman, Majid; Khan, Sameer Ahmad; Shah, Aftab Aalam (2021). "Prediction of Cardiovascular Disease Through Cutting-Edge Deep Learning Technologies: An Empirical Study Based on TENSORFLOW, PYTORCH and KERAS". In Gupta, Deepak; Khanna, Ashish; Bhattacharyya, Siddhartha; Hassanien, Aboul Ella; Anand, Sameer; Jaiswal, Ajay (eds.). International Conference on Innovative Computing and Communications. Advances in Intelligent Systems and Computing. Vol. 1165. Singapore: Springer. pp. 239–255. doi:10.1007/978-981-15-5113-0_18. ISBN 978-981-15-5113-0.
  51. ^ Korshunova, Maria; Ginsburg, Boris; Tropsha, Alexander; Isayev, Olexandr (2021年01月25日). "OpenChem: A Deep Learning Toolkit for Computational Chemistry and Drug Design" . Journal of Chemical Information and Modeling. 61 (1): 7–13. doi:10.1021/acs.jcim.0c00971. ISSN 1549-9596. PMID 33393291.
  52. ^ Pomfret, James; Pang, Jessie; Pomfret, James; Pang, Jessie (2024年11月01日). "Exclusive: Chinese researchers develop AI model for military use on back of Meta's Llama". Reuters. Retrieved 2024年11月16日.
  53. ^ a b c Roth, Emma (2024年11月04日). "Meta AI is ready for war". The Verge. Retrieved 2024年11月16日.
  54. ^ a b Dean, Jeffrey (2022年05月01日). "A Golden Decade of Deep Learning: Computing Systems & Applications". Daedalus. 151 (2): 58–74. doi:10.1162/daed_a_01900 . ISSN 0011-5266.
  55. ^ Hassri, Myftahuddin Hazmi; Man, Mustafa (2023年12月07日). "The Impact of Open-Source Software on Artificial Intelligence" . Journal of Mathematical Sciences and Informatics. 3 (2). doi:10.46754/jmsi.202312006 . ISSN 2948-3697.
  56. ^ Solaiman, Irene (May 24, 2023). "Generative AI Systems Aren't Just Open or Closed Source". Wired. Archived from the original on November 27, 2023. Retrieved July 20, 2023.
  57. ^ MACHADO, J. (2025). Toward a Public and Secure Generative AI: A Comparative Analysis of Open and Closed LLMs. Conference Paper. arXiv:2505.10603.
  58. ^ a b White, Matt; Haddad, Ibrahim; Osborne, Cailean; Xiao-Yang Yanglet Liu; Abdelmonsef, Ahmed; Varghese, Sachin; Arnaud Le Hors (2024). "The Model Openness Framework: Promoting Completeness and Openness for Reproducibility, Transparency, and Usability in Artificial Intelligence". arXiv:2403.13784 [cs.LG].
  59. ^ Gujar, Praveen. "Council Post: Building Trust In AI: Overcoming Bias, Privacy And Transparency Challenges". Forbes. Retrieved 2024年11月27日.
  60. ^ O'Brien, Matt (2024年07月30日). "White House says no need to restrict open-source AI, for now". Associated Press. PBS News. Retrieved 2024年11月14日.
  61. ^ a b c d e Eiras, Francisco; Petrov, Aleksandar; Vidgen, Bertie; Schroeder, Christian; Pizzati, Fabio; Elkins, Katherine; Mukhopadhyay, Supratik; Bibi, Adel; Purewal, Aaron (2024年05月29日). "Risks and Opportunities of Open-Source Generative AI". arXiv:2405.08597 [cs.LG].
  62. ^ Kathikar, Adhishree; Nair, Aishwarya; Lazarine, Ben (2023). "Assessing the Vulnerabilities of the Open-Source Artificial Intelligence (AI) Landscape: A Large-Scale Analysis of the Hugging Face Platform". 2023 IEEE International Conference on Intelligence and Security Informatics (ISI). pp. 1–6. doi:10.1109/ISI58743.2023.10297271. ISBN 979-8-3503-3773-0.
[edit ]

AltStyle によって変換されたページ (->オリジナル) /