List of data science software
Appearance
From Wikipedia, the free encyclopedia
This is a list of data science software and platforms used in data science, which includes programming languages, programming environments, machine learning frameworks, data engineering tools, statistical software, data analysis, plotting, MLOps systems, and more.
Programming languages
[edit ]Development environments
[edit ]These interactive notebooks, IDEs, and platforms provide specialised development environments.
- Apache Zeppelin[6]
- Architect — Eclipse (software)
- CoCalc
- Dataiku Data Science Studio
- FreeMat
- GNU Octave
- Google Colab
- DataSpell
- Jupyter Notebook / JupyterLab
- Kaggle Notebooks
- MATLAB
- O-Matrix
- PyCharm
- RStudio
- SAS (software) and SAS Studio[7]
- Spyder
- Visual Studio Code [8]
Machine and deep learning software
[edit ]The Machine learning / deep learning tools support development in those fields.
See also: Comparison of deep learning software, List of data mining and machine learning software, List of open-source machine learning software, and List of neural network software
- Apache Mahout
- Apache MXNet
- Apache SINGA
- BigDL
- Caffe
- CatBoost
- Chainer
- Data Analytics Acceleration Library
- Deeplearning4j
- Dlib
- Encog
- Flux
- Google JAX
- Keras
- LIBSVM
- LightGBM
- MATLAB + Deep Learning Toolbox
- Microsoft Cognitive Toolkit
- MindsDB
- MindSpore
- ML.NET
- Neural Designer
- Neural Network Intelligence
- oneAPI
- OpenNN
- PlaidML
- PyTorch
- QLattice
- Scikit-learn
- Shogun (toolbox)
- TensorFlow
- Theano
- Torch
- Tree-based pipeline optimization tool
- XGBoost
- Weka
- Wolfram Mathematica [9]
Data engineering
[edit ]Examples of Data engineering tools.
See also: Comparison of data modeling tools
- Apache Airflow
- Apache Flink
- Apache Hadoop
- Apache Kafka
- Apache NiFi
- Apache Spark
- Dask
- Data build tool (dbt)
Data mining
[edit ]Examples of Data mining tools.
See also: List of data mining software
Free and open-source
[edit ]Proprietary
[edit ]Database management
[edit ]See also: List of relational database management systems, List of NoSQL databases, List of SQL software and tools, List of NoSQL software and tools, Comparison of database administration tools, and List of time series databases
Data warehouses
[edit ]Data warehouse environments include:
Data lakes
[edit ]Data lake environments include:
Algorithms
[edit ]See also: List of algorithms, List of machine learning algorithms, List of optimization algorithms, and List of database algorithms
- Apriori algorithm – frequent itemset mining and association rule learning in market basket analysis
- Backpropagation – algorithm for training artificial neural networks using gradient descent
- Decision Trees – tree-based algorithm for classification and regression
- Expectation–maximization algorithm – iterative procedure for maximum likelihood estimation with latent variables
- Gradient descent – iterative optimization algorithm for minimizing a loss function
- ID3 algorithm – used to generate a decision tree from a dataset
- K-Means – clustering algorithm based on minimizing within-cluster distances
- K-Nearest Neighbors (KNN) – instance-based learning and classification method
- Linear regression – estimation method for predicting a dependent variable based on independent variables
- Logistic regression – classification algorithm for predicting a binary outcome
- Naive Bayes – probabilistic classifier based on Bayes' theorem
- Ordinary least squares – estimation method for parameters in linear regression
- PageRank – graph-based algorithm for link analysis and search ranking
- Principal component analysis – technique to reduce high-dimensional data while preserving variance
- Q-learning – reinforcement learning algorithm for learning optimal actions
- Random forest – ensemble of decision trees for improved classification or regression
- Sequential minimal optimization – solver for training support vector machines
- Stochastic gradient descent – randomized variant of gradient descent for large-scale machine learning
- Support Vector Machines (SVM) – algorithm for finding a hyperplane to separate classes[13] [14]
Statistical software
[edit ]Open-source
[edit ]- ADaMSoft
- ADMB
- Chronux
- DAP
- Epi Info
- Fityk
- GNU Octave
- gretl
- Intrinsic Noise Analyzer
- jamovi
- JASP
- JMulTi
- Just another Gibbs sampler (JAGS)
- Mondrian
- Neurophysiological Biomarker Toolbox
- OpenBUGS
- OpenEpi
- OpenMx
- Ploticus
- PSPP
- Programming with Big Data in R
- R Commander
- Rattle GUI
- Revolution Analytics
- RStudio
- Salstat
- Scilab
- SciPy
- Simfit
- SOCR
- SOFA Statistics
- Stan
- Statistical Lab
Public domain
[edit ]Freeware
[edit ]Proprietary
[edit ]- Analytica
- ASReml
- BMDP
- DB Lytix
- EViews
- GAUSS
- Genedata
- GenStat
- GLIM
- GraphPad Prism
- Igor Pro
- IMSL Numerical Libraries
- JMP
- LIMDEP
- LISREL
- Maple
- Mathematica
- MATLAB
- MedCalc
- Microfit
- Minitab
- MLwiN
- Nacsport Video Analysis Software
- NAG Numerical Library
- NCSS
- NLOGIT
- nQuery Sample Size Software
- O-Matrix
- PASS Sample Size Software
- Primer-E Primer
- Qlucore
- RATS
- S-PLUS
- SHAZAM
- SigmaStat
- SIMUL
- SmartPLS
- Speakeasy
- SPSS
- Stata
- StatCrunch
- Statgraphics
- Statistica
- StatsDirect
- StatXact
- SuperCROSS
- SYSTAT
- The Unscrambler
- WarpPLS
- World Programming System
- XploRe
Data processing
[edit ]Tools for Data processing and analysis:
See also: List of data analysis software and Comparison of OLAP servers
- AIDA
- Alteryx
- Apache Kudu
- Aphelion
- ClickHouse
- Cubes (OLAP server)
- DADiSP
- DAP
- Data Analysis Expressions
- Databricks
- Data Discovery and Query Builder
- Dataiku
- DIVA
- Dplyr
- Easystats
- Ecu.test
- EditGrid
- EgoNet
- Epi Info
- EViews
- Endrov
- Eye-Sys
- FlexPro
- FreeMat
- Fsc2
- GNU Octave
- ILNumerics
- Imc FAMOS
- InfiniteGraph
- Informatica
- Java Analysis Studio
- JMP
- Kirix Strata
- KnetMiner
- LabWindows/CVI
- LIONsolver
- MATLAB
- MagicPlot
- MetaboAnalyst
- MEX file
- Microsoft Analysis Services
- Monarch
- Moose (analysis)
- MountainsMap
- Natural Language Toolkit
- NetMiner
- Nirvana
- Ocean Data View
- OpenRefine
- OpenScientist
- Origin
- Pandas
- Paxata
- Pipeline Pilot
- Poimapper
- Polars
- PolyAnalyst
- PowerLab
- RCFile
- ROOT
- RRDtool
- SAS
- Seeq Corporation
- SekChek Local
- SensoMotoric Instruments
- Sisense
- SmartPLS
- Social network analysis software
- SolveIT
- Speakeasy (computational environment)
- SuperCROSS
- Tidyverse
- Trifacta
- Truviso
- WarpPLS
- XLfit [15]
Data and information visualization
[edit ]Software for Data visualization:
See also: List of information graphics software, List of charting software, Comparison of JavaScript charting libraries, and List of free data visualization software
- Amira
- AnyChart
- Apache Superset
- Avizo
- Baudline
- BisQue (Bioimage Analysis and Management Platform)
- Calligra Sheets
- Catpac
- Chart.js
- Cloudera
- ColorBrewer
- COMPLEAT (bioinformatics tool)
- Creately
- D3js
- DataGraph
- DataScene
- DataViva
- Diagrams.net
- Epi Map
- Eye-Sys
- FlexPro
- FreeMat
- FusionCharts
- GeoGebra
- Gephi
- ggplot2
- Gnuplot
- Gliffy
- GRAPE
- GrADS
- Grace
- Grafana
- GraphPad Prism
- Graphviz
- HippoDraw
- Histcite
- IBM Cognos Analytics
- Imc FAMOS
- Infogram
- InfoZoom
- InfiniteGraph
- IGOR Pro
- Java Analysis Studio
- Jedox
- JFreeChart
- JMP
- Kig
- Kitware
- KnetMiner
- Kst
- LabPlot
- LabVIEW
- LabWindows/CVI
- Lavastorm Analytics
- LibreOffice
- LIONsolver
- LiSiCA
- MagicPlot
- Maple
- MathCad
- Mathematica
- MATLAB
- Maxima
- MedCalc
- MetaboAnalyst
- MEX file
- Microsoft Analysis Services
- Microsoft Excel
- Microsoft Power BI
- MicroStrategy
- Monarch
- Moose (analysis)
- MountainsMap
- Molecular Evolutionary Genetics Analysis
- Netvibes
- Numbers for Mac
- Ocean Data View
- OpenOffice.org Calc
- OpenScientist
- Origin
- ParaView
- PathVisio
- Perl Data Language
- PGPLOT
- ploticus
- Plotly
- plotutils
- Poimapper
- PolyAnalyst
- PowerLab
- Psychometric software
- Pyramid Analytics
- QtiPlot
- Qunb
- RGraph
- ROOT
- RRDtool
- SAS
- Seaborn
- Sisense
- SmartPLS
- Social network analysis software
- TAChart
- Tableau
- Teechart
- Tomviz
- Trade Space Visualizer
- Trendalyzer
- Truviso
- Vaa3D
- Visual.ly
- WarpPLS
- XLfit
Plotting software
[edit ]Software for plotting data to support processing and visualise resuls.
- Analytica
- CricketGraph
- Data Desk
- DISLIN
- Earth sciences graphics software
- Generic Mapping Tools
- GraphCalc
- Grapher
- Gri graphical language
- Intel Array Visualizer
- IRows
- JASP
- Kst
- LabPlot
- MapleSim
- Mondrian
- MWorks
- NuCalc
- Pipeline Pilot
- Ploticus
- PLplot
- ProStat
- PSI-Plot
- Pyxplot
- SciDAVis
- TableCurve 2D
- TableCurve 3D
- Tecplot
- TinkerPlots
- TOPCAT
- TopoFusion
- Veusz
- VisIt
- Winplot
- Wolfram Mathematica
Maps and geospatial visualization
[edit ]See also: List of GIS mapping software
Machine learning
[edit ]MLOps and model deployment:
- BentoML[16]
- Data Version Control (DVC)
- Kubeflow
- MLflow[17] [18]
- Seldon Core[19]
- Streamlit[20] [21]
- TensorFlow Serving[22] [23]
- Weights & Biases[24]
Data repositories
[edit ]See also: List of datasets for machine-learning research, Comparison of source-code-hosting facilities, and Comparison of data-serialization formats
- Kaggle – platform for data science competitions, datasets,[25] and notebooks.
- Zenodo – open-access repository supported by CERN and the EU.
- University of California, Irvine Machine Learning Repository[26]
- OpenML – collaborative platform for sharing datasets, algorithms, and experiments.[27]
See also
[edit ]Wikibooks has a book on the topic of: Data Science: An Introduction
- Business intelligence software
- List of data science journals
- List of R software and tools
- Lists of mathematical software and List of open-source software for mathematics
- List of numerical-analysis software and List of numerical analysis topics
- List of open-source data science software
- Common Crawl – nonprofit that crawls the web and freely provides its archives and datasets to the public under an MIT License
References
[edit ]- ^ "Top 10 Java Libraries for Data Science". GeeksforGeeks. September 22, 2024.
- ^ "Swift for Data Science: An Introduction - Alibaba Cloud". www.alibabacloud.com.
- ^ "Top 12 Data Science Programming Languages | MDS@Rice". csweb.rice.edu.
- ^ "5 Types of Programming Languages for Data Scientists".
- ^ "The Role of Programming Languages in Data Science". New York Tech Online College of Engineering & Computer Sciences.
- ^ "Apache Zeppelin 0.10.0 Documentation". zeppelin.apache.org.
- ^ Monaco, Michael A.; Dexter, Marie; Tamburro, Jennifer. "Introduction to SAS® Studio" (PDF). Proceedings 2014 Paper SAS302-2014. Cary, NC: SAS Institute Inc.
- ^ "6 Best Python IDEs for Data Science in 2025". www.datacamp.com.
- ^ "8 Best Machine Learning Software To Use in 2025".
- ^ Hiter, Shelby (April 25, 2023). "10 Best Data Mining Tools & Software".
- ^ "Cloud Data Warehouse Comparison: Amazon Redshift, Google BigQuery, Azure Synapse, Snowflake, and Databricks". www.linkedin.com.
- ^ Darley, James (September 24, 2025). "Top 10: AI Data Lakes". aimagazine.com.
- ^ "Top 10 Algorithms for Data Science".
- ^ "Machine Learning Algorithms". 17 August 2023.
- ^ Staff, Coursera (May 9, 2025). "15 Data Analysis Tools and When to Use Them". Coursera.
- ^ "BentoML". GitHub.
- ^ "MLflow". mlflow.org.
- ^ Zaharia, Matei A.; Chen, Andrew; Davidson, Aaron; Ghodsi, Ali; Hong, Sue Ann; Konwinski, Andy; Murching, Siddharth; Nykodym, Tomas; Ogilvie, Paul; Parkhe, Mani; Xie, Fen; Zumar, Corey (September 28, 2018). "Accelerating the Machine Learning Lifecycle with MLflow". IEEE Data Eng. Bull. – via GitHub.
- ^ "Production-ready ML Serving Framework | Seldon Core 2". docs.seldon.ai.
- ^ "Streamlit/Streamlit". GitHub .
- ^ https://docs.streamlit.io
- ^ "Serving Models | TFX". TensorFlow.
- ^ "tensorflow/serving". September 27, 2025 – via GitHub.
- ^ "wandb/wandb". September 28, 2025 – via GitHub.
- ^ "Find Open Datasets and Machine Learning Projects | Kaggle".
- ^ https://archive.ics.uci.edu
- ^ "OpenML".