Statsmodels logo
Conda Version
Azure CI Build Status
Coveralls Coverage Conda downloads
About statsmodels
statsmodels is a Python package that provides a complement to scipy for
statistical computations including descriptive statistics and estimation
and inference for statistical models.
Documentation
The documentation for the latest release is at
https://www.statsmodels.org/dev/
Recent improvements are highlighted in the release notes
https://statsmodels.github.io/stable/
and Main Features
- Linear regression models:
- Ordinary least squares
- Generalized least squares
- Weighted least squares
- Least squares with autoregressive errors
- Quantile regression
- Recursive least squares
- Mixed Linear Model with mixed effects and variance components
- GLM: Generalized linear models with support for all of the one-parameter
exponential family distributions
- Bayesian Mixed GLM for Binomial and Poisson
- GEE: Generalized Estimating Equations for one-way clustered or longitudinal data
- Discrete models:
- Logit and Probit
- Multinomial logit (MNLogit)
- Poisson and Generalized Poisson regression
- Negative Binomial regression
- Zero-Inflated Count models
- RLM: Robust linear models with support for several M-estimators.
- Time Series Analysis: models for time series analysis
- Complete StateSpace modeling framework
- Seasonal ARIMA and ARIMAX models
- VARMA and VARMAX models
- Dynamic Factor models
- Unobserved Component models
- Markov switching models (MSAR), also known as Hidden Markov Models (HMM)
- Univariate time series analysis: AR, ARIMA
- Vector autoregressive models, VAR and structural VAR
- Vector error correction model, VECM
- exponential smoothing, Holt-Winters
- Hypothesis tests for time series: unit root, cointegration and others
- Descriptive statistics and process models for time series analysis
- Survival analysis:
- Proportional hazards regression (Cox models)
- Survivor function estimation (Kaplan-Meier)
- Cumulative incidence function estimation
- Multivariate:
- Principal Component Analysis with missing data
- Factor Analysis with rotation
- MANOVA
- Canonical Correlation
- Nonparametric statistics: Univariate and multivariate kernel density estimators
- Datasets: Datasets used for examples and in testing
- Statistics: a wide range of statistical tests
- diagnostics and specification tests
- goodness-of-fit and normality tests
- functions for multiple testing
- various additional statistical tests
- Imputation with MICE, regression on order statistic and Gaussian imputation
- Mediation analysis
- Graphics includes plot functions for visual analysis of data and model results
- I/O
- Tools for reading Stata .dta files, but pandas has a more recent version
- Table output to ascii, latex, and html
- Miscellaneous models
- Sandbox: statsmodels contains a sandbox folder with code in various stages of
development and testing which is not considered "production ready". This covers
among others
- Generalized method of moments (GMM) estimators
- Kernel regression
- Various extensions to scipy.stats.distributions
- Panel data models
- Information theoretic measures
How to get it
The main branch on GitHub is the most up to date code
https://github.com/statsmodels/statsmodels/tags
Binaries and source distributions are available from PyPi
Getting the latest code
Installing the most recent nightly wheel
The most recent nightly wheel can be installed using pip.
python -m pip install -i https://pypi.anaconda.org/scientific-python-nightly-wheels/simple statsmodels --upgrade --use-deprecated=legacy-resolver
Installing from sources
See INSTALL.txt for requirements or see the documentation
Contributing
Contributions in any form are welcome, including:
- Documentation improvements
- Additional tests
- New features to existing models
- New models
License
Modified BSD (3-clause)
Discussion and Development
Discussions take place on the mailing list
Bug Reports
Bug reports can be submitted to the issue tracker at
/statx/statsmodels