Statsmodels: statistical modeling and econometrics in Python

1,615 阅读1分钟
原文链接: github.com

About Statsmodels

Statsmodels is a Python package that provides a complement to scipy for statistical computations including descriptive statistics and estimation and inference for statistical models.

Documentation

The documentation for the latest release is at

www.statsmodels.org/stable/

The documentation for the development version is at

www.statsmodels.org/dev/

Recent improvements are highlighted in the release notes

www.statsmodels.org/stable/rele…

Backups of documentation are available at statsmodels.github.io/stable/ and statsmodels.github.io/dev/.

Main Features

  • Linear regression models:
    • Ordinary least squares
    • Generalized least squares
    • Weighted least squares
    • Least squares with autoregressive errors
    • Quantile regression
  • Mixed Linear Model with mixed effects and variance components
  • GLM: Generalized linear models with support for all of the one-parameter exponential family distributions
  • GEE: Generalized Estimating Equations for one-way clustered or longitudinal data
  • Discrete models:
    • Logit and Probit
    • Multinomial logit (MNLogit)
    • Poisson regresion
    • Negative Binomial regression
  • RLM: Robust linear models with support for several M-estimators.
  • Time Series Analysis: models for time series analysis
    • Complete StateSpace modeling framework
      • Seasonal ARIMA and ARIMAX models
      • VARMA and VARMAX models
      • Dynamic Factor models
    • Markov switching models (MSAR), also known as Hidden Markov Models (HMM)
    • Univariate time series analysis: AR, ARIMA
    • Vector autoregressive models, VAR and structural VAR
    • Hypothesis tests for time series: unit root, cointegration and others
    • Descriptive statistics and process models for time series analysis
  • Survival analysis:
    • Proportional hazards regression (Cox models)
    • Survivor function estimation (Kaplan-Meier)
    • Cumulative incidence function estimation
  • Nonparametric statistics: (Univariate) kernel density estimators
  • Datasets: Datasets used for examples and in testing
  • Statistics: a wide range of statistical tests
    • diagnostics and specification tests
    • goodness-of-fit and normality tests
    • functions for multiple testing
    • various additional statistical tests
  • Imputation with MICE and regression on order statistic
  • Mediation analysis
  • Principal Component Analysis with missing data
  • I/O
    • Tools for reading Stata .dta files into numpy arrays.
    • Table output to ascii, latex, and html
  • Miscellaneous models
  • Sandbox: statsmodels contains a sandbox folder with code in various stages of developement and testing which is not considered "production ready". This covers among others
    • Generalized method of moments (GMM) estimators
    • Kernel regression
    • Various extensions to scipy.stats.distributions
    • Panel data models
    • Information theoretic measures

How to get it

The master branch on GitHub is the most up to date code

www.github.com/statsmodels…

Source download of release tags are available on GitHub

github.com/statsmodels…

Binaries and source distributions are available from PyPi

pypi.python.org/pypi/statsm…

Binaries can be installed in Anaconda

conda install statsmodels

Development snapshots are also available in Anaconda (infrequently updated)

conda install -c conda.binstar.org/statsmodels statsmodels

Installing from sources

See INSTALL.txt for requirements or see the documentation

statsmodels.github.io/dev/install…

License

Modified BSD (3-clause)

Discussion and Development

Discussions take place on our mailing list.

groups.google.com/group/pysta…

We are very interested in feedback about usability and suggestions for improvements.

Bug Reports

Bug reports can be submitted to the issue tracker at

github.com/statsmodels…