Data management
| data transformations, match-merge, ODBC, XML, by-group processing, append files, sort, row–column transposition, labeling, saving results, more |
Basic statistics
| summaries, cross-tabulations, correlations, t tests, equality-of-variance tests, tests of proportions, confidence intervals, factor variables, more |
Linear models
| regression; bootstrap, jackknife, and robust Huber/White/sandwich variance estimates; instrumental variables; three-stage least squares; constraints; quantile regression; GLS; more |
Multilevel mixed-effects models
| continuous, binary, and count outcomes; two-, three-, and multiway random-intercepts and random-coefficients models; crossed random effects; ML and REML estimation; BLUPs of effects and fitted values; hierarchical models; residual error structures; more |
Binary, count, and limited dependent variables
| logistic, probit, tobit; Poisson and negative binomial; conditional, multinomial, nested, ordered, rank-ordered, and stereotype logistic; multinomial probit; zero-inflated and zero-truncated count models; selection models; marginal effects; more |
Panel data/longitudinal data
| random- and fixed-effects with robust standard errors, linear mixed models, random-effects probit, GEE, random- and fixed-effects Poisson, dynamic panel-data models, and instrumental-variables regression; panel unit-root tests; AR(1) disturbances; more |
Generalized linear models (GLMs)
| ten link functions, user-defined links, seven distributions, ML and IRLS estimation, nine variance estimators, seven residuals, more |
Nonparametric methods
| Wilcoxon–Mann–Whitney, Wilcoxon signed ranks, and Kruskal–Wallis tests; Spearman and Kendall correlations; Kolmogorov–Smirnov tests; exact binomial CIs; more |
Exact statistics
| exact logistic and Poisson regression, exact case–control statistics, binomial tests, Fisher’s exact test for r × c tables, more |
ANOVA/MANOVA
| balanced and unbalanced designs; factorial, nested, and mixed designs; repeated measures; marginal means; more |
Multivariate methods
| factor analysis, principal components, discriminant analysis, rotation, multidimensional scaling, Procrustean analysis, correspondence analysis, biplots, dendrograms, user-extensible analyses, more |
Cluster analysis
| hierarchical clustering; kmeans and kmedian nonhierarchical clustering; dendrograms; stopping rules; user-extensible analyses; more |
Resampling and simulation methods
| bootstrap, jackknife, and Monte Carlo simulation, permutation tests, more |
Model testing and postestimation support
| Wald tests; LR tests; linear and nonlinear combinations, tests, and predictions; marginal means, least-squares means, adjusted means; average partial and marginal effects; Hausman tests; more |
|
Graphics
| line charts, scatterplots, bar charts, pie charts, hi–lo charts, Graph Editor, regression diagnostic graphs, survival plots, nonparametric smoothers, distribution Q–Q plots, more |
Survey methods
| sampling weights, multistage designs; stratification, poststratification; DEFF; means, proportions, ratios, totals; summary tables; predictive margins; bootstrap, jackknife, and linearization-based variance estimation; regression, instrumental variables, probit, Cox regression; more |
Survival analysis
| Kaplan–Meier and Nelson–Aalen estimators, Cox regression (frailty); parametric models (frailty); competing risks; hazards; time-varying covariates; left and right censoring, Weibull, exponential, and Gompertz analysis; sample size and power analysis; more |
Tools for epidemiologists
| standardization of rates, case–control, cohort, matched case–control, Mantel–Haenszel, pharmacokinetics, ROC analysis, ICD-9-CM, more |
Time series
| ARIMA, ARCH/GARCH, VAR, VECM, multivariate GARCH, dynamic factors, state-space models, high-frequency data, correlograms, periodograms, white-noise tests, unit-root tests, Holt–Winters smoothers, Haver Analytics data, rolling and recursive estimation, more |
Multiple imputation
| five univariate imputation methods, multivariate normal imputation, explore pattern of missingness, manage imputed datasets, estimate model and pool results, transform parameters, joint tests of parameter estimates, more |
Maximum likelihood
| user-specified functions; NR, DFP, BFGS, BHHH; OIM, OPG, robust, bootstrap, and jackknife matrices; Wald tests; survey data; numeric or analytic derivatives; more |
Other statistical methods
| generalized method of moments (GMM), sample size and power, nonlinear regression, stepwise regression, statistical and mathematical functions, more |
Programming language
| adding new commands, command scripting, if, while, command parsing, debugging, menu and dialog-box programming, markup and control language, more |
Matrix programming—Mata
| interactive sessions, large-scale development projects, optimization, matrix inversions, decompositions, eigenvalues and eigenvectors, LAPACK engine, real and complex numbers, string matrices, interface to Stata datasets and matrices, numerical derivatives, object-oriented programming, more |
Internet capabilities
| ability to install new commands, web updating, web file sharing, latest Stata news, more |
Accessibility
| Section 508 compliance, accessibility for persons with disabilities |
Sample session
User-written commands
| User-written commands for meta-analysis, data management, survival, econometrics, more |
|