Statistical Software Overview: STAT 5302-5303

STA 6166 Statistical Methods in Research I
PACKAGES With possible few exceptions, the latest versions of these packages have all the features needed for this course.
  MINITAB A menu-driven, all-in-one statistical and graphical analysis software package, known for its ease-of-use, reliability, and broad collection of methods. The professional version is expensive, but an affordable student version is also available. Two options here are to either order it bundled with the textbook, or rent it from Minitab for the semester. Details at www.minitab.com. Note that the student version has (among others) limitations on the size of datasets that can be analyzed, but it is sufficient for this course.

As of Fall 2007, Texas Tech does not have site licenses to offer students. However, apart from Minitab's own rental option, students can also rent rent a copy for 6 months from e-academy.

  SAS A complete statistical data analysis and data management package. Used by most applied statisticians. Expensive, but Texas Tech faculty/staff/students can get a site license from TTU's Technology Support Center for a reasonable fee. The Center also offers free training shortcourses in SAS. SAS is also freely available on all computers in the library.

SAS uses a scripting language to tell the computer what data manipulations and computations to perform. The learning curve for the language is much longer than for some of the other menu-driven packages. You can see more about the package by linking to www.sas.com.

 

SPSS

Another complete statistical data analysis package, one primarily designed for data analysis in the social sciences. This is a menu-driven package that provides output "objects" that can be cut and pasted as is into word processing documents. It is similar to Minitab. Texas Tech faculty/staff/students can get a site license from TTU's Technology Support Center for a reasonable fee. The Center also offers free training shortcourses in SPSS.

Check out http://www.spss.com/.

  JMP This is a menu-driven analysis package from SAS Institute that has gained much appeal in the biological sciences. JMP's approach to data analysis is very much exploratory, hence statistical graphics is integral to the output. You purchase JMP outright (like Minitab) and there is a student edition that can be bundled with the textbook. Check out http://www.jmpdiscovery.com/product/index.shtml. Also, there are JMP books available from the Duxbury Site.
  S+ S+ is an object-oriented, command driven programming language that is specifically designed for statistical analysis. Many academic statisticians are using S+ because it allows them the flexibility to modify and combine existing procedures and program new, recently-released methodologies. The learning curve for S+ is the longest of any of the packages discussed here, but it is also the most flexible. The most recent version of S+ has menus and associated dialog boxes, file import and graph export capabilities that make it much more user friendly. It can be purchased outright or under a license agreement and is not cheap. Check out www.statsci.com.
  Matlab Similar to S+, Matlab is an object-oriented command driven programming language, but is specifically designed for/by the engineering sciences. It has full statistical analysis capabilities. It is expensive, but student versions are available for a lower cost. Check out www.mathworks.com.
  R R is a free-ware object-oriented, command driven programming language tailored after the original S+ system. This is not a menu-driven system but it does have all of the features needed for this course. And of course, it is Free. It has the same long learning curve as S+. Check out http://cran.us.r-project.org/.
  EXCEL

A spreadsheet program, EXCEL has a statistical analysis add-in tool that will perform some of the analyses requested as part of this course.

The statistical analysis add-in has some limitations, especially as we get to more complex analyses. Graphics are excellent but often not of the type particularly useful for statistical analysis. There are other statistical analysis add-in packages for EXCEL that can be purchased (see for example Analyse-it or XLstat ), but the instructor has no experience with them. Finally, EXCEL, QUATTRO PRO and other spreadsheet program are excellent platforms for entering raw data and performing minor data manipulations. Plan on using your spreadsheet program but also add one of the statistical analysis packages above to your professional tool kit.

  Others

We cannot begin to list all of the statistical analysis packages currently available. If the packages is available to you at home or work and you want to use it, first check that it has routines for the analyses listed in the next table. Check out the list of 129 statistics and mathematics packages at the link below. These provide United Kingdom suppliers, but you can easily find the US suppliers from these links. http://www.stats.gla.ac.uk/cti/links_stats/software.html.

Of course, you probably haven't had statistics yet, so this list may not make much sense to you. This is why the programs listed above are recommended; they meet most if not all of the requirements below. If you choose another package, you can check its user's guide to see if it has sections that look somewhat like the following. Otherwise you can take your chances and hope it is all there (in most cases it will be there since these are basic statistical routines that any statistics package should have). Other Statistical Programs should be able to do the following:

Statistical Graphics Construct, bar charts, pie charts, scatterplots and most importantly histograms. In addition it would be great if the program could construct 3-dimensional plots and 2-dimensional contour plots.
One- and Two-sample tests Perform one- and two-sample z- and t-tests. Output p-values. In addition, it will simplify your life if the program has a routine for computing sample sizes for these tests.
Frequency table. Perform a one- and two-sample Chi Square test and output associated p-values.
Regression analysis Perform simple and multiple regression analysis. Estimate parameters, confidence intervals, confidence and prediction bands on the predicted regression model. Output the model associated analysis of variance table. Facilitate variable selection in multiple regression. In addition, it would be nice if it also provided the facility to perform a residual analysis and had some facility for outlier or influential point detection.
Analysis of Variance Provide analysis of variance computations for standard multiple treatment factor designs in completely randomized, randomized complete block and Latin square designs.
Analysis of covariance Provide analysis of covariance computations for one covariate in a completely randomized design with one treatment factor at multiple levels.
Generalized Linear Models (GLM's) Be able to do GLM's with common link functions, in particular Logistic and Poisson Regression; unbalanced ANOVA's; incomplete block designs; etc.
Linear Mixed Models (LMM's) Be able to do LMM's which are linear models with both fixed and random effects.

Back to top