Stat-JR is a statistical environment that allows you to explore, analyse, and display data, using in-built functionality (including an algebra system, the eSTAT MCMC engine, and a custom C++ engine) and/or a large range of third-party-authored software with which it can interoperate.
Stat-JR is implemented in Python. Users have a choice of three different Stat-JR interfaces: a chiefly menu-driven interface called TREE, an eBook-reader called DEEP (both run via a browser), and a command-line interface called runStatJR.
Stat-JR employs a modular system of templates, each of which executes a particular function with users' data. Templates can be written by users, thus allowing extension of Stat-JR's functionality, and disseminatination of new statistical methodology and algorithms.
The initals of Stat-JR (pronounced "stature") are taken from those of the late Jon Rasbash, whose vision was instrumental to its conception.
Given the plethora of statistical software packages already available, why use Stat-JR? Today's statistical software environment is an exciting, fast-moving one, and we hope Stat-JR offers a valuable and useful addition to this via its unique combination of functionality, including interoperability with other software packages, its own in-house engines, the transparency of its modular template system, and the range of different ways users can choose to interact with it, including a dynamic eBook-reading interface.
Stat-JR is a universal gateway to many other statistical packages that users may be unfamiliar with - it can take your data and open, run and return results from a range of other software programmes - effectively allowing access to a huge statistical resource through a common, user-friendly interface. In doing so, it provides tools to help teach software-specific knowledge to those wishing to learn, but also circumvents the immediate need to master software-specific techniques each time functionality of a new package is required.
The modular system of templates, which users themselves can author, allows extension of Stat-JR's functionality to include novel statistical methodology and algorithms, thus allowing statistical researchers and algorithm developers to disseminate their techniques to those who would find them useful. The template system also allows Stat-JR to be readily tailored to different research fields, for example with the use of discipline-specific terminology and techniques.
Stat-JR is distributed with MLwiN, and is therefore free to UK academics, and otherwise available for purchase; see our Ordering & Installing Stat-JR page for further details on how to order / upgrade.
Stat-JR's licence agreement (PDF, 29kB) details warranties, copyright, etc.
Stat-JR currently only runs under Windows OS but can be run from a Linux/Mac machine through a virtual machine or a Windows emulator. We are hoping to make it cross-platform in the future.
Users interact with both the TREE and DEEP interfaces via a browser (although they can (and for most users will) be running locally on your machine). Our most extensive tests have been with Chrome and Firefox, and (to a lesser extent) with Internet Explorer too (version 9, or version 10 in Compatability View). Please let us know via our Bug Report Form whether you experience any particular difficulties, though.
The best place to start is by looking at the software manuals: e.g. try running some of the simpler templates presented in the TREE Beginner's Guide, which also includes a Quick-start section to help you become familiar with the basics of the TREE interface (the TREE interface itself also features tooltips: i.e. help and advice which appears in some regions as your cursor hovers over them).
Also, keep an eye out for Stat-JR workshops we will be running.
See here for details of all future (and past) workshops and courses using Stat-JR.
The development of Stat-JR was a collaborative project, funded by the ESRC, between the Universities of Bristol and Southampton.
Stat-JR has been designed to cater for a wide range of users: from the novice statistical practitioner to the statistical guru and algorithm developer.
For example, its TREE and DEEP interfaces are designed to be accessible, with the former a universal hub which can communicate with a wide range of other statistical software packages (and also Stat-JR's own statistical engines). So users don't have to learn the nuts and bolts of each piece of software which may contain functions useful to them, however (and importantly), when Stat-JR fits a model (or plots a chart, produces a table, etc) it returns a variety of resources which can be used as tools to help learn about the statistical methods they are using, and their implementation in a number of different software packages.
The DEEP interface, on the other hand, allows users to read interactive eBooks which exploit Stat-JR's statistical functionality. This opens the door for eBook authors to create valuable teaching resources: eBooks which introduce statistical methodologies, and so on. It also allows for the creation of transparent reports, in which analyses are embedded, interactive and accessible.
Finally, if you are an algorithm developer and you wish to make your new model/algorithm available to applied statisticians, then why not implement it in Stat-JR by writing a template?
If you have used any aspects of Stat-JR in your work, please use the following citation (adjusting the version number accordingly):
Charlton, C.M.J.; Michaelides, D.T.; Parker, R.M.A.; Cameron, B.; Szmaragd, C.; Yang, H.; Zhang, Z.; Frazer, A.J.; Goldstein, H.; Jones, K.; Leckie, G.; Moreau, L.; Browne W.J. (2013) Stat-JR version 1.0. Centre for Multilevel Modelling, University of Bristol & Electronics and Computer Science, University of Southampton.
You can find instructions on installation here.
Yes, Stat-JR needs to be installed by someone with administator rights.
Since our recommended C++ compiler, MinGW, can have problems with paths which contain spaces, we strongly recommend that you do not install it under a user name which contains spaces (e.g. JaneSmith would be fine, Jane Smith would not be).
Our in-built estimation engine requires a C++ compiler to be installed on your machine. We suggest you install MinGW. Full details on the download and installation instruction for this C++ compiler can be found on our Ordering & Installation page.
For Stat-JR to work, the only other software required is a C++ compiler (full installation instructions are available here).
Stat-JR works with datasets saved in Stata format, i.e. with a .dta extension. It looks for these in the ...\datasets folder of the Stat-JR install, and also in a folder saved, by default, under your user name, e.g. C:\Users\YourName\.statjr\datasets (you can change the path via Settings in the black bar at the top of the browser window in the TREE interface).
If your dataset is already in .dta format (see below), then you can upload it, in TREE, via (i) Dataset > Upload (menu options in the black bar at the top of the browser window), which will upload it into the temporary memory cache, or by (ii) saving your dataset in the StatJR\datasets folder, and then selecting Debug > Reload datasets (again, accessible via the black bar at the top of the browser window). If, instead, you have it (iii) saved as a .txt file, you can use Stat-JR's LoadTextFile template to save it into the temporary memory cache (the template LoadTextFileMoreOptions allows the user to specify more particulars, and can also handle string variables).
In the case of options (i) and (iii) the dataset will be available for use in the current session, but you then need to download it (as a .dta file) via Dataset > Download (e.g. saving it into the StatJR\datasets folder) for use in the future sessions too.
Stat-JR has a modular system of templates, each defining a certain function (or suite of functions). Users choose a template to use in conjunction with their dataset of interest. Some templates fit models, others plot charts, some produce data summaries, and so on.
One of the advantages of this system is that Stat-JR's functionality can be extended simply by adding additional template files. Templates are written in Python, and are (typically) saved in Stat-JR's templates subdirectory, with the extension .py. Stat-JR is distributed with a 'core' set of templates, but others will be made available on our Downloads page, and it is hoped that some users will write their own too.
The Advanced User guide (see the manuals page) provides some examples of how to write templates, with some explanation about how the Stat-JR system works.
Another useful resource, of course, are the templates provided with Stat-JR, the code in which can serve as an aid to writing novel templates (e.g. modifying existing templates to fit your specific needs).
Since Stat-JR has been developed using the Python, knowledge of Python will help you in writing templates, although if you're not familar with Python, but you know precisely what you want to do from a statistical (including data manipulation) point of view, and you have a certain level of experience of computer programming (at least of macro languages), then using the Advanced User guide, and inspecting the code in the templates distributed with Stat-JR, will be of consirable help.
Stat-JR has its own built-in MCMC engine, called eStat (not to be confused with the ESRC's e-Stat quantitative node via which the project was funded!)
Stat-JR has an built-in algebra system (main developer Bruce Cameron) which takes users' template-specific input, and generates the formulae necessary to fit the model requested. The files produced by the algebra system are used to generate C++ code which is run via the eSTAT MCMC engine to fit the model.
Not all templates will use the algebra system: just those which (a) use eSTAT, and (b) don't solely engage with eSTAT via the Custom C engine.
Those templates which do use the algebra system do so via the model attribute, which you'll see in the template code itself; this produces a file with code which shares much in common with that of BUGS.
The algebra system returns the full conditional posterior distribution for each parameter, either as a known distribution with formula for each parameter or as an unknown distribution function. These are returned in a LaTeX file called algorithm.tex, and also in a series of *.xml files, one for each parameter; it is these latter files which are used to generate the C++ code.
Whilst flexible, there are some modelling scenarios the algebra system cannot currently handle (e.g. a restriction to six responses); however, the eSTAT MCMC engine can still be used in such instances, via Stat-JR's Custom C++ engine.
Stat-JR has a built-in engine which generates customised C++ code; this can be used in instances where the user wishes to employ eSTAT to fit a model which is not currently supported by Stat-JR's algebra system.
Templates supporting this feature have CustomC in their list of engines, and contain a function called customccode which includes all the code to fit the model. The engine effectively allows the user to work out all the steps for their model algorithm themselves and write these in one Python function that generates C code.
The customcode function writes out C code for the update step of the MCMC algorithm: i.e. code to perform one iteration of the MCMC algorithm. Steps for adapting (if required) are done automatically by the system and so are not included as part of the template's customcode function. C code cannot be written out directly, as it needs to be linked to inputs the user has given. At present this is done throughout the code whereas a C programmer approaching this for the first time might prefer to perform the linking to inputs initially, and then have more generic code; we will investigate this in a later Stat-JR release.
In Stat-JR's packages directory there is a file, CustomC.py, that performs the operations required when the CustomC engine is chosen. This file borrows considerably from eStat.py (also in the packages directory) as, aside from not calling the algebra system, the two engines are fairly similar.
In TREE's Settings menu there are options to Generate standalone code (under both CustomC and eStat). If you tick this option, Stat-JR creates a standalone C++ file, engine.cpp, which is used to fit the model; since Stat-JR includes parallel processing, only one standalone file is created.
Not ticking this option (which is the default), results in the relevant C++ code necessary to fit the model being contained in a number of *.cpp files, including modelcode.cpp and a range of files containing supporting routines. Each of these pieces of C++ code is compiled separately, and Python code within Stat-JR pieces everything together.
Stat-JR can interoperate with a wide range of other software packages (see our Additional Software Packages page for further details). This relies upon template writers providing interoperability for their templates, and so there is not universal coverage for all packages (and also, of course, many templates will perform functions which aren't supported by certain statistical software packages anyway).
To interoperate with other software via Stat-JR you need to (a) have access to the third party software package of interest and (b) inform Stat-JR of the path to the this software. You can update the paths to all software packages via the Settings menu (in the black bar at the top) in the TREE interface, or by directly editing the settings.cfg file in the .statjr directory saved under your username.
You can also interact with Stat-JR remotely by using the command line environment.
Stat-JR currently supports the following software packages:
Stat-JR has not found any suitable statistical software package to run the selected template. The list and location of the statistical software package implemented in Stat-JR can be found in the settings.cfg file in the .statjr directory under your username. You can either edit this file directly (e.g. via a text editor) or via Stat-JR's Settings, accessible via the black bar at the top of the TREE interface. Further details are available on our Additional Software Packages page.
TREE (which stands for Template Reading and Execution Environment) is a menu-driven point-and-click interface. You interact with it via a browser, and it is a flexible environment in which you pair-up templates with datasets, and then specify inputs to perform desired functions. Depending on the template chosen, outputs include results tables, graphs, equations, scripts, macros, MCMC chains, and so on. TREE contains tooltips (contextual help and advice which appears when your cursor hovers over certain areas), and also has a supporting Beginner's Guide, which includes a Quick-start guide.
There is a Beginner's Guide to TREE available via our Stat-JR manuals page, which includes a Quick-start guide as well. Otherwise, we will highlight relevant workshops as and when they become available.
Stat-JR's DEEP (which stands for Documents with Embedded Execution and Provenance) interface is a dynamic eBook-reading environment which users interact with via a browser. It can fully-exploit Stat-JR's statistical functionality, but the eBook environment also allows a lot of contextual information (compared to the TREE interface) to be provided by the eBook author: e.g. guiding a user through an analysis, describing a particular set of models, illustrating how to operate third-party statistical software, and so on; it is a less flexible environment than the TREE interface, however.
Stat-JR's command line interface, which is still under development, will, in the future, enable Stat-JR to be run from other software.