e-Stat development areas

We aim to produce software tools to cater for three types of user: novice practitioners, advanced practitioners and statistical algorithm developers. We will focus on four areas of development:

Area 1: pre-analysis tasks
Area 2: analysing complex models on large datasets
Area 3: interoperability between software packages
Area 4: learning, sharing and collaborating

Area 1: pre-analysis tasks

There is a need for tools to help users locate, combine and manipulate multiple data sources that are available on the web. This allows users to create data sets that are best suited to pursue their substantive questions.

Area 2: analysing complex models on large datasets

Many problems in quantitative social science require complex models on large data sets and are not tractable to researchers given the computing resources they have available. We propose a variety of developments aimed at our third category of user, algorithm developers, that will allow the creation of computationally efficient algorithms for models of 'arbitrary' complexity.

Area 3: interoperability between software packages

We will provide tools that allow statistical modelling packages used by social scientists and econometricians to exchange statistical models. This will allow users to collaborate using different software packages and iteratively exchange models and work on them without having to learn the syntax of each other's favoured package. We also provide a single consistent user-friendly interface to our own estimation algorithms and those in other packages.

Area 4: learning, sharing and collaborating

We will create tools for packaging the artifacts of the research process (data, tables, models, graphs, scripts) up into executable notebooks. We will also create social networking tools for sharing executable notebooks containing the results and techniques of the research process. These tools will have sophisticated tagging and searching facilities to allow users to find the content they need.