Generations

  1. The first generation technology was a character-based MS DOS application known as Arcus Pro-Stat. Developed in the late 1980s early 1990s this software was built with Microsoft FORTRAN, Assembler and BASIC Professional Development System tools. It focused on statistical functions commonly used in medical research, making them easier to use via an intuitive interface. Data were entered via a spreadsheet-like interface and analyses were reported in text that the user could edit, with hidden links to context sensitive help in the output. The editor was later made available as a help compiler for authoring clinical guidelines, known as Path.Finder.
  2. The second generation technology, first known as Arcus Quickstat, translated the DOS application into a graphical program for Microsoft Windows. It built on the popular spreadsheet/word-processor paradigm for editing data and reports of analyses. The software was developed in Microsoft Visual Basic, Microsoft Assembler and Digital (later Compaq) FORTRAN. The FORTRAN was refactored as FORTRAN 90. After doubling the available number of statistical functions and expansion of the help content, in 2000 the software was renamed StatsDirect.
  3. The third generation technology was introduced in August 2013 for a year of beta testing. This marked a major shift in the underlying technology: All StatsDirect 2 code was translated into managed C#, aimed at allowing the code to be both portable and optimised at install-time for users' environments. More demanding calculations and simulations were added, taking advantage of the loop and memory optimisation features of the .Net platform, especially when running on 64-bit Windows. StatsDirect 3 maintains the popular spreadsheet/word-processor/menu-driven/embedded-help interface. The charting interface was changed to give users more control over elements of charts. The computational engine was separated from user-interaction, reporting and help - enabling new functions to be written by external developers in R or any of the languages that can be compiled in real time under .Net.
  4. The fourth generation technology was introduced in April 2024 as a consolidation of the third generation of StatsDirect for the mature .Net (version 6.1 and above) platform. It adopts the .Net core readiness for a future cross-platform application. Version 4 also marks the move to a free, open-source resource.
  5. The fifth generation technology is being developed to blend desktop software with AI 'conversations' that generate statistical scripts (see slides).

StatsDirect 4 Design Principles

  1. User-interaction must anticipate common statistical misconceptions, challenging them through: interaction dialogs; the format of results reported; and help materials linked to contexts of menus, dialogs and reports.
  2. User-interaction, data management, statistical calculation, reporting and help technologies must be separated in code so that different components can be substituted in future/derived applications, for example a cross-platform browser-based reporting engine with SVG charting instead of Windows-specific rich text and metafile based reports.
  3. All in-built numerical algorithms must allow compilers to manage memory and loop optimisation so that advances in compiler technologies feed through into performance gains.
  4. Power-users must be able to add their own statistical functions to the basic set distributed, such that statistical gurus can broadcast new/adapted methods to their disciplines in ways that are easy to transfer via social media and email.
  5. All functions must be described in an XML file that controls user user-interaction, calculation, reporting and help pointers.
  6. Statistical algorithms are called by the controlling XML from: a library of methods in the core application; or code embedded in the XML as a language that can be compiled in real time under .Net or control syntax for an external statistical engine.
  7. StatsDirect must be able to create and run R scripts, for both users who wish to interact with the R graphical user interface and for those who don't.
  8. The StatisticalHelp learning materials (and StatsDirect's help system) will be developed (in open source) to harness and test generative AIs' 'Conversational Data Analytic' abilities, e.g. in generating R scripts for causal inference given a set of data an 'conversations' about causality.
  9. StatsDirect application source code is available to the community as shared source now and will soon be open source.