Statistical computing research
During the week, I received final confirmation notice that the special issue of Statistical Science that Vince Carey and I put together is finally published. There are four papers from leaders in the field of statistical computing research: John Chambers, Duncan Temple Lang, Michael Lawrence and Michael Morgan (newly minted members of R Core) and Yihui Xie, Heike Hofmann and Xiaoyue Cheng. The links to the overview and the four papers are below.
- Four Papers on Contemporary Software Design Strategies for Statistical Methodologists Vincent Carey and Dianne Cook
- Object-Oriented Programming, Functional Programming and R John Chambers
- Scalable Genomics with R and Bioconductor Michael Lawrence and Martin Morgan
- Enhancing R with Advanced Compilation Tools and Methods Duncan Temple Lang
- Reactive Programming for Interactive Graphics Yihui Xie, Heike Hofmann and Xiaoyue Cheng
These four papers document statistical computing research, and hopefully inspire talented young researchers to tackle ideas in these areas.
“Statistical computing” is different to “computational statistics”. Computational statistics research develops of algorithms for statistical analysis, eg project pursuit, importance sampling, maximum likelihood. Statistical computing research explores the availability of new technology for statistical purposes. There has been a long history of statistical computing research dating back, from my knowledge, to at least the 1960s. When computing languages were first being developed John Chambers, Rick Becker and Allan Wilks started thinking about how a language might support data analysis, and thus was the birth of the S language, which is used today in R. Statistical computing researchers have brought us systems such as S, Splus, R, SAS, SPSS, Systat, XLispStat, XploRe, DataDesk, GENSTAT, Minitab, Antelope, Dataviewer, QUAIL, XGobi, GGobi, Mondrian, Orca. I have missed a lot, I am sure, so if you know of more, please add them in comments at the end of the blog.
More than a decade ago, I documented a discussion between four attendees at a workshop on the future of statistical computing in the newsletter of the ASA Sections of Statistical Computing and Graphics, which can be found here. (BTW, If you browse these newsletters look for some real gems in the short articles, particularly many by Dan Carr on elegant data visualizations. This newsetter has largely been supplanted by the R Journal today.) There was concern about the future of statistical computing research in the face of declining support for these areas in industry research labs. Substantial accomplishments have historically come from places like Bell Labs. The big question was whether the research support could make the leap from these sources into academia.
Although, it is is still not easy, to a large extent statistical computing research has made the leap. The authors of three of these four papers are in academic positions. In numerous departments around the world, statistical computing research is thriving. Some examples are University of Auckland, University of Waterloo, Augsburg University, Harvard Biostats (please add more in the comments!). From personal experience, we have had steady support from the administration at Iowa State University in promoting statistical computing research, especially focusing on graphics. In addition, new homes in industry have emerged, like RStudio which is a hot-bed of open-source development, and Genentech and FHCRC have brought us bioconductor.
With the growing importance of data science, to tackle big data problems, statistical computing is a critical area for the Statistics research.