SlideShare a Scribd company logo
1 of 102
Download to read offline
The                                                                                Journal                 Volume 2/2, December 2010

     A peer-reviewed, open-access publication of the R Foundation
                       for Statistical Computing




Contents
Editorial . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .                                                                       3

Contributed Research Articles
Solving Differential Equations in R . . . . . . . . . . . . . . . . . . . . . . . . . .                                                        .   .   .   .   .   .   .   .   .    5
Source References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .                                                      .   .   .   .   .   .   .   .   .   16
hglm: A Package for Fitting Hierarchical Generalized Linear Models . . . . .                                                                   .   .   .   .   .   .   .   .   .   20
dclone: Data Cloning in R . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .                                                          .   .   .   .   .   .   .   .   .   29
stringr: modern, consistent string processing . . . . . . . . . . . . . . . . . . .                                                            .   .   .   .   .   .   .   .   .   38
Bayesian Estimation of the GARCH(1,1) Model with Student-t Innovations . .                                                                     .   .   .   .   .   .   .   .   .   41
cudaBayesreg: Bayesian Computation in CUDA . . . . . . . . . . . . . . . . .                                                                   .   .   .   .   .   .   .   .   .   48
binGroup: A Package for Group Testing . . . . . . . . . . . . . . . . . . . . . .                                                              .   .   .   .   .   .   .   .   .   56
The RecordLinkage Package: Detecting Errors in Data . . . . . . . . . . . . . .                                                                .   .   .   .   .   .   .   .   .   61
spikeslab: Prediction and Variable Selection Using Spike and Slab Regression                                                                   .   .   .   .   .   .   .   .   .   68

From the Core
What’s New? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .                                                                          74

News and Notes
useR! 2010 . . . . . . . . . . . . . . .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   . 77
Forthcoming Events: useR! 2011 . .         .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   . 79
Changes in R . . . . . . . . . . . . .     .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   . 81
Changes on CRAN . . . . . . . . . .        .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   . 90
News from the Bioconductor Project         .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   . 101
R Foundation News . . . . . . . . .        .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   . 102
2




                    The          Journal is a peer-reviewed publication of the R
                    Foundation for Statistical Computing. Communications regarding
                    this publication should be addressed to the editors. All articles are
                    copyrighted by the respective authors.

                    Prospective authors will find detailed and up-to-date submission
                    instructions on the Journal’s homepage.


                                             Editor-in-Chief:
                                              Peter Dalgaard
                                            Center for Statistics
                                        Copenhagen Business School
                                             Solbjerg Plads 3
                                            2000 Frederiksberg
                                                Denmark

                                            Editorial Board:
                           Vince Carey, Martyn Plummer, and Heather Turner.

                                        Editor Programmer’s Niche:
                                                Bill Venables

                                            Editor Help Desk:
                                               Uwe Ligges

                                        Editor Book Reviews:
                                             G. Jay Kerns
                                Department of Mathematics and Statistics
                                     Youngstown State University
                                     Youngstown, Ohio 44555-0002
                                                 USA
                                           gkerns@ysu.edu

                                         R Journal Homepage:
                                    http://journal.r-project.org/

                                  Email of editors and editorial board:
                                 firstname.lastname@R-project.org

                         The R Journal is indexed/abstracted by EBSCO, DOAJ.




The R Journal Vol. 2/2, December 2010                                                       ISSN 2073-4859
3



Editorial
by Peter Dalgaard                                          putationally demanding. Especially for transient so-
                                                           lutions in two or three spatial dimensions, comput-
Welcome to the 2nd issue of the 2nd volume of The          ers simply were not fast enough in a time where nu-
R Journal.                                                 merical performance was measured in fractions of a
    I am pleased to say that we can offer ten peer-        MFLOPS (million floating point operations per sec-
reviewed papers this time. Many thanks go to the           ond). Today, the relevant measure is GFLOPS and
authors and the reviewers who ensure that our arti-        we should be getting much closer to practicable so-
cles live up to high academic standards. The tran-         lutions.
sition from R News to The R Journal is now nearly              However, raw computing power is not sufficient;
completed. We are now listed by EBSCO and the              there are non-obvious aspects of numerical analysis
registration procedure with Thomson Reuters is well        that should not be taken lightly, notably issues of sta-
on the way. We thereby move into the framework of          bility and accuracy. There is a reason that numerical
scientific journals and away from the grey-literature       analysis is a scientific field in its own right.
newsletter format; however, it should be stressed              From a statistician’s perspective, being able to fit
that R News was a fairly high-impact piece of grey         models to actual data is of prime importance. For
literature: A cited reference search turned up around      models with only a few parameters, you can get
1300 references to the just over 200 papers that were      quite far with nonlinear regression and a good nu-
published in R News!                                       merical solver. For ill-posed problems with func-
    I am particularly happy to see the paper by            tional parameters (the so-called “inverse problems”),
Soetart et al. on differential equation solvers. In        and for stochastic differential equations, there still
many fields of research, the natural formulation of         appears to be work to be done. Soetart et al. do not
models is via local relations at the infinitesimal level,   go into these issues, but I hope that their paper will
rather than via closed form mathematical expres-           be an inspiration for further work.
sions, and quite often solutions rely on simplifying           With this issue, in accordance with the rotation
assumptions. My own PhD work, some 25 years ago,           rules of the Editorial Board, I step down as Editor-
concerned diffusion of substances within the human         in-Chief, to be succeded by Heather Turner. Heather
eye, with the ultimate goal of measuring the state         has already played a key role in the transition from
of the blood-retinal barrier. Solutions for this prob-     R News to The R Journal, as well as being probably
lem could be obtained for short timespans, if one as-      the most efficient Associate Editor on the Board. The
sumed that the eye was completely spherical. Ex-           Editorial Board will be losing last year’s Editor-in-
tending the solutions to accommodate more realis-          Chief, Vince Carey, who has now been on board for
tic models (a necessity for fitting actual experimental     the full four years. We shall miss Vince, who has al-
data) resulted in quite unwieldy formulas, and even        ways been good for a precise and principled argu-
then, did not give you the kind of modelling freedom       ment and in the process taught at least me several
that you really wanted to elucidate the scientific is-      new words. We also welcome Hadley Wickham as
sue.                                                       a new Associate Editor and member of the Editorial
    In contrast, numerical procedures could fairly         Board.
easily be set up and modified to better fit reality.             Season’s greetings and best wishes for a happy
The main problem was that they tended to be com-           2011!




The R Journal Vol. 2/2, December 2010                                                             ISSN 2073-4859
4




The R Journal Vol. 2/2, December 2010   ISSN 2073-4859
C ONTRIBUTED R ESEARCH A RTICLES                                                                                                           5



Solving Differential Equations in R
by Karline Soetaert, Thomas Petzoldt and R. Woodrow                      complete repertoire of differential equations can be
Setzer1                                                                  numerically solved.
                                                                             More specifically, the following types of differen-
  Abstract Although R is still predominantly ap-                         tial equations can now be handled with add-on pack-
  plied for statistical analysis and graphical repre-                    ages in R:
  sentation, it is rapidly becoming more suitable
  for mathematical computing. One of the fields                               • Initial value problems (IVP) of ordinary differ-
  where considerable progress has been made re-                                ential equations (ODE), using package deSolve
  cently is the solution of differential equations.                            (Soetaert et al., 2010b).
  Here we give a brief overview of differential                              • Initial value differential algebraic equations
  equations that can now be solved by R.                                       (DAE), package deSolve .
                                                                             • Initial value partial differential equations
                                                                               (PDE), packages deSolve and ReacTran
Introduction                                                                   (Soetaert and Meysman, 2010).
Differential equations describe exchanges of matter,                         • Boundary value problems (BVP) of ordinary
energy, information or any other quantities, often as                          differential equations, using package bvpSolve
they vary in time and/or space. Their thorough ana-                            (Soetaert et al., 2010a), or ReacTran and root-
lytical treatment forms the basis of fundamental the-                          Solve (Soetaert, 2009).
ories in mathematics and physics, and they are in-
                                                                             • Initial value delay differential equations
creasingly applied in chemistry, life sciences and eco-
                                                                               (DDE), using packages deSolve or PBSddes-
nomics.
                                                                               olve (Couture-Beil et al., 2010).
    Differential equations are solved by integration,
but unfortunately, for many practical applications                           • Stochastic differential equations (SDE), using
in science and engineering, systems of differential                            packages sde (Iacus, 2008) and pomp (King
equations cannot be integrated to give an analytical                           et al., 2008).
solution, but rather need to be solved numerically.
                                                                             In this short overview, we demonstrate how to
    Many advanced numerical algorithms that solve
                                                                         solve the first four types of differential equations
differential equations are available as (open-source)
                                                                         in R. It is beyond the scope to give an exhaustive
computer codes, written in programming languages
                                                                         overview about the vast number of methods to solve
like FORTRAN or C and that are available from
                                                                         these differential equations and their theory, so the
repositories like GAMS (http://gams.nist.gov/) or
                                                                         reader is encouraged to consult one of the numer-
NETLIB (www.netlib.org).
                                                                         ous textbooks (e.g., Ascher and Petzold, 1998; Press
    Depending on the problem, mathematical for-                          et al., 2007; Hairer et al., 2009; Hairer and Wanner,
malisations may consist of ordinary differential                         2010; LeVeque, 2007, and many others).
equations (ODE), partial differential equations                              In addition, a large number of analytical and nu-
(PDE), differential algebraic equations (DAE), or de-                    merical methods exists for the analysis of bifurca-
lay differential equations (DDE). In addition, a dis-                    tions and stability properties of deterministic sys-
tinction is made between initial value problems (IVP)                    tems, the efficient simulation of stochastic differen-
and boundary value problems (BVP).                                       tial equations or the estimation of parameters. We
    With the introduction of R-package odesolve                          do not deal with these methods here.
(Setzer, 2001), it became possible to use R (R Devel-
opment Core Team, 2009) for solving very simple ini-
tial value problems of systems of ordinary differen-                     Types of differential equations
tial equations, using the lsoda algorithm of Hind-
marsh (1983) and Petzold (1983). However, many                           Ordinary differential equations
real-life applications, including physical transport
modeling, equilibrium chemistry or the modeling of                       Ordinary differential equations describe the change
electrical circuits, could not be solved with this pack-                 of a state variable y as a function f of one independent
age.                                                                     variable t (e.g., time or space), of y itself, and, option-
    Since odesolve, much effort has been made to                         ally, a set of other variables p, often called parameters:
improve R’s capabilities to handle differential equa-
tions, mostly by incorporating published and well                                                    dy
                                                                                               y =      = f (t, y, p)
tested numerical codes, such that now a much more                                                    dt
   1 The views expressed in this paper are those of the authors and do not necessarily reflect the views or policies of the U.S. Environmental

Protection Agency


The R Journal Vol. 2/2, December 2010                                                                                    ISSN 2073-4859
6                                                                            C ONTRIBUTED R ESEARCH A RTICLES


    In many cases, solving differential equations re-       rithm (Press et al., 2007). Two more stable solu-
quires the introduction of extra conditions. In the fol-    tion methods implement a mono implicit Runge-
lowing, we concentrate on the numerical treatment           Kutta (MIRK) code, based on the FORTRAN code
of two classes of problems, namely initial value prob-      twpbvpC (Cash and Mazzia, 2005), and the collocation
lems and boundary value problems.                           method, based on the FORTRAN code colnew (Bader
                                                            and Ascher, 1987). Some boundary value problems
                                                            can also be solved with functions from packages Re-
Initial value problems
                                                            acTran and rootSolve (see below).
If the extra conditions are specified at the initial value
of the independent variable, the differential equa-         Partial differential equations
tions are called initial value problems (IVP).
     There exist two main classes of algorithms to nu-      In contrast to ODEs where there is only one indepen-
merically solve such problems, so-called Runge-Kutta        dent variable, partial differential equations (PDE)
formulas and linear multistep formulas (Hairer et al.,      contain partial derivatives with respect to more than
2009; Hairer and Wanner, 2010). The latter contains         one independent variable, for instance t (time) and
two important families, the Adams family and the            x (a spatial dimension). To distinguish this type
backward differentiation formulae (BDF).                    of equations from ODEs, the derivatives are repre-
     Another important distinction is between explicit      sented with the ∂ symbol, e.g.
and implicit methods, where the latter methods can                            ∂y              ∂y
solve a particular class of equations (so-called “stiff”                         = f (t, x, y, , p)
                                                                              ∂t              ∂x
equations) where explicit methods have problems
with stability and efficiency. Stiffness occurs for in-      Partial differential equations can be solved by sub-
stance if a problem has components with different           dividing one or more of the continuous independent
rates of variation according to the independent vari-       variables in a number of grid cells, and replacing the
able. Very often there will be a tradeoff between us-       derivatives by discrete, algebraic approximate equa-
ing explicit methods that require little work per inte-     tions, so-called finite differences (cf. LeVeque, 2007;
gration step and implicit methods which are able to         Hundsdorfer and Verwer, 2003).
take larger integration steps, but need (much) more             For time-varying cases, it is customary to discre-
work for one step.                                          tise the spatial coordinate(s) only, while time is left in
     In R, initial value problems can be solved with        continuous form. This is called the method-of-lines,
functions from package deSolve (Soetaert et al.,            and in this way, one PDE is translated into a large
2010b), which implements many solvers from ODE-             number of coupled ordinary differential equations,
PACK (Hindmarsh, 1983), the code vode (Brown                that can be solved with the usual initial value prob-
et al., 1989), the differential algebraic equation solver   lem solvers (cf. Hamdi et al., 2007). This applies to
daspk (Brenan et al., 1996), all belonging to the linear    parabolic PDEs such as the heat equation, and to hy-
multistep methods, and comprising Adams meth-               perbolic PDEs such as the wave equation.
ods as well as backward differentiation formulae.               For time-invariant problems, usually all indepen-
The former methods are explicit, the latter implicit.       dent variables are discretised, and the derivatives ap-
In addition, this package contains a de-novo imple-         proximated by algebraic equations, which are solved
mentation of a rather general Runge-Kutta solver            by root-finding techniques. This technique applies to
based on Dormand and Prince (1980); Prince and              elliptic PDEs.
Dormand (1981); Bogacki and Shampine (1989); Cash               R-package ReacTran provides functions to gener-
and Karp (1990) and using ideas from Butcher (1987)         ate finite differences on a structured grid. After that,
and Press et al. (2007). Finally, the implicit Runge-       the resulting time-varying cases can be solved with
Kutta method radau (Hairer et al., 2009) has been           specially-designed functions from package deSolve,
added recently.                                             while time-invariant cases can be solved with root-
                                                            solving methods from package rootSolve .

Boundary value problems
                                                            Differential algebraic equations
If the extra conditions are specified at different
                                                            Differential-algebraic equations (DAE) contain a
values of the independent variable, the differen-
                                                            mixture of differential ( f ) and algebraic equations
tial equations are called boundary value problems
                                                            (g), the latter e.g. for maintaining mass-balance con-
(BVP). A standard textbook on this subject is Ascher
                                                            ditions:
et al. (1995).
    Package bvpSolve (Soetaert et al., 2010a) imple-                             y = f (t, y, p)
ments three methods to solve boundary value prob-                                 0 = g(t, y, p)
lems. The simplest solution method is the single
shooting method, which combines initial value prob-         Important for the solution of a DAE is its index.
lem integration with a nonlinear root finding algo-          The index of a DAE is the number of differentiations


The R Journal Vol. 2/2, December 2010                                                                 ISSN 2073-4859
C ONTRIBUTED R ESEARCH A RTICLES                                                                                  7


needed until a system consisting only of ODEs is ob-        Numerical accuracy
tained.
    Function daspk (Brenan et al., 1996) from pack-         Numerical solution of a system of differential equa-
age deSolve solves (relatively simple) DAEs of index        tions is an approximation and therefore prone to nu-
at most 1, while function radau (Hairer et al., 2009)       merical errors, originating from several sources:
solves DAEs of index up to 3.
                                                               1. time step and accuracy order of the solver,
                                                               2. floating point arithmetics,
Implementation details                                         3. properties of the differential system and stabil-
                                                                  ity of the solution algorithm.
The implemented solver functions are explained by
means of the ode-function, used for the solution of              For methods with automatic stepsize selection,
initial value problems. The interfaces to the other         accuracy of the computation can be adjusted us-
solvers have an analogous definition:                        ing the non-negative arguments atol (absolute tol-
                                                            erance) and rtol (relative tolerance), which control
 ode(y, times, func, parms, method = c("lsoda",             the local errors of the integration.
     "lsode", "lsodes", "lsodar",                                Like R itself, all solvers use double-precision
     "vode", "daspk", "euler", "rk4",                       floating-point arithmetics according to IEEE Stan-
     "ode23", "ode45", "radau", "bdf",                      dard 754 (2008), which means that it can represent
     "bdf_d", "adams", "impAdams",                          numbers between approx. ±2.25 10−308 to approx.
     "impAdams_d"), ...)                                    ±1.8 10308 and with 16 significant digits. It is there-
                                                            fore not advisable to set rtol below 10−16 , except set-
    To use this, the system of differential equations       ting it to zero with the intention to use absolute tol-
can be defined as an R-function (func) that computes         erance exclusively.
derivatives in the ODE system (the model definition)              The solvers provided by the packages presented
according to the independent variable (e.g. time t).        below have proven to be quite robust in most prac-
func can also be a function in a dynamically loaded         tical cases, however users should always be aware
shared library (Soetaert et al., 2010c) and, in addition,   about the problems and limitations of numerical
some solvers support also the supply of an analyti-         methods and carefully check results for plausibil-
cally derived function of partial derivatives (Jacobian     ity. The section “Troubleshooting” in the package vi-
matrix).                                                    gnette (Soetaert et al., 2010d) should be consulted as
    If func is an R-function, it must be defined as:         a first source for solving typical problems.
    func <- function(t, y, parms, ...)
where t is the actual value of the independent vari-        Examples
able (e.g. the current time point in the integration),
y is the current estimate of the variables in the ODE
                                                            An initial value ODE
system, parms is the parameter vector and ... can be
used to pass additional arguments to the function.          Consider the famous van der Pol equation (van der
    The return value of func should be a list, whose        Pol and van der Mark, 1927), that describes a non-
first element is a vector containing the derivatives         conservative oscillator with non-linear damping and
of y with respect to t, and whose next elements are         which was originally developed for electrical cir-
optional global values that can be recorded at each         cuits employing vacuum tubes. The oscillation is de-
point in times. The derivatives must be specified in         scribed by means of a 2nd order ODE:
the same order as the state variables y.
    Depending on the algorithm specified in argu-                           z − µ (1 − z2 ) z + z = 0
ment method, numerical simulation proceeds either           Such a system can be routinely rewritten as a system
exactly at the time steps specified in times, or us-         of two 1st order ODEs, if we substitute z with y1 and
ing time steps that are independent from times and          z with y2 :
where the output is generated by interpolation. With
the exception of method euler and several fixed-step                       y1 = y2
Runge-Kutta methods all algorithms have automatic
                                                                          y2 = µ · (1 − y1 2 ) · y2 − y1
time stepping, which can be controlled by setting ac-
curacy requirements (see below) or by using optional           There is one parameter, µ, and two differential
arguments like hini (initial time step), hmin (minimal      variables, y1 and y2 with initial values (at t = 0):
time step) and hmax (maximum time step). Specific
details, e.g. about the applied interpolation methods                               y 1( t =0) = 2
can be found in the manual pages and the original
                                                                                    y 2( t =0) = 0
literature cited there.


The R Journal Vol. 2/2, December 2010                                                                ISSN 2073-4859
8                                                                              C ONTRIBUTED R ESEARCH A RTICLES


   The van der Pol equation is often used as a test
problem for ODE solvers, as, for large µ, its dy-                                       IVP ODE, stiff

namics consists of parts where the solution changes




                                                               2
very slowly, alternating with regions of very sharp
changes. This “stiffness” makes the equation quite




                                                               1
challenging to solve.




                                                               0
                                                           y
   In R, this model is implemented as a function
(vdpol) whose inputs are the current time (t), the val-




                                                               −1
ues of the state variables (y), and the parameters (mu);




                                                               −2
the function returns a list with as first element the
derivatives, concatenated.                                          0     500     1000      1500     2000      2500   3000

                                                                                             time
 vdpol <- function (t, y, mu) {
   list(c(
     y[2],                                                 Figure 1: Solution of the van der Pol equation, an
     mu * (1 - y[1]^2) * y[2] - y[1]                       initial value ordinary differential equation, stiff case,
   ))                                                      µ = 1000.
 }


    After defining the initial condition of the state                               IVP ODE, nonstiff
variables (yini), the model is solved, and output              2

written at selected time points (times), using de-
Solve’s integration function ode. The default rou-
                                                               1




tine lsoda, which is invoked by ode automatically
                                                               0
                                                           y




switches between stiff and non-stiff methods, de-
pending on the problem (Petzold, 1983).
                                                               −1




    We run the model for a typically stiff (mu = 1000)
                                                               −2




and nonstiff (mu = 1) situation:
                                                                    0      5       10        15          20     25    30

 library(deSolve)                                                                            time
 yini <- c(y1 = 2, y2 = 0)
 stiff <- ode(y = yini, func = vdpol,
     times = 0:3000, parms = 1000)                         Figure 2: Solution of the van der Pol equation, an
                                                           initial value ordinary differential equation, non-stiff
 nonstiff <- ode(y = yini, func = vdpol,                   case, µ = 1.
     times = seq(0, 30, by = 0.01),
     parms = 1)
                                                                        solver      non-stiff        stiff
   The model returns a matrix, of class deSolve,                        ode23       0.37             271.19
with in its first column the time values, followed by                    lsoda       0.26             0.23
the values of the state variables:                                      adams       0.13             616.13
                                                                        bdf         0.15             0.22
 head(stiff, n = 3)                                                     radau       0.53             0.72

     time       y1            y2                           Table 1: Comparison of solvers for a stiff and a
[1,]    0 2.000000 0.0000000000                            non-stiff parametrisation of the van der Pol equation
[2,]    1 1.999333 -0.0006670373                           (time in seconds, mean values of ten simulations on
[3,]    2 1.998666 -0.0006674088                           an AMD AM2 X2 3000 CPU).

    Figures are generated using the S3 plot method             A comparison of timings for two explicit solvers,
for objects of class deSolve:                              the Runge-Kutta method (ode23) and the adams
                                                           method, with the implicit multistep solver (bdf,
 plot(stiff, type = "l", which = "y1",                     backward differentiation formula) shows a clear ad-
   lwd = 2, ylab = "y",                                    vantage for the latter in the stiff case (Figure 1). The
   main = "IVP ODE, stiff")                                default solver (lsoda) is not necessarily the fastest,
                                                           but shows robust behavior due to automatic stiff-
 plot(nonstiff, type = "l", which = "y1",                  ness detection. It uses the explicit multistep Adams
   lwd = 2, ylab = "y",                                    method for the non-stiff case and the BDF method
   main = "IVP ODE, nonstiff")                             for the stiff case. The accuracy is comparable for all


The R Journal Vol. 2/2, December 2010                                                                         ISSN 2073-4859
C ONTRIBUTED R ESEARCH A RTICLES                                                                                           9


solvers with atol = rtol = 10−6 , the default.              xi <- 0.0025
                                                            analytic <- cos(pi * x) + exp((x -
                                                                1)/sqrt(xi)) + exp(-(x + 1)/sqrt(xi))
A boundary value ODE                                        max(abs(analytic - twp[, 2]))
The webpage of Jeff Cash (Cash, 2009) contains many
                                                           [1] 7.788209e-10
test cases, including their analytical solution (see be-
low), that BVP solvers should be able to solve. We             A similar low discrepancy (4 · 10−11 ) is noted for
use equation no. 14 from this webpage as an exam-          the ξ = 0.0001 as solved by bvpcol; the shooting
ple:                                                       method is considerably less precise (1.4 · 10−5 ), al-
                                                           though the same tolerance (atol = 10−8 ) was used
           ξy − y = −(ξπ 2 + 1) cos(πx )
                                                           for all runs.
on the interval [−1, 1], and subject to the boundary           The plot shows how the shape of the solution
conditions:                                                is affected by the parameter ξ, becoming more and
                                                           more steep near the boundaries, and therefore more
                     y( x=−1) = 0                          and more difficult to solve, as ξ gets smaller.
                     y( x=+1) = 0                           plot(shoot[, 1], shoot[, 2], type = "l", lwd = 2,
                                                              ylim = c(-1, 1), col = "blue",
The second-order equation first is rewritten as two            xlab = "x", ylab = "y", main = "BVP ODE")
first-order equations:                                       lines(twp[, 1], twp[, 2], col = "red", lwd = 2)
                                                            lines(coll[, 1], coll[, 2], col = "green", lwd = 2)
         y1 = y2                                            legend("topright", legend = c("0.01", "0.0025",
         y2 = 1/ξ · (y1 − (ξπ 2 + 1) cos(πx ))                "0.0001"), col = c("blue", "red", "green"),
                                                              title = expression(xi), lwd = 2)


It is implemented in R as:                                                                 BVP ODE

 Prob14 <- function(x, y, xi) {                                                                                  ξ
                                                               1.0




   list(c(                                                                                                       0.01
     y[2],                                                                                                       0.0025
                                                                                                                 0.0001
     1/xi * (y[1] - (xi*pi*pi+1) * cos(pi*x))
                                                               0.5




   ))
 }
                                                               0.0




With decreasing values of ξ, this problem becomes
                                                           y




increasingly difficult to solve. We use three val-
ues of ξ, and solve the problem with the shooting,
                                                               −0.5




the MIRK and the collocation method (Ascher et al.,
1995).
    Note how the initial conditions yini and the con-
ditions at the end of the integration interval yend
                                                               −1.0




are specified, where NA denotes that the value is not
                                                                      −1.0         −0.5       0.0       0.5          1.0
known. The independent variable is called x here
(rather than times in ode).                                                                    x


 library(bvpSolve)
 x <- seq(-1, 1, by = 0.01)                                Figure 3: Solution of the BVP ODE problem, for dif-
 shoot <- bvpshoot(yini = c(0, NA),                        ferent values of parameter ξ.
     yend = c(0, NA), x = x, parms = 0.01,
     func = Prob14)
                                                           Differential algebraic equations
 twp <- bvptwp(yini = c(0, NA), yend = c(0,
                                                           The so called “Rober problem” describes an auto-
     NA), x = x, parms = 0.0025,
                                                           catalytic reaction (Robertson, 1966) between three
     func = Prob14)
                                                           chemical species, y1 , y2 and y3 . The problem can be
 coll <- bvpcol(yini = c(0, NA),                           formulated either as an ODE (Mazzia and Magherini,
     yend = c(0, NA), x = x, parms = 1e-04,                2008), or as a DAE:
     func = Prob14)
                                                                             y1 = −0.04y1 + 104 y2 y3
The numerical approximation generated by bvptwp
                                                                             y2 = 0.04y1 − 104 y2 y3 − 3107 y2
                                                                                                             2
is very close to the analytical solution, e.g. for ξ =
0.0025:                                                                       1 = y1 + y2 + y3


The R Journal Vol. 2/2, December 2010                                                                    ISSN 2073-4859
10                                                                                    C ONTRIBUTED R ESEARCH A RTICLES


                                                                                             IVP DAE
    where the first two equations are differential
                                                                                 y1                                           y2
equations that specify the dynamics of chemical
species y1 and y2 , while the third algebraic equation




                                                                  0.8
ensures that the summed concentration of the three




                                                                                                          2e−05
                                                          conc.




                                                                                                  conc.
species remains 1.




                                                                  0.4
    The DAE has to be specified by the residual func-




                                                                                                          0e+00
tion instead of the rates of change (as in ODEs).




                                                                  0.0
                                                                        1e−06   1e+00     1e+06                    1e−06     1e+00        1e+06
            r1 = −y1 − 0.04y1 + 104 y2 y3                                       time                                          time

            r2 = −y2 + 0.04y1 − 104 y2 y3 − 3 107 y2
                                                   2
                                                                                 y3                                          error
            r3 = −1 + y1 + y2 + y3




                                                                                                          1e−09
                                                                  0.8
     Implemented in R this becomes:




                                                                                                          −2e−09
                                                          conc.




                                                                                                  conc.
                                                                  0.4
 daefun<-function(t, y, dy, parms) {




                                                                                                          −5e−09
     res1 <- - dy[1] - 0.04 * y[1] +




                                                                  0.0
               1e4 * y[2] * y[3]
                                                                        1e−06   1e+00     1e+06                    1e−06     1e+00        1e+06
     res2 <- - dy[2] + 0.04 * y[1] -
               1e4 * y[2] * y[3] - 3e7 * y[2]^2                                 time                                          time

     res3 <- y[1] + y[2] + y[3] - 1
     list(c(res1, res2, res3),
          error = as.vector(y[1] + y[2] + y[3]) - 1)      Figure 4: Solution of the DAE problem for the sub-
 }                                                        stances y1 , y2 , y3 ; mass balance error: deviation of to-
                                                          tal sum from one.
 yini <- c(y1 = 1, y2 = 0, y3 = 0)
 dyini <- c(-0.04, 0.04, 0)
 times <- 10 ^ seq(-6,6,0.1)
                                                          Partial differential equations
                                                          In partial differential equations (PDE), the func-
    The input arguments of function daefun are the        tion has several independent variables (e.g. time and
current time (t), the values of the state variables and   depth) and contains their partial derivatives.
their derivatives (y, dy) and the parameters (parms).         Many partial differential equations can be solved
It returns the residuals, concatenated and an output      by numerical approximation (finite differencing) af-
variable, the error in the algebraic equation. The lat-   ter rewriting them as a set of ODEs (see Schiesser,
ter is added to check upon the accuracy of the results.   1991; LeVeque, 2007; Hundsdorfer and Verwer,
    For DAEs solved with daspk, both the state vari-      2003).
ables and their derivatives need to be initialised (y         Functions tran.1D, tran.2D, and tran.3D from
and dy). Here we make sure that the initial condi-        R package ReacTran (Soetaert and Meysman, 2010)
tions for y obey the algebraic constraint, while also     implement finite difference approximations of the
the initial condition of the derivatives is consistent    diffusive-advective transport equation which, for the
with the dynamics.                                        1-D case, is:
 library(deSolve)
                                                                          1     ∂         ∂C                            ∂
 print(system.time(out <-daspk(y = yini,                            −        ·    Ax −D ·                          −      ( Ax · u · C)
   dy = dyini, times = times, res = daefun,                               Ax   ∂x         ∂x                           ∂x
   parms = NULL)))
                                                          Here D is the “diffusion coefficient”, u is the “advec-
     user     system elapsed                              tion rate”, and A x is some property (e.g. surface area)
     0.07       0.00    0.11                              that depends on the independent variable, x.
                                                              It should be noted that the accuracy of the finite
   An S3 plot method can be used to plot all vari-        difference approximations can not be specified in the
ables at once:                                            ReacTran functions. It is up to the user to make sure
                                                          that the solutions are sufficiently accurate, e.g. by in-
 plot(out, ylab = "conc.", xlab = "time",
                                                          cluding more grid points.
   type = "l", lwd = 2, log = "x")
 mtext("IVP DAE", side = 3, outer = TRUE,
   line = -1)                                             One dimensional PDE

    There is a very fast initial change in concentra-     Diffusion-reaction models are a fundamental class of
tions, mainly due to the quick reaction between y1        models which describe how concentration of matter,
and y2 and amongst y2 . After that, the slow reaction     energy, information, etc. evolves in space and time
of y1 with y2 causes the system to change much more       under the influence of diffusive transport and trans-
smoothly. This is typical for stiff problems.             formation (Soetaert and Herman, 2009).


The R Journal Vol. 2/2, December 2010                                                                                      ISSN 2073-4859
C ONTRIBUTED R ESEARCH A RTICLES                                                                                   11


   As an example, consider the 1-D diffusion-                user       system elapsed
reaction model in [0, 10]:                                   0.02         0.00    0.02

              ∂C    ∂          ∂C                           The values of the state-variables (y) are plotted
                 =        D·        −Q
              ∂t   ∂x          ∂x                        against the distance, in the middle of the grid cells
                                                         (Grid$x.mid).
with C the concentration, t the time, x the distance
from the origin, Q, the consumption rate, and with        plot (Grid$x.mid, std$y, type = "l",
boundary conditions (values at the model edges):            lwd = 2, main = "steady-state PDE",
                                                            xlab = "x", ylab = "C", col = "red")
                    ∂C
                           =0
                    ∂x x=0
                                                                                   steady−state PDE
                     Cx=10 = Cext




                                                             20
To solve this model in R, first the 1-D model Grid is




                                                             10
defined; it divides 10 cm (L) into 1000 boxes (N).




                                                             0
                                                         C
 library(ReacTran)



                                                             −10
 Grid <- setup.grid.1D(N = 1000, L = 10)


                                                             −30
    The model equation includes a transport term,
approximated by ReacTran function tran.1D and                       0         2       4       6        8      10
a consumption term (Q). The downstream bound-
                                                                                          x
ary condition, prescribed as a concentration (C.down)
needs to be specified, the zero-gradient at the up-
stream boundary is the default:                          Figure 5: Steady-state solution of the 1-D diffusion-
                                                         reaction model.
 pde1D <-function(t, C, parms) {
   tran <- tran.1D(C = C, D = D,                           The analytical solution compares well with the
                   C.down = Cext, dx = Grid)$dC          numerical approximation:
   list(tran - Q) # return value: rate of change
 }                                                        analytical <- Q/2/D*(Grid$x.mid^2 - 10^2) + Cext
                                                          max(abs(analytical - std$y))
   The model parameters are:
                                                         [1] 1.250003e-05
 D    <- 1    # diffusion constant
 Q    <- 1    # uptake rate                                 Next the model is run dynamically for 100 time
 Cext <- 20
                                                         units using deSolve function ode.1D, and starting
                                                         with a uniform concentration:
    In a first application, the model is solved to
steady-state, which retrieves the condition where the
                                                          require(deSolve)
concentrations are invariant:                             times <- seq(0, 100, by = 1)
                     ∂        ∂C                          system.time(
               0=        D·         −Q                      out <- ode.1D(y = rep(1, Grid$N),
                    ∂x        ∂x                              times = times, func = pde1D,
                                                              parms = NULL, nspec = 1)
In R, steady-state conditions can be estimated using      )
functions from package rootSolve which implement
amongst others a Newton-Raphson algorithm (Press             user       system elapsed
et al., 2007). For 1-dimensional models, steady.1D is        0.61         0.02    0.63
most efficient. The initial “guess” of the steady-state
solution (y) is unimportant; here we take simply N           Here, out is a matrix, whose 1st column contains
random numbers. Argument nspec = 1 informs the           the output times, and the next columns the values of
solver that only one component is described.             the state variables in the different boxes; we print the
    Although a system of 1000 equations needs to be      first columns of the last three rows of this matrix:
solved, this takes only a fraction of a second:
                                                          tail(out[, 1:4], n = 3)
 library(rootSolve)
 print(system.time(                                                 time         1         2         3
 std    <- steady.1D(y = runif(Grid$N),                  [99,]        98 -27.55783 -27.55773 -27.55754
    func = pde1D, parms = NULL, nspec = 1)               [100,]       99 -27.61735 -27.61725 -27.61706
 ))                                                      [101,]      100 -27.67542 -27.67532 -27.67513


The R Journal Vol. 2/2, December 2010                                                                 ISSN 2073-4859
12                                                                                                    C ONTRIBUTED R ESEARCH A RTICLES


    We plot the result using a blue-yellow-red color                               R code, or in compiled languages. However, com-
scheme, and using deSolve’s S3 method image. Fig-                                  pared to odesolve, it includes a more complete set
ure 6 shows that, as time proceeds, gradients develop                              of integrators, and a more extensive set of options to
from the uniform distribution, until the system al-                                tune the integration routines, it provides more com-
most reaches steady-state at the end of the simula-                                plete output, and has extended the applicability do-
tion.                                                                              main to include also DDEs, DAEs and PDEs.
                                                                                        Thanks to the DAE solvers daspk (Brenan et al.,
    image(out, xlab = "time, days",                                                1996) and radau (Hairer and Wanner, 2010) it is now
          ylab = "Distance, cm",                                                   also possible to model electronic circuits or equilib-
          main = "PDE", add.contour = TRUE)                                        rium chemical systems. These problems are often of
                                                                                   index ≤ 1. In many mechanical systems, physical
                                                                                   constraints lead to DAEs of index up to 3, and these
                                                 PDE
                                                                                   more complex problems can be solved with radau.
                   1.0




                                                                                        The inclusion of BVP and PDE solvers have
                                                                       15

                                                                            10
                                                                                   opened up the application area to the field of re-
                                                                             5     active transport modelling (Soetaert and Meysman,
                   0.8




                                                                            0      2010), such that R can now be used to describe quan-
                                                                        −5         tities that change not only in time, but also along
                                                                                   one or more spatial axes. We use it to model how
                   0.6




                                                                       −10
Distance, cm




                                                                       −15
                                                                                   ecosystems change along rivers, or in sediments, but
                                                                                   it could equally serve to model the growth of a tu-
                   0.4




                                                                       −20
                                                                                   mor in human brains, or the dispersion of toxicants
                                                                                   in human tissues.
                                                                       −25              The open source matrix language R has great po-
                   0.2




                                                                                   tential for dynamic modelling, and the tools cur-
                                                                                   rently available are suitable for solving a wide va-
                                                                                   riety of practical and scientific problems. The perfor-
                   0.0




                         0      20        40                60   80     100        mance is sufficient even for larger systems, especially
                                               time, days
                                                                                   when models can be formulated using matrix alge-
                                                                                   bra or are implemented in compiled languages like
                                                                                   C or Fortran (Soetaert et al., 2010b). Indeed, there
Figure 6: Dynamic solution of the 1-D diffusion-                                   is emerging interest in performing statistical analysis
reaction model.                                                                    on differential equations, e.g. in package nlmeODE
                                                                                   (Tornøe et al., 2004) for fitting non-linear mixed-
    It should be noted that the steady-state model is                              effects models using differential equations, pack-
effectively a boundary value problem, while the tran-                              age FME (Soetaert and Petzoldt, 2010) for sensitiv-
sient model is a prototype of a “parabolic” partial dif-                           ity analysis, parameter estimation and Markov chain
ferential equation (LeVeque, 2007).                                                Monte-Carlo analysis or package ccems for combina-
    Whereas R can also solve the other two main                                    torially complex equilibrium model selection (Radi-
classes of PDEs, i.e. of the “hyperbolic” and “ellip-                              voyevitch, 2008).
tic” type, it is well beyond the scope of this paper to                                 However, there is ample room for extensions
elaborate on that.                                                                 and improvements. For instance, the PDE solvers
                                                                                   are quite memory intensive, and could benefit from
                                                                                   the implementation of sparse matrix solvers that are
Discussion                                                                         more efficient in this respect2 . In addition, the meth-
                                                                                   ods implemented in ReacTran handle equations de-
Although R is still predominantly applied for statis-                              fined on very simple shapes only. Extending the
tical analysis and graphical representation, it is more                            PDE approach to finite elements (Strang and Fix,
and more suitable for mathematical computing, e.g.                                 1973) would open up the application domain of R to
in the field of matrix algebra (Bates and Maechler,                                 any irregular geometry. Other spatial discretisation
2008). Thanks to the differential equation solvers, R                              schemes could be added, e.g. for use in fluid dynam-
is also emerging as a powerful environment for dy-                                 ics.
namic simulations (Petzoldt, 2003; Soetaert and Her-                                    Our models are often applied to derive unknown
man, 2009; Stevens, 2009).                                                         parameters by fitting them against data; this relies on
    The new package deSolve has retained all the                                   the availability of apt parameter fitting algorithms.
funtionalities of its predecessor odesolve (Setzer,                                     Discussion of these items is highly welcomed, in
2001), such as the potential to define models both in                               the new special interest group about dynamic mod-
               2 for instance, the “preconditioned Krylov” part of the daspk method is not yet supported
               3   https://stat.ethz.ch/mailman/listinfo/r-sig-dynamic-models


The R Journal Vol. 2/2, December 2010                                                                                    ISSN 2073-4859
C ONTRIBUTED R ESEARCH A RTICLES                                                                                    13


els3 in R.                                                    E. Hairer, S. P. Noørsett, and G. Wanner. Solving Ordi-
                                                                 nary Differential Equations I: Nonstiff Problems. Sec-
                                                                 ond Revised Edition. Springer-Verlag, Heidelberg,
Bibliography                                                     2009.

U. Ascher, R. Mattheij, and R. Russell. Numerical So-         S. Hamdi, W. E. Schiesser, and G. W. Griffiths.
  lution of Boundary Value Problems for Ordinary Dif-           Method of lines. Scholarpedia, 2(7):2859, 2007.
  ferential Equations. Philadelphia, PA, 1995.                A. C. Hindmarsh. ODEPACK, a systematized collec-
                                                                tion of ODE solvers. In R. Stepleman, editor, Scien-
U. M. Ascher and L. R. Petzold. Computer Methods
                                                                tific Computing, Vol. 1 of IMACS Transactions on Sci-
  for Ordinary Differential Equations and Differential-
                                                                entific Computation, pages 55–64. IMACS / North-
  Algebraic Equations. SIAM, Philadelphia, 1998.
                                                                Holland, Amsterdam, 1983.
G. Bader and U. Ascher. A new basis implementa-
                                                              W. Hundsdorfer and J. Verwer. Numerical Solution of
  tion for a mixed order boundary value ODE solver.
                                                               Time-Dependent Advection-Diffusion-Reaction Equa-
  SIAM J. Scient. Stat. Comput., 8:483–500, 1987.
                                                               tions. Springer Series in Computational Mathematics.
D. Bates and M. Maechler. Matrix: A Matrix Package             Springer-Verlag, Berlin, 2003.
  for R, 2008. R package version 0.999375-9.                  S. M. Iacus. sde: Simulation and Inference for Stochas-
                                                                 tic Differential Equations, 2008. R package version
P. Bogacki and L. Shampine. A 3(2) pair of Runge-
                                                                 2.0.3.
   Kutta formulas. Appl. Math. Lett., 2:1–9, 1989.
                                                              IEEE Standard 754. Ieee standard for floating-point
K. E. Brenan, S. L. Campbell, and L. R. Pet-                    arithmetic, Aug 2008.
  zold. Numerical Solution of Initial-Value Problems in
  Differential-Algebraic Equations. SIAM Classics in          A. A. King, E. L. Ionides, and C. M. Breto. pomp: Sta-
  Applied Mathematics, 1996.                                    tistical Inference for Partially Observed Markov Pro-
                                                                cesses, 2008. R package version 0.21-3.
P. N. Brown, G. D. Byrne, and A. C. Hindmarsh.
   VODE, a variable-coefficient ode solver. SIAM J.            R. J. LeVeque. Finite Difference Methods for Ordinary
   Sci. Stat. Comput., 10:1038–1051, 1989.                      and Partial Differential Equations, Steady State and
                                                                Time Dependent Problems. SIAM, 2007.
J. C. Butcher. The Numerical Analysis of Ordinary Dif-
   ferential Equations, Runge-Kutta and General Linear        F. Mazzia and C. Magherini. Test Set for Initial Value
   Methods. Wiley, Chichester, New York, 1987.                   Problem Solvers, release 2.4. Department of Mathe-
                                                                 matics, University of Bari, Italy, 2008. URL http://
J. R. Cash. 35 Test Problems for Two Way Point Bound-            pitagora.dm.uniba.it/~testset. Report 4/2008.
   ary Value Problems, 2009. URL http://www.ma.ic.
   ac.uk/~jcash/BVP_software/PROBLEMS.PDF.                    L. R. Petzold. Automatic selection of methods for
                                                                solving stiff and nonstiff systems of ordinary dif-
J. R. Cash and A. H. Karp.           A variable order           ferential equations. SIAM J. Sci. Stat. Comput., 4:
   Runge-Kutta method for initial value problems                136–148, 1983.
   with rapidly varying right-hand sides. ACM Trans-
                                                              T. Petzoldt. R as a simulation platform in ecological
   actions on Mathematical Software, 16:201–222, 1990.
                                                                 modelling. R News, 3(3):8–16, 2003.
J. R. Cash and F. Mazzia. A new mesh selection                W. H. Press, S. A. Teukolsky, W. T. Vetterling, and
   algorithm, based on conditioning, for two-point             B. P. Flannery. Numerical Recipes: The Art of Scien-
   boundary value codes. J. Comput. Appl. Math., 184:          tific Computing. Cambridge University Press, 3rd
   362–381, 2005.                                              edition, 2007.
A. Couture-Beil, J. T. Schnute, and R. Haigh. PB-             P. J. Prince and J. R. Dormand. High order embed-
  Sddesolve: Solver for Delay Differential Equations,            ded Runge-Kutta formulae. J. Comput. Appl. Math.,
  2010. R package version 1.08.11.                               7:67–75, 1981.
J. R. Dormand and P. J. Prince. A family of embed-            R Development Core Team. R: A Language and Envi-
   ded Runge-Kutta formulae. J. Comput. Appl. Math.,            ronment for Statistical Computing. R Foundation for
   6:19–26, 1980.                                               Statistical Computing, Vienna, Austria, 2009. URL
                                                                http://www.R-project.org. ISBN 3-900051-07-0.
E. Hairer and G. Wanner. Solving Ordinary Differen-
  tial Equations II: Stiff and Differential-Algebraic Prob-   T. Radivoyevitch.    Equilibrium model selection:
  lems. Second Revised Edition. Springer-Verlag, Hei-           dTTP induced R1 dimerization. BMC Systems Bi-
  delberg, 2010.                                                ology, 2:15, 2008.


The R Journal Vol. 2/2, December 2010                                                                 ISSN 2073-4859
14                                                                          C ONTRIBUTED R ESEARCH A RTICLES


H. H. Robertson. The solution of a set of reaction rate
  equations. In J. Walsh, editor, Numerical Analysis:       Karline Soetaert
  An Introduction, pages 178–182. Academic Press,           Netherlands Institute of Ecology
  London, 1966.                                             K.Soetaert@nioo.knaw.nl

W. E. Schiesser. The Numerical Method of Lines: In-         Thomas Petzoldt
 tegration of Partial Differential Equations. Academic      Technische Universität Dresden
 Press, San Diego, 1991.                                    Thomas.Petzoldt@tu-dresden.de

R. W. Setzer. The odesolve Package: Solvers for Ordi-       R. Woodrow Setzer
  nary Differential Equations, 2001. R package version      US Environmental Protection Agency
  0.1-1.                                                    Setzer.Woodrow@epamail.epa.gov
K. Soetaert. rootSolve: Nonlinear Root Finding, Equi-
  librium and Steady-State Analysis of Ordinary Differ-
  ential Equations, 2009. R package version 1.6.

K. Soetaert and P. M. J. Herman. A Practical Guide
  to Ecological Modelling. Using R as a Simulation Plat-
  form. Springer, 2009. ISBN 978-1-4020-8623-6.

K. Soetaert and F. Meysman. ReacTran: Reactive
  Transport Modelling in 1D, 2D and 3D, 2010. R pack-
  age version 1.2.

K. Soetaert and T. Petzoldt. Inverse modelling, sensi-
  tivity and Monte Carlo analysis in R using package
  FME. Journal of Statistical Software, 33(3):1–28, 2010.
  URL http://www.jstatsoft.org/v33/i03/.

K. Soetaert, J. R. Cash, and F. Mazzia. bvpSolve:
  Solvers for Boundary Value Problems of Ordinary Dif-
  ferential Equations, 2010a. R package version 1.2.

K. Soetaert, T. Petzoldt, and R. W. Setzer. Solving dif-
  ferential equations in R: Package deSolve. Journal
  of Statistical Software, 33(9):1–25, 2010b. ISSN 1548-
  7660. URL http://www.jstatsoft.org/v33/i09.

K. Soetaert, T. Petzoldt, and R. W. Setzer. R Pack-
  age deSolve: Writing Code in Compiled Languages,
  2010c. deSolve vignette - R package version 1.8.

K. Soetaert, T. Petzoldt, and R. W. Setzer. R Package
  deSolve: Solving Initial Value Differential Equations,
  2010d. deSolve vignette - R package version 1.8.

M. H. H. Stevens. A Primer of Ecology with R. Use R
 Series. Springer, 2009. ISBN: 978-0-387-89881-0.

G. Strang and G. Fix. An Analysis of The Finite Element
  Method. Prentice Hall, 1973.

C. W. Tornøe, H. Agersø, E. N. Jonsson, H. Mad-
  sen, and H. A. Nielsen. Non-linear mixed-effects
  pharmacokinetic/pharmacodynamic modelling in
  nlme using differential equations. Computer Meth-
  ods and Programs in Biomedicine, 76:31–40, 2004.

B. van der Pol and J. van der Mark. Frequency de-
  multiplication. Nature, 120:363–364, 1927.


The R Journal Vol. 2/2, December 2010                                                            ISSN 2073-4859
C ONTRIBUTED R ESEARCH A RTICLES                                                                             15




                  Table 2: Summary of the main functions that solve differential equations.
  Function           Package             Description
  ode                deSolve             IVP of ODEs, full, banded or arbitrary sparse Jacobian
  ode.1D             deSolve             IVP of ODEs resulting from 1-D reaction-transport problems
  ode.2D             deSolve             IVP of ODEs resulting from 2-D reaction-transport problems
  ode.3D             deSolve             IVP of ODEs resulting from 3-D reaction-transport problems
  daspk              deSolve             IVP of DAEs of index ≤ 1, full or banded Jacobian
  radau              deSolve             IVP of DAEs of index ≤ 3, full or banded Jacobian
  dde                PBSddesolve         IVP of delay differential equations, based on Runge-Kutta formu-
                                         lae
  dede               deSolve             IVP of delay differential equations, based on Adams and BDF for-
                                         mulae
  bvpshoot           bvpSolve            BVP of ODEs; the shooting method
  bvptwp             bvpSolve            BVP of ODEs; mono-implicit Runge-Kutta formula
  bvpcol             bvpSolve            BVP of ODEs; collocation formula
  steady             rootSolve           steady-state of ODEs; full, banded or arbitrary sparse Jacobian
  steady.1D          rootSolve           steady-state of ODEs resulting from 1-D reaction-transport prob-
                                         lems
  steady.2D          rootSolve           steady-state of ODEs resulting from 2-D reaction-transport prob-
                                         lems
  steady.3D          rootSolve           steady-state of ODEs resulting from 3-D reaction-transport prob-
                                         lems
  tran.1D            ReacTran            numerical approximation of 1-D advective-diffusive transport
                                         problems
  tran.2D            ReacTran            numerical approximation of 2-D advective-diffusive transport
                                         problems
  tran.3D            ReacTran            numerical approximation of 3-D advective-diffusive transport
                                         problems




                Table 3: Summary of the auxilliary functions that solve differential equations.
  Function           Package             Description
  lsoda              deSolve             IVP ODEs, full or banded Jacobian, automatic choice for stiff or
                                         non-stiff method
  lsodar             deSolve             same as lsoda, but includes a root-solving procedure.
  lsode, vode        deSolve             IVP ODEs, full or banded Jacobian, user specifies if stiff or non-
                                         stiff
  lsodes             deSolve             IVP ODEs, arbitrary sparse Jacobian, stiff method
  rk4, rk, euler     deSolve             IVP ODEs, using Runge-Kutta and Euler methods
  zvode              deSolve             IVP ODEs, same as vode, but for complex variables
  runsteady          rootSolve           steady-state ODEs by dynamically running, full or banded Jaco-
                                         bian
  stode              rootSolve           steady-state ODEs by Newton-Raphson method, full or banded
                                         Jacobian
  stodes             rootSolve           steady-state ODEs by Newton-Raphson method, arbitrary sparse
                                         Jacobian




The R Journal Vol. 2/2, December 2010                                                             ISSN 2073-4859
16                                                                          C ONTRIBUTED R ESEARCH A RTICLES



Source References
by Duncan Murdoch                                          [1] 3

     Abstract Since version 2.10.0, R includes ex-         > typeof(parsed)
     panded support for source references in R code        [1] "expression"
     and ‘.Rd’ files. This paper describes the origin
     and purposes of source references, and current        The first element is the assignment, the second ele-
     and future support for them.                          ment is the for loop, and the third is the single x at
                                                           the end:
One of the strengths of R is that it allows “compu-        > parsed[[1]]
tation on the language”, i.e. the parser returns an R
object which can be manipulated, not just evaluated.       x <- 1:10
This has applications in quality control checks, de-       > parsed[[2]]
bugging, and elsewhere. For example, the codetools
package (Tierney, 2009) examines the structure of          for (i in x) {
parsed source code to look for common program-                 print(i)
ming errors. Functions marked by debug() can be            }
executed one statement at a time, and the trace()          > parsed[[3]]
function can insert debugging statements into any
function.                                                  x
    Computing on the language is often enhanced by
                                                           The first two elements are both of type "language",
being able to refer to the original source code, rather
                                                           and are made up of smaller components. The dif-
than just to a deparsed (reconstructed) version of
                                                           ference between an "expression" and a "language"
it based on the parsed object. To support this, we
                                                           object is mainly internal: the former is based on the
added source references to R 2.5.0 in 2007. These are
                                                           generic vector type (i.e. type "list"), whereas the
attributes attached to the result of parse() or (as of
                                                           latter is based on the "pairlist" type. Pairlists are
2.10.0) parse_Rd() to indicate where a particular part
                                                           rarely encountered explicitly in any other context.
of an object originated. In this article I will describe
                                                           From a user point of view, they act just like generic
their structure and how they are used in R. The arti-
                                                           vectors.
cle is aimed at developers who want to create debug-
                                                               The third element x is of type "symbol". There are
gers or other tools that use the source references, at
                                                           other possible types, such as "NULL", "double", etc.:
users who are curious about R internals, and also at
                                                           essentially any simple R object could be an element.
users who want to use the existing debugging facili-
                                                               The comments in the source code and the white
ties. The latter group may wish to skip over the gory
                                                           space making up the indentation of the third line are
details and go directly to the section “Using Source
                                                           not part of the parsed object.
References".
                                                               The parse_Rd() function parses ‘.Rd’ documenta-
                                                           tion files. It also returns a recursive structure contain-
The R parsers                                              ing objects of different types (Murdoch and Urbanek,
                                                           2009; Murdoch, 2010).
We start with a quick introduction to the R parser.
The parse() function returns an R object of type
"expression". This is a list of statements; the state-     Source reference structure
ments can be of various types. For example, consider
                                                           As described above, the result of parse() is essen-
the R source shown in Figure 1.
                                                           tially a list (the "expression" object) of objects that
1:   x <- 1:10           # Initialize x
                                                           may be lists (the "language" objects) themselves, and
2:   for (i in x) {                                        so on recursively. Each element of this structure from
3:     print(i)          # Print each entry                the top down corresponds to some part of the source
4:   }                                                     file used to create it: in our example, parse[[1]] cor-
5:   x                                                     responds to the first line of ‘sample.R’, parse[[2]] is
                                                           the second through fourth lines, and parse[[3]] is
          Figure 1: The contents of ‘sample.R’.            the fifth line.
                                                               The comments and indentation, though helpful
   If we parse this file, we obtain an expression of        to the human reader, are not part of the parsed object.
length 3:                                                  However, by default the parsed object does contain a
                                                           "srcref" attribute:
> parsed <- parse("sample.R")
> length(parsed)                                           > attr(parsed, "srcref")


The R Journal Vol. 2/2, December 2010                                                              ISSN 2073-4859
C ONTRIBUTED R ESEARCH A RTICLES                                                                                    17


[[1]]                                                          The "srcfile" attribute is actually an environment
x <- 1:10                                                      containing an encoding, a filename, a timestamp,
                                                               and a working directory. These give information
[[2]]                                                          about the file from which the parser was reading.
for (i in x) {                                                 The reason it is an environment is that environments
  print(i)            # Print each entry
                                                               are reference objects: even though all three source ref-
}
                                                               erences contain this attribute, in actuality there is
[[3]]                                                          only one copy stored. This was done to save mem-
x                                                              ory, since there are often hundreds of source refer-
                                                               ences from each file.
Although it appears that the "srcref" attribute con-               Source references in objects returned by
tains the source, in fact it only references it, and the       parse_Rd() use the same structure as those returned
print.srcref() method retrieves it for printing. If            by parse(). The main difference is that in Rd objects
we remove the class from each element, we see the              source references are attached to every component,
true structure:                                                whereas parse() only constructs source references
                                                               for complete statements, not for their component
> lapply(attr(parsed, "srcref"), unclass)                      parts, and they are attached to the container of the
                                                               statements. Thus for example a braced list of state-
[[1]]                                                          ments processed by parse() will receive a "srcref"
[1] 1 1 1 9 1 9                                                attribute containing source references for each state-
attr(,"srcfile")                                               ment within, while the statements themselves will
sample.R                                                       not hold their own source references, and sub-
                                                               expressions within each statement will not generate
[[2]]
                                                               source references at all. In contrast the "srcref" at-
[1] 2 1 4 1 1 1
attr(,"srcfile")
                                                               tribute for a section in an ‘.Rd’ file will be a source
sample.R                                                       reference for the whole section, and each component
                                                               part in the section will have its own source reference.
[[3]]
[1] 5 1 5 1 1 1
attr(,"srcfile")                                               Relation to the "source" attribute
sample.R
                                                               By default the R parser also creates an attribute
Each element is a vector of 6 integers: (first line, first       named "source" when it parses a function definition.
byte, last line, last byte, first character, last character).   When available, this attribute is used by default in
The values refer to the position of the source for each        lieu of deparsing to display the function definition.
element in the original source file; the details of the         It is unrelated to the "srcref" attribute, which is in-
source file are contained in a "srcfile" attribute on           tended to point to the source, rather than to duplicate
each reference.                                                the source. An integrated development environment
    The reason both bytes and characters are                   (IDE) would need to know the correspondence be-
recorded in the source reference is historical. When           tween R code in R and the true source, and "srcref"
they were introduced, they were mainly used for re-            attributes are intended to provide this.
trieving source code for display; for this, bytes are
needed. Since R 2.9.0, they have also been used to
                                                               When are "srcref" attributes added?
aid in error messages. Since some characters take up
more than one byte, users need to be informed about            As mentioned above, the parser adds a "srcref"
character positions, not byte positions, and the last          attribute by default. For this, it is assumes that
two entries were added.                                        options("keep.source") is left at its default setting
    The "srcfile" attribute is also not as simple as it        of TRUE, and that parse() is given a filename as argu-
looks. For example,                                            ment file, or a character vector as argument text.
                                                               In the latter case, there is no source file to refer-
> srcref <- attr(parsed, "srcref")[[1]]
                                                               ence, so parse() copies the lines of source into a
> srcfile <- attr(srcref, "srcfile")
                                                               "srcfilecopy" object, which is simply a "srcfile"
> typeof(srcfile)
                                                               object that contains a copy of the text.
[1] "environment"                                                  Developers may wish to add source references in
                                                               other situations. To do that, an object inheriting from
> ls(srcfile)                                                  class "srcfile" should be passed as the srcfile ar-
                                                               gument to parse().
[1] "Enc"       "encoding"       "filename"                        The other situation in which source references
[4] "timestamp" "wd"                                           are likely to be created in R code is when calling


The R Journal Vol. 2/2, December 2010                                                                 ISSN 2073-4859
18                                                                         C ONTRIBUTED R ESEARCH A RTICLES


source(). The source() function calls parse(), cre-        In this simple example it is easy to see where the
ating the source references, and then evaluates the        problem occurred, but in a more complex function
resulting code. At this point newly created functions      it might not be so simple. To find it, we can convert
will have source references attached to the body of        the warning to an error using
the function.
                                                           > options(warn=2)
    The section “Breakpoints” below discusses how
to make sure that source references are created in         and then re-run the code to generate an error. After
package code.                                              generating the error, we can display a stack trace:
                                                           > traceback()
Using source references                                    5: doWithOneRestart(return(expr), restart)
                                                           4: withOneRestart(expr, restarts[[1L]])
Error locations                                            3: withRestarts({
                                                                  .Internal(.signalCondition(
For the most part, users need not be concerned with                   simpleWarning(msg, call), msg, call))
source references, but they interact with them fre-               .Internal(.dfltWarn(msg, call))
quently. For example, error messages make use of              }, muffleWarning = function() NULL) at badabs.R#2
them to report on the location of syntax errors:           2: .signalSimpleWarning("the condition has length
                                                                  > 1 and only the first element will be used",
> source("error.R")                                               quote(if (x < 0) x <- -x)) at badabs.R#3
                                                           1: badabs(c(5, -10))
Error in source("error.R") : error.R:4:1: unexpected
'else'                                                     To read a traceback, start at the bottom. We see our
3:    print( "less" )                                      call from the console as line “1:”, and the warning
4: else                                                    being signalled in line “2:”. At the end of line “2:”
   ^                                                       it says that the warning originated “at badabs.R#3”,
                                                           i.e. line 3 of the ‘badabs.R’ file.
    A more recent addition is the use of source ref-
erences in code being executed. When R evaluates
a function, it evaluates each statement in turn, keep-
                                                           Breakpoints
ing track of any associated source references. As of       Users may also make use of source references when
R 2.10.0, these are reported by the debugging sup-         setting breakpoints. The trace() function lets us set
port functions traceback(), browser(), recover(),          breakpoints in particular R functions, but we need to
and dump.frames(), and are returned as an attribute        specify which function and where to do the setting.
on each element returned by sys.calls(). For ex-           The setBreakpoint() function is a more friendly
ample, consider the function shown in Figure 2.            front end that uses source references to construct a
                                                           call to trace(). For example, if we wanted to set a
1: # Compute   the absolute value                          breakpoint on ‘badabs.R’ line 3, we could use
2: badabs <-   function(x) {
3:   if (x <   0)                                          > setBreakpoint("badabs.R#3")
4:      x <-   -x
5:   x                                                     D:svnpaperssrcrefsbadabs.R#3:
6: }                                                        badabs step 2 in <environment: R_GlobalEnv>

                                                           This tells us that we have set a breakpoint in step 2 of
        Figure 2: The contents of ‘badabs.R’.              the function badabs found in the global environment.
                                                           When we run it, we will see
    This function is syntactically correct, and works
to calculate the absolute value of scalar values, but is   > badabs( c(5, -10) )
not a valid way to calculate the absolute values of the
elements of a vector, and when called it will generate     badabs.R#3
an incorrect result and a warning:                         Called from: badabs(c(5, -10))

> source("badabs.R")                                       Browse[1]>
> badabs( c(5, -10) )
                                                           telling us that we have broken into the browser at the
[1]   5 -10                                                requested line, and it is waiting for input. We could
                                                           then examine x, single step through the code, or do
Warning message:                                           any other action of which the browser is capable.
In if (x < 0) x <- -x :                                        By default, most packages are built without
  the condition has length > 1 and only the first          source reference information, because it adds quite
  element will be used                                     substantially to the size of the code. However, setting


The R Journal Vol. 2/2, December 2010                                                             ISSN 2073-4859
C ONTRIBUTED R ESEARCH A RTICLES                                                                                19


the environment variable R_KEEP_PKG_SOURCE=yes            length 6 with a class and "srcfile" attribute. It is
before installing a source package will tell R to keep    hard to measure exactly how much space this takes
the source references, and then breakpoints may be        because much is shared with other source references,
set in package source code. The envir argument to         but it is on the order of 100 bytes per reference.
setBreakpoints() will need to be set in order to tell     Clearly a more efficient design is possible, at the ex-
it to search outside the global environment when set-     pense of moving support code to C from R. As part of
ting breakpoints.                                         this move, the use of environments for the "srcfile"
                                                          attribute could be dropped: they were used as the
                                                          only available R-level reference objects. For develop-
The #line directive
                                                          ers, this means that direct access to particular parts
In some cases, R source code is written by a program,     of a source reference should be localized as much as
not by a human being. For example, Sweave() ex-           possible: They should write functions to extract par-
tracts lines of code from Sweave documents before         ticular information, and use those functions where
sending the lines to R for parsing and evaluation. To     needed, rather than extracting information directly.
support such preprocessors, the R 2.10.0 parser rec-      Then, if the implementation changes, only those ex-
ognizes a new directive of the form                       tractor functions will need to be updated.
                                                              Finally, source level debugging could be imple-
#line nn "filename"                                       mented to make use of source references, to single
where nn is an integer. As with the same-named            step through the actual source files, rather than dis-
directive in the C language, this tells the parser to     playing a line at a time as the browser() does.
assume that the next line of source is line nn from
the given filename for the purpose of constructing
source references. The Sweave() function doesn’t          Bibliography
currently make use of this, but in the future, it (and
                                                          D. Murdoch. Parsing Rd files. 2010. URL http:
other preprocessors) could output #line directives
                                                            //developer.r-project.org/parseRd.pdf.
so that source references and syntax errors refer to
the original source location rather than to an inter-     D. Murdoch and S. Urbanek. The new R help system.
mediate file.                                                The R Journal, 1/2:60–65, 2009.
    The #line directive was a late addition to R
2.10.0. Support for this in Sweave() appeared in R        L. Tierney. codetools: Code Analysis Tools for R, 2009. R
2.12.0.                                                      package version 0.2-2.


The future                                                Duncan Murdoch
                                                          Dept. of Statistical and Actuarial Sciences
The source reference structure could be improved.         University of Western Ontario
First, it adds quite a lot of bulk to R objects in mem-   London, Ontario, Canada
ory. Each source reference is an integer vector of        murdoch@stats.uwo.ca




The R Journal Vol. 2/2, December 2010                                                              ISSN 2073-4859
R journal 2010-2
R journal 2010-2
R journal 2010-2
R journal 2010-2
R journal 2010-2
R journal 2010-2
R journal 2010-2
R journal 2010-2
R journal 2010-2
R journal 2010-2
R journal 2010-2
R journal 2010-2
R journal 2010-2
R journal 2010-2
R journal 2010-2
R journal 2010-2
R journal 2010-2
R journal 2010-2
R journal 2010-2
R journal 2010-2
R journal 2010-2
R journal 2010-2
R journal 2010-2
R journal 2010-2
R journal 2010-2
R journal 2010-2
R journal 2010-2
R journal 2010-2
R journal 2010-2
R journal 2010-2
R journal 2010-2
R journal 2010-2
R journal 2010-2
R journal 2010-2
R journal 2010-2
R journal 2010-2
R journal 2010-2
R journal 2010-2
R journal 2010-2
R journal 2010-2
R journal 2010-2
R journal 2010-2
R journal 2010-2
R journal 2010-2
R journal 2010-2
R journal 2010-2
R journal 2010-2
R journal 2010-2
R journal 2010-2
R journal 2010-2
R journal 2010-2
R journal 2010-2
R journal 2010-2
R journal 2010-2
R journal 2010-2
R journal 2010-2
R journal 2010-2
R journal 2010-2
R journal 2010-2
R journal 2010-2
R journal 2010-2
R journal 2010-2
R journal 2010-2
R journal 2010-2
R journal 2010-2
R journal 2010-2
R journal 2010-2
R journal 2010-2
R journal 2010-2
R journal 2010-2
R journal 2010-2
R journal 2010-2
R journal 2010-2
R journal 2010-2
R journal 2010-2
R journal 2010-2
R journal 2010-2
R journal 2010-2
R journal 2010-2
R journal 2010-2
R journal 2010-2
R journal 2010-2
R journal 2010-2

More Related Content

What's hot

M.Sc Dissertation: Simple Digital Libraries
M.Sc Dissertation: Simple Digital LibrariesM.Sc Dissertation: Simple Digital Libraries
M.Sc Dissertation: Simple Digital LibrariesLighton Phiri
 
Group Violence Intervention: Implementation Guide
Group Violence Intervention: Implementation GuideGroup Violence Intervention: Implementation Guide
Group Violence Intervention: Implementation GuidePatricia Hall
 
Xeriscape Conversion Study - Southern Nevada Water Authority
Xeriscape Conversion Study - Southern Nevada Water AuthorityXeriscape Conversion Study - Southern Nevada Water Authority
Xeriscape Conversion Study - Southern Nevada Water AuthorityEric851q
 
Analysis of messy data vol i designed experiments 2nd ed
Analysis of messy data vol i designed experiments 2nd edAnalysis of messy data vol i designed experiments 2nd ed
Analysis of messy data vol i designed experiments 2nd edJavier Buitrago Gantiva
 
A Case Study of a New High School Choir at CAIS
A Case Study of a New High School Choir at CAISA Case Study of a New High School Choir at CAIS
A Case Study of a New High School Choir at CAISSelana Kong
 
NEWCOMB-BENFORD’S LAW APPLICATIONS TO ELECTORAL PROCESSES, BIOINFORMATICS, AN...
NEWCOMB-BENFORD’S LAW APPLICATIONS TO ELECTORAL PROCESSES, BIOINFORMATICS, AN...NEWCOMB-BENFORD’S LAW APPLICATIONS TO ELECTORAL PROCESSES, BIOINFORMATICS, AN...
NEWCOMB-BENFORD’S LAW APPLICATIONS TO ELECTORAL PROCESSES, BIOINFORMATICS, AN...David Torres
 
Poverty in a rising Africa
Poverty in a rising AfricaPoverty in a rising Africa
Poverty in a rising AfricaJohan Westerholm
 
The Hidden Frontier of Forest Degradation
The Hidden Frontier of Forest DegradationThe Hidden Frontier of Forest Degradation
The Hidden Frontier of Forest Degradationdaveganz
 
Monitoring and evaluation_plan____a_practical_guide_to_prepare_good_quality_m...
Monitoring and evaluation_plan____a_practical_guide_to_prepare_good_quality_m...Monitoring and evaluation_plan____a_practical_guide_to_prepare_good_quality_m...
Monitoring and evaluation_plan____a_practical_guide_to_prepare_good_quality_m...Malik Khalid Mehmood
 
Dimensional modeling in a bi environment
Dimensional modeling in a bi environmentDimensional modeling in a bi environment
Dimensional modeling in a bi environmentdivjeev
 
Guide to Hydrological Practices: Data Acquisition and Processing, Analysis, F...
Guide to Hydrological Practices: Data Acquisition and Processing, Analysis, F...Guide to Hydrological Practices: Data Acquisition and Processing, Analysis, F...
Guide to Hydrological Practices: Data Acquisition and Processing, Analysis, F...indiawrm
 
A Bilevel Optimization Approach to Machine Learning
A Bilevel Optimization Approach to Machine LearningA Bilevel Optimization Approach to Machine Learning
A Bilevel Optimization Approach to Machine Learningbutest
 
Sanbi Biodiversity Series 111
Sanbi Biodiversity Series 111Sanbi Biodiversity Series 111
Sanbi Biodiversity Series 111melissagmoye
 

What's hot (19)

Rand rr2637
Rand rr2637Rand rr2637
Rand rr2637
 
M.Sc Dissertation: Simple Digital Libraries
M.Sc Dissertation: Simple Digital LibrariesM.Sc Dissertation: Simple Digital Libraries
M.Sc Dissertation: Simple Digital Libraries
 
Group Violence Intervention: Implementation Guide
Group Violence Intervention: Implementation GuideGroup Violence Intervention: Implementation Guide
Group Violence Intervention: Implementation Guide
 
Xeriscape Conversion Study - Southern Nevada Water Authority
Xeriscape Conversion Study - Southern Nevada Water AuthorityXeriscape Conversion Study - Southern Nevada Water Authority
Xeriscape Conversion Study - Southern Nevada Water Authority
 
Analysis of messy data vol i designed experiments 2nd ed
Analysis of messy data vol i designed experiments 2nd edAnalysis of messy data vol i designed experiments 2nd ed
Analysis of messy data vol i designed experiments 2nd ed
 
A Case Study of a New High School Choir at CAIS
A Case Study of a New High School Choir at CAISA Case Study of a New High School Choir at CAIS
A Case Study of a New High School Choir at CAIS
 
NEWCOMB-BENFORD’S LAW APPLICATIONS TO ELECTORAL PROCESSES, BIOINFORMATICS, AN...
NEWCOMB-BENFORD’S LAW APPLICATIONS TO ELECTORAL PROCESSES, BIOINFORMATICS, AN...NEWCOMB-BENFORD’S LAW APPLICATIONS TO ELECTORAL PROCESSES, BIOINFORMATICS, AN...
NEWCOMB-BENFORD’S LAW APPLICATIONS TO ELECTORAL PROCESSES, BIOINFORMATICS, AN...
 
Poverty in a rising Africa
Poverty in a rising AfricaPoverty in a rising Africa
Poverty in a rising Africa
 
The Hidden Frontier of Forest Degradation
The Hidden Frontier of Forest DegradationThe Hidden Frontier of Forest Degradation
The Hidden Frontier of Forest Degradation
 
Monitoring and evaluation_plan____a_practical_guide_to_prepare_good_quality_m...
Monitoring and evaluation_plan____a_practical_guide_to_prepare_good_quality_m...Monitoring and evaluation_plan____a_practical_guide_to_prepare_good_quality_m...
Monitoring and evaluation_plan____a_practical_guide_to_prepare_good_quality_m...
 
Dimensional modeling in a bi environment
Dimensional modeling in a bi environmentDimensional modeling in a bi environment
Dimensional modeling in a bi environment
 
Knustthesis
KnustthesisKnustthesis
Knustthesis
 
Rand rr3242 (1)
Rand rr3242 (1)Rand rr3242 (1)
Rand rr3242 (1)
 
25 quick formative assessments
25 quick formative assessments25 quick formative assessments
25 quick formative assessments
 
Guide to Hydrological Practices: Data Acquisition and Processing, Analysis, F...
Guide to Hydrological Practices: Data Acquisition and Processing, Analysis, F...Guide to Hydrological Practices: Data Acquisition and Processing, Analysis, F...
Guide to Hydrological Practices: Data Acquisition and Processing, Analysis, F...
 
A Bilevel Optimization Approach to Machine Learning
A Bilevel Optimization Approach to Machine LearningA Bilevel Optimization Approach to Machine Learning
A Bilevel Optimization Approach to Machine Learning
 
Sanbi Biodiversity Series 111
Sanbi Biodiversity Series 111Sanbi Biodiversity Series 111
Sanbi Biodiversity Series 111
 
Slima thesis carnegie mellon ver march 2001
Slima thesis carnegie mellon ver march 2001Slima thesis carnegie mellon ver march 2001
Slima thesis carnegie mellon ver march 2001
 
sg248293
sg248293sg248293
sg248293
 

Viewers also liked

Modelo curricular por competencia.doc
Modelo curricular por competencia.docModelo curricular por competencia.doc
Modelo curricular por competencia.docAldo Thomas
 
El curriculum educativo Panameño
El  curriculum educativo PanameñoEl  curriculum educativo Panameño
El curriculum educativo Panameñocei montessori
 
Modelo curricular panama
Modelo curricular panamaModelo curricular panama
Modelo curricular panamaalexandergracia
 
Curriculo nivel primario Dominicano 2016
Curriculo nivel primario Dominicano 2016Curriculo nivel primario Dominicano 2016
Curriculo nivel primario Dominicano 2016Julio Cesar Silverio
 
Innovaciones curriculares República Dominicana
 Innovaciones curriculares República Dominicana Innovaciones curriculares República Dominicana
Innovaciones curriculares República Dominicanayanairaseverino
 
Componentes del Currículo Dominicano
Componentes del Currículo DominicanoComponentes del Currículo Dominicano
Componentes del Currículo Dominicanoclaradiazb
 
Fundamentos del Enfoque por Competencias
Fundamentos del Enfoque por CompetenciasFundamentos del Enfoque por Competencias
Fundamentos del Enfoque por CompetenciasJ Obeso Velázquez
 
El diseño curricular sus tareas, componentes y niveles
El diseño curricular sus tareas, componentes y nivelesEl diseño curricular sus tareas, componentes y niveles
El diseño curricular sus tareas, componentes y nivelesManduk Padron
 

Viewers also liked (9)

Modelo curricular por competencia.doc
Modelo curricular por competencia.docModelo curricular por competencia.doc
Modelo curricular por competencia.doc
 
El curriculum educativo Panameño
El  curriculum educativo PanameñoEl  curriculum educativo Panameño
El curriculum educativo Panameño
 
Modelo curricular panama
Modelo curricular panamaModelo curricular panama
Modelo curricular panama
 
Curriculo nivel primario Dominicano 2016
Curriculo nivel primario Dominicano 2016Curriculo nivel primario Dominicano 2016
Curriculo nivel primario Dominicano 2016
 
Innovaciones curriculares República Dominicana
 Innovaciones curriculares República Dominicana Innovaciones curriculares República Dominicana
Innovaciones curriculares República Dominicana
 
Nuevo diseño curricular dominicano
Nuevo diseño curricular dominicano Nuevo diseño curricular dominicano
Nuevo diseño curricular dominicano
 
Componentes del Currículo Dominicano
Componentes del Currículo DominicanoComponentes del Currículo Dominicano
Componentes del Currículo Dominicano
 
Fundamentos del Enfoque por Competencias
Fundamentos del Enfoque por CompetenciasFundamentos del Enfoque por Competencias
Fundamentos del Enfoque por Competencias
 
El diseño curricular sus tareas, componentes y niveles
El diseño curricular sus tareas, componentes y nivelesEl diseño curricular sus tareas, componentes y niveles
El diseño curricular sus tareas, componentes y niveles
 

Similar to R journal 2010-2

R journal 2010-1
R journal 2010-1R journal 2010-1
R journal 2010-1Ajay Ohri
 
Documentation - LibraryRandom
Documentation - LibraryRandomDocumentation - LibraryRandom
Documentation - LibraryRandomMichel Alves
 
Stochastic Processes and Simulations – A Machine Learning Perspective
Stochastic Processes and Simulations – A Machine Learning PerspectiveStochastic Processes and Simulations – A Machine Learning Perspective
Stochastic Processes and Simulations – A Machine Learning Perspectivee2wi67sy4816pahn
 
Specification of the Linked Media Layer
Specification of the Linked Media LayerSpecification of the Linked Media Layer
Specification of the Linked Media LayerLinkedTV
 
An Optical Character Recognition Engine For Graphical Processing Units
An Optical Character Recognition Engine For Graphical Processing UnitsAn Optical Character Recognition Engine For Graphical Processing Units
An Optical Character Recognition Engine For Graphical Processing UnitsKelly Lipiec
 
A Comparative Study Of Generalized Arc-Consistency Algorithms
A Comparative Study Of Generalized Arc-Consistency AlgorithmsA Comparative Study Of Generalized Arc-Consistency Algorithms
A Comparative Study Of Generalized Arc-Consistency AlgorithmsSandra Long
 
1026332_Master_Thesis_Eef_Lemmens_BIS_269.pdf
1026332_Master_Thesis_Eef_Lemmens_BIS_269.pdf1026332_Master_Thesis_Eef_Lemmens_BIS_269.pdf
1026332_Master_Thesis_Eef_Lemmens_BIS_269.pdfssusere02009
 
Data struture and aligorism
Data struture and aligorismData struture and aligorism
Data struture and aligorismmbadhi barnabas
 
Ibm watson analytics
Ibm watson analyticsIbm watson analytics
Ibm watson analyticsLeon Henry
 
ChucK_manual
ChucK_manualChucK_manual
ChucK_manualber-yann
 
Design of a bionic hand using non invasive interface
Design of a bionic hand using non invasive interfaceDesign of a bionic hand using non invasive interface
Design of a bionic hand using non invasive interfacemangal das
 

Similar to R journal 2010-2 (20)

R journal 2010-1
R journal 2010-1R journal 2010-1
R journal 2010-1
 
Liebman_Thesis.pdf
Liebman_Thesis.pdfLiebman_Thesis.pdf
Liebman_Thesis.pdf
 
Documentation - LibraryRandom
Documentation - LibraryRandomDocumentation - LibraryRandom
Documentation - LibraryRandom
 
Stochastic Processes and Simulations – A Machine Learning Perspective
Stochastic Processes and Simulations – A Machine Learning PerspectiveStochastic Processes and Simulations – A Machine Learning Perspective
Stochastic Processes and Simulations – A Machine Learning Perspective
 
Specification of the Linked Media Layer
Specification of the Linked Media LayerSpecification of the Linked Media Layer
Specification of the Linked Media Layer
 
An Optical Character Recognition Engine For Graphical Processing Units
An Optical Character Recognition Engine For Graphical Processing UnitsAn Optical Character Recognition Engine For Graphical Processing Units
An Optical Character Recognition Engine For Graphical Processing Units
 
Analytical-Chemistry
Analytical-ChemistryAnalytical-Chemistry
Analytical-Chemistry
 
outiar.pdf
outiar.pdfoutiar.pdf
outiar.pdf
 
A Comparative Study Of Generalized Arc-Consistency Algorithms
A Comparative Study Of Generalized Arc-Consistency AlgorithmsA Comparative Study Of Generalized Arc-Consistency Algorithms
A Comparative Study Of Generalized Arc-Consistency Algorithms
 
1026332_Master_Thesis_Eef_Lemmens_BIS_269.pdf
1026332_Master_Thesis_Eef_Lemmens_BIS_269.pdf1026332_Master_Thesis_Eef_Lemmens_BIS_269.pdf
1026332_Master_Thesis_Eef_Lemmens_BIS_269.pdf
 
Dsa book
Dsa bookDsa book
Dsa book
 
Dsa
DsaDsa
Dsa
 
Data struture and aligorism
Data struture and aligorismData struture and aligorism
Data struture and aligorism
 
Ibm watson analytics
Ibm watson analyticsIbm watson analytics
Ibm watson analytics
 
IBM Watson Content Analytics Redbook
IBM Watson Content Analytics RedbookIBM Watson Content Analytics Redbook
IBM Watson Content Analytics Redbook
 
R data
R dataR data
R data
 
Lecturenotesstatistics
LecturenotesstatisticsLecturenotesstatistics
Lecturenotesstatistics
 
ChucK_manual
ChucK_manualChucK_manual
ChucK_manual
 
Notes econometricswithr
Notes econometricswithrNotes econometricswithr
Notes econometricswithr
 
Design of a bionic hand using non invasive interface
Design of a bionic hand using non invasive interfaceDesign of a bionic hand using non invasive interface
Design of a bionic hand using non invasive interface
 

More from Ajay Ohri

Introduction to R ajay Ohri
Introduction to R ajay OhriIntroduction to R ajay Ohri
Introduction to R ajay OhriAjay Ohri
 
Introduction to R
Introduction to RIntroduction to R
Introduction to RAjay Ohri
 
Social Media and Fake News in the 2016 Election
Social Media and Fake News in the 2016 ElectionSocial Media and Fake News in the 2016 Election
Social Media and Fake News in the 2016 ElectionAjay Ohri
 
Download Python for R Users pdf for free
Download Python for R Users pdf for freeDownload Python for R Users pdf for free
Download Python for R Users pdf for freeAjay Ohri
 
Install spark on_windows10
Install spark on_windows10Install spark on_windows10
Install spark on_windows10Ajay Ohri
 
Ajay ohri Resume
Ajay ohri ResumeAjay ohri Resume
Ajay ohri ResumeAjay Ohri
 
Statistics for data scientists
Statistics for  data scientistsStatistics for  data scientists
Statistics for data scientistsAjay Ohri
 
National seminar on emergence of internet of things (io t) trends and challe...
National seminar on emergence of internet of things (io t)  trends and challe...National seminar on emergence of internet of things (io t)  trends and challe...
National seminar on emergence of internet of things (io t) trends and challe...Ajay Ohri
 
Tools and techniques for data science
Tools and techniques for data scienceTools and techniques for data science
Tools and techniques for data scienceAjay Ohri
 
How Big Data ,Cloud Computing ,Data Science can help business
How Big Data ,Cloud Computing ,Data Science can help businessHow Big Data ,Cloud Computing ,Data Science can help business
How Big Data ,Cloud Computing ,Data Science can help businessAjay Ohri
 
Training in Analytics and Data Science
Training in Analytics and Data ScienceTraining in Analytics and Data Science
Training in Analytics and Data ScienceAjay Ohri
 
Software Testing for Data Scientists
Software Testing for Data ScientistsSoftware Testing for Data Scientists
Software Testing for Data ScientistsAjay Ohri
 
A Data Science Tutorial in Python
A Data Science Tutorial in PythonA Data Science Tutorial in Python
A Data Science Tutorial in PythonAjay Ohri
 
How does cryptography work? by Jeroen Ooms
How does cryptography work?  by Jeroen OomsHow does cryptography work?  by Jeroen Ooms
How does cryptography work? by Jeroen OomsAjay Ohri
 
Using R for Social Media and Sports Analytics
Using R for Social Media and Sports AnalyticsUsing R for Social Media and Sports Analytics
Using R for Social Media and Sports AnalyticsAjay Ohri
 
Kush stats alpha
Kush stats alpha Kush stats alpha
Kush stats alpha Ajay Ohri
 
Analyze this
Analyze thisAnalyze this
Analyze thisAjay Ohri
 

More from Ajay Ohri (20)

Introduction to R ajay Ohri
Introduction to R ajay OhriIntroduction to R ajay Ohri
Introduction to R ajay Ohri
 
Introduction to R
Introduction to RIntroduction to R
Introduction to R
 
Social Media and Fake News in the 2016 Election
Social Media and Fake News in the 2016 ElectionSocial Media and Fake News in the 2016 Election
Social Media and Fake News in the 2016 Election
 
Pyspark
PysparkPyspark
Pyspark
 
Download Python for R Users pdf for free
Download Python for R Users pdf for freeDownload Python for R Users pdf for free
Download Python for R Users pdf for free
 
Install spark on_windows10
Install spark on_windows10Install spark on_windows10
Install spark on_windows10
 
Ajay ohri Resume
Ajay ohri ResumeAjay ohri Resume
Ajay ohri Resume
 
Statistics for data scientists
Statistics for  data scientistsStatistics for  data scientists
Statistics for data scientists
 
National seminar on emergence of internet of things (io t) trends and challe...
National seminar on emergence of internet of things (io t)  trends and challe...National seminar on emergence of internet of things (io t)  trends and challe...
National seminar on emergence of internet of things (io t) trends and challe...
 
Tools and techniques for data science
Tools and techniques for data scienceTools and techniques for data science
Tools and techniques for data science
 
How Big Data ,Cloud Computing ,Data Science can help business
How Big Data ,Cloud Computing ,Data Science can help businessHow Big Data ,Cloud Computing ,Data Science can help business
How Big Data ,Cloud Computing ,Data Science can help business
 
Training in Analytics and Data Science
Training in Analytics and Data ScienceTraining in Analytics and Data Science
Training in Analytics and Data Science
 
Tradecraft
Tradecraft   Tradecraft
Tradecraft
 
Software Testing for Data Scientists
Software Testing for Data ScientistsSoftware Testing for Data Scientists
Software Testing for Data Scientists
 
Craps
CrapsCraps
Craps
 
A Data Science Tutorial in Python
A Data Science Tutorial in PythonA Data Science Tutorial in Python
A Data Science Tutorial in Python
 
How does cryptography work? by Jeroen Ooms
How does cryptography work?  by Jeroen OomsHow does cryptography work?  by Jeroen Ooms
How does cryptography work? by Jeroen Ooms
 
Using R for Social Media and Sports Analytics
Using R for Social Media and Sports AnalyticsUsing R for Social Media and Sports Analytics
Using R for Social Media and Sports Analytics
 
Kush stats alpha
Kush stats alpha Kush stats alpha
Kush stats alpha
 
Analyze this
Analyze thisAnalyze this
Analyze this
 

Recently uploaded

20230202 - Introduction to tis-py
20230202 - Introduction to tis-py20230202 - Introduction to tis-py
20230202 - Introduction to tis-pyJamie (Taka) Wang
 
Connector Corner: Extending LLM automation use cases with UiPath GenAI connec...
Connector Corner: Extending LLM automation use cases with UiPath GenAI connec...Connector Corner: Extending LLM automation use cases with UiPath GenAI connec...
Connector Corner: Extending LLM automation use cases with UiPath GenAI connec...DianaGray10
 
Igniting Next Level Productivity with AI-Infused Data Integration Workflows
Igniting Next Level Productivity with AI-Infused Data Integration WorkflowsIgniting Next Level Productivity with AI-Infused Data Integration Workflows
Igniting Next Level Productivity with AI-Infused Data Integration WorkflowsSafe Software
 
Nanopower In Semiconductor Industry.pdf
Nanopower  In Semiconductor Industry.pdfNanopower  In Semiconductor Industry.pdf
Nanopower In Semiconductor Industry.pdfPedro Manuel
 
How Accurate are Carbon Emissions Projections?
How Accurate are Carbon Emissions Projections?How Accurate are Carbon Emissions Projections?
How Accurate are Carbon Emissions Projections?IES VE
 
VoIP Service and Marketing using Odoo and Asterisk PBX
VoIP Service and Marketing using Odoo and Asterisk PBXVoIP Service and Marketing using Odoo and Asterisk PBX
VoIP Service and Marketing using Odoo and Asterisk PBXTarek Kalaji
 
UiPath Studio Web workshop series - Day 7
UiPath Studio Web workshop series - Day 7UiPath Studio Web workshop series - Day 7
UiPath Studio Web workshop series - Day 7DianaGray10
 
Basic Building Blocks of Internet of Things.
Basic Building Blocks of Internet of Things.Basic Building Blocks of Internet of Things.
Basic Building Blocks of Internet of Things.YounusS2
 
Comparing Sidecar-less Service Mesh from Cilium and Istio
Comparing Sidecar-less Service Mesh from Cilium and IstioComparing Sidecar-less Service Mesh from Cilium and Istio
Comparing Sidecar-less Service Mesh from Cilium and IstioChristian Posta
 
UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...
UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...
UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...UbiTrack UK
 
Computer 10: Lesson 10 - Online Crimes and Hazards
Computer 10: Lesson 10 - Online Crimes and HazardsComputer 10: Lesson 10 - Online Crimes and Hazards
Computer 10: Lesson 10 - Online Crimes and HazardsSeth Reyes
 
The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...
The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...
The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...Aggregage
 
IaC & GitOps in a Nutshell - a FridayInANuthshell Episode.pdf
IaC & GitOps in a Nutshell - a FridayInANuthshell Episode.pdfIaC & GitOps in a Nutshell - a FridayInANuthshell Episode.pdf
IaC & GitOps in a Nutshell - a FridayInANuthshell Episode.pdfDaniel Santiago Silva Capera
 
COMPUTER 10: Lesson 7 - File Storage and Online Collaboration
COMPUTER 10: Lesson 7 - File Storage and Online CollaborationCOMPUTER 10: Lesson 7 - File Storage and Online Collaboration
COMPUTER 10: Lesson 7 - File Storage and Online Collaborationbruanjhuli
 
Anypoint Code Builder , Google Pub sub connector and MuleSoft RPA
Anypoint Code Builder , Google Pub sub connector and MuleSoft RPAAnypoint Code Builder , Google Pub sub connector and MuleSoft RPA
Anypoint Code Builder , Google Pub sub connector and MuleSoft RPAshyamraj55
 
Artificial Intelligence & SEO Trends for 2024
Artificial Intelligence & SEO Trends for 2024Artificial Intelligence & SEO Trends for 2024
Artificial Intelligence & SEO Trends for 2024D Cloud Solutions
 
9 Steps For Building Winning Founding Team
9 Steps For Building Winning Founding Team9 Steps For Building Winning Founding Team
9 Steps For Building Winning Founding TeamAdam Moalla
 
Building AI-Driven Apps Using Semantic Kernel.pptx
Building AI-Driven Apps Using Semantic Kernel.pptxBuilding AI-Driven Apps Using Semantic Kernel.pptx
Building AI-Driven Apps Using Semantic Kernel.pptxUdaiappa Ramachandran
 
Videogame localization & technology_ how to enhance the power of translation.pdf
Videogame localization & technology_ how to enhance the power of translation.pdfVideogame localization & technology_ how to enhance the power of translation.pdf
Videogame localization & technology_ how to enhance the power of translation.pdfinfogdgmi
 

Recently uploaded (20)

20230202 - Introduction to tis-py
20230202 - Introduction to tis-py20230202 - Introduction to tis-py
20230202 - Introduction to tis-py
 
Connector Corner: Extending LLM automation use cases with UiPath GenAI connec...
Connector Corner: Extending LLM automation use cases with UiPath GenAI connec...Connector Corner: Extending LLM automation use cases with UiPath GenAI connec...
Connector Corner: Extending LLM automation use cases with UiPath GenAI connec...
 
Igniting Next Level Productivity with AI-Infused Data Integration Workflows
Igniting Next Level Productivity with AI-Infused Data Integration WorkflowsIgniting Next Level Productivity with AI-Infused Data Integration Workflows
Igniting Next Level Productivity with AI-Infused Data Integration Workflows
 
Nanopower In Semiconductor Industry.pdf
Nanopower  In Semiconductor Industry.pdfNanopower  In Semiconductor Industry.pdf
Nanopower In Semiconductor Industry.pdf
 
How Accurate are Carbon Emissions Projections?
How Accurate are Carbon Emissions Projections?How Accurate are Carbon Emissions Projections?
How Accurate are Carbon Emissions Projections?
 
201610817 - edge part1
201610817 - edge part1201610817 - edge part1
201610817 - edge part1
 
VoIP Service and Marketing using Odoo and Asterisk PBX
VoIP Service and Marketing using Odoo and Asterisk PBXVoIP Service and Marketing using Odoo and Asterisk PBX
VoIP Service and Marketing using Odoo and Asterisk PBX
 
UiPath Studio Web workshop series - Day 7
UiPath Studio Web workshop series - Day 7UiPath Studio Web workshop series - Day 7
UiPath Studio Web workshop series - Day 7
 
Basic Building Blocks of Internet of Things.
Basic Building Blocks of Internet of Things.Basic Building Blocks of Internet of Things.
Basic Building Blocks of Internet of Things.
 
Comparing Sidecar-less Service Mesh from Cilium and Istio
Comparing Sidecar-less Service Mesh from Cilium and IstioComparing Sidecar-less Service Mesh from Cilium and Istio
Comparing Sidecar-less Service Mesh from Cilium and Istio
 
UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...
UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...
UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...
 
Computer 10: Lesson 10 - Online Crimes and Hazards
Computer 10: Lesson 10 - Online Crimes and HazardsComputer 10: Lesson 10 - Online Crimes and Hazards
Computer 10: Lesson 10 - Online Crimes and Hazards
 
The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...
The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...
The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...
 
IaC & GitOps in a Nutshell - a FridayInANuthshell Episode.pdf
IaC & GitOps in a Nutshell - a FridayInANuthshell Episode.pdfIaC & GitOps in a Nutshell - a FridayInANuthshell Episode.pdf
IaC & GitOps in a Nutshell - a FridayInANuthshell Episode.pdf
 
COMPUTER 10: Lesson 7 - File Storage and Online Collaboration
COMPUTER 10: Lesson 7 - File Storage and Online CollaborationCOMPUTER 10: Lesson 7 - File Storage and Online Collaboration
COMPUTER 10: Lesson 7 - File Storage and Online Collaboration
 
Anypoint Code Builder , Google Pub sub connector and MuleSoft RPA
Anypoint Code Builder , Google Pub sub connector and MuleSoft RPAAnypoint Code Builder , Google Pub sub connector and MuleSoft RPA
Anypoint Code Builder , Google Pub sub connector and MuleSoft RPA
 
Artificial Intelligence & SEO Trends for 2024
Artificial Intelligence & SEO Trends for 2024Artificial Intelligence & SEO Trends for 2024
Artificial Intelligence & SEO Trends for 2024
 
9 Steps For Building Winning Founding Team
9 Steps For Building Winning Founding Team9 Steps For Building Winning Founding Team
9 Steps For Building Winning Founding Team
 
Building AI-Driven Apps Using Semantic Kernel.pptx
Building AI-Driven Apps Using Semantic Kernel.pptxBuilding AI-Driven Apps Using Semantic Kernel.pptx
Building AI-Driven Apps Using Semantic Kernel.pptx
 
Videogame localization & technology_ how to enhance the power of translation.pdf
Videogame localization & technology_ how to enhance the power of translation.pdfVideogame localization & technology_ how to enhance the power of translation.pdf
Videogame localization & technology_ how to enhance the power of translation.pdf
 

R journal 2010-2

  • 1. The Journal Volume 2/2, December 2010 A peer-reviewed, open-access publication of the R Foundation for Statistical Computing Contents Editorial . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 Contributed Research Articles Solving Differential Equations in R . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 Source References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16 hglm: A Package for Fitting Hierarchical Generalized Linear Models . . . . . . . . . . . . . . 20 dclone: Data Cloning in R . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29 stringr: modern, consistent string processing . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38 Bayesian Estimation of the GARCH(1,1) Model with Student-t Innovations . . . . . . . . . . . 41 cudaBayesreg: Bayesian Computation in CUDA . . . . . . . . . . . . . . . . . . . . . . . . . . 48 binGroup: A Package for Group Testing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56 The RecordLinkage Package: Detecting Errors in Data . . . . . . . . . . . . . . . . . . . . . . . 61 spikeslab: Prediction and Variable Selection Using Spike and Slab Regression . . . . . . . . . 68 From the Core What’s New? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74 News and Notes useR! 2010 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77 Forthcoming Events: useR! 2011 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79 Changes in R . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81 Changes on CRAN . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90 News from the Bioconductor Project . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101 R Foundation News . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102
  • 2. 2 The Journal is a peer-reviewed publication of the R Foundation for Statistical Computing. Communications regarding this publication should be addressed to the editors. All articles are copyrighted by the respective authors. Prospective authors will find detailed and up-to-date submission instructions on the Journal’s homepage. Editor-in-Chief: Peter Dalgaard Center for Statistics Copenhagen Business School Solbjerg Plads 3 2000 Frederiksberg Denmark Editorial Board: Vince Carey, Martyn Plummer, and Heather Turner. Editor Programmer’s Niche: Bill Venables Editor Help Desk: Uwe Ligges Editor Book Reviews: G. Jay Kerns Department of Mathematics and Statistics Youngstown State University Youngstown, Ohio 44555-0002 USA gkerns@ysu.edu R Journal Homepage: http://journal.r-project.org/ Email of editors and editorial board: firstname.lastname@R-project.org The R Journal is indexed/abstracted by EBSCO, DOAJ. The R Journal Vol. 2/2, December 2010 ISSN 2073-4859
  • 3. 3 Editorial by Peter Dalgaard putationally demanding. Especially for transient so- lutions in two or three spatial dimensions, comput- Welcome to the 2nd issue of the 2nd volume of The ers simply were not fast enough in a time where nu- R Journal. merical performance was measured in fractions of a I am pleased to say that we can offer ten peer- MFLOPS (million floating point operations per sec- reviewed papers this time. Many thanks go to the ond). Today, the relevant measure is GFLOPS and authors and the reviewers who ensure that our arti- we should be getting much closer to practicable so- cles live up to high academic standards. The tran- lutions. sition from R News to The R Journal is now nearly However, raw computing power is not sufficient; completed. We are now listed by EBSCO and the there are non-obvious aspects of numerical analysis registration procedure with Thomson Reuters is well that should not be taken lightly, notably issues of sta- on the way. We thereby move into the framework of bility and accuracy. There is a reason that numerical scientific journals and away from the grey-literature analysis is a scientific field in its own right. newsletter format; however, it should be stressed From a statistician’s perspective, being able to fit that R News was a fairly high-impact piece of grey models to actual data is of prime importance. For literature: A cited reference search turned up around models with only a few parameters, you can get 1300 references to the just over 200 papers that were quite far with nonlinear regression and a good nu- published in R News! merical solver. For ill-posed problems with func- I am particularly happy to see the paper by tional parameters (the so-called “inverse problems”), Soetart et al. on differential equation solvers. In and for stochastic differential equations, there still many fields of research, the natural formulation of appears to be work to be done. Soetart et al. do not models is via local relations at the infinitesimal level, go into these issues, but I hope that their paper will rather than via closed form mathematical expres- be an inspiration for further work. sions, and quite often solutions rely on simplifying With this issue, in accordance with the rotation assumptions. My own PhD work, some 25 years ago, rules of the Editorial Board, I step down as Editor- concerned diffusion of substances within the human in-Chief, to be succeded by Heather Turner. Heather eye, with the ultimate goal of measuring the state has already played a key role in the transition from of the blood-retinal barrier. Solutions for this prob- R News to The R Journal, as well as being probably lem could be obtained for short timespans, if one as- the most efficient Associate Editor on the Board. The sumed that the eye was completely spherical. Ex- Editorial Board will be losing last year’s Editor-in- tending the solutions to accommodate more realis- Chief, Vince Carey, who has now been on board for tic models (a necessity for fitting actual experimental the full four years. We shall miss Vince, who has al- data) resulted in quite unwieldy formulas, and even ways been good for a precise and principled argu- then, did not give you the kind of modelling freedom ment and in the process taught at least me several that you really wanted to elucidate the scientific is- new words. We also welcome Hadley Wickham as sue. a new Associate Editor and member of the Editorial In contrast, numerical procedures could fairly Board. easily be set up and modified to better fit reality. Season’s greetings and best wishes for a happy The main problem was that they tended to be com- 2011! The R Journal Vol. 2/2, December 2010 ISSN 2073-4859
  • 4. 4 The R Journal Vol. 2/2, December 2010 ISSN 2073-4859
  • 5. C ONTRIBUTED R ESEARCH A RTICLES 5 Solving Differential Equations in R by Karline Soetaert, Thomas Petzoldt and R. Woodrow complete repertoire of differential equations can be Setzer1 numerically solved. More specifically, the following types of differen- Abstract Although R is still predominantly ap- tial equations can now be handled with add-on pack- plied for statistical analysis and graphical repre- ages in R: sentation, it is rapidly becoming more suitable for mathematical computing. One of the fields • Initial value problems (IVP) of ordinary differ- where considerable progress has been made re- ential equations (ODE), using package deSolve cently is the solution of differential equations. (Soetaert et al., 2010b). Here we give a brief overview of differential • Initial value differential algebraic equations equations that can now be solved by R. (DAE), package deSolve . • Initial value partial differential equations (PDE), packages deSolve and ReacTran Introduction (Soetaert and Meysman, 2010). Differential equations describe exchanges of matter, • Boundary value problems (BVP) of ordinary energy, information or any other quantities, often as differential equations, using package bvpSolve they vary in time and/or space. Their thorough ana- (Soetaert et al., 2010a), or ReacTran and root- lytical treatment forms the basis of fundamental the- Solve (Soetaert, 2009). ories in mathematics and physics, and they are in- • Initial value delay differential equations creasingly applied in chemistry, life sciences and eco- (DDE), using packages deSolve or PBSddes- nomics. olve (Couture-Beil et al., 2010). Differential equations are solved by integration, but unfortunately, for many practical applications • Stochastic differential equations (SDE), using in science and engineering, systems of differential packages sde (Iacus, 2008) and pomp (King equations cannot be integrated to give an analytical et al., 2008). solution, but rather need to be solved numerically. In this short overview, we demonstrate how to Many advanced numerical algorithms that solve solve the first four types of differential equations differential equations are available as (open-source) in R. It is beyond the scope to give an exhaustive computer codes, written in programming languages overview about the vast number of methods to solve like FORTRAN or C and that are available from these differential equations and their theory, so the repositories like GAMS (http://gams.nist.gov/) or reader is encouraged to consult one of the numer- NETLIB (www.netlib.org). ous textbooks (e.g., Ascher and Petzold, 1998; Press Depending on the problem, mathematical for- et al., 2007; Hairer et al., 2009; Hairer and Wanner, malisations may consist of ordinary differential 2010; LeVeque, 2007, and many others). equations (ODE), partial differential equations In addition, a large number of analytical and nu- (PDE), differential algebraic equations (DAE), or de- merical methods exists for the analysis of bifurca- lay differential equations (DDE). In addition, a dis- tions and stability properties of deterministic sys- tinction is made between initial value problems (IVP) tems, the efficient simulation of stochastic differen- and boundary value problems (BVP). tial equations or the estimation of parameters. We With the introduction of R-package odesolve do not deal with these methods here. (Setzer, 2001), it became possible to use R (R Devel- opment Core Team, 2009) for solving very simple ini- tial value problems of systems of ordinary differen- Types of differential equations tial equations, using the lsoda algorithm of Hind- marsh (1983) and Petzold (1983). However, many Ordinary differential equations real-life applications, including physical transport modeling, equilibrium chemistry or the modeling of Ordinary differential equations describe the change electrical circuits, could not be solved with this pack- of a state variable y as a function f of one independent age. variable t (e.g., time or space), of y itself, and, option- Since odesolve, much effort has been made to ally, a set of other variables p, often called parameters: improve R’s capabilities to handle differential equa- tions, mostly by incorporating published and well dy y = = f (t, y, p) tested numerical codes, such that now a much more dt 1 The views expressed in this paper are those of the authors and do not necessarily reflect the views or policies of the U.S. Environmental Protection Agency The R Journal Vol. 2/2, December 2010 ISSN 2073-4859
  • 6. 6 C ONTRIBUTED R ESEARCH A RTICLES In many cases, solving differential equations re- rithm (Press et al., 2007). Two more stable solu- quires the introduction of extra conditions. In the fol- tion methods implement a mono implicit Runge- lowing, we concentrate on the numerical treatment Kutta (MIRK) code, based on the FORTRAN code of two classes of problems, namely initial value prob- twpbvpC (Cash and Mazzia, 2005), and the collocation lems and boundary value problems. method, based on the FORTRAN code colnew (Bader and Ascher, 1987). Some boundary value problems can also be solved with functions from packages Re- Initial value problems acTran and rootSolve (see below). If the extra conditions are specified at the initial value of the independent variable, the differential equa- Partial differential equations tions are called initial value problems (IVP). There exist two main classes of algorithms to nu- In contrast to ODEs where there is only one indepen- merically solve such problems, so-called Runge-Kutta dent variable, partial differential equations (PDE) formulas and linear multistep formulas (Hairer et al., contain partial derivatives with respect to more than 2009; Hairer and Wanner, 2010). The latter contains one independent variable, for instance t (time) and two important families, the Adams family and the x (a spatial dimension). To distinguish this type backward differentiation formulae (BDF). of equations from ODEs, the derivatives are repre- Another important distinction is between explicit sented with the ∂ symbol, e.g. and implicit methods, where the latter methods can ∂y ∂y solve a particular class of equations (so-called “stiff” = f (t, x, y, , p) ∂t ∂x equations) where explicit methods have problems with stability and efficiency. Stiffness occurs for in- Partial differential equations can be solved by sub- stance if a problem has components with different dividing one or more of the continuous independent rates of variation according to the independent vari- variables in a number of grid cells, and replacing the able. Very often there will be a tradeoff between us- derivatives by discrete, algebraic approximate equa- ing explicit methods that require little work per inte- tions, so-called finite differences (cf. LeVeque, 2007; gration step and implicit methods which are able to Hundsdorfer and Verwer, 2003). take larger integration steps, but need (much) more For time-varying cases, it is customary to discre- work for one step. tise the spatial coordinate(s) only, while time is left in In R, initial value problems can be solved with continuous form. This is called the method-of-lines, functions from package deSolve (Soetaert et al., and in this way, one PDE is translated into a large 2010b), which implements many solvers from ODE- number of coupled ordinary differential equations, PACK (Hindmarsh, 1983), the code vode (Brown that can be solved with the usual initial value prob- et al., 1989), the differential algebraic equation solver lem solvers (cf. Hamdi et al., 2007). This applies to daspk (Brenan et al., 1996), all belonging to the linear parabolic PDEs such as the heat equation, and to hy- multistep methods, and comprising Adams meth- perbolic PDEs such as the wave equation. ods as well as backward differentiation formulae. For time-invariant problems, usually all indepen- The former methods are explicit, the latter implicit. dent variables are discretised, and the derivatives ap- In addition, this package contains a de-novo imple- proximated by algebraic equations, which are solved mentation of a rather general Runge-Kutta solver by root-finding techniques. This technique applies to based on Dormand and Prince (1980); Prince and elliptic PDEs. Dormand (1981); Bogacki and Shampine (1989); Cash R-package ReacTran provides functions to gener- and Karp (1990) and using ideas from Butcher (1987) ate finite differences on a structured grid. After that, and Press et al. (2007). Finally, the implicit Runge- the resulting time-varying cases can be solved with Kutta method radau (Hairer et al., 2009) has been specially-designed functions from package deSolve, added recently. while time-invariant cases can be solved with root- solving methods from package rootSolve . Boundary value problems Differential algebraic equations If the extra conditions are specified at different Differential-algebraic equations (DAE) contain a values of the independent variable, the differen- mixture of differential ( f ) and algebraic equations tial equations are called boundary value problems (g), the latter e.g. for maintaining mass-balance con- (BVP). A standard textbook on this subject is Ascher ditions: et al. (1995). Package bvpSolve (Soetaert et al., 2010a) imple- y = f (t, y, p) ments three methods to solve boundary value prob- 0 = g(t, y, p) lems. The simplest solution method is the single shooting method, which combines initial value prob- Important for the solution of a DAE is its index. lem integration with a nonlinear root finding algo- The index of a DAE is the number of differentiations The R Journal Vol. 2/2, December 2010 ISSN 2073-4859
  • 7. C ONTRIBUTED R ESEARCH A RTICLES 7 needed until a system consisting only of ODEs is ob- Numerical accuracy tained. Function daspk (Brenan et al., 1996) from pack- Numerical solution of a system of differential equa- age deSolve solves (relatively simple) DAEs of index tions is an approximation and therefore prone to nu- at most 1, while function radau (Hairer et al., 2009) merical errors, originating from several sources: solves DAEs of index up to 3. 1. time step and accuracy order of the solver, 2. floating point arithmetics, Implementation details 3. properties of the differential system and stabil- ity of the solution algorithm. The implemented solver functions are explained by means of the ode-function, used for the solution of For methods with automatic stepsize selection, initial value problems. The interfaces to the other accuracy of the computation can be adjusted us- solvers have an analogous definition: ing the non-negative arguments atol (absolute tol- erance) and rtol (relative tolerance), which control ode(y, times, func, parms, method = c("lsoda", the local errors of the integration. "lsode", "lsodes", "lsodar", Like R itself, all solvers use double-precision "vode", "daspk", "euler", "rk4", floating-point arithmetics according to IEEE Stan- "ode23", "ode45", "radau", "bdf", dard 754 (2008), which means that it can represent "bdf_d", "adams", "impAdams", numbers between approx. ±2.25 10−308 to approx. "impAdams_d"), ...) ±1.8 10308 and with 16 significant digits. It is there- fore not advisable to set rtol below 10−16 , except set- To use this, the system of differential equations ting it to zero with the intention to use absolute tol- can be defined as an R-function (func) that computes erance exclusively. derivatives in the ODE system (the model definition) The solvers provided by the packages presented according to the independent variable (e.g. time t). below have proven to be quite robust in most prac- func can also be a function in a dynamically loaded tical cases, however users should always be aware shared library (Soetaert et al., 2010c) and, in addition, about the problems and limitations of numerical some solvers support also the supply of an analyti- methods and carefully check results for plausibil- cally derived function of partial derivatives (Jacobian ity. The section “Troubleshooting” in the package vi- matrix). gnette (Soetaert et al., 2010d) should be consulted as If func is an R-function, it must be defined as: a first source for solving typical problems. func <- function(t, y, parms, ...) where t is the actual value of the independent vari- Examples able (e.g. the current time point in the integration), y is the current estimate of the variables in the ODE An initial value ODE system, parms is the parameter vector and ... can be used to pass additional arguments to the function. Consider the famous van der Pol equation (van der The return value of func should be a list, whose Pol and van der Mark, 1927), that describes a non- first element is a vector containing the derivatives conservative oscillator with non-linear damping and of y with respect to t, and whose next elements are which was originally developed for electrical cir- optional global values that can be recorded at each cuits employing vacuum tubes. The oscillation is de- point in times. The derivatives must be specified in scribed by means of a 2nd order ODE: the same order as the state variables y. Depending on the algorithm specified in argu- z − µ (1 − z2 ) z + z = 0 ment method, numerical simulation proceeds either Such a system can be routinely rewritten as a system exactly at the time steps specified in times, or us- of two 1st order ODEs, if we substitute z with y1 and ing time steps that are independent from times and z with y2 : where the output is generated by interpolation. With the exception of method euler and several fixed-step y1 = y2 Runge-Kutta methods all algorithms have automatic y2 = µ · (1 − y1 2 ) · y2 − y1 time stepping, which can be controlled by setting ac- curacy requirements (see below) or by using optional There is one parameter, µ, and two differential arguments like hini (initial time step), hmin (minimal variables, y1 and y2 with initial values (at t = 0): time step) and hmax (maximum time step). Specific details, e.g. about the applied interpolation methods y 1( t =0) = 2 can be found in the manual pages and the original y 2( t =0) = 0 literature cited there. The R Journal Vol. 2/2, December 2010 ISSN 2073-4859
  • 8. 8 C ONTRIBUTED R ESEARCH A RTICLES The van der Pol equation is often used as a test problem for ODE solvers, as, for large µ, its dy- IVP ODE, stiff namics consists of parts where the solution changes 2 very slowly, alternating with regions of very sharp changes. This “stiffness” makes the equation quite 1 challenging to solve. 0 y In R, this model is implemented as a function (vdpol) whose inputs are the current time (t), the val- −1 ues of the state variables (y), and the parameters (mu); −2 the function returns a list with as first element the derivatives, concatenated. 0 500 1000 1500 2000 2500 3000 time vdpol <- function (t, y, mu) { list(c( y[2], Figure 1: Solution of the van der Pol equation, an mu * (1 - y[1]^2) * y[2] - y[1] initial value ordinary differential equation, stiff case, )) µ = 1000. } After defining the initial condition of the state IVP ODE, nonstiff variables (yini), the model is solved, and output 2 written at selected time points (times), using de- Solve’s integration function ode. The default rou- 1 tine lsoda, which is invoked by ode automatically 0 y switches between stiff and non-stiff methods, de- pending on the problem (Petzold, 1983). −1 We run the model for a typically stiff (mu = 1000) −2 and nonstiff (mu = 1) situation: 0 5 10 15 20 25 30 library(deSolve) time yini <- c(y1 = 2, y2 = 0) stiff <- ode(y = yini, func = vdpol, times = 0:3000, parms = 1000) Figure 2: Solution of the van der Pol equation, an initial value ordinary differential equation, non-stiff nonstiff <- ode(y = yini, func = vdpol, case, µ = 1. times = seq(0, 30, by = 0.01), parms = 1) solver non-stiff stiff The model returns a matrix, of class deSolve, ode23 0.37 271.19 with in its first column the time values, followed by lsoda 0.26 0.23 the values of the state variables: adams 0.13 616.13 bdf 0.15 0.22 head(stiff, n = 3) radau 0.53 0.72 time y1 y2 Table 1: Comparison of solvers for a stiff and a [1,] 0 2.000000 0.0000000000 non-stiff parametrisation of the van der Pol equation [2,] 1 1.999333 -0.0006670373 (time in seconds, mean values of ten simulations on [3,] 2 1.998666 -0.0006674088 an AMD AM2 X2 3000 CPU). Figures are generated using the S3 plot method A comparison of timings for two explicit solvers, for objects of class deSolve: the Runge-Kutta method (ode23) and the adams method, with the implicit multistep solver (bdf, plot(stiff, type = "l", which = "y1", backward differentiation formula) shows a clear ad- lwd = 2, ylab = "y", vantage for the latter in the stiff case (Figure 1). The main = "IVP ODE, stiff") default solver (lsoda) is not necessarily the fastest, but shows robust behavior due to automatic stiff- plot(nonstiff, type = "l", which = "y1", ness detection. It uses the explicit multistep Adams lwd = 2, ylab = "y", method for the non-stiff case and the BDF method main = "IVP ODE, nonstiff") for the stiff case. The accuracy is comparable for all The R Journal Vol. 2/2, December 2010 ISSN 2073-4859
  • 9. C ONTRIBUTED R ESEARCH A RTICLES 9 solvers with atol = rtol = 10−6 , the default. xi <- 0.0025 analytic <- cos(pi * x) + exp((x - 1)/sqrt(xi)) + exp(-(x + 1)/sqrt(xi)) A boundary value ODE max(abs(analytic - twp[, 2])) The webpage of Jeff Cash (Cash, 2009) contains many [1] 7.788209e-10 test cases, including their analytical solution (see be- low), that BVP solvers should be able to solve. We A similar low discrepancy (4 · 10−11 ) is noted for use equation no. 14 from this webpage as an exam- the ξ = 0.0001 as solved by bvpcol; the shooting ple: method is considerably less precise (1.4 · 10−5 ), al- though the same tolerance (atol = 10−8 ) was used ξy − y = −(ξπ 2 + 1) cos(πx ) for all runs. on the interval [−1, 1], and subject to the boundary The plot shows how the shape of the solution conditions: is affected by the parameter ξ, becoming more and more steep near the boundaries, and therefore more y( x=−1) = 0 and more difficult to solve, as ξ gets smaller. y( x=+1) = 0 plot(shoot[, 1], shoot[, 2], type = "l", lwd = 2, ylim = c(-1, 1), col = "blue", The second-order equation first is rewritten as two xlab = "x", ylab = "y", main = "BVP ODE") first-order equations: lines(twp[, 1], twp[, 2], col = "red", lwd = 2) lines(coll[, 1], coll[, 2], col = "green", lwd = 2) y1 = y2 legend("topright", legend = c("0.01", "0.0025", y2 = 1/ξ · (y1 − (ξπ 2 + 1) cos(πx )) "0.0001"), col = c("blue", "red", "green"), title = expression(xi), lwd = 2) It is implemented in R as: BVP ODE Prob14 <- function(x, y, xi) { ξ 1.0 list(c( 0.01 y[2], 0.0025 0.0001 1/xi * (y[1] - (xi*pi*pi+1) * cos(pi*x)) 0.5 )) } 0.0 With decreasing values of ξ, this problem becomes y increasingly difficult to solve. We use three val- ues of ξ, and solve the problem with the shooting, −0.5 the MIRK and the collocation method (Ascher et al., 1995). Note how the initial conditions yini and the con- ditions at the end of the integration interval yend −1.0 are specified, where NA denotes that the value is not −1.0 −0.5 0.0 0.5 1.0 known. The independent variable is called x here (rather than times in ode). x library(bvpSolve) x <- seq(-1, 1, by = 0.01) Figure 3: Solution of the BVP ODE problem, for dif- shoot <- bvpshoot(yini = c(0, NA), ferent values of parameter ξ. yend = c(0, NA), x = x, parms = 0.01, func = Prob14) Differential algebraic equations twp <- bvptwp(yini = c(0, NA), yend = c(0, The so called “Rober problem” describes an auto- NA), x = x, parms = 0.0025, catalytic reaction (Robertson, 1966) between three func = Prob14) chemical species, y1 , y2 and y3 . The problem can be coll <- bvpcol(yini = c(0, NA), formulated either as an ODE (Mazzia and Magherini, yend = c(0, NA), x = x, parms = 1e-04, 2008), or as a DAE: func = Prob14) y1 = −0.04y1 + 104 y2 y3 The numerical approximation generated by bvptwp y2 = 0.04y1 − 104 y2 y3 − 3107 y2 2 is very close to the analytical solution, e.g. for ξ = 0.0025: 1 = y1 + y2 + y3 The R Journal Vol. 2/2, December 2010 ISSN 2073-4859
  • 10. 10 C ONTRIBUTED R ESEARCH A RTICLES IVP DAE where the first two equations are differential y1 y2 equations that specify the dynamics of chemical species y1 and y2 , while the third algebraic equation 0.8 ensures that the summed concentration of the three 2e−05 conc. conc. species remains 1. 0.4 The DAE has to be specified by the residual func- 0e+00 tion instead of the rates of change (as in ODEs). 0.0 1e−06 1e+00 1e+06 1e−06 1e+00 1e+06 r1 = −y1 − 0.04y1 + 104 y2 y3 time time r2 = −y2 + 0.04y1 − 104 y2 y3 − 3 107 y2 2 y3 error r3 = −1 + y1 + y2 + y3 1e−09 0.8 Implemented in R this becomes: −2e−09 conc. conc. 0.4 daefun<-function(t, y, dy, parms) { −5e−09 res1 <- - dy[1] - 0.04 * y[1] + 0.0 1e4 * y[2] * y[3] 1e−06 1e+00 1e+06 1e−06 1e+00 1e+06 res2 <- - dy[2] + 0.04 * y[1] - 1e4 * y[2] * y[3] - 3e7 * y[2]^2 time time res3 <- y[1] + y[2] + y[3] - 1 list(c(res1, res2, res3), error = as.vector(y[1] + y[2] + y[3]) - 1) Figure 4: Solution of the DAE problem for the sub- } stances y1 , y2 , y3 ; mass balance error: deviation of to- tal sum from one. yini <- c(y1 = 1, y2 = 0, y3 = 0) dyini <- c(-0.04, 0.04, 0) times <- 10 ^ seq(-6,6,0.1) Partial differential equations In partial differential equations (PDE), the func- The input arguments of function daefun are the tion has several independent variables (e.g. time and current time (t), the values of the state variables and depth) and contains their partial derivatives. their derivatives (y, dy) and the parameters (parms). Many partial differential equations can be solved It returns the residuals, concatenated and an output by numerical approximation (finite differencing) af- variable, the error in the algebraic equation. The lat- ter rewriting them as a set of ODEs (see Schiesser, ter is added to check upon the accuracy of the results. 1991; LeVeque, 2007; Hundsdorfer and Verwer, For DAEs solved with daspk, both the state vari- 2003). ables and their derivatives need to be initialised (y Functions tran.1D, tran.2D, and tran.3D from and dy). Here we make sure that the initial condi- R package ReacTran (Soetaert and Meysman, 2010) tions for y obey the algebraic constraint, while also implement finite difference approximations of the the initial condition of the derivatives is consistent diffusive-advective transport equation which, for the with the dynamics. 1-D case, is: library(deSolve) 1 ∂ ∂C ∂ print(system.time(out <-daspk(y = yini, − · Ax −D · − ( Ax · u · C) dy = dyini, times = times, res = daefun, Ax ∂x ∂x ∂x parms = NULL))) Here D is the “diffusion coefficient”, u is the “advec- user system elapsed tion rate”, and A x is some property (e.g. surface area) 0.07 0.00 0.11 that depends on the independent variable, x. It should be noted that the accuracy of the finite An S3 plot method can be used to plot all vari- difference approximations can not be specified in the ables at once: ReacTran functions. It is up to the user to make sure that the solutions are sufficiently accurate, e.g. by in- plot(out, ylab = "conc.", xlab = "time", cluding more grid points. type = "l", lwd = 2, log = "x") mtext("IVP DAE", side = 3, outer = TRUE, line = -1) One dimensional PDE There is a very fast initial change in concentra- Diffusion-reaction models are a fundamental class of tions, mainly due to the quick reaction between y1 models which describe how concentration of matter, and y2 and amongst y2 . After that, the slow reaction energy, information, etc. evolves in space and time of y1 with y2 causes the system to change much more under the influence of diffusive transport and trans- smoothly. This is typical for stiff problems. formation (Soetaert and Herman, 2009). The R Journal Vol. 2/2, December 2010 ISSN 2073-4859
  • 11. C ONTRIBUTED R ESEARCH A RTICLES 11 As an example, consider the 1-D diffusion- user system elapsed reaction model in [0, 10]: 0.02 0.00 0.02 ∂C ∂ ∂C The values of the state-variables (y) are plotted = D· −Q ∂t ∂x ∂x against the distance, in the middle of the grid cells (Grid$x.mid). with C the concentration, t the time, x the distance from the origin, Q, the consumption rate, and with plot (Grid$x.mid, std$y, type = "l", boundary conditions (values at the model edges): lwd = 2, main = "steady-state PDE", xlab = "x", ylab = "C", col = "red") ∂C =0 ∂x x=0 steady−state PDE Cx=10 = Cext 20 To solve this model in R, first the 1-D model Grid is 10 defined; it divides 10 cm (L) into 1000 boxes (N). 0 C library(ReacTran) −10 Grid <- setup.grid.1D(N = 1000, L = 10) −30 The model equation includes a transport term, approximated by ReacTran function tran.1D and 0 2 4 6 8 10 a consumption term (Q). The downstream bound- x ary condition, prescribed as a concentration (C.down) needs to be specified, the zero-gradient at the up- stream boundary is the default: Figure 5: Steady-state solution of the 1-D diffusion- reaction model. pde1D <-function(t, C, parms) { tran <- tran.1D(C = C, D = D, The analytical solution compares well with the C.down = Cext, dx = Grid)$dC numerical approximation: list(tran - Q) # return value: rate of change } analytical <- Q/2/D*(Grid$x.mid^2 - 10^2) + Cext max(abs(analytical - std$y)) The model parameters are: [1] 1.250003e-05 D <- 1 # diffusion constant Q <- 1 # uptake rate Next the model is run dynamically for 100 time Cext <- 20 units using deSolve function ode.1D, and starting with a uniform concentration: In a first application, the model is solved to steady-state, which retrieves the condition where the require(deSolve) concentrations are invariant: times <- seq(0, 100, by = 1) ∂ ∂C system.time( 0= D· −Q out <- ode.1D(y = rep(1, Grid$N), ∂x ∂x times = times, func = pde1D, parms = NULL, nspec = 1) In R, steady-state conditions can be estimated using ) functions from package rootSolve which implement amongst others a Newton-Raphson algorithm (Press user system elapsed et al., 2007). For 1-dimensional models, steady.1D is 0.61 0.02 0.63 most efficient. The initial “guess” of the steady-state solution (y) is unimportant; here we take simply N Here, out is a matrix, whose 1st column contains random numbers. Argument nspec = 1 informs the the output times, and the next columns the values of solver that only one component is described. the state variables in the different boxes; we print the Although a system of 1000 equations needs to be first columns of the last three rows of this matrix: solved, this takes only a fraction of a second: tail(out[, 1:4], n = 3) library(rootSolve) print(system.time( time 1 2 3 std <- steady.1D(y = runif(Grid$N), [99,] 98 -27.55783 -27.55773 -27.55754 func = pde1D, parms = NULL, nspec = 1) [100,] 99 -27.61735 -27.61725 -27.61706 )) [101,] 100 -27.67542 -27.67532 -27.67513 The R Journal Vol. 2/2, December 2010 ISSN 2073-4859
  • 12. 12 C ONTRIBUTED R ESEARCH A RTICLES We plot the result using a blue-yellow-red color R code, or in compiled languages. However, com- scheme, and using deSolve’s S3 method image. Fig- pared to odesolve, it includes a more complete set ure 6 shows that, as time proceeds, gradients develop of integrators, and a more extensive set of options to from the uniform distribution, until the system al- tune the integration routines, it provides more com- most reaches steady-state at the end of the simula- plete output, and has extended the applicability do- tion. main to include also DDEs, DAEs and PDEs. Thanks to the DAE solvers daspk (Brenan et al., image(out, xlab = "time, days", 1996) and radau (Hairer and Wanner, 2010) it is now ylab = "Distance, cm", also possible to model electronic circuits or equilib- main = "PDE", add.contour = TRUE) rium chemical systems. These problems are often of index ≤ 1. In many mechanical systems, physical constraints lead to DAEs of index up to 3, and these PDE more complex problems can be solved with radau. 1.0 The inclusion of BVP and PDE solvers have 15 10 opened up the application area to the field of re- 5 active transport modelling (Soetaert and Meysman, 0.8 0 2010), such that R can now be used to describe quan- −5 tities that change not only in time, but also along one or more spatial axes. We use it to model how 0.6 −10 Distance, cm −15 ecosystems change along rivers, or in sediments, but it could equally serve to model the growth of a tu- 0.4 −20 mor in human brains, or the dispersion of toxicants in human tissues. −25 The open source matrix language R has great po- 0.2 tential for dynamic modelling, and the tools cur- rently available are suitable for solving a wide va- riety of practical and scientific problems. The perfor- 0.0 0 20 40 60 80 100 mance is sufficient even for larger systems, especially time, days when models can be formulated using matrix alge- bra or are implemented in compiled languages like C or Fortran (Soetaert et al., 2010b). Indeed, there Figure 6: Dynamic solution of the 1-D diffusion- is emerging interest in performing statistical analysis reaction model. on differential equations, e.g. in package nlmeODE (Tornøe et al., 2004) for fitting non-linear mixed- It should be noted that the steady-state model is effects models using differential equations, pack- effectively a boundary value problem, while the tran- age FME (Soetaert and Petzoldt, 2010) for sensitiv- sient model is a prototype of a “parabolic” partial dif- ity analysis, parameter estimation and Markov chain ferential equation (LeVeque, 2007). Monte-Carlo analysis or package ccems for combina- Whereas R can also solve the other two main torially complex equilibrium model selection (Radi- classes of PDEs, i.e. of the “hyperbolic” and “ellip- voyevitch, 2008). tic” type, it is well beyond the scope of this paper to However, there is ample room for extensions elaborate on that. and improvements. For instance, the PDE solvers are quite memory intensive, and could benefit from the implementation of sparse matrix solvers that are Discussion more efficient in this respect2 . In addition, the meth- ods implemented in ReacTran handle equations de- Although R is still predominantly applied for statis- fined on very simple shapes only. Extending the tical analysis and graphical representation, it is more PDE approach to finite elements (Strang and Fix, and more suitable for mathematical computing, e.g. 1973) would open up the application domain of R to in the field of matrix algebra (Bates and Maechler, any irregular geometry. Other spatial discretisation 2008). Thanks to the differential equation solvers, R schemes could be added, e.g. for use in fluid dynam- is also emerging as a powerful environment for dy- ics. namic simulations (Petzoldt, 2003; Soetaert and Her- Our models are often applied to derive unknown man, 2009; Stevens, 2009). parameters by fitting them against data; this relies on The new package deSolve has retained all the the availability of apt parameter fitting algorithms. funtionalities of its predecessor odesolve (Setzer, Discussion of these items is highly welcomed, in 2001), such as the potential to define models both in the new special interest group about dynamic mod- 2 for instance, the “preconditioned Krylov” part of the daspk method is not yet supported 3 https://stat.ethz.ch/mailman/listinfo/r-sig-dynamic-models The R Journal Vol. 2/2, December 2010 ISSN 2073-4859
  • 13. C ONTRIBUTED R ESEARCH A RTICLES 13 els3 in R. E. Hairer, S. P. Noørsett, and G. Wanner. Solving Ordi- nary Differential Equations I: Nonstiff Problems. Sec- ond Revised Edition. Springer-Verlag, Heidelberg, Bibliography 2009. U. Ascher, R. Mattheij, and R. Russell. Numerical So- S. Hamdi, W. E. Schiesser, and G. W. Griffiths. lution of Boundary Value Problems for Ordinary Dif- Method of lines. Scholarpedia, 2(7):2859, 2007. ferential Equations. Philadelphia, PA, 1995. A. C. Hindmarsh. ODEPACK, a systematized collec- tion of ODE solvers. In R. Stepleman, editor, Scien- U. M. Ascher and L. R. Petzold. Computer Methods tific Computing, Vol. 1 of IMACS Transactions on Sci- for Ordinary Differential Equations and Differential- entific Computation, pages 55–64. IMACS / North- Algebraic Equations. SIAM, Philadelphia, 1998. Holland, Amsterdam, 1983. G. Bader and U. Ascher. A new basis implementa- W. Hundsdorfer and J. Verwer. Numerical Solution of tion for a mixed order boundary value ODE solver. Time-Dependent Advection-Diffusion-Reaction Equa- SIAM J. Scient. Stat. Comput., 8:483–500, 1987. tions. Springer Series in Computational Mathematics. D. Bates and M. Maechler. Matrix: A Matrix Package Springer-Verlag, Berlin, 2003. for R, 2008. R package version 0.999375-9. S. M. Iacus. sde: Simulation and Inference for Stochas- tic Differential Equations, 2008. R package version P. Bogacki and L. Shampine. A 3(2) pair of Runge- 2.0.3. Kutta formulas. Appl. Math. Lett., 2:1–9, 1989. IEEE Standard 754. Ieee standard for floating-point K. E. Brenan, S. L. Campbell, and L. R. Pet- arithmetic, Aug 2008. zold. Numerical Solution of Initial-Value Problems in Differential-Algebraic Equations. SIAM Classics in A. A. King, E. L. Ionides, and C. M. Breto. pomp: Sta- Applied Mathematics, 1996. tistical Inference for Partially Observed Markov Pro- cesses, 2008. R package version 0.21-3. P. N. Brown, G. D. Byrne, and A. C. Hindmarsh. VODE, a variable-coefficient ode solver. SIAM J. R. J. LeVeque. Finite Difference Methods for Ordinary Sci. Stat. Comput., 10:1038–1051, 1989. and Partial Differential Equations, Steady State and Time Dependent Problems. SIAM, 2007. J. C. Butcher. The Numerical Analysis of Ordinary Dif- ferential Equations, Runge-Kutta and General Linear F. Mazzia and C. Magherini. Test Set for Initial Value Methods. Wiley, Chichester, New York, 1987. Problem Solvers, release 2.4. Department of Mathe- matics, University of Bari, Italy, 2008. URL http:// J. R. Cash. 35 Test Problems for Two Way Point Bound- pitagora.dm.uniba.it/~testset. Report 4/2008. ary Value Problems, 2009. URL http://www.ma.ic. ac.uk/~jcash/BVP_software/PROBLEMS.PDF. L. R. Petzold. Automatic selection of methods for solving stiff and nonstiff systems of ordinary dif- J. R. Cash and A. H. Karp. A variable order ferential equations. SIAM J. Sci. Stat. Comput., 4: Runge-Kutta method for initial value problems 136–148, 1983. with rapidly varying right-hand sides. ACM Trans- T. Petzoldt. R as a simulation platform in ecological actions on Mathematical Software, 16:201–222, 1990. modelling. R News, 3(3):8–16, 2003. J. R. Cash and F. Mazzia. A new mesh selection W. H. Press, S. A. Teukolsky, W. T. Vetterling, and algorithm, based on conditioning, for two-point B. P. Flannery. Numerical Recipes: The Art of Scien- boundary value codes. J. Comput. Appl. Math., 184: tific Computing. Cambridge University Press, 3rd 362–381, 2005. edition, 2007. A. Couture-Beil, J. T. Schnute, and R. Haigh. PB- P. J. Prince and J. R. Dormand. High order embed- Sddesolve: Solver for Delay Differential Equations, ded Runge-Kutta formulae. J. Comput. Appl. Math., 2010. R package version 1.08.11. 7:67–75, 1981. J. R. Dormand and P. J. Prince. A family of embed- R Development Core Team. R: A Language and Envi- ded Runge-Kutta formulae. J. Comput. Appl. Math., ronment for Statistical Computing. R Foundation for 6:19–26, 1980. Statistical Computing, Vienna, Austria, 2009. URL http://www.R-project.org. ISBN 3-900051-07-0. E. Hairer and G. Wanner. Solving Ordinary Differen- tial Equations II: Stiff and Differential-Algebraic Prob- T. Radivoyevitch. Equilibrium model selection: lems. Second Revised Edition. Springer-Verlag, Hei- dTTP induced R1 dimerization. BMC Systems Bi- delberg, 2010. ology, 2:15, 2008. The R Journal Vol. 2/2, December 2010 ISSN 2073-4859
  • 14. 14 C ONTRIBUTED R ESEARCH A RTICLES H. H. Robertson. The solution of a set of reaction rate equations. In J. Walsh, editor, Numerical Analysis: Karline Soetaert An Introduction, pages 178–182. Academic Press, Netherlands Institute of Ecology London, 1966. K.Soetaert@nioo.knaw.nl W. E. Schiesser. The Numerical Method of Lines: In- Thomas Petzoldt tegration of Partial Differential Equations. Academic Technische Universität Dresden Press, San Diego, 1991. Thomas.Petzoldt@tu-dresden.de R. W. Setzer. The odesolve Package: Solvers for Ordi- R. Woodrow Setzer nary Differential Equations, 2001. R package version US Environmental Protection Agency 0.1-1. Setzer.Woodrow@epamail.epa.gov K. Soetaert. rootSolve: Nonlinear Root Finding, Equi- librium and Steady-State Analysis of Ordinary Differ- ential Equations, 2009. R package version 1.6. K. Soetaert and P. M. J. Herman. A Practical Guide to Ecological Modelling. Using R as a Simulation Plat- form. Springer, 2009. ISBN 978-1-4020-8623-6. K. Soetaert and F. Meysman. ReacTran: Reactive Transport Modelling in 1D, 2D and 3D, 2010. R pack- age version 1.2. K. Soetaert and T. Petzoldt. Inverse modelling, sensi- tivity and Monte Carlo analysis in R using package FME. Journal of Statistical Software, 33(3):1–28, 2010. URL http://www.jstatsoft.org/v33/i03/. K. Soetaert, J. R. Cash, and F. Mazzia. bvpSolve: Solvers for Boundary Value Problems of Ordinary Dif- ferential Equations, 2010a. R package version 1.2. K. Soetaert, T. Petzoldt, and R. W. Setzer. Solving dif- ferential equations in R: Package deSolve. Journal of Statistical Software, 33(9):1–25, 2010b. ISSN 1548- 7660. URL http://www.jstatsoft.org/v33/i09. K. Soetaert, T. Petzoldt, and R. W. Setzer. R Pack- age deSolve: Writing Code in Compiled Languages, 2010c. deSolve vignette - R package version 1.8. K. Soetaert, T. Petzoldt, and R. W. Setzer. R Package deSolve: Solving Initial Value Differential Equations, 2010d. deSolve vignette - R package version 1.8. M. H. H. Stevens. A Primer of Ecology with R. Use R Series. Springer, 2009. ISBN: 978-0-387-89881-0. G. Strang and G. Fix. An Analysis of The Finite Element Method. Prentice Hall, 1973. C. W. Tornøe, H. Agersø, E. N. Jonsson, H. Mad- sen, and H. A. Nielsen. Non-linear mixed-effects pharmacokinetic/pharmacodynamic modelling in nlme using differential equations. Computer Meth- ods and Programs in Biomedicine, 76:31–40, 2004. B. van der Pol and J. van der Mark. Frequency de- multiplication. Nature, 120:363–364, 1927. The R Journal Vol. 2/2, December 2010 ISSN 2073-4859
  • 15. C ONTRIBUTED R ESEARCH A RTICLES 15 Table 2: Summary of the main functions that solve differential equations. Function Package Description ode deSolve IVP of ODEs, full, banded or arbitrary sparse Jacobian ode.1D deSolve IVP of ODEs resulting from 1-D reaction-transport problems ode.2D deSolve IVP of ODEs resulting from 2-D reaction-transport problems ode.3D deSolve IVP of ODEs resulting from 3-D reaction-transport problems daspk deSolve IVP of DAEs of index ≤ 1, full or banded Jacobian radau deSolve IVP of DAEs of index ≤ 3, full or banded Jacobian dde PBSddesolve IVP of delay differential equations, based on Runge-Kutta formu- lae dede deSolve IVP of delay differential equations, based on Adams and BDF for- mulae bvpshoot bvpSolve BVP of ODEs; the shooting method bvptwp bvpSolve BVP of ODEs; mono-implicit Runge-Kutta formula bvpcol bvpSolve BVP of ODEs; collocation formula steady rootSolve steady-state of ODEs; full, banded or arbitrary sparse Jacobian steady.1D rootSolve steady-state of ODEs resulting from 1-D reaction-transport prob- lems steady.2D rootSolve steady-state of ODEs resulting from 2-D reaction-transport prob- lems steady.3D rootSolve steady-state of ODEs resulting from 3-D reaction-transport prob- lems tran.1D ReacTran numerical approximation of 1-D advective-diffusive transport problems tran.2D ReacTran numerical approximation of 2-D advective-diffusive transport problems tran.3D ReacTran numerical approximation of 3-D advective-diffusive transport problems Table 3: Summary of the auxilliary functions that solve differential equations. Function Package Description lsoda deSolve IVP ODEs, full or banded Jacobian, automatic choice for stiff or non-stiff method lsodar deSolve same as lsoda, but includes a root-solving procedure. lsode, vode deSolve IVP ODEs, full or banded Jacobian, user specifies if stiff or non- stiff lsodes deSolve IVP ODEs, arbitrary sparse Jacobian, stiff method rk4, rk, euler deSolve IVP ODEs, using Runge-Kutta and Euler methods zvode deSolve IVP ODEs, same as vode, but for complex variables runsteady rootSolve steady-state ODEs by dynamically running, full or banded Jaco- bian stode rootSolve steady-state ODEs by Newton-Raphson method, full or banded Jacobian stodes rootSolve steady-state ODEs by Newton-Raphson method, arbitrary sparse Jacobian The R Journal Vol. 2/2, December 2010 ISSN 2073-4859
  • 16. 16 C ONTRIBUTED R ESEARCH A RTICLES Source References by Duncan Murdoch [1] 3 Abstract Since version 2.10.0, R includes ex- > typeof(parsed) panded support for source references in R code [1] "expression" and ‘.Rd’ files. This paper describes the origin and purposes of source references, and current The first element is the assignment, the second ele- and future support for them. ment is the for loop, and the third is the single x at the end: One of the strengths of R is that it allows “compu- > parsed[[1]] tation on the language”, i.e. the parser returns an R object which can be manipulated, not just evaluated. x <- 1:10 This has applications in quality control checks, de- > parsed[[2]] bugging, and elsewhere. For example, the codetools package (Tierney, 2009) examines the structure of for (i in x) { parsed source code to look for common program- print(i) ming errors. Functions marked by debug() can be } executed one statement at a time, and the trace() > parsed[[3]] function can insert debugging statements into any function. x Computing on the language is often enhanced by The first two elements are both of type "language", being able to refer to the original source code, rather and are made up of smaller components. The dif- than just to a deparsed (reconstructed) version of ference between an "expression" and a "language" it based on the parsed object. To support this, we object is mainly internal: the former is based on the added source references to R 2.5.0 in 2007. These are generic vector type (i.e. type "list"), whereas the attributes attached to the result of parse() or (as of latter is based on the "pairlist" type. Pairlists are 2.10.0) parse_Rd() to indicate where a particular part rarely encountered explicitly in any other context. of an object originated. In this article I will describe From a user point of view, they act just like generic their structure and how they are used in R. The arti- vectors. cle is aimed at developers who want to create debug- The third element x is of type "symbol". There are gers or other tools that use the source references, at other possible types, such as "NULL", "double", etc.: users who are curious about R internals, and also at essentially any simple R object could be an element. users who want to use the existing debugging facili- The comments in the source code and the white ties. The latter group may wish to skip over the gory space making up the indentation of the third line are details and go directly to the section “Using Source not part of the parsed object. References". The parse_Rd() function parses ‘.Rd’ documenta- tion files. It also returns a recursive structure contain- The R parsers ing objects of different types (Murdoch and Urbanek, 2009; Murdoch, 2010). We start with a quick introduction to the R parser. The parse() function returns an R object of type "expression". This is a list of statements; the state- Source reference structure ments can be of various types. For example, consider As described above, the result of parse() is essen- the R source shown in Figure 1. tially a list (the "expression" object) of objects that 1: x <- 1:10 # Initialize x may be lists (the "language" objects) themselves, and 2: for (i in x) { so on recursively. Each element of this structure from 3: print(i) # Print each entry the top down corresponds to some part of the source 4: } file used to create it: in our example, parse[[1]] cor- 5: x responds to the first line of ‘sample.R’, parse[[2]] is the second through fourth lines, and parse[[3]] is Figure 1: The contents of ‘sample.R’. the fifth line. The comments and indentation, though helpful If we parse this file, we obtain an expression of to the human reader, are not part of the parsed object. length 3: However, by default the parsed object does contain a "srcref" attribute: > parsed <- parse("sample.R") > length(parsed) > attr(parsed, "srcref") The R Journal Vol. 2/2, December 2010 ISSN 2073-4859
  • 17. C ONTRIBUTED R ESEARCH A RTICLES 17 [[1]] The "srcfile" attribute is actually an environment x <- 1:10 containing an encoding, a filename, a timestamp, and a working directory. These give information [[2]] about the file from which the parser was reading. for (i in x) { The reason it is an environment is that environments print(i) # Print each entry are reference objects: even though all three source ref- } erences contain this attribute, in actuality there is [[3]] only one copy stored. This was done to save mem- x ory, since there are often hundreds of source refer- ences from each file. Although it appears that the "srcref" attribute con- Source references in objects returned by tains the source, in fact it only references it, and the parse_Rd() use the same structure as those returned print.srcref() method retrieves it for printing. If by parse(). The main difference is that in Rd objects we remove the class from each element, we see the source references are attached to every component, true structure: whereas parse() only constructs source references for complete statements, not for their component > lapply(attr(parsed, "srcref"), unclass) parts, and they are attached to the container of the statements. Thus for example a braced list of state- [[1]] ments processed by parse() will receive a "srcref" [1] 1 1 1 9 1 9 attribute containing source references for each state- attr(,"srcfile") ment within, while the statements themselves will sample.R not hold their own source references, and sub- expressions within each statement will not generate [[2]] source references at all. In contrast the "srcref" at- [1] 2 1 4 1 1 1 attr(,"srcfile") tribute for a section in an ‘.Rd’ file will be a source sample.R reference for the whole section, and each component part in the section will have its own source reference. [[3]] [1] 5 1 5 1 1 1 attr(,"srcfile") Relation to the "source" attribute sample.R By default the R parser also creates an attribute Each element is a vector of 6 integers: (first line, first named "source" when it parses a function definition. byte, last line, last byte, first character, last character). When available, this attribute is used by default in The values refer to the position of the source for each lieu of deparsing to display the function definition. element in the original source file; the details of the It is unrelated to the "srcref" attribute, which is in- source file are contained in a "srcfile" attribute on tended to point to the source, rather than to duplicate each reference. the source. An integrated development environment The reason both bytes and characters are (IDE) would need to know the correspondence be- recorded in the source reference is historical. When tween R code in R and the true source, and "srcref" they were introduced, they were mainly used for re- attributes are intended to provide this. trieving source code for display; for this, bytes are needed. Since R 2.9.0, they have also been used to When are "srcref" attributes added? aid in error messages. Since some characters take up more than one byte, users need to be informed about As mentioned above, the parser adds a "srcref" character positions, not byte positions, and the last attribute by default. For this, it is assumes that two entries were added. options("keep.source") is left at its default setting The "srcfile" attribute is also not as simple as it of TRUE, and that parse() is given a filename as argu- looks. For example, ment file, or a character vector as argument text. In the latter case, there is no source file to refer- > srcref <- attr(parsed, "srcref")[[1]] ence, so parse() copies the lines of source into a > srcfile <- attr(srcref, "srcfile") "srcfilecopy" object, which is simply a "srcfile" > typeof(srcfile) object that contains a copy of the text. [1] "environment" Developers may wish to add source references in other situations. To do that, an object inheriting from > ls(srcfile) class "srcfile" should be passed as the srcfile ar- gument to parse(). [1] "Enc" "encoding" "filename" The other situation in which source references [4] "timestamp" "wd" are likely to be created in R code is when calling The R Journal Vol. 2/2, December 2010 ISSN 2073-4859
  • 18. 18 C ONTRIBUTED R ESEARCH A RTICLES source(). The source() function calls parse(), cre- In this simple example it is easy to see where the ating the source references, and then evaluates the problem occurred, but in a more complex function resulting code. At this point newly created functions it might not be so simple. To find it, we can convert will have source references attached to the body of the warning to an error using the function. > options(warn=2) The section “Breakpoints” below discusses how to make sure that source references are created in and then re-run the code to generate an error. After package code. generating the error, we can display a stack trace: > traceback() Using source references 5: doWithOneRestart(return(expr), restart) 4: withOneRestart(expr, restarts[[1L]]) Error locations 3: withRestarts({ .Internal(.signalCondition( For the most part, users need not be concerned with simpleWarning(msg, call), msg, call)) source references, but they interact with them fre- .Internal(.dfltWarn(msg, call)) quently. For example, error messages make use of }, muffleWarning = function() NULL) at badabs.R#2 them to report on the location of syntax errors: 2: .signalSimpleWarning("the condition has length > 1 and only the first element will be used", > source("error.R") quote(if (x < 0) x <- -x)) at badabs.R#3 1: badabs(c(5, -10)) Error in source("error.R") : error.R:4:1: unexpected 'else' To read a traceback, start at the bottom. We see our 3: print( "less" ) call from the console as line “1:”, and the warning 4: else being signalled in line “2:”. At the end of line “2:” ^ it says that the warning originated “at badabs.R#3”, i.e. line 3 of the ‘badabs.R’ file. A more recent addition is the use of source ref- erences in code being executed. When R evaluates a function, it evaluates each statement in turn, keep- Breakpoints ing track of any associated source references. As of Users may also make use of source references when R 2.10.0, these are reported by the debugging sup- setting breakpoints. The trace() function lets us set port functions traceback(), browser(), recover(), breakpoints in particular R functions, but we need to and dump.frames(), and are returned as an attribute specify which function and where to do the setting. on each element returned by sys.calls(). For ex- The setBreakpoint() function is a more friendly ample, consider the function shown in Figure 2. front end that uses source references to construct a call to trace(). For example, if we wanted to set a 1: # Compute the absolute value breakpoint on ‘badabs.R’ line 3, we could use 2: badabs <- function(x) { 3: if (x < 0) > setBreakpoint("badabs.R#3") 4: x <- -x 5: x D:svnpaperssrcrefsbadabs.R#3: 6: } badabs step 2 in <environment: R_GlobalEnv> This tells us that we have set a breakpoint in step 2 of Figure 2: The contents of ‘badabs.R’. the function badabs found in the global environment. When we run it, we will see This function is syntactically correct, and works to calculate the absolute value of scalar values, but is > badabs( c(5, -10) ) not a valid way to calculate the absolute values of the elements of a vector, and when called it will generate badabs.R#3 an incorrect result and a warning: Called from: badabs(c(5, -10)) > source("badabs.R") Browse[1]> > badabs( c(5, -10) ) telling us that we have broken into the browser at the [1] 5 -10 requested line, and it is waiting for input. We could then examine x, single step through the code, or do Warning message: any other action of which the browser is capable. In if (x < 0) x <- -x : By default, most packages are built without the condition has length > 1 and only the first source reference information, because it adds quite element will be used substantially to the size of the code. However, setting The R Journal Vol. 2/2, December 2010 ISSN 2073-4859
  • 19. C ONTRIBUTED R ESEARCH A RTICLES 19 the environment variable R_KEEP_PKG_SOURCE=yes length 6 with a class and "srcfile" attribute. It is before installing a source package will tell R to keep hard to measure exactly how much space this takes the source references, and then breakpoints may be because much is shared with other source references, set in package source code. The envir argument to but it is on the order of 100 bytes per reference. setBreakpoints() will need to be set in order to tell Clearly a more efficient design is possible, at the ex- it to search outside the global environment when set- pense of moving support code to C from R. As part of ting breakpoints. this move, the use of environments for the "srcfile" attribute could be dropped: they were used as the only available R-level reference objects. For develop- The #line directive ers, this means that direct access to particular parts In some cases, R source code is written by a program, of a source reference should be localized as much as not by a human being. For example, Sweave() ex- possible: They should write functions to extract par- tracts lines of code from Sweave documents before ticular information, and use those functions where sending the lines to R for parsing and evaluation. To needed, rather than extracting information directly. support such preprocessors, the R 2.10.0 parser rec- Then, if the implementation changes, only those ex- ognizes a new directive of the form tractor functions will need to be updated. Finally, source level debugging could be imple- #line nn "filename" mented to make use of source references, to single where nn is an integer. As with the same-named step through the actual source files, rather than dis- directive in the C language, this tells the parser to playing a line at a time as the browser() does. assume that the next line of source is line nn from the given filename for the purpose of constructing source references. The Sweave() function doesn’t Bibliography currently make use of this, but in the future, it (and D. Murdoch. Parsing Rd files. 2010. URL http: other preprocessors) could output #line directives //developer.r-project.org/parseRd.pdf. so that source references and syntax errors refer to the original source location rather than to an inter- D. Murdoch and S. Urbanek. The new R help system. mediate file. The R Journal, 1/2:60–65, 2009. The #line directive was a late addition to R 2.10.0. Support for this in Sweave() appeared in R L. Tierney. codetools: Code Analysis Tools for R, 2009. R 2.12.0. package version 0.2-2. The future Duncan Murdoch Dept. of Statistical and Actuarial Sciences The source reference structure could be improved. University of Western Ontario First, it adds quite a lot of bulk to R objects in mem- London, Ontario, Canada ory. Each source reference is an integer vector of murdoch@stats.uwo.ca The R Journal Vol. 2/2, December 2010 ISSN 2073-4859