Template Model Builder (TMB) has a number of built-in macros for importing R data objects into your c++ likelihood function (such as DATA_VECTOR). These built-in’s work fine for the vast majority of models.

However, I recently had a case that just didn’t fit well with these provided macros. I had a likelihood function that needed to loop over a variable number of datasets, where each dataset was a vector of variable length.

The conventional TMB approach would be to join all those vectors into a single one and then use an additional indexing vector to loop over parts of it. This works of course, but it results in a lot of overhead in both R and c++, and generally makes the code a pain to work with. Ideally, I could just provide TMB with a list object where all elements are known to be vectors. Then inside the likelihood function, I could simply iterate over each vector.

TMB does have the macro DATA_STRUCT. Here, you can map list objects into struct containers. That works well when you have a list with a fixed set of named elements. In my case, I wanted a list with varaible size and where the elements were unnamed, so it wasn’t a great fit.

Going off the DATA_STRUCT source code, I just made my own macro called DATA_VECTORLIST. The macro would load in the SEXP list object and store it as a custom class I called VectorList. The VectorList class stores the SEXP list object from R and provides a single access function getElement, where I can retrieve list elements by their index. Additionally, the class has a length attribute holding the size of the list so I can easily write loops to iterate over all the elements.

For a simple exercise, below I’ve modified the linreg.cpp example from TMB. Instead of loading two vectors with DATA_VECTOR, I changed it to load in a single DATA_VECTORLIST object, and then used the getElement member function extract the vectors stored as the first and second elements.

The example is simple (and impractical), but should give a good illustration of how easy it can be to write you own macros in TMB for custom data types.

// Simple linear regression.
#include <TMB.hpp>
#include <Rinternals.h>

// See Writing R Extensions, section 5.9.7 for info on handling list objects
// in cpp

// Class for storing list filled with numeric vectors
template<class Type>
class VectorList {
  
  public:
    
    // R object that is a list containing R vectors
    SEXP list;
    // Size of list
    int length;
    
    // Constructor (using initializer list syntax)
    VectorList(SEXP object): list(object) {
      // Get length of list
      length = Rf_length(object);
    }
    
    // Function for returning vector via index
    vector<Type> getElement(int idx){
      // Get vector from list and store as TMB object
      vector<Type> ret = asVector<Type>(VECTOR_ELT(list, idx));
      // Return the vector
      return(ret);
      
    }
    
};

// Make TMB macro for loading a list of numeric vectors
#define DATA_VECTORLIST(name)                                       \
VectorList<Type> name(getListElement(TMB_OBJECTIVE_PTR -> data, #name));

template<class Type>
Type objective_function<Type>::operator() ()
{
  
  DATA_VECTORLIST(lst_vectors);
  
  // Get x and Y vectors from list
  vector<Type> x = lst_vectors.getElement(int(0));
  vector<Type> Y = lst_vectors.getElement(int(1));
  
  PARAMETER(a);
  PARAMETER(b);
  PARAMETER(logSigma);
  ADREPORT(exp(logSigma));
    
  Type nll = -sum(dnorm(Y, a+b*x, exp(logSigma), true));
  return nll;
}

Below is the corresponding R code to make use of the likelihood functtion above.

# Load TMB
library(TMB)

# Compile and link cpp objective function
compile("linreg.cpp")
dyn.load(dynlib("linreg"))

# Random seed (for reproducibility)
set.seed(123)

data <- list(
  lst_vectors = list(x=1:30, Y = 10 + rnorm(n = 30, sd = 0.2) + (1:30)*2))

parameters <- list(a=0, b=0, logSigma=0)
obj <- MakeADFun(data, parameters, DLL="linreg")
obj$hessian <- TRUE
opt <- do.call("optim", obj)
sdreport(obj)

summary(sdreport(obj), "fixed")     
summary(sdreport(obj), "report")     

Making custom macros requires a little but of digging through R’s documentation. Skimming section 5 of “Writing R Extensions” and Rinternals.h (which should be saved in your R distribution somewhere on your local machine) was really helpful in figuring this out.

The above example is simple, but its clear to see how far more complicated objects can be loaded into TMB. Effectively, any S3 object type can be imported into TMB as long as you made a struct or class that fits its structure.