Banging My Head Against the Keyboard: Tips for Debugging in Rcpp

If you’ve been following along with my Banging My Head Against the Keyboard series you know that I’ve been adventuring in the realm of c++ in R, or Rcpp. The tutorials linked above show how to use basic c++ functionality in R, basic MCMC++ and RcppArmadillo. When we’re coding an Gibbs sampling function, we’re usually talking about an enormous amount of code- where a small problem in one sample can destroy others. We have to find a way to debug this, the best bet is Rprintfvalue.

Rcpp can be kindof buggy and will crash for minor errors if the program is allowed to continue. An example of this is a problem that came up in my research with reversible jump MCMC which involves changing a vectors dimension at each iteration. If say we have a NumericVector with size 5 and we accidentally try to index element 6, then c++ will return an error, but the MCMC will continue to the next loop! This makes R crash and you get the blue box of death (with a bomb to boot!) without seeing the real error, that is we are accessing a entry that is not there. We can use Rprintfvalue to print at many points in our sampler, so we can see exactly where the crash is.

PRO TIP: If you have a NumericVector X and you want to index an element, when you are testing and building your MCMC++, use X(6) instead of X[6]. The reason is the following: (6) tells c++ to go check and make sure there is an entry at entry 6 and to do something with it if it’s there while X[6] just grabs the 6th entry of X, even if it’s not there. This can cause c++ to CRASH! Instead X(6) will return, “index out of range” and exit the function. This can save a lot of searching in your MCMC++.

Back to debugging, here is an example of finding a bug in c++ code that a maintainer in my department set up.

DEBUG2

This way every time we use the Likelihood function in our MCMC, the R console will print “Entering Like” when the function starts and “Exit Like” when the function ends. If we do this with all of our MCMC functions and add additional comments to likely culprits of the errors, we can pinpoint where the error is. The output looks something like this for a large MCMC++.

Debugoutput

So we see that the MCMC++ bugs in the function called Birth 3 and that this happens in a loop. Additionally we see that this bugs when Ind=(0), k=(4) and skip=(1) so let’s find this in our code and see if we can rewrite it to work outside of the Birth3 function as a function that we can call.

Here is the code:

BugFound.png

We see that the bottom loop is the culprit, which happens to be filling in a J+1 vector with a J vector and two numeric entries. So I rewrote this code outside of this function as:

// [[Rcpp::export]]
NumericVector lamBirth(int Ind,double newLam1,double newLam2, NumericVector lam1){
   NumericVector lam=lam1;
   lam.push_back(0);
   int k;
   if(Ind==0 || Ind==lam1.size()-1){
     if(Ind==0){
       lam(0)=newLam1;
       lam(1)=newLam2;
       for(k=2;k<lam.size();k++){
         lam(k)=lam1(k-1);
       }
     }
     if(Ind==lam1.size()-1){
       lam(lam.size()-2)=newLam1;
       lam(lam.size()-1)=newLam2;
       for(k=0;k<lam.size()-2;k++){
         lam(k)=lam1(k);
       }

     }

   }
   else{
     lam(Ind)=newLam1;
     lam(Ind+1)=newLam2;
     for(k=0;k<Ind;k++){
       lam(k)=lam1(k);
     }
     for(k=Ind+2;k<lam.size();k++){
       lam(k)=lam1(k-1);
     }
   }
   return(lam);
}

And when I plugged this in place of the for loop, it works fine! We can really do a lot of good with printing statements in out functions to tell us where the error happens! We don’t have the advantage of R in that the loop stops and we can see where it ended, so we need to use print statements! Special thanks to my maintainer Cliff for helping me find this bug!

Happy Coding!

If you enjoyed this post, take a look at some of my other Adventures in Statistics.

-Andrew G Chapple

 

 

 

One thought on “Banging My Head Against the Keyboard: Tips for Debugging in Rcpp

Leave a comment