Skip to content

Add a new datatype to RcppDeepState

Fabrizio Sandri edited this page Nov 11, 2022 · 2 revisions

This manual is intended to serve as a reference for users who want to extend RcppDeepState's support for additional datatypes. Eight different datatypes are supported by RcppDeepState as of the time this article was written. But what if the package you wish to examine has a function that uses at least one datatype that is not supported? The process for automatically creating a test harness for your function cannot be used, thus RcppDeepState will terminate with an error and you won't be able to examine that code. You can however create the test harness file manually as a workaround for this issue; RcppDeepState will utilize it if it is placed in the /inst/testfiles directory.

Another alternative is to add support for this new datatype to RcppDeepState, and this article will guide you in this task.

List of datatypes supported by RcppDeepState:

  • Rcpp::NumericVector
  • Rcpp::NumericMatrix
  • Rcpp::CharacterVector
  • Rcpp::IntegerVector
  • arma::mat
  • std::double
  • std::string
  • std::int
  • std::float

Steps

A new datatype can be added in two steps:

  • defining a function that randomly initializes this datatype;
  • adding the new datatype to the table that lists all the available datatypes.

First step

The first step is to change the /inst/include/RcppDeepState.h file and define a generator function, which is a function that generates random values for your datatype. This function must adhere to the following rules:

  • named as the concatenation of RcppDeepState_ and your datatype;
  • It must return the datatype you specified;
  • all of the randomly generated variables must be initialized using the DeepState API, specifically using the functions listed below:
    • DeepState_Int()
    • DeepState_UInt()
    • DeepState_Size() (for size_)
    • DeepState_Bool()
    • DeepState_Char()
    • DeepState_Float()
    • DeepState_Double()
    • DeepState_CStr_C(size_t len, const char* allowed)
    • DeepState_CStrUpToLen(size_t maxLen, const char* allowed)

Second step

Once your function has been defined, add it to the data table holding all of the supported datatypes, which is named types_table and is placed within the deepstate_fun_create function within /R/fun_harness_create.R. This table has three columns that define, in order:

  • the name of the supported datatype;
  • an alternative datatype used when running qs::c_qsave or NA if the datatype you are adding can be directly saved using the qs::c_qsave function;
  • an alternative function to generate this datatype that accepts additional parameters as input, such as the size, lower and upper bounds. If you simply specify a single function that does not accept any parameters as input, leave this to NA.

Examples

In this section, we'll look at several examples of how to add a new datatype.

Example 1

If you want to add support for the int datatype, for example, you can write two functions in /inst/include/RcppDeepState.h; one with no input parameters and another with the upper and lower bounds used to produce the random integer value in a given range:

  • int RcppDeepState_int()
  • int RcppDeepState_int(int low, int high)

Because int cannot be saved directly with qs::c_qsave, a IntegerVector must be used as an alternate datatype. Furthermore, because a function that accepts extra parameters has been defined for the int datatype, the third column must correspond to the parameters of the prototype of this new function, which is int RcppDeepState_int(int low, int high). The following is the new row that must be added to the types_table:

types_table <- rbind(
  datatype("int", "IntegerVector", "(low,high)"),   # new row for int
  datatype("double", "NumericVector", "(low,high)"),
  datatype("string", "CharacterVector", NA),
  datatype("NumericVector", NA, "(size,low,high)"),
  datatype("IntegerVector", NA, "(size,low,high)"),
  datatype("NumericMatrix", NA, "(size,low,high)"),
  datatype("CharacterVector", NA, NA),
  datatype("mat", NA, NA))

Example 2

To add support for LogicalVector, build a method LogicalVector RcppDeepState_LogicalVector() within /inst/include/RcppDeepState.h that utilizes the DeepState API to randomly populate variables. This is one possibility for implementation:

Rcpp::LogicalVector RcppDeepState_LogicalVector(){
  int rand_size = DeepState_IntInRange(1,20);
  Rcpp::LogicalVector rand_vec(rand_size);
  for(int i = 0 ; i < rand_size;i++){      
    rand_vec[i] = DeepState_Bool();  
  }
  return rand_vec;
}

Once this function has been defined, a new row must be added to the types_table:

types_table <- rbind(
  datatype("int", "IntegerVector", "(low,high)"),
  datatype("double", "NumericVector", "(low,high)"),
  datatype("string", "CharacterVector", NA),
  datatype("NumericVector", NA, "(size,low,high)"),
  datatype("IntegerVector", NA, "(size,low,high)"),
  datatype("NumericMatrix", NA, "(size,low,high)"),
  datatype("CharacterVector", NA, NA),
  datatype("LogicalVector", NA, NA),    # new row for LogicalVector
  datatype("mat", NA, NA))
Clone this wiki locally