Yet another operator to simplify data preparation with memisc¶
The recently published version 0.99.31.6 of the memisc package also contains an
%$$%
operator that simplifies routine data preparation steps that hitherto would
involve calls to the function within()
. It is analogous to the operator %$%
,
which is provided by the “magrittr” package, but is also defined by this
package.
These operators are illustrated by the following code examples.
library(magrittr)
library(memisc)
set.seed(42)
Here we create a simple example data frame:
df <- data.frame(a = 1:7, x = rnorm(7))
df
The following code creates two new variables b
and x.sq
in the data frame using within()
:
df <- within(df,{
b <- a + 4
x.sq <- x^2
})
df
This is a bit tedious, because we have to write the name of the data frame
(i.e. “df”) twice. Using the operator %<>%
from the magrittr package one
needs to write the name of the data frame only once:
df %<>% within({
b <- a + 4
x.sq <- x^2
})
df
The magrittr package defines an operator %$%
that can be used as a shorthand
for with()
:
with(df, mean(x))
df %$% mean(x)
Thus it does not seem to be far-fetched to use an analogous shorthand for
within()
- which is defined in the most recent version of memisc:
df[c("b","x.sq")] <- NULL
df %$$% {
b <- a + 4
x.sq <- x^2
}
df
Beside being shorter than a call to within()
, it results in a data frame (or
data set) in which the variables are ordered by their creation - variables
created frist, appear first in the resulting data frame.