Skip to contents

Transform skewed variables (aiming at they conform to a normal distribution) in .omv-files for the statistical spreadsheet 'jamovi' (https://www.jamovi.org)

Usage

transform_vars_omv(
  dtaInp = NULL,
  fleOut = "",
  varXfm = NULL,
  psvAnl = FALSE,
  usePkg = c("foreign", "haven"),
  selSet = "",
  ...
)

Arguments

dtaInp

Either a data frame or the name of a data file to be read (including the path, if required; "FILENAME.ext"; default: NULL); files can be of any supported file type, see Details below

fleOut

Name of the data file to be written (including the path, if required; "FILE_OUT.omv"; default: ""); if empty, the resulting data frame is returned instead

varXfm

Named list variable where the name indicates which transformation is to be carried out and where each list entry points to one or more variables to be transformed using this transformation. See Details for more information.

psvAnl

Whether analyses that are contained in the input file shall be transferred to the output file (default: FALSE)

usePkg

Name of the package: "foreign" or "haven" that shall be used to read SPSS, Stata and SAS files; "foreign" is the default (it comes with base R), but "haven" is newer and more comprehensive

selSet

Name of the data set that is to be selected from the workspace (only applies when reading .RData-files)

...

Additional arguments passed on to methods; see Details below

Value

a data frame (only returned if fleOut is empty) where the order of variables / columns of the input data set is re-arranged

Details

  • varXfm has to be a named list variable where the names can either indicate the type of transformation or the kind and degree of skewness that shall be corrected. For the type of transformation, the following names are valid: posSqr, negSqr, posLog, negLog, posInv, negInv; where the second part of the name indicates the transformation to be carried out: ...Sqr - square root, ...Log - logarithm to the basis 10, ...Inv - inversion, i.e., 1 / original value), and where the first part of the name indicates whether the original value is used (pos...) or whether the original value is subtracted from the maximum value of that variable (neg...; a constant of 1 is added to the maximum value for ...Log and ...Inv transformations). For the degree and kind of skewness, the following names are valid: mdrPos, strPos, svrPos, mdrNeg, strNeg, svrNeg (degree: moderate, strong, severe; kind: positive or negative).

  • The ellipsis-parameter (...) can be used to submit arguments / parameters to the functions that are used for reading and writing the data. By clicking on the respective function under “See also”, you can get a more detailed overview over which parameters each of those functions take. The functions are: read_omv and write_omv (for jamovi-files), read.table (for CSV / TSV files; using similar defaults as read.csv for CSV and read.delim for TSV which both are based upon read.table), load (for .RData-files), readRDS (for .rds-files), read_sav (needs the R-package haven) or read.spss (needs the R-package foreign) for SPSS-files, read_dta (haven) / read.dta (foreign) for Stata-files, read_sas (haven) for SAS-data-files, and read_xpt (haven) / read.xport (foreign) for SAS-transport-files. If you would like to use haven, you may need to install it using install.packages("haven", dep = TRUE).

See also

transform_vars_omv internally uses the following functions for reading and writing data files in different formats: read_omv() and write_omv() for jamovi-files, utils::read.table() for CSV / TSV files, load() for reading .RData-files, readRDS() for .rds-files, haven::read_sav() or foreign::read.spss() for SPSS-files, haven::read_dta() or foreign::read.dta() for Stata-files, haven::read_sas() for SAS-data-files, and haven::read_xpt() or foreign::read.xport() for SAS-transport-files.

Examples

if (FALSE) { # \dontrun{
# generate skewed variables
set.seed(335)
dtaInp <- data.frame(MP = rnorm(1000) * 1e-1 + rexp(1000, 2) * (1 - 1e-1),
                     MN = rnorm(1000) * 1e-1 - rexp(1000, 2) * (1 - 1e-1),
                     SP = rnorm(1000) * 1e-2 + rexp(1000, 2) * (1 - 1e-2),
                     SN = rnorm(1000) * 1e-2 - rexp(1000, 2) * (1 - 1e-2),
                     EP = rnorm(1000) * 1e-4 + rexp(1000, 2) * (1 - 1e-4),
                     EN = rnorm(1000) * 1e-4 - rexp(1000, 2) * (1 - 1e-4))
jmv::descriptives(data = dtaInp, skew = TRUE, sw = TRUE)

crrXfm <- list(posSqr = c("MP"), negSqr = c("MN"), posLog = c("MP", "SP"), negLog = c("SN"),
               posInv = c("MP", "SP", "EP"), negInv = c("EN"))
dtaOut <- jmvReadWrite::transform_vars_omv(dtaInp = dtaInp, varXfm = crrXfm)
jmv::descriptives(data = dtaOut, skew = TRUE, sw = TRUE)

crrXfm <- list(mdrPos = c("MP"), mdrNeg = c("MN"), strPos = c("SP"), strNeg = c("SN"),
               svrPos = c("EP"), svrNeg = c("EN"))
dtaOut <- jmvReadWrite::transform_vars_omv(dtaInp = dtaInp, varXfm = crrXfm)
jmv::descriptives(data = dtaOut, skew = TRUE, sw = TRUE)

} # }