The parse_setup function is what this packages uses to convert the .sps or .sas setup files into an usable format for R.

This will return a list of length 3 containing the objects “setup”, “value_labels”, and “missing”.

# Using the example .sps setup file included with the package
sps_name <- system.file("extdata", "sacramento.sps",
example <- asciiSetupReader::parse_setup(sps_name)

## setup

The first object in the returned list is a data.frame with 4 columns and as many rows as there are columns in the data. The “column_number” column is the non-descriptive name of the column while the “column_name” is the descriptive name of the column. In read_ascii_setup, setting use_clean_names to TRUE will set the data column names to the “column_name” names, otherwise it will remain as the “column_number” names. Since the data is in fixed-width format, you need to know the location of each column. The “begin” and “end” columns in this object provide that location for each column in the data.

head(example$setup) #> column_number column_name begin end #> 1 TRINUM TRIAL_NUMBER 1 2 #> 2 SUBNO SUBJECT_NUMBER 3 5 #> 3 TODDATYR TODAYS_DATE_YEAR 6 9 #> 4 DATSTAR DATE_TRIAL_STARTED 10 15 #> 5 CONSTATE COUNTY_STATE 16 16 #> 6 Q1JSEX JUROR_GENDER 17 17 ## value_labels To make the data more compact, the data often provides values that represent a label. For example, in a column about participant’s gender it may only include “M” and “F” which stands for “Male” and “Female”. The setup file will say the M = Male and F = Female. The value labels tell us that we need to convert M to Male in the given column. This is a list of named vectors indicating the value and its corresponding label. If there are no value labels this object will be NULL. example$value_labels[1:3]
#> $CONSTATE #> Maricopa, AZ Sacramento, CA Unknown #> "1" "2" "9" #> #>$DATSTAR
#>  Blanked
#> "888888"
#>
#> $DURAT #> Undoc code Unknown #> "-2" "99" There is one named vector for each column in the data that has value labels. We can see how many there are using length(). length(example$value_labels)
#> [1] 194

## missing

The final object in the list a data.frame with two columns and as many rows as there are missing values in the data. The column “variable” indicates the column in the data and the column “values” says that the value in that row is to be replaced with NA. For example, if there are 10 columns in the data with missing values and each column has two missing values (e.g. -8 and -9) there will be 20 rows in this data.frame. A missing value is when the data includes a value that should be replaced with NA. For example, data often includes negative values such as -8 or -9 mean that that value is missing and should be NA. If there are no missing values this object will be NULL.

head(example\$missing)
#>   variable values
#> 1 TODDATYR   9999
#> 2  DATSTAR 888888
#> 3 CONSTATE      9
#> 4   Q3JETH      9
#> 5 Q5JSUPDP      9
#> 6   Q6JVIC      9