In this article we will work on importing .sas7bdat (SAS) files into R from your computer directory using read_sas() command from haven package.


Theory

.sas7bdat is an extension for system dataset for SAS software.



Application

Below are the steps we are going to take to make sure we do master the skill of importing .dta files into R:

  1. Installing haven package
  2. Basic read_sas() command description
  3. Importing .sas7bdat file from your computer



Part 1. Installing heaven package

As R doesn’t have this command built in, we will need an additional package in order to import .sas7bdat file into R.

You can learn more about haven package here.

In order to install and “call” the package into your workspace, you should use the following code:


install.packages("haven")
library(haven)



Part 2. Basic read_sas() command description

The complete list of arguments of the function is the following:

read_sas(data_file, catalog_file = NULL, encoding = NULL, catalog_encoding = encoding, cols_only = NULL)

Assuming we have our file ready, and it doesn’t need any additional data manipulation, we will only need to use the first argument “file”.

A complete set of explanations for each of the arguments is available here on page 7.



Part 3. Importing .sas7bdat file from the computer

In order to import a .sas7bdat file from the computer, we need the exact location of the destination file.

So how do we do that? Let’s find out!

First of all, we need to know where the file is stored on your computer.

In my case, I use Mac OS and my file is stored on my desktop. In order to find the location of the file on Mac OS you can right click on the file and choose “Get Info”; on Windows you can right click on the file and choose “Properties”.

In my case, the location of the file in R format is: /Users/DataSharkie/Desktop/cola.sas7bdat

Use this local path in the file path in the read_sas() command to import the file. Don’t forget that you need to define a variable into which you will be importing the dataset (I called mine “mydata”).

The .sas7bdat file used in this article is available for download here. It is called “cola” and is the 2nd one in the first row of the table also available readily in the .sas7bdat format.

Using the following code we will be able to import the .dta file into R and it will be in the data frame format already:


mydata<-read_sas("/Users/mikhail/Desktop/cola.sas7bdat")



If you are interested to learn more about importing different data formats into R, you can find more articles here.