In this article we will work on importing .dta (Stata) files into R from your computer directory using read.dta() command from foreign package.


Theory

.dta is an extension of a binary format designed to be used for STATA datasets.



Application

Below are the steps we are going to take to make sure we do master the skill of importing .dta files into R:

  1. Installing foreign package
  2. Basic read.dta() command description
  3. Importing .dta file into R from your computer



Part 1. Installing foreign package

As R doesn’t have this command built in, we will need an additional package in order to import .dta file into R.

You can learn more about foreign package here.

In order to install and “call” the package into your workspace, you should use the following code:


install.packages("foreign")
library(foreign)



Part 2. Basic read.dta() command description

The complete list of arguments of the function is the following:

read.dta(file, convert.dates = TRUE, convert.factors = TRUE, missing.type = FALSE,
convert.underscore = FALSE, warn.missing.labels = TRUE)

 

Assuming we have our file ready, and it doesn’t need any additional data manipulation, we will only need to use the first argument “file”.

A complete set of explanations for each of the arguments is available here on page 5.



Part 3. Importing .dta file into R from computer

In order to import a .dta file from the computer, we need the exact location of the destination file.

So how do we do that? Let’s find out!

First of all, we need to know where the file is stored on your computer.

In my case, I use Mac OS and my file is stored on my desktop. In order to find the location of the file on Mac OS you can right click on the file and choose “Get Info”; on Windows you can right click on the file and choose “Properties”.

In my case, the location of the file in R format is: /Users/DataSharkie/Desktop/can-pop.dta

Use this local path in the file path in the read.dta() command to import the file. Don’t forget that you need to define a variable into which you will be importing the dataset (I called mine “mydata”).

The .dta file used in this article is available for download here. It is called “Can-pop” and is the 4th one in the table available readily in the .dta format.

Using the following code we will be able to import the .dta file into R and it will be in the data frame format already:


mydata<-read.dta("/Users/mikhail/Desktop/can-pop.dta")



If you are interested to learn more about importing different data formats into R, you can find more articles here.