In this article we will learn how to generate random numbers in R from various distributions (uniform and normal).


Theory

Why is random numbers generation important and where is it used?

Random numbers generations have application in various fields like statistical sampling, simulation, test designs, and so on. Generally, when a data scientist is in need of a set of random numbers, they will have in mind

R programming language allows users to generate random distributed numbers with a set of built-in functions: runif(), rnorm().

In this article we will discuss generating random numbers from the following three distributions:



Application

Below are the steps we are going to take to make sure we do learn how to do random numbers generation in R from different distributions:

  1. Generate random numbers from uniform distribution
  2. Generate random numbers from normal distribution
  3. Generate random numbers from binomial distribution


Part 1: Generate random numbers from uniform distribution in R

Let’s first discuss what a uniform distribution is and why often it is the most popular case for generating random numbers from. In simple words, a uniform distribution is a type of a probability distribution in which all of the numbers have an equal probability to be the outcome.

For example, you want to roll the dice, and you know that you can only get one of the following outcomes: 1, 2, 3, 4, 5, 6. Each of these numbers has an equal probability of occurring, since the dice has 1 number per side.

In R, to generate random numbers from a uniform distribution, you will need to use the runif() function. Here is its explanation:

runif(n, min=a, max=b)

Here, n refers to how many random numbers to generate. a and b are the lower and upper limits of the distribution respectively. The default values for min and max are 0 and 1.

Now, we will try to replicate the rolling of the dice 10 times. From above we know that min value on the dice and max value on the dice is 6. Let’s try it:


x<-runif(6, min=1, max=6)
print(x)

Output:

 [1] 3.830754 4.144301 5.386413 1.008171 2.187963 1.568800 5.439202 1.102950 1.638472 3.734701

Almost the result we needed. What we see in the result is that the numbers generated aren't integers since we were pulling numbers from a continuous distribution. We can round these numbers to 0 decimal points to get the result we needed:


x<-round(x)
print(x)

Output:

 [1] 4 4 5 1 2 2 5 1 2 4

Perfect. Exactly what wee wanted to see. You can think of it in a way that the function rolled the dice for you 10 times and these are the numbers it got on each of the 10 rolls.


Part 2: Generate random numbers from normal distribution in R

We have an article that explains normal distribution in detail, so here we will summarize a few of key features:

  • Looks like a "bell"
  • Mean=mode=median
  • 68% of observations are within 1 standard deviation from the mean

An example that most of you will be familiar with is grades on final exams in university courses. You often heard your professor talking about the adjustments, and this is exactly what they are referring to. Their goal is to have normally distributed grades around some mean value.

In R, to generate random numbers from a uniform distribution, you will need to use the rnorm() function. Here is its explanation:

rnorm(n, mean=a, sd=b)

Here, n refers to how many random numbers to generate. a and b are the mean and standard deviation of the distribution respectively. The default values for mean and standard deviations are 0 and 1.

We will try to replicate a sample set of grades of 40 students in a class with mean 70 and standard deviation of 10. This means that 68% of students will have their exam grades between 60 and 80. Let's try it:


x<-rnorm(40, mean = 70, sd = 10)
print(x)

Output:

 [1] 81.44031 74.53427 83.62584 63.31253 72.93072 43.12650 65.63602 69.81147 57.77994 65.55200 70.14918 81.24505 73.32076 70.56342 75.78013 70.95782 82.70436 83.46286 78.64195 66.67930 61.04818 73.47958 73.43050 75.50523 71.95195 74.90248 67.83296 62.59096 73.61193 68.02580 56.19261 61.96315 69.00750 72.29529 60.88323 63.61325 61.38831 72.15759 66.37761 69.66935

Almost the result we needed. What we see in the result is that the numbers generated aren't integers since we were pulling numbers from a continuous distribution. We can round these numbers to 0 decimal points to get the result we needed:


x<-round(x)
print(x)

Output:

 [1] 81 75 84 63 73 43 66 70 58 66 70 81 73 71 76 71 83 83 79 67 61 73 73 76 72 75 68 63 74 68 56 62 69 72 61 64 61 72 66 70

Looks much better. We suggest to quickly visualize this sequence by plotting a histogram to see that is follows a bell curve. Let's just confirm the mean and standard deviation of the created sequence of numbers.


print(mean(x))
print(sd(x))

Output:

[1] 69.725
[1] 8.193141

We see that it's very close to the value of the parameters that we set and deviates from it partly due to rounding and because the sample is small. As we increase the number of sample observations, the sample parameter values approach population parameter values.



This concludes the article on generating random numbers in R from various distributions. You can find more related articles in our Statistics in R section.