Introduction
In this article, we will go through the tutorial of working with Histogram in R programming language. We shall first cover the syntax of hist() function and then see various examples of creating histogram in R language using this function.
Syntax of Histogram hist() function in R
The basic syntax of hist() function is as follows –
hist(v, main, xlab, xlim, ylim, breaks, col, border)
- v: This is the numerical values or data for which histogram is needed
- main:Â Used for giving title to the chart.
- col: Used for setting the color of the bars.
- xlab:Â Used to label for the horizontal axis.
- border: Used for setting the border color of each bar.
- xlim:Â Used for plotting values of x-axis.
- ylim:Â Used for plotting values of y-axis.
- breaks: Used for setting the width of each bar.
Examples of Histogram in R Language
Load Dataset
Here we will load our CSV file into a dataframe df, which includes mall customers and consists of columns such as Genre, Age, Income, and Spending score. This dataset will be used in all the below examples of histogram.
df <- read.table("mall.csv",header=TRUE,sep=',')
options( warn = -1 )
head(df)
CustomerID | Genre | Age | income | Score |
---|---|---|---|---|
1 | Male | 19 | 15 | 39 |
2 | Male | 21 | 15 | 81 |
3 | Female | 20 | 16 | 6 |
4 | Female | 23 | 16 | 77 |
5 | Female | 31 | 17 | 40 |
6 | Female | 22 | 17 | 76 |
Example 1: Basic Histogram in R Language
One can plot a basic histogram by passing the data frame to hist() function and referencing the column that is to be plotted using the ‘$’ symbol. In this example, we have plotted the histogram for the Age column in the data frame df.
hist(df$Age)
Example 2: Adding Title to Histogram in R Language
We can add a title to the histogram using the main parameter, further, we use lab to label the x-axis.
hist(df$Age,
main="MALL CUSTOMERS", xlab="INCOME",
xlim=c(15,75))
Example 3: Adding Color to Histogram
The color can be added to the Histogram by using col parameter as shown in the below example where we passed orange color.
hist(df$Age,
main="MALL CUSTOMERS", xlab="INCOME",
xlim=c(15,75),
col="orange")
Example 4: Adding Hatched Fill Pattern to Histogram
In the below example we have created a hatched fill histogram with 45° slanting lines.
hist(df$Age,
main="MALL CUSTOMERS", xlab="INCOME",
col="dodgerblue3",
density=25,
angle=45)
Example 5: Setting X and Y Axes limits
To set limits to our X and Y axes we use arguments xlim and ylim, the range passed to these two arguments will set the axes of our histogram plot. Let us have a look at its implementation below.
hist(df$Age, xlim=c(30,70), ylim=c(0,70),col='pink')
Example 6: Add Values on Top of Histogram Plot
Now we will use the text() function to print the numeric values on top of our histogram, this will make our histogram more intuitive and easier to visualize the values on the y-axis.
m<-hist(df$Age,
main="MALL CUSTOMERS", xlab="INCOME",, ylab ="Frequency",
col = "royalblue", border = "pink")
# Setting labels
text(m$mids, m$counts, labels = m$counts, adj = c(0.5, -0.5))
Example 7: Histogram with Breaks
The break option helps to control the bin or bars of the histogram. In the below example, we set a range, between 0 to 80 with the size of 5 for break.
hist(df$Age,breaks=seq(0,80,by=5),col='lightgreen')
Example 8: Overlay Histogram with Density Line
To add a density curve over a histogram you can use the lines function for plotting the curve and density for calculating the underlying non-parametric (kernel) density of the distribution.
hist(df$Age,
col="lightblue1",
freq = FALSE)
lines(density(df$Age))
polygon(density(df$Age),
col=rgb(1,0,1,.2))
-
I am passionate about Analytics and I am looking for opportunities to hone my current skills to gain prominence in the field of Data Science.
View all posts