cover image


Graphical representation of the distribution of numerical data / From Wikipedia, the free encyclopedia

Dear Wikiwand AI, let's keep it short by simply answering these key questions:

Can you list the top facts and stats about Histogram?

Summarize this article for a 10 years old


A histogram is an approximate representation of the distribution of numerical data. The term was first introduced by Karl Pearson.[1] To construct a histogram, the first step is to "bin" (or "bucket") the range of values—that is, divide the entire range of values into a series of intervals—and then count how many values fall into each interval. The bins are usually specified as consecutive, non-overlapping intervals of a variable. The bins (intervals) must be adjacent and are often (but not required to be) of equal size.[2]

Quick facts: Histogram, One of the Seven Basic Tools of Qu...
One of the Seven Basic Tools of Quality
First described byKarl Pearson
PurposeTo roughly assess the probability distribution of a given variable by depicting the frequencies of observations occurring in certain ranges of values.

If the bins are of equal size, a bar is drawn over the bin with height proportional to the frequency—the number of cases in each bin. A histogram may also be normalized to display "relative" frequencies showing the proportion of cases that fall into each of several categories, with the sum of the heights equaling 1.

However, bins need not be of equal width; in that case, the erected rectangle is defined to have its area proportional to the frequency of cases in the bin.[3] The vertical axis is then not the frequency but frequency density—the number of cases per unit of the variable on the horizontal axis. Examples of variable bin width are displayed on Census bureau data below.

As the adjacent bins leave no gaps, the rectangles of a histogram touch each other to indicate that the original variable is continuous.[4]

Histograms give a rough sense of the density of the underlying distribution of the data, and often for density estimation: estimating the probability density function of the underlying variable. The total area of a histogram used for probability density is always normalized to 1. If the length of the intervals on the x-axis are all 1, then a histogram is identical to a relative frequency plot.

The histogram is one of the seven basic tools of quality control.[5]

Histograms are sometimes confused with bar charts. A histogram is used for continuous data, where the bins represent ranges of data, while a bar chart is a plot of categorical variables. Some authors recommend that bar charts have gaps between the rectangles to clarify the distinction.[6][7]

A bar graph and a histogram are two common types of graphical representations of data. While they may look similar, there are some key differences between the two that are important to understand.

A bar graph is a chart that uses bars to represent the frequency or quantity of different categories of data. The bars can be either vertical or horizontal, and they are typically arranged either horizontally or vertically to make it easy to compare the different categories. Bar graphs are useful for displaying data that can be divided into discrete categories, such as the number of students in different grade levels at a school.

A histogram, on the other hand, is a graph that shows the distribution of numerical data. It is a type of bar chart that shows the frequency or number of observations within different numerical ranges, called bins. The bins are usually specified as consecutive, non-overlapping intervals of a variable. The histogram provides a visual representation of the distribution of the data, showing the number of observations that fall within each bin. This can be useful for identifying patterns and trends in the data, and for making comparisons between different datasets.[8]