On this Picostat.com statistics page, you will find information about the peas data set which pertains to Galton's Peas. The peas data set is found in the psych R package. You can load the peas data set in R by issuing the following command at the console data("peas"). This will load the data into a variable called peas. If R says the peas data set is not found, you can try installing the package by issuing this command install.packages("psych") and then attempt to reload the data. If you need to download R, you can go to the R project website. You can download a CSV (comma separated values) version of the peas R data set. The size of this file is about 6,317 bytes.
Francis Galton introduced the correlation coefficient with an analysis of the similarities of the parent and child generation of 700 sweet peas.
A data frame with 700 observations on the following 2 variables.
The mean diameter of the mother pea for 700 peas
The mean diameter of the daughter pea for 700 sweet peas
Galton's introduction of the correlation coefficient was perhaps the most important contribution to the study of individual differences. This data set allows a graphical analysis of the data set. There are two different graphic examples. One shows the regression lines for both relationships, the other finds the correlation as well.
Stanton, Jeffrey M. (2001) Galton, Pearson, and the Peas: A brief history of linear regression for statistics intstructors, Journal of Statistics Education, 9. (retrieved from the web from http://www.amstat.org/publications/jse/v9n3/stanton.html) reproduces the table from Galton, 1894, Table 2.
The data were generated from this table.
Galton, Francis (1877) Typical laws of heredity. paper presented to the weekly evening meeting of the Royal Institution, London. Volume VIII (66) is the first reference to this data set. The data appear in
Galton, Francis (1894) Natural Inheritance (5th Edition), New York: MacMillan).
The other Galton data sets:
Dataset imported from https://www.r-project.org.