R Dataset / Package HistData / PearsonLee
Attachment | Size |
---|---|
dataset-42701.csv | 26.88 KB |
Documentation |
---|
On this Picostat.com statistics page, you will find information about the PearsonLee data set which pertains to Pearson and Lee's data on the heights of parents and children classified by gender. The PearsonLee data set is found in the HistData R package. Try to load the PearsonLee data set in R by issuing the following command at the console data("PearsonLee"). This may load the data into a variable called PearsonLee. If R says the PearsonLee data set is not found, you can try installing the package by issuing this command install.packages("HistData") and then attempt to reload the data with library("HistData") followed by data("PearsonLee"). Perhaps strangley, if R gives you no output after entering a command, it means the command succeeded. If it succeeded you can see the data by typing PearsonLee at the command-line which should display the entire dataset. If you need to download R, you can go to the R project website. You can download a CSV (comma separated values) version of the PearsonLee R data set. The size of this file is about 27,525 bytes. Pearson and Lee's data on the heights of parents and children classified by genderDescriptionWachsmuth et. al (2003) noticed that a loess smooth through Galton's data on heights of mid-parents and their offspring exhibited a slightly non-linear trend, and asked whether this might be due to Galton having pooled the heights of fathers and mothers and sons and daughters in constructing his tables and graphs. To answer this question, they used analogous data from English families at about the same time, tabulated by Karl Pearson and Alice Lee (1896, 1903), but where the heights of parents and children were each classified by gender of the parent. Usagedata(PearsonLee) FormatA frequency data frame with 746 observations on the following 6 variables.
DetailsThe variables SourcePearson, K. and Lee, A. (1896). Mathematical contributions to the theory of evolution. On telegony in man, etc. Proceedings of the Royal Society of London, 60 , 273-283. Pearson, K. and Lee, A. (1903). On the laws of inheritance in man: I. Inheritance of physical characters. Biometika, 2(4), 357-462. (Tables XXII, p. 415; XXV, p. 417; XXVIII, p. 419 and XXXI, p. 421.) ReferencesWachsmuth, A.W., Wilkinson L., Dallal G.E. (2003). Galton's bend: A previously undiscovered nonlinearity in Galton's family stature regression data. The American Statistician, 57, 190-192. http://www.cs.uic.edu/~wilkinson/Publications/galton.pdf See Also
Examplesdata(PearsonLee) str(PearsonLee)with(PearsonLee, { lim <- c(55,80) xv <- seq(55,80, .5) sunflowerplot(parent,child, number=frequency, xlim=lim, ylim=lim, seg.col="gray", size=.1) abline(lm(child ~ parent, weights=frequency), col="blue", lwd=2) lines(xv, predict(loess(child ~ parent, weights=frequency), data.frame(parent=xv)), col="blue", lwd=2) # NB: dataEllipse doesn't take frequency into account if(require(car)) { dataEllipse(parent,child, xlim=lim, ylim=lim, plot.points=FALSE) } })## separate plots for combinations of (chl, par)# this doesn't quite work, because xyplot can't handle weights require(lattice) xyplot(child ~ parent|par+chl, data=PearsonLee, type=c("p", "r", "smooth"), col.line="red")# Using ggplot [thx: Dennis Murphy] require(ggplot2) ggplot(PearsonLee, aes(x = parent, y = child, weight=frequency)) + geom_point(size = 1.5, position = position_jitter(width = 0.2)) + geom_smooth(method = lm, aes(weight = PearsonLee$frequency, colour = 'Linear'), se = FALSE, size = 1.5) + geom_smooth(aes(weight = PearsonLee$frequency, colour = 'Loess'), se = FALSE, size = 1.5) + facet_grid(chl ~ par) + scale_colour_manual(breaks = c('Linear', 'Loess'), values = c('green', 'red')) + theme(legend.position = c(0.14, 0.885), legend.background = element_rect(fill = 'white'))# inverse regression, as in Wachmuth et al. (2003)ggplot(PearsonLee, aes(x = child, y = parent, weight=frequency)) + geom_point(size = 1.5, position = position_jitter(width = 0.2)) + geom_smooth(method = lm, aes(weight = PearsonLee$frequency, colour = 'Linear'), se = FALSE, size = 1.5) + geom_smooth(aes(weight = PearsonLee$frequency, colour = 'Loess'), se = FALSE, size = 1.5) + facet_grid(chl ~ par) + scale_colour_manual(breaks = c('Linear', 'Loess'), values = c('green', 'red')) + theme(legend.position = c(0.14, 0.885), legend.background = element_rect(fill = 'white')) -- Dataset imported from https://www.r-project.org. |
Picostat Manual |
---|
How To Register With a Username
How To Register With Google Single Sign On (SSO)
How To Login With a Username and Password
How To Login With Google Single Sign On (SSO)
How To Import a Dataset
How To Perform Statistical Analysis with Picostat
How To Use Educational Applications with Picostat
|
Recent Queries For This Dataset |
---|
No queries made on this dataset yet. |
Title | Authored on | Content type |
---|---|---|
R Dataset / Package datasets / sunspot.year | March 9, 2018 - 1:06 PM | Dataset |
R Dataset / Package datasets / airmiles | March 9, 2018 - 1:06 PM | Dataset |
R Dataset / Package Stat2Data / SampleFG | March 9, 2018 - 1:06 PM | Dataset |
OpenIntro Statistics Dataset - toy_anova | August 9, 2020 - 2:38 PM | Dataset |
swiss | February 26, 2017 - 11:28 AM | Dataset |