R Dataset / Package HistData / ZeaMays

How To Create a Barplot

Webform
The Drupal File ID of the selected dataset. The user may load another using the search bar on the operation's page.

How To Create a Stacked Barplot

Webform
The Drupal File ID of the selected dataset. The user may load another using the search bar on the operation's page.

How To Create a Pie Chart

Webform
The Drupal File ID of the selected dataset. The user may load another using the search bar on the operation's page.

How To Compute the Mean

Webform
The Drupal File ID of the selected dataset. The user may load another using the search bar on the operation's page.

How To Create a Plot

Webform
The Drupal File ID of the selected dataset. The user may load another using the search bar on the operation's page.

How to Compute the Median

Webform
The Drupal File ID of the selected dataset. The user may load another using the search bar on the operation's page.

Boxplot

The Drupal File ID of the selected dataset. The user may load another using the search bar on the operation's page.

Correlation Coefficient

The Drupal File ID of the selected dataset. The user may load another using the search bar on the operation's page.

Cumulative Frequency Histogram

The Drupal File ID of the selected dataset. The user may load another using the search bar on the operation's page.

Dotplot

The Drupal File ID of the selected dataset. The user may load another using the search bar on the operation's page.

Hollow Histogram

The Drupal File ID of the selected dataset. The user may load another using the search bar on the operation's page.

Numerical Summaries

The Drupal File ID of the selected dataset. The user may load another using the search bar on the operation's page.

Pie Chart

The Drupal File ID of the selected dataset. The user may load another using the search bar on the operation's page.

Plot

The Drupal File ID of the selected dataset. The user may load another using the search bar on the operation's page.

Regression

Stem and Leaf Plots

The Drupal File ID of the selected dataset. The user may load another using the search bar on the operation's page.

Visual Summaries

The Drupal File ID of the selected dataset. The user may load another using the search bar on the operation's page.
Attachment Size
dataset-75564.csv 350 bytes
Dataset License
GNU General Public License v2.0
Documentation License
GNU General Public License v2.0
R Dataset Help

On this Picostat.com statistics page, you will find information about the ZeaMays data set which pertains to Darwin's Heights of Cross- and Self-fertilized Zea May Pairs. The ZeaMays data set is found in the HistData R package. You can load the ZeaMays data set in R by issuing the following command at the console data("ZeaMays"). This will load the data into a variable called ZeaMays. If R says the ZeaMays data set is not found, you can try installing the package by issuing this command install.packages("HistData") and then attempt to reload the data. If you need to download R, you can go to the R project website. You can download a CSV (comma separated values) version of the ZeaMays R data set. The size of this file is about 350 bytes.

Documentation

Darwin's Heights of Cross- and Self-fertilized Zea May Pairs

Description

Darwin (1876) studied the growth of pairs of zea may (aka corn) seedlings, one produced by cross-fertilization and the other produced by self-fertilization, but otherwise grown under identical conditions. His goal was to demonstrate the greater vigour of the cross-fertilized plants. The data recorded are the final height (inches, to the nearest 1/8th) of the plants in each pair.

In the Design of Experiments, Fisher (1935) used these data to illustrate a paired t-test (well, a one-sample test on the mean difference, cross - self). Later in the book (section 21), he used this data to illustrate an early example of a non-parametric permutation test, treating each paired difference as having (randomly) either a positive or negative sign.

Usage

data(ZeaMays)

Format

A data frame with 15 observations on the following 4 variables.

pair

pair number, a numeric vector

pot

pot, a factor with levels 1 2 3 4

cross

height of cross fertilized plant, a numeric vector

self

height of self fertilized plant, a numeric vector

diff

cross - self for each pair

Details

In addition to the standard paired t-test, several types of non-parametric tests can be contemplated:

(a) Permutation test, where the values of, say self are permuted and diff=cross - self is calculated for each permutation. There are 15! permutations, but a reasonably large number of random permutations would suffice. But this doesn't take the paired samples into account.

(b) Permutation test based on assigning each abs(diff) a + or - sign, and calculating the mean(diff). There are 2^{15} such possible values. This is essentially what Fisher proposed. The p-value for the test is the proportion of absolute mean differences under such randomization which exceed the observed mean difference.

(c) Wilcoxon signed rank test: tests the hypothesis that the median signed rank of the diff is zero, or that the distribution of diff is symmetric about 0, vs. a location shifted alternative.

Source

Darwin, C. (1876). The Effect of Cross- and Self-fertilization in the Vegetable Kingdom, 2nd Ed. London: John Murray.

Andrews, D. and Herzberg, A. (1985) Data: a collection of problems from many fields for the student and research worker. New York: Springer. Data retrieved from: https://www.stat.cmu.edu/StatDat/

References

Fisher, R. A. (1935). The Design of Experiments. London: Oliver & Boyd.

See Also

wilcox.test

independence_test in the coin package, a general framework for conditional inference procedures (permutation tests)

Examples

data(ZeaMays)##################################
## Some preliminary exploration ##
##################################
boxplot(ZeaMays[,c("cross", "self")], ylab="Height (in)", xlab="Fertilization")# examine large individual diff/ces
largediff <- subset(ZeaMays, abs(diff) > 2*sd(abs(diff)))
with(largediff, segments(1, cross, 2, self, col="red"))# plot cross vs. self.  NB: unusual trend and some unusual points
with(ZeaMays, plot(self, cross, pch=16, cex=1.5))
abline(lm(cross ~ self, data=ZeaMays), col="red", lwd=2)# pot effects ?
 anova(lm(diff ~ pot, data=ZeaMays))##############################
## Tests of mean difference ##
##############################
# Wilcoxon signed rank test
# signed ranks:
with(ZeaMays, sign(diff) * rank(abs(diff)))
wilcox.test(ZeaMays$cross, ZeaMays$self, conf.int=TRUE, exact=FALSE)# t-tests
with(ZeaMays, t.test(cross, self))
with(ZeaMays, t.test(diff))mean(ZeaMays$diff)
# complete permutation distribution of diff, for all 2^15 ways of assigning
# one value to cross and the other to self (thx: Bert Gunter)
N <- nrow(ZeaMays)
allmeans <- as.matrix(expand.grid(as.data.frame(
                         matrix(rep(c(-1,1),N), nr =2))))  %*% abs(ZeaMays$diff) / N# upper-tail p-value
sum(allmeans > mean(ZeaMays$diff)) / 2^N
# two-tailed p-value
sum(abs(allmeans) > mean(ZeaMays$diff)) / 2^Nhist(allmeans, breaks=64, xlab="Mean difference, cross-self",
	main="Histogram of all mean differences")
abline(v=c(1, -1)*mean(ZeaMays$diff), col="red", lwd=2, lty=1:2)plot(density(allmeans), xlab="Mean difference, cross-self",
	main="Density plot of all mean differences")
abline(v=c(1, -1)*mean(ZeaMays$diff), col="red", lwd=2, lty=1:2)
--

Dataset imported from https://www.r-project.org.

All Public Datasets File Size
Test 394 bytes
Q1 332 bytes
prova_Correlatio 593 bytes
profva
test 332 bytes
Recent Queries For This Dataset

No queries made on this dataset yet.