OpenIntro Statistics Dataset - email

How To Create a Barplot

Webform
The Drupal File ID of the selected dataset. The user may load another using the search bar on the operation's page.

How To Create a Stacked Barplot

Webform
The Drupal File ID of the selected dataset. The user may load another using the search bar on the operation's page.

How To Create a Pie Chart

Webform
The Drupal File ID of the selected dataset. The user may load another using the search bar on the operation's page.

How To Compute the Mean

Webform
The Drupal File ID of the selected dataset. The user may load another using the search bar on the operation's page.

How To Create a Plot

Webform
The Drupal File ID of the selected dataset. The user may load another using the search bar on the operation's page.

How to Compute the Median

Webform
The Drupal File ID of the selected dataset. The user may load another using the search bar on the operation's page.

Boxplot

The Drupal File ID of the selected dataset. The user may load another using the search bar on the operation's page.

Correlation Coefficient

The Drupal File ID of the selected dataset. The user may load another using the search bar on the operation's page.

Cumulative Frequency Histogram

The Drupal File ID of the selected dataset. The user may load another using the search bar on the operation's page.

Dotplot

The Drupal File ID of the selected dataset. The user may load another using the search bar on the operation's page.

Hollow Histogram

The Drupal File ID of the selected dataset. The user may load another using the search bar on the operation's page.

Numerical Summaries

The Drupal File ID of the selected dataset. The user may load another using the search bar on the operation's page.

Pie Chart

The Drupal File ID of the selected dataset. The user may load another using the search bar on the operation's page.

Plot

The Drupal File ID of the selected dataset. The user may load another using the search bar on the operation's page.

Regression

Stem and Leaf Plots

The Drupal File ID of the selected dataset. The user may load another using the search bar on the operation's page.

Visual Summaries

The Drupal File ID of the selected dataset. The user may load another using the search bar on the operation's page.
Attachment Size
dataset-265634117.csv 293.34 KB
Dataset License
Unknown
Documentation License
No license (All rights reserved)
Dataset Help

This dataset was taken from the list of OpenIntro dataset files found at https://www.openintro.org/data/.

OpenIntro features a number of free books that can be used in high school and AP statistics courses. The license on these datasets is currently unknown. You can find out more about OpenIntro at https://www.openintro.org.

email

These data represent incoming emails for the first three months of 2012 foran email account (see Source).

Variables

  • spam - Indicator for whether the email was spam.
  • to_multiple - Indicator for whether the email was addressed to more than one recipient.
  • from - Whether the message was listed as from anyone (this is usually set by default for regular outgoing email).
  • cc - Indicator for whether anyone was CCed.
  • sent_email - Indicator for whether the sender had been sent an email in the last 30 days.
  • time - Time at which email was sent.
  • image - The number of images attached.
  • attach - The number of attached files.
  • dollar - The number of times a dollar sign or the word "dollar" appeared in the email.
  • winner - Indicates whether "winner" appeared in the email.
  • inherit - The number of times "inherit" (or an extension, such as "inheritance") appeared in the email.
  • viagra - The number of times "viagra" appeared in the email.
  • password - The number of times "password" appeared in the email.
  • num_char - The number of characters in the email, in thousands.
  • line_breaks - The number of line breaks in the email (does not count text wrapping).
  • format - Indicates whether the email was written using HTML (e.g. may have included bolding or active links).
  • re_subj - Whether the subject started with "Re:", "RE:", "re:", or "rE:"
  • exclaim_subj - Whether there was an exclamation point in the subject.
  • urgent_subj - Whether the word "urgent" was in the email subject.
  • exclaim_mess - The number of exclamation points in the email message.
  • period_mess - The number of periods in the message.
  • signoff - Whether a sign-off of "Cheers", "Regards", or "Best" (also, "Best Regards") was used.
  • number - Factor variable saying whether there was no number, a small number (under 1 million), or a big number.

Source

David Diez's Gmail Account, early months of 2012. All personallyidentifiable information has been removed.

Taken from: https://www.openintro.org/data/index.php?data=email.

Documentation

No documentation is available yet.

Recent Queries For This Dataset

No queries made on this dataset yet.