This dataset was taken from the list of OpenIntro dataset files found at https://www.openintro.org/data/.
OpenIntro features a number of free books that can be used in high school and AP statistics courses. The license on these datasets is currently unknown. You can find out more about OpenIntro at https://www.openintro.org.
Photo classifications: fashion or not
This is a simulated data set for photo classifications based on a machine learning algorithm versus what the true classification is for those photos. While the data are not real, they resemble performance that would be reasonable to expect in a well-built classifier.
- mach_learn - The prediction by the machine learning system as to whether the photo is about fashion or not.
- truth - The actual classification of the photo by a team of humans.
The hypothetical ML algorithm has a precision of 90\ meaning of those photos it claims are fashion, about 90\ of them are actually about fashion. The recall of the ML algorithm is about 64\ of the photos that are about fashion, it correctly predicts that they are about fashion about 64\
The data are simulated / hypothetical.
Taken from: https://www.openintro.org/data/index.php?data=photo_classify.