R Dataset / Package robustbase / salinity
Attachment | Size |
---|---|
dataset-13260.csv | 506 bytes |
Documentation |
---|
On this Picostat.com statistics page, you will find information about the salinity data set which pertains to Salinity Data. The salinity data set is found in the robustbase R package. Try to load the salinity data set in R by issuing the following command at the console data("salinity"). This may load the data into a variable called salinity. If R says the salinity data set is not found, you can try installing the package by issuing this command install.packages("robustbase") and then attempt to reload the data with library("robustbase") followed by data("salinity"). Perhaps strangley, if R gives you no output after entering a command, it means the command succeeded. If it succeeded you can see the data by typing salinity at the command-line which should display the entire dataset. If you need to download R, you can go to the R project website. You can download a CSV (comma separated values) version of the salinity R data set. The size of this file is about 506 bytes. Salinity DataDescriptionThis is a data set consisting of measurements of water salinity (i.e., its salt concentration) and river discharge taken in North Carolina's Pamlico Sound, recording some bi-weekly averages in March, April, and May from 1972 to 1977. This dataset was listed by Ruppert and Carroll (1980). In Carrol and Ruppert (1985) the physical background of the data is described. They indicated that observations 5 and 16 correspond to periods of very heavy discharge and showed that the discrepant observation 5 was masked by observations 3 and 16, i.e., only after deletion of these observations it was possible to identify the influential observation 5. This data set is a prime example of the masking effect. Usagedata(salinity) FormatA data frame with 28 observations on the following 4 variables (in parentheses are the names used in the 1980 reference).
NoteThe boot package contains another version of this salinity data set, also attributed to Ruppert and Carroll (1980), but with two clear transcription errors, see the examples. SourceP. J. Rousseeuw and A. M. Leroy (1987) Robust Regression and Outlier Detection; Wiley, p.82, table 5. Ruppert, D. and Carroll, R.J. (1980) Trimmed least squares estimation in the linear model. JASA 75, 828–838; table 3, p.835. Carroll, R.J. and Ruppert, D. (1985) Transformations in regression: A robust analysis. Technometrics 27, 1–12 Examplesdata(salinity) summary(lm.sali <- lm(Y ~ . , data = salinity)) summary(rlm.sali <- MASS::rlm(Y ~ . , data = salinity)) summary(lts.sali <- ltsReg(Y ~ . , data = salinity))salinity.x <- data.matrix(salinity[, 1:3]) c_sal <- covMcd(salinity.x) plot(c_sal, "tolEllipsePlot")## Connection with boot package's version : if(requireNamespace("boot")) { ## 'always' print( head(boot.sal <- boot::salinity ) ) print( head(robb.sal <- salinity [, c(4, 1:3)]) ) # difference: has one digit more ## Otherwise the same ? dimnames(robb.sal) <- dimnames(boot.sal) ## apart from the 4th column, they are "identical": stopifnot( all.equal(boot.sal[, -4], robb.sal[, -4], tol = 1e-15) ) ## But the discharge ('X3', 'dis' or 'H2OFLOW') __differs__ in two places: plot(cbind(robustbase = robb.sal[,4], boot = boot.sal[,4])) abline(0,1, lwd=3, col=adjustcolor("red", 1/4)) D.sal <- robb.sal[,4] - boot.sal[,4] stem(robb.sal[,4] - boot.sal[,4]) which(abs(D.sal) > 0.01) ## 2 8 ## *two* typos (=> difference ~= 1) in the version of 'boot': obs. 2 & 8 !!! cbind(robb = robb.sal[,4], boot = boot.sal[,4], D.sal) }# boot -- Dataset imported from https://www.r-project.org. |
Picostat Manual |
---|
How To Register With a Username
How To Register With Google Single Sign On (SSO)
How To Login With a Username and Password
How To Login With Google Single Sign On (SSO)
How To Import a Dataset
How To Perform Statistical Analysis with Picostat
How To Use Educational Applications with Picostat
|
Recent Queries For This Dataset |
---|
No queries made on this dataset yet. |
Title | Authored on | Content type |
---|---|---|
R Dataset / Package Stat2Data / TwinsLungs | March 9, 2018 - 1:06 PM | Dataset |
R Dataset / Package DAAG / psid1 | March 9, 2018 - 1:06 PM | Dataset |
R Dataset / Package quantreg / gasprice | March 9, 2018 - 1:06 PM | Dataset |
R Dataset / Package datasets / nottem | March 9, 2018 - 1:06 PM | Dataset |
R Dataset / Package DAAG / hills2000 | March 9, 2018 - 1:06 PM | Dataset |