| |
Haberman's Survival data
The data set (and description) can be downloaded here:
http://archive.ics.uci.edu/ml/machine-learning-databases/haberman/haberman.data
Description:
1. Title: Haberman's Survival Data
2. Sources:
(a) Donor: Tjen-Sien Lim (limt@stat.wisc.edu)
(b) Date: March 4, 1999
3. Past Usage:
1. Haberman, S. J. (1976). Generalized Residuals for Log-Linear
Models, Proceedings of the 9th International Biometrics
Conference, Boston, pp. 104-122.
2. Landwehr, J. M., Pregibon, D., and Shoemaker, A. C. (1984),
Graphical Models for Assessing Logistic Regression Models (with
discussion), Journal of the American Statistical Association 79:
61-83.
3. Lo, W.-D. (1993). Logistic Regression Trees, PhD thesis,
Department of Statistics, University of Wisconsin, Madison, WI.
4. Relevant Information:
The dataset contains cases from a study that was conducted between
1958 and 1970 at the University of Chicago's Billings Hospital on
the survival of patients who had undergone surgery for breast
cancer.
5. Number of Instances: 306
6. Number of Attributes: 4 (including the class attribute)
7. Attribute Information:
1. Age of patient at time of operation (numerical)
2. Patient's year of operation (year - 1900, numerical)
3. Number of positive axillary nodes detected (numerical)
4. Survival status (class attribute)
1 = the patient survived 5 years or longer
2 = the patient died within 5 year
8. Missing Attribute Values: None
Citation Request:
Please refer to the repository http://archive.ics.uci.edu/ml (see citation policy).
See also Frank, A. & Asuncion, A. (2010). UCI Machine Learning Repository
[http://archive.ics.uci.edu/ml].
Irvine, CA: University of California, School of Information and Computer Science.
Descriptive statistics:
Dataset= haberman : n= 306 , d= 3
Class1: n= 225
Covariance matrix:
[,1] [,2] [,3]
[1,] 121.2675 6.2748 -5.5498
[2,] 6.2748 10.3872 0.7077
[3,] -5.5498 0.7077 34.4606
Correlation matrix:
[,1] [,2] [,3]
[1,] 1.0000 0.1768 -0.0859
[2,] 0.1768 1.0000 0.0374
[3,] -0.0859 0.0374 1.0000
Median: 51.8281 62.6214 1.7184
Mean: 52.0178 62.8622 2.7911
MCD-estimated:
MDC-0.975-Mean: 54.0769 63.1966 0
MDC-0.750-Mean: 54.0769 63.1966 0
MDC-0.500-Mean: 54.0769 63.1966 0
Class2: n= 81
Covariance matrix:
[,1] [,2] [,3]
[1,] 103.3707 -5.5437 -8.9390
[2,] -5.5437 11.1698 -2.1951
[3,] -8.9390 -2.1951 84.3762
Correlation matrix:
[,1] [,2] [,3]
[1,] 1.0000 -0.1631 -0.0957
[2,] -0.1631 1.0000 -0.0715
[3,] -0.0957 -0.0715 1.0000
Median: 53.0616 62.9596 6.0796
Mean: 53.679 62.8272 7.4568
MCD-estimated:
MDC-0.975-Mean: 53.7213 62.8197 3.2459
MDC-0.750-Mean: 53.7213 62.8197 3.2459
MDC-0.500-Mean: 53.7213 62.8197 3.2459
Measures:
Mah.Dist: 0.7096
Mah.Dist-MCD-0.975: 1.3992
Mah.Dist-MCD-0.750: 1.444
Mah.Dist-MCD-0.500: 1.444
All the MCD estimates have been obtained after a slight perturbation of the data set
Zuletzt geändert am 17.02.2013
|
|
|