| |
Indian Liver Patient (1 vs 2) data
The data set (and description) can be downloaded here:
http://archive.ics.uci.edu/ml/machine-learning-databases/00225/
Description:
ILPD (Indian Liver Patient Dataset) Data Set
Download: Data Folder, Data Set Description
Abstract: This data set contains 10 variables that are age, gender, total
Bilirubin, direct Bilirubin, total proteins, albumin, A/G ratio, SGPT, SGOT
and Alkphos.
Data Set Characteristics: Multivariate
Number of Instances: 583
Area: Life
Attribute Characteristics: Integer, Real
Number of Attributes: 10
Date Donated: 2012-05-21
Associated Tasks: Classification
Missing Values? N/A
Number of Web Hits: 9021
Source:
1. Bendi Venkata Ramana
ramana.bendi '@' gmail.com
Associate Professor,
Department of Information Technology,
Aditya Instutute of Technology and Management,
Tekkali - 532201, Andhra Pradesh, India.
2. Prof. M. Surendra Prasad Babu
drmsprasadbabu '@' yahoo.co.in
Deptartment of Computer Science & Systems Engineering,
Andhra University College of Engineering,
Visakhapatnam-530 003 Andhra Pradesh, India.
3.Prof. N. B. Venkateswarlu
venkat_ritch '@' yahoo.com
Department of Computer Science and Engineering,
Aditya Instutute of Technology and Management,
Tekkali - 532201, Andhra Pradesh, India.
Data Set Information:
This data set contains 416 liver patient records and 167 non liver patient
records.The data set was collected from north east of Andhra Pradesh, India.
Selector is a class label used to divide into groups(liver patient or not).
This data set contains 441 male patient records and 142 female patient records.
Attribute Information:
1. Age Age of the patient
2. Gender Gender of the patient
3. TB Total Bilirubin
4. DB Direct Bilirubin
5. Alkphos Alkaline Phosphotase
6. Sgpt Alamine Aminotransferase
7. Sgot Aspartate Aminotransferase
8. TP Total Protiens
9. ALB Albumin
10. A/G Ratio Albumin and Globulin Ratio
11. Selector field used to split the data into two sets (labeled by the experts)
Relevant Papers:
1. Bendi Venkata Ramana, Prof. M. S. Prasad Babu and Prof. N. B. Venkateswarlu,
"A Critical Comparative Study of Liver Patients from USA and INDIA: An
Exploratory Analysis", International Journal of Computer Science Issues,
ISSN :1694-0784, May 2012.
2. Bendi Venkata Ramana, Prof. M. S. Prasad Babu and Prof. N. B. Venkateswarlu,
"A Critical Study of Selected Classification Algorithms for Liver Disease
Diagnosis", International Journal of Database Management Systems (IJDMS),
Vol.3, No.2, ISSN : 0975-5705, PP 101-114, May 2011.
Citation Request:
Please refer to the repository http://archive.ics.uci.edu/ml (see citation policy).
See also Frank, A. & Asuncion, A. (2010). UCI Machine Learning Repository
[http://archive.ics.uci.edu/ml].
Irvine, CA: University of California, School of Information and Computer Science.
Descriptive statistics:
Dataset= indian-liver-patient_1vs2 : n= 579 , d= 10
Class1: n= 414
Covariance matrix:
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
[1,] 246.1872 0.6203 -2.6722 -1.7320 376.8664 -436.8823 -243.4587 -3.1780 -3.4974 -1.3309
[2,] 0.6203 0.1719 0.2344 0.1186 -8.4775 7.0353 10.6412 -0.0520 -0.0365 -0.0007
[3,] -2.6722 0.2344 51.2424 19.9290 318.8957 279.9389 509.4240 0.0609 -1.2365 -0.4557
[4,] -1.7320 0.1186 19.9290 10.3204 164.7544 137.6200 247.9962 0.0671 -0.5629 -0.1916
[5,] 376.8664 -8.4775 318.8957 164.7544 72277.3631 5390.7891 13157.7419 -5.9795 -29.9191 -18.3495
[6,] -436.8823 7.0353 279.9389 137.6200 5390.7891 45461.4642 56759.0071 -10.6641 -0.8769 1.9593
[7,] -243.4587 10.6412 509.4240 247.9962 13157.7419 56759.0071 114336.1150 -7.4679 -18.6288 -5.9487
[8,] -3.1780 -0.0520 0.0609 0.0671 -5.9795 -10.6641 -7.4679 1.2040 0.6577 0.0721
[9,] -3.4974 -0.0365 -1.2365 -0.5629 -29.9191 -0.8769 -18.6288 0.6577 0.6200 0.1703
[10,] -1.3309 -0.0007 -0.4557 -0.1916 -18.3495 1.9593 -5.9487 0.0721 0.1703 0.1064
Correlation matrix:
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
[1,] 1.0000 0.0954 -0.0238 -0.0344 0.0893 -0.1306 -0.0459 -0.1846 -0.2831 -0.2601
[2,] 0.0954 1.0000 0.0790 0.0890 -0.0761 0.0796 0.0759 -0.1142 -0.1118 -0.0055
[3,] -0.0238 0.0790 1.0000 0.8666 0.1657 0.1834 0.2105 0.0078 -0.2194 -0.1952
[4,] -0.0344 0.0890 0.8666 1.0000 0.1908 0.2009 0.2283 0.0190 -0.2225 -0.1829
[5,] 0.0893 -0.0761 0.1657 0.1908 1.0000 0.0940 0.1447 -0.0203 -0.1413 -0.2093
[6,] -0.1306 0.0796 0.1834 0.2009 0.0940 1.0000 0.7873 -0.0456 -0.0052 0.0282
[7,] -0.0459 0.0759 0.2105 0.2283 0.1447 0.7873 1.0000 -0.0201 -0.0700 -0.0539
[8,] -0.1846 -0.1142 0.0078 0.0190 -0.0203 -0.0456 -0.0201 1.0000 0.7612 0.2013
[9,] -0.2831 -0.1118 -0.2194 -0.2225 -0.1413 -0.0052 -0.0700 0.7612 1.0000 0.6631
[10,] -0.2601 -0.0055 -0.1952 -0.1829 -0.2093 0.0282 -0.0539 0.2013 0.6631 1.0000
Median: 46.6575 1.8028 2.7957 1.2738 233.0326 46.1209 59.6008 6.5122 3.136 0.9329
Mean: 46.1449 1.7802 4.1804 1.9316 319.5362 99.9734 138.1739 6.4587 3.0585 0.9142
MCD-estimated:
MDC-0.975-Mean: 47.3773 2 1.5278 0.6189 247.1415 44.198 54.8586 6.4363 3.1174 0.9419
MDC-0.750-Mean: 47.4365 2 1.5502 0.631 247.4554 44.5445 55.1363 6.4371 3.1202 0.9434
MDC-0.500-Mean: 47.4365 2 1.5479 0.6286 248.7934 44.1924 54.911 6.4315 3.1113 0.9398
Class2: n= 165
Covariance matrix:
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
[1,] 291.0133 -0.4584 0.2049 0.1733 -190.6957 -46.7999 -61.6452 -3.2687 -2.2266 -0.2056
[2,] -0.4584 0.2101 0.0695 0.0379 5.0583 1.1157 2.5635 -0.0205 -0.0071 0.0066
[3,] 0.2049 0.0695 1.0193 0.5198 80.9240 8.5153 14.2241 -0.1654 -0.1453 -0.0473
[4,] 0.1733 0.0379 0.5198 0.2724 41.6836 4.3909 7.3869 -0.0794 -0.0733 -0.0248
[5,] -190.6957 5.0583 80.9240 41.6836 20030.1196 1341.8262 1384.5409 -4.3997 -16.1198 -9.8257
[6,] -46.7999 1.1157 8.5153 4.3909 1341.8262 632.3328 610.3452 0.9815 0.8766 0.0663
[7,] -61.6452 2.5635 14.2241 7.3869 1384.5409 610.3452 1336.8645 -4.0711 -2.3126 0.2008
[8,] -3.2687 -0.0205 -0.1654 -0.0794 -4.3997 0.9815 -4.0711 1.1095 0.7056 0.0988
[9,] -2.2266 -0.0071 -0.1453 -0.0733 -16.1198 0.8766 -2.3126 0.7056 0.6062 0.1650
[10,] -0.2056 0.0066 -0.0473 -0.0248 -9.8257 0.0663 0.2008 0.0988 0.1650 0.0825
Correlation matrix:
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
[1,] 1.0000 -0.0586 0.0119 0.0195 -0.0790 -0.1091 -0.0988 -0.1819 -0.1676 -0.0420
[2,] -0.0586 1.0000 0.1502 0.1586 0.0780 0.0968 0.1530 -0.0426 -0.0200 0.0504
[3,] 0.0119 0.1502 1.0000 0.9864 0.5663 0.3354 0.3853 -0.1556 -0.1848 -0.1631
[4,] 0.0195 0.1586 0.9864 1.0000 0.5643 0.3345 0.3871 -0.1445 -0.1803 -0.1656
[5,] -0.0790 0.0780 0.5663 0.5643 1.0000 0.3770 0.2676 -0.0295 -0.1463 -0.2417
[6,] -0.1091 0.0968 0.3354 0.3345 0.3770 1.0000 0.6638 0.0371 0.0448 0.0092
[7,] -0.0988 0.1530 0.3853 0.3871 0.2676 0.6638 1.0000 -0.1057 -0.0812 0.0191
[8,] -0.1819 -0.0426 -0.1556 -0.1445 -0.0295 0.0371 -0.1057 1.0000 0.8604 0.3265
[9,] -0.1676 -0.0200 -0.1848 -0.1803 -0.1463 0.0448 -0.0812 0.8604 1.0000 0.7376
[10,] -0.0420 0.0504 -0.1631 -0.1656 -0.2417 0.0092 0.0191 0.3265 0.7376 1.0000
Median: 42.9088 1.6847 0.9593 0.2991 187.7775 28.9654 31.8993 6.6418 3.4259 1.0516
Mean: 41.3636 1.703 1.1448 0.3964 220.6848 33.8364 40.7636 6.5394 3.3394 1.0296
MCD-estimated:
MDC-0.975-Mean: 41.3259 2.0001 0.9225 0.2888 197.4046 30.0561 33.0337 6.5978 3.4101 1.0492
MDC-0.750-Mean: 41.3 2.0001 0.93 0.2945 197.7667 30.6999 33.4889 6.6133 3.4167 1.0487
MDC-0.500-Mean: 41.3259 2.0001 0.9225 0.2888 197.4046 30.0561 33.0337 6.5978 3.4101 1.0492
Measures:
Mah.Dist: 0.8122
Mah.Dist-MCD-0.975: 0.8385
Mah.Dist-MCD-0.750: 0.8264
Mah.Dist-MCD-0.500: 0.8264
Zuletzt geändert am 17.02.2013
|
|
|