Aktuelles
  Seminar
  LS Liesenfeld
  LS Mosler
  Mitarbeiter
  Lehre
Diplom
Bachelor
Master
Promotion
  Forschung
  Bibliothek
  Links
 
   

     Uni Köln > WiSo-Fakultät > Seminar für Wirtschafts- und Sozialstatistik > Institut > LS Mosler > Prof. Mosler > Datenportal

Datenportal des Lehrstuhls für Statistik und Ökonometrie

 

Vowel (MvsF) data


The data set (and description) can be downloaded here:
http://archive.ics.uci.edu/ml/machine-learning-databases/undocumented/connectionist-bench/vowel/vowel-context.data


Description:



                Introduction
                ============

In my work on context-sensitive learning, I used the "Deterding Vowel
Recognition Data", but I found it necessary to reformulate the data.
Implicit in the original data is contextual information on the
speaker's gender and identity. For my work, it was necessary to make
this information explicit. The file "vowel-context.data" adds the
speaker's sex and identity as new features. The format of the data file
is described below.


Peter Turney
peter@ai.iit.nrc.ca



                References
                ==========

P. Turney. "Robust Classification With Context-Sensitive Features."
Proceedings of the Sixth International Conference on Industrial
and Engineering Applications of Artificial Intelligence and Expert
Systems (IEA/AIE-93): 268-276. 1993.

URL: ftp://ai.iit.nrc.ca/pub/ksl-papers/NRC-35074.ps.Z


P. Turney. "Exploiting Context When Learning to Classify."
Proceedings of the European Conference on Machine Learning
(ECML-93): 402-407. 1993.

URL: ftp://ai.iit.nrc.ca/pub/ksl-papers/NRC-35058.ps.Z



                File Structure
                ==============


        Column          Description
        -------------------------------
        0               Train or Test
        1               Speaker Number
        2               Sex
        3               Feature 0
        4               Feature 1
        5               Feature 2
        6               Feature 3
        7               Feature 4
        8               Feature 5
        9               Feature 6
        10              Feature 7
        11              Feature 8
        12              Feature 9
        13              Class




                Numerical Codes
                ===============


        Speaker         Code Number
        ---------------------------
        Andrew          0
        Bill            1
        David           2
        Mark            3
        Jo              4
        Kate            5
        Penny           6
        Rose            7
        Mike            8
        Nick            9
        Rich            10
        Tim             11
        Sarah           12
        Sue             13
        Wendy           14



        Set             Number
        ---------------------------
        Train           0
        Test            1



        Sex             Number
        ---------------------------
        Male            0
        Female          1



        Class           Number
        ---------------------------
        hid             0
        hId             1
        hEd             2
        hAd             3
        hYd             4
        had             5
        hOd             6
        hod             7
        hUd             8
        hud             9
        hed             10





        Speaker         Code Number     Sex             Train/Test
        ---------------------------------------------------------------
        Andrew          0               0               0
        Bill            1               0               0
        David           2               0               0
        Mark            3               0               0
        Jo              4               1               0
        Kate            5               1               0
        Penny           6               1               0
        Rose            7               1               0
        Mike            8               0               1
        Nick            9               0               1
        Rich            10              0               1
        Tim             11              0               1
        Sarah           12              1               1
        Sue             13              1               1
        Wendy           14              1               1


Citation Request:

Please refer to the repository http://archive.ics.uci.edu/ml (see citation policy).
See also Frank, A. & Asuncion, A. (2010). UCI Machine Learning Repository
[http://archive.ics.uci.edu/ml].
Irvine, CA: University of California, School of Information and Computer Science.


Descriptive statistics:

Dataset= vowel_MvsF : n= 990 , d= 13 


Class1: n= 528 

Covariance matrix:
         [,1]    [,2]    [,3]    [,4]    [,5]    [,6]    [,7]    [,8]    [,9]   [,10]   [,11]   [,12]   [,13]
 [1,]  0.2505  2.0038 -0.0834  0.0821 -0.0115 -0.0612  0.0893 -0.0184  0.0330 -0.0330 -0.0529 -0.0546  0.0000
 [2,]  2.0038 17.2827 -0.5461  0.6315 -0.0508 -0.5634  0.8658 -0.2462  0.3434 -0.2071 -0.4088 -0.4880  0.0000
 [3,] -0.0834 -0.5461  0.8118 -0.5088 -0.2941 -0.1309  0.0855  0.0474  0.0073  0.0923 -0.0885 -0.0866 -1.7375
 [4,]  0.0821  0.6315 -0.5088  1.2556  0.1582 -0.2629 -0.3330 -0.3351 -0.0014 -0.0144  0.2397  0.0591  2.0623
 [5,] -0.0115 -0.0508 -0.2941  0.1582  0.5472  0.0383 -0.1417 -0.2189 -0.0969 -0.0566  0.1868  0.2024  0.8977
 [6,] -0.0612 -0.5634 -0.1309 -0.2629  0.0383  0.4924 -0.0328  0.0867 -0.1012  0.0127 -0.0491  0.0675 -0.0315
 [7,]  0.0893  0.8658  0.0855 -0.3330 -0.1417 -0.0328  0.3676  0.1126  0.0688 -0.0631 -0.1678 -0.1463 -0.7607
 [8,] -0.0184 -0.2462  0.0474 -0.3351 -0.2189  0.0867  0.1126  0.3733  0.0424  0.0283 -0.1638 -0.1012 -0.5770
 [9,]  0.0330  0.3434  0.0073 -0.0014 -0.0969 -0.1012  0.0688  0.0424  0.1717 -0.0257 -0.0476 -0.0898 -0.2811
[10,] -0.0330 -0.2071  0.0923 -0.0144 -0.0566  0.0127 -0.0631  0.0283 -0.0257  0.2106 -0.0034  0.0250 -0.0379
[11,] -0.0529 -0.4088 -0.0885  0.2397  0.1868 -0.0491 -0.1678 -0.1638 -0.0476 -0.0034  0.2885  0.0965  0.7798
[12,] -0.0546 -0.4880 -0.0866  0.0591  0.2024  0.0675 -0.1463 -0.1012 -0.0898  0.0250  0.0965  0.2689  0.5671
[13,]  0.0000  0.0000 -1.7375  2.0623  0.8977 -0.0315 -0.7607 -0.5770 -0.2811 -0.0379  0.7798  0.5671 10.0190

Correlation matrix:
         [,1]    [,2]    [,3]    [,4]    [,5]    [,6]    [,7]    [,8]    [,9]   [,10]   [,11]   [,12]   [,13]
 [1,]  1.0000  0.9631 -0.1849  0.1464 -0.0311 -0.1742  0.2944 -0.0602  0.1589 -0.1437 -0.1969 -0.2104  0.0000
 [2,]  0.9631  1.0000 -0.1458  0.1356 -0.0165 -0.1931  0.3435 -0.0969  0.1993 -0.1086 -0.1831 -0.2264  0.0000
 [3,] -0.1849 -0.1458  1.0000 -0.5040 -0.4413 -0.2070  0.1565  0.0862  0.0196  0.2233 -0.1828 -0.1854 -0.6092
 [4,]  0.1464  0.1356 -0.5040  1.0000  0.1909 -0.3343 -0.4901 -0.4894 -0.0029 -0.0280  0.3983  0.1017  0.5814
 [5,] -0.0311 -0.0165 -0.4413  0.1909  1.0000  0.0737 -0.3159 -0.4844 -0.3160 -0.1667  0.4700  0.5277  0.3834
 [6,] -0.1742 -0.1931 -0.2070 -0.3343  0.0737  1.0000 -0.0770  0.2021 -0.3479  0.0393 -0.1302  0.1856 -0.0142
 [7,]  0.2944  0.3435  0.1565 -0.4901 -0.3159 -0.0770  1.0000  0.3039  0.2737 -0.2266 -0.5151 -0.4653 -0.3964
 [8,] -0.0602 -0.0969  0.0862 -0.4894 -0.4844  0.2021  0.3039  1.0000  0.1676  0.1008 -0.4991 -0.3195 -0.2983
 [9,]  0.1589  0.1993  0.0196 -0.0029 -0.3160 -0.3479  0.2737  0.1676  1.0000 -0.1350 -0.2140 -0.4179 -0.2143
[10,] -0.1437 -0.1086  0.2233 -0.0280 -0.1667  0.0393 -0.2266  0.1008 -0.1350  1.0000 -0.0140  0.1050 -0.0261
[11,] -0.1969 -0.1831 -0.1828  0.3983  0.4700 -0.1302 -0.5151 -0.4991 -0.2140 -0.0140  1.0000  0.3466  0.4586
[12,] -0.2104 -0.2264 -0.1854  0.1017  0.5277  0.1856 -0.4653 -0.3195 -0.4179  0.1050  0.3466  1.0000  0.3455
[13,]  0.0000  0.0000 -0.6092  0.5814  0.3834 -0.0142 -0.3964 -0.2983 -0.2143 -0.0261  0.4586  0.3455  1.0000

Median:          0.5007 5.5049 -2.9235 1.7224 -0.526  0.5637 -0.5822 0.7309 -0.0554 0.5883  0.0034 -0.208  4.9953 

Mean:            0.5    5.5    -2.9803 1.6699 -0.5887 0.5886 -0.531  0.8074 -0.0202 0.5767 -0.0438 -0.246  5 
MCD-estimated:
MDC-0.975-Mean:  0.45   5.0767 -2.5109 1.3426 -0.8244 0.6016 -0.4631 0.825   0.0098 0.5845 -0.1444 -0.3825 3 
MDC-0.750-Mean:  0.5159 5.7134 -2.5315 1.4633 -0.8073 0.4965 -0.4265 0.8424  0.0078 0.6055 -0.1521 -0.4031 3.1242 
MDC-0.500-Mean:  0.5808 6.1078 -2.8884 1.7077 -0.786  0.4062 -0.4314 0.8473  0.0914 0.4699 -0.0901 -0.4441 3.9311 


Class2: n= 462 

Covariance matrix:
         [,1]    [,2]    [,3]    [,4]    [,5]    [,6]    [,7]    [,8]    [,9]   [,10]   [,11]   [,12]   [,13]
 [1,]  0.2454  1.8407  0.0346  0.0912 -0.0487  0.0531  0.0121  0.0584 -0.0614 -0.0745 -0.0006  0.0919  0.0000
 [2,]  1.8407 14.8076  0.0192  0.8937 -0.4954  0.3621  0.1423  0.4198 -0.4907 -0.6062 -0.0874  0.8986  0.0000
 [3,]  0.0346  0.0192  0.5694 -0.2785 -0.1439 -0.0223 -0.1300  0.0216  0.0420 -0.0774 -0.0119  0.0786 -0.6309
 [4,]  0.0912  0.8937 -0.2785  1.4177 -0.3074 -0.6624 -0.3132 -0.0019  0.2496  0.2919 -0.0161 -0.2901  2.3842
 [5,] -0.0487 -0.4954 -0.1439 -0.3074  0.4458  0.2091  0.0815 -0.1447 -0.1605  0.0351  0.0312  0.0140 -0.5654
 [6,]  0.0531  0.3621 -0.0223 -0.6624  0.2091  0.6607  0.1885 -0.0691 -0.2809 -0.2335  0.0785  0.2196 -1.4070
 [7,]  0.0121  0.1423 -0.1300 -0.3132  0.0815  0.1885  0.4026  0.0503 -0.1393 -0.1593 -0.0996  0.0979 -0.3427
 [8,]  0.0584  0.4198  0.0216 -0.0019 -0.1447 -0.0691  0.0503  0.2786  0.0518 -0.0075 -0.0636 -0.0625  0.0734
 [9,] -0.0614 -0.4907  0.0420  0.2496 -0.1605 -0.2809 -0.1393  0.0518  0.2608  0.0900 -0.0126 -0.1327  0.4993
[10,] -0.0745 -0.6062 -0.0774  0.2919  0.0351 -0.2335 -0.1593 -0.0075  0.0900  0.3229  0.0065 -0.2279  0.9318
[11,] -0.0006 -0.0874 -0.0119 -0.0161  0.0312  0.0785 -0.0996 -0.0636 -0.0126  0.0065  0.2027 -0.0364 -0.2090
[12,]  0.0919  0.8986  0.0786 -0.2901  0.0140  0.2196  0.0979 -0.0625 -0.1327 -0.2279 -0.0364  0.4004 -1.0323
[13,]  0.0000  0.0000 -0.6309  2.3842 -0.5654 -1.4070 -0.3427  0.0734  0.4993  0.9318 -0.2090 -1.0323 10.0217

Correlation matrix:
         [,1]    [,2]    [,3]    [,4]    [,5]    [,6]    [,7]    [,8]    [,9]   [,10]   [,11]   [,12]   [,13]
 [1,]  1.0000  0.9656  0.0926  0.1546 -0.1471  0.1319  0.0385  0.2234 -0.2427 -0.2647 -0.0028  0.2931  0.0000
 [2,]  0.9656  1.0000  0.0066  0.1951 -0.1928  0.1158  0.0583  0.2067 -0.2497 -0.2772 -0.0504  0.3690  0.0000
 [3,]  0.0926  0.0066  1.0000 -0.3100 -0.2856 -0.0363 -0.2715  0.0543  0.1091 -0.1805 -0.0351  0.1647 -0.2641
 [4,]  0.1546  0.1951 -0.3100  1.0000 -0.3867 -0.6845 -0.4145 -0.0030  0.4105  0.4314 -0.0301 -0.3851  0.6325
 [5,] -0.1471 -0.1928 -0.2856 -0.3867  1.0000  0.3853  0.1924 -0.4107 -0.4706  0.0924  0.1038  0.0331 -0.2675
 [6,]  0.1319  0.1158 -0.0363 -0.6845  0.3853  1.0000  0.3655 -0.1611 -0.6766 -0.5056  0.2144  0.4269 -0.5468
 [7,]  0.0385  0.0583 -0.2715 -0.4145  0.1924  0.3655  1.0000  0.1501 -0.4299 -0.4417 -0.3486  0.2439 -0.1706
 [8,]  0.2234  0.2067  0.0543 -0.0030 -0.4107 -0.1611  0.1501  1.0000  0.1920 -0.0251 -0.2677 -0.1873  0.0439
 [9,] -0.2427 -0.2497  0.1091  0.4105 -0.4706 -0.6766 -0.4299  0.1920  1.0000  0.3101 -0.0549 -0.4106  0.3088
[10,] -0.2647 -0.2772 -0.1805  0.4314  0.0924 -0.5056 -0.4417 -0.0251  0.3101  1.0000  0.0253 -0.6339  0.5180
[11,] -0.0028 -0.0504 -0.0351 -0.0301  0.1038  0.2144 -0.3486 -0.2677 -0.0549  0.0253  1.0000 -0.1278 -0.1467
[12,]  0.2931  0.3690  0.1647 -0.3851  0.0331  0.4269  0.2439 -0.1873 -0.4106 -0.6339 -0.1278  1.0000 -0.5153
[13,]  0.0000  0.0000 -0.2641  0.6325 -0.2675 -0.5468 -0.1706  0.0439  0.3088  0.5180 -0.1467 -0.5153  1.0000

Median:          0.3587 8.2849 -3.4217 2.1937 -0.4829 0.3294 -0.1385 0.41   0.0857 0.0998 -0.6072  0.12   5.0087 

Mean:            0.4286 8.7143 -3.4591 2.1239 -0.4153 0.4319 -0.0481 0.4278 0.0137 0.0621 -0.5992  0.1282 5 
MCD-estimated:
MDC-0.975-Mean:  0      5.5    -3.5196 1.9647 -0.3303 0.3392 -0.0692 0.3258 0.121  0.1922 -0.5981 -0.0322 5 
MDC-0.750-Mean:  0      5.5    -3.5196 1.9647 -0.3303 0.3392 -0.0692 0.3258 0.121  0.1922 -0.5981 -0.0322 5 
MDC-0.500-Mean:  0      5.5    -3.5196 1.9647 -0.3303 0.3392 -0.0692 0.3258 0.121  0.1922 -0.5981 -0.0322 5 


Measures:
Mah.Dist:                        3.9868 
Mah.Dist-MCD-0.975:              4.281 
Mah.Dist-MCD-0.750:              4.0891 
Mah.Dist-MCD-0.500:              4.0898 
 



 

Zuletzt geändert am 17.02.2013