Home Research Publications

 

 


Learning and Classification Algorithms

Within the scope of this project we developed several learning and classification algorithms using techniques from statistical pattern recognition and machine learning.  One of the key elements in building these systems is the concept of statistical confidence measure.  We used this concept in different domains: for partitioning a features space, designing an adaptive k-nearest neighbor (k-NN) algorithm, and selecting the subset of training data for support vector machines (SVMs). 

The main inspiration for this project is the learning algorithm called the Reduced Coulomb Energy (RCE).  This algorithm, proposed in 1982 by Reilly, Cooper and Elbaum (again RCE) is one of the first  classification algorithms that was able to learn any non-linearly separable function.  In contrast to other classification algorithms, such as the Multi Layer Perceptron (MLP) and Radial Basis Function (RBF) networks, the RCE algorithm automatically adjusts the number of hidden units and converges in only few epochs. However, the RCE algorithm has its shortcomings most importantly it depends on user-specified parameters which are computationally expensive to optimize. 

Over the last several years, we developed several algorithms that, like the RCE algorithm, use the idea of covering the feature space with spheres or prototypes.  Although we use the word "sphere" very often in the titles of our papers, it turns out that most of the models are completely unrelated and the "spheres" have quite different roles. 

Confidence Measure

Adaptive k-NN Rule

Data Selection for SVM

Minimal Sphere Covering Algorithm

Single Sphere Algorithm

Minimum Bounding Spheres

 

The following table illustrates the experimental results (the error rates) of the Adaptive k-NN rule (A-k-NN) and the Minimal Sphere Covering Algorithm (MSCA) in comparison to SVM, and k-NN algorithms when tested on several datasets from the UCI Machine Learning Repository. The numbers in the parenthesis are the corresponding standard deviations.

 

Dataset

k-NN

SVMs MSCA A-k-NN

Breast Cancer

2.79 (0.67)

3.68 (0.66)

3.24 (0.72)

2.65 (0.84)

Ionosphere

12.86 (1.96) 4.86 (1.05)

3.71 (0.73)

4.00 (0.87)

Pima

24.61 (1.36)

27.50 (1.68) 26.67 (1.21) 24.21 (1.39)

Liver

30.88 (3.32) 31.47 (2.63)

30.14 1.40)

30.59 (2.33)

Sonar

17.00 (2.26) 11.00 (2.33)

9.76 (1.51)

13.00 (1.70)

 


 

 

Home Research Publications