Online Discriminative Kernel Density Estimation

Building classifiers is central to many applications in pattern recognition. When all input data-points are available, we can use the batch methods for learning the classifier. Batch methods, however, may not be suitable when data-points arrive in a sequence, and only a few or a single data-point is observed at a time. We therefore require methods for building the models in an online manner.

We have adapted the online Kernel Density Estimation (oKDE) framework, which allows online adaptation of generative models from new data-points. We extend the oKDE to take into account the interclass discrimination through a new distance function between the classifiers. The result is an online discriminative Kernel Density Estimator (odKDE).

Online Discriminative Kernel Density Estimator in a Nutshell:

We want to estimate a classifier for K classses from a stream of data, each class ci described by a GMM p(x|ci) and a prior probability p(ci). Discriminative oKDE (doKDE) is an oKDE, which penalizes discrimination loss in the compression step, and proceeds as in the example below:

We have K=3 classes and observe a data-point for the class 1:


How to measure discrimination loss during compresson?

From a classification point of view, two classifiers are equivalent as long as they classify the relevant part fo the feature space equally. Therefore, we can accept compressions which do not change significantly the posterior distribution over the classes.

An example below shows a (line 1) reference classifier, (line 2) an equivalent compressed classifier and (line 3) a nonequivalent compressed classifier:


We have proposed a measure that evaluates how much the posterior changes during compression of the positive class model in the classifier. This measure is used with the odKDE to determine the extent of the allowed compression during online adaptation.

Some preliminary experimental results:

We have compared the odKDE to oKDE, several batch state-of-the-art KDE’s and the batch SVM with an RBF kernel. Experiments were performed on a standard database from the UCI machine learning repository .The Table below (see the paper at the bottom of this page for details) show that odKDE achieves comparable performance in classification accuracy to the batch methods, but produces models with significantly lower complexity and allows online adaptation.


How does the odKDE performance evolve with new observations?


Upper row of figures shows the classification results (Score) and the lower shows the number of components per class (Ncmp) w.r.t. the number of samples (Nobs). The results for the oKDE and odKDE are depicted by darker (red) line and bright (green) line, respectively along with one standard deviation bars. For reference, we also show results for the batch methods after observing all samples.

How does the relation between classes affect the odKDE?

The two videos below shows how the odKDE classifier for two classes evolves with time and how the complexity of the classifier is affected by how “close” the classes are. In both videos, the distributions of classes are equal, the only difference is that in the second video, the second class is shifter further from the first class. As a result, the generative part of the odKDE classifier is different.

Video 1:
Video 2:

Relevant publications:


Download the Matlab code for online Gaussian Mixture Models using the Online Kernel Density Estimation:

NEW! Version 3.5 This is a Matlab research code that is based on the papers on Online Kernel Density Estimation with Gaussian Kernels and Online Discriminative Kernel Density Estimation with Gaussin Kernles. The code essentially demonstrates estimation of a Gaussian Mixture Model from a stream of data.

Download the Matlab code for batch Multivariate Bandwidth:

The code is a minimal implementation of a batch kernel density estimator. Since the code is based on our new bandwidth estimator, it allows KDE construction even from preclustered/compressed sets of samples and weighted data. This is also a minimal demonstration of the general bandwidth estimator proposed in “Online Kernel Density Estimation with Gaussian Kernels”.