UVM Theses and Dissertations
Format:
Print
Author:
Awad, Mariette
Dept./Program:
Electrical and Computer Engineering
Year:
2007
Degree:
PhD
Abstract:
Recent advances in computing, inexpensive sensors and high throughput acquisition technologies have made data more available and easier to collect than before. However, given their observational nature, data instances are typically finite and sampled non-uniformly in a high dimensional input space. Thus, the main challenge in modeling a process of induction using empirical data is to ensure good generalization and avoid over-fitting. Consequently machine learning (ML) that solely relies on traditional empirical risk minimization (ERM) gives a poor performance on instances that are yet to be observed. Support Vecfor Machine (SVA4) as a ML technique offers a principled approach to learning tasks because its mathematical foundations are rooted in statistical learning theory and its formulation embodies the strztctural risk minimization (SRM) and not the traditional ERM for misclassifying training data. However as originally invented by Boser, Guyon and Vapnik in 1992, SVM was primarily geared to off-line binary classification. Thus once the classifier's modei is defined, online learning would require a complete model retrain. Dynamic, incremental or online learning refers, in this context, to the situation where the training dataset is not fully available at the beginning of the learning process. The data can arrive at different time intervals and need to be efficiently incorporated into the training set to preserve the class concept.
Within the context of dynamic SVM classification, we propose two techniques and apply them to three learning tasks making this dissertation, to the best of our knowledge, a novel research work. Starting with an off-line learning model, our first dynamic SVM technique (hereafter, DSVMI) sequentially updates the hyper-plane parameters when necessary based on our proposed weighted incremental criteria. DSVMI is attractive because it is simple to implement, has faster computing time, and requests lower storage and memory requirements when compared to the retrain model. We applied DSVMI to three different sets of experiments, and we investigated behavior learning for video stream data for an articulated humanoid, decision fusion in a distributed network and coinputed tomography colonography (CTC) classification. The extreme imbalance observed in the CTC data motivated us to develop DSVM2, our second proposed dynamic SVM technique with a data dependant kernel for online novelty detection. Results from all four learning tasks demonstrate the feasibility and merits of the proposed incremental classification approaches using SVM technique. Basically, the resource extensive ML training phase was reduced without drastically affecting the classifier's accuracy. Instead the effectiveness of the existing learner and its ability to continuously learn were improved with successive incorporation of the incremental knowledge into the existing classifier model.
Within the context of dynamic SVM classification, we propose two techniques and apply them to three learning tasks making this dissertation, to the best of our knowledge, a novel research work. Starting with an off-line learning model, our first dynamic SVM technique (hereafter, DSVMI) sequentially updates the hyper-plane parameters when necessary based on our proposed weighted incremental criteria. DSVMI is attractive because it is simple to implement, has faster computing time, and requests lower storage and memory requirements when compared to the retrain model. We applied DSVMI to three different sets of experiments, and we investigated behavior learning for video stream data for an articulated humanoid, decision fusion in a distributed network and coinputed tomography colonography (CTC) classification. The extreme imbalance observed in the CTC data motivated us to develop DSVM2, our second proposed dynamic SVM technique with a data dependant kernel for online novelty detection. Results from all four learning tasks demonstrate the feasibility and merits of the proposed incremental classification approaches using SVM technique. Basically, the resource extensive ML training phase was reduced without drastically affecting the classifier's accuracy. Instead the effectiveness of the existing learner and its ability to continuously learn were improved with successive incorporation of the incremental knowledge into the existing classifier model.