UVM Theses and Dissertations

Ask a Librarian

Threre are lots of ways to contact a librarian. Choose what works best for you.

HOURS TODAY

11:00 am - 3:00 pm

Reference Desk

(802) 656-2022

Voice

(802) 503-1703

Text

MAKE AN APPOINTMENT OR EMAIL A QUESTION

Email a Librarian

Submit a question for reply by e-mail.

Library Hours for Friday, April 19th

All of the hours for today can be found below. We look forward to seeing you in the library.

HOURS TODAY

8:00 am - 6:00 pm

MAIN LIBRARY

WITHIN HOWE LIBRARY

MapsM-Th by appointment, email govdocs@uvm.edu

Media Services8:00 am - 4:30 pm

Reference Desk11:00 am - 3:00 pm

OTHER DEPARTMENTS

Special Collections10:00 am - 5:00 pm

Dana Health Sciences Library7:30 am - 6:00 pm

Format:

Author:

Chen, Qijun

Title:

Inductive learning on partitioned data

Dept./Program:

Computer Science

Year:

2004

Degree:

M.S.

Abstract:

Inductive Learning is a typical learning task in machine learning. Given a data set, inductive learning aims to discover patterns in the data and form concepts that describe the data. Research in inductive learning has sustained for decades. However, much of the existing work focuses on a relatively small amount of data, which will be infeasible in large, realistic situations. With the rapid advancement of information technology, scalability has become a necessity for learning algorithms to deal with large, real-world data repositories. This thesis aims to design some scalable inductive learning algorithms. Scalability is defined as the ability to process large data sets or handle data sets that are distributed at different sites. In our work, scalability is accomplished through a data reduction technique, which partitions a large data set into subsets, applies the learning algorithm on each subset sequentially or concurrently, and then integrates the learned results.
Five strategies to achieve scalability (Rule-Example Conversion, Rule Weighting, Iteration, Good Rule Selection, and Data Dependent Rule Selection) have been identified and their corresponding scalable schemes have been designed and developed. A substantial number of experiments have been performed to evaluate these schemes. Experimental results demonstrate that through data reduction some of our schemes can effectively generate accurate classifiers from inaccurate classifiers generated from data subsets. Furthermore, our schemes require significantly less training time than that of generating a global classifier. Among the five investigated strategies, Iteration and Data Dependent Rule Selection are the two most effective strategies in respect to the classification accuracy of the generated classifiers and the variety of the data sets that can be dealt with. These two strategies, combined with a Voting strategy, can generate schemes, which outperform Voting consistently.

Request print copy from Annex

Search Website

Search Directory

A to Z

Search Website

Search Directory

Collections

Research

Services

About

Help

Ask a Librarian

Threre are lots of ways to contact a librarian. Choose what works best for you.

11:00 am - 3:00 pm

Reference Desk

(802) 656-2022

Voice

(802) 503-1703

Text

Meet with a librarian or subject specialist for in-depth help.

Submit a question for reply by e-mail.

WANT TO TALK TO SOMEONE RIGHT AWAY?

Library Hours for Friday, April 19th

All of the hours for today can be found below. We look forward to seeing you in the library.

HOURS TODAY

MAIN LIBRARY

WITHIN HOWE LIBRARY

OTHER DEPARTMENTS

CATQuest

Search the UVM Libraries' collections

UVM Theses and Dissertations