Ask a Librarian

Threre are lots of ways to contact a librarian. Choose what works best for you.

HOURS TODAY

11:00 am - 3:00 pm

Reference Desk

CONTACT US BY PHONE

(802) 656-2022

Voice

(802) 503-1703

Text

MAKE AN APPOINTMENT OR EMAIL A QUESTION

Schedule an Appointment

Meet with a librarian or subject specialist for in-depth help.

Email a Librarian

Submit a question for reply by e-mail.

WANT TO TALK TO SOMEONE RIGHT AWAY?

Library Hours for Friday, April 19th

All of the hours for today can be found below. We look forward to seeing you in the library.
HOURS TODAY
8:00 am - 6:00 pm
MAIN LIBRARY

SEE ALL LIBRARY HOURS
WITHIN HOWE LIBRARY

MapsM-Th by appointment, email govdocs@uvm.edu

Media Services8:00 am - 4:30 pm

Reference Desk11:00 am - 3:00 pm

OTHER DEPARTMENTS

Special Collections10:00 am - 5:00 pm

Dana Health Sciences Library7:30 am - 6:00 pm

 

CATQuest

Search the UVM Libraries' collections

UVM Theses and Dissertations

Browse by Department
Format:
Print
Author:
Li, Zhao
Dept./Program:
Computer Science
Year:
2012
Degree:
PhD
Abstract:
Clustering analysis, an unsupervised learning paradigm that divides similar objects into groups, has long played an important role in data mining. Given a data matrix, in contrast to traditional single-sided clustering on either columns or rows of the data matrix, co-clustering treats the data matrix in a symmetric fashion such that a partitioning of rows can induce a partitioning of columns or vice verse. By clustering both rows and columns of a data matrix simultaneously co-clustering methods have been successfully applied in text mining, gene expressions, image retrieval and recommendation systems. However, an important issue in applying co-clustering methods is how to choose a good algorithm and make it more practical in real applications. There are several research issues, for example, how to find the relationship between different co-clustering algorithms; how to effectively discover the structure of high dimensional data; how to design a fast co-clustering algorithm with competitive results; and how to determine the number of co-clusters automatically. In this dissertation, we propose effective strategies as follows to address these questions from the perspective of matrix computation.
1. A Weighting Scheme for Co-clustering: By proposing a weighting scheme the relationship between different co-clustering algorithms is developed and the weighting scheme can lead to better clustering results.
2. Nonnegative Matrix Tri-factorization based on Orthogonal Subspaces: By effectively adjusting orthogonality through matrix factorization the parts-based structure of high dimensional data is clearly revealed.
3. Fast Co-clustering by Ranking and Sampling: By discriminately sampling representative data points the computational complexity of co-clustering analysis is achieved in linear time with competitive results.
4. Single Updated Matrix Decomposition: By incorporating alternative least squares to nonnegative matrix tri-factorization only one small factor matrix needs to be updated. Thus the space and time complexities are tremendously reduced. Meanwhile, our method outperforms stae-of-the-art co-clustering algorithms on large datasets.
We have also built an online search engine named "Descriptive Clustering Meta Search Engine" to address several application problems in web information retrieval, such as determining the number of clusters and describing the resulting clusters.