Ask a Librarian

Threre are lots of ways to contact a librarian. Choose what works best for you.

HOURS TODAY

10:00 am - 3:00 pm

Reference Desk

CONTACT US BY PHONE

(802) 656-2022

Voice

(802) 503-1703

Text

MAKE AN APPOINTMENT OR EMAIL A QUESTION

Schedule an Appointment

Meet with a librarian or subject specialist for in-depth help.

Email a Librarian

Submit a question for reply by e-mail.

WANT TO TALK TO SOMEONE RIGHT AWAY?

Library Hours for Tuesday, April 16th

All of the hours for today can be found below. We look forward to seeing you in the library.
HOURS TODAY
8:00 am - 12:00 am
MAIN LIBRARY

SEE ALL LIBRARY HOURS
WITHIN HOWE LIBRARY

MapsM-Th by appointment, email govdocs@uvm.edu

Media Services8:00 am - 7:00 pm

Reference Desk10:00 am - 3:00 pm

OTHER DEPARTMENTS

Special Collections10:00 am - 6:00 pm

Dana Health Sciences Library7:30 am - 11:00 pm

 

CATQuest

Search the UVM Libraries' collections

UVM Theses and Dissertations

Browse by Department
Format:
Print
Author:
Zheleva, Elena M.
Dept./Program:
Computer Science
Year:
2005
Degree:
MS
Abstract:
Pattern matching (using regular expressions) is widely used in computational biology, and searching through a database of character sequences can take a long time. In this thesis, we explore and evaluate alternative representations of the database of sequences using suffix trees for two types of query problems: 1) Decide whether a match exists, and 2) Find all matches to a given pattern. We propose a framework in which the first problem can be solved in a faster manner than previous solutions while not slowing down the solution of the second problem. We apply several heuristics both at the level of suffix tree creation resulting in modified tree representations, and at the regular expression matching level in which we search subtrees in a given predefined order by simulating a deterministic finite automaton that we create from the given regular expression. The focus of the work is to develop a method for faster retrieval of PROSITE motif (a restricted regular expression) matches from a protein sequence database. We show empirically the effectiveness of our solution using several real protein data sets. Our method can be applied to any problem which requires answering restricted regular expression queries on a flatfile string database.