UVM Theses and Dissertations

Ask a Librarian

Threre are lots of ways to contact a librarian. Choose what works best for you.

HOURS TODAY

10:00 am - 4:00 pm

Reference Desk

(802) 656-2022

Voice

(802) 503-1703

Text

MAKE AN APPOINTMENT OR EMAIL A QUESTION

Email a Librarian

Submit a question for reply by e-mail.

Library Hours for Thursday, November 21st

All of the hours for today can be found below. We look forward to seeing you in the library.

HOURS TODAY

8:00 am - 12:00 am

MAIN LIBRARY

WITHIN HOWE LIBRARY

MapsM-Th by appointment, email govdocs@uvm.edu

Media Services8:00 am - 7:00 pm

Reference Desk10:00 am - 4:00 pm

OTHER DEPARTMENTS

Special Collections10:00 am - 6:00 pm

Dana Health Sciences Library7:30 am - 11:00 pm

Format:

Author:

Huang, Lulu

Title:

Application of models for count data with overdispersion and zero-inflation

Dept./Program:

Statistics Program

Year:

2007

Degree:

M.S.

Abstract:

Count data are nonnegative integer-valued data and refer to the number of times an event occurs. Count data are usually assumed to follow a Poisson distribution. In practice, this assumption may not be appropriate. The Poisson distribution has the characteristic that its mean and variance are equal, but the variance of actual counts can be greater than their mean. The number of zeros may be excessive compared to the number expected for a Poisson distribution. Various strategies for modeling the relationship between the counts and a set of predictor variables have been proposed, such as the linear, the Poisson, the overdispersed Poisson, the negative binomial, the zero-inflated Poisson (ZIP) and the zero-inflated negative binomial (ZINB) regression models. Investigators or data analysts must decide which to report. Analysts may consider the fit of a model and ease of interpretation when selecting an appropriate model. We applied these models to mammography participation data obtained from a survey of low income minority women conducted in 1990. We were interested in these data from a modeling perspective because few low-income minority women were getting mammograms at that time. The response count variable is number of mammograms; predictors are age, income, health insurance and geographic area.
For this example, the Poisson regression model did not fit very well. The negative binomial and ZIP model fit better with the negative binomial model slightly outperfoming the ZIP model. Unfortunately, when we fit the ZINB model with all the four variables, it did not work out. All models indicated that a greater number of mammograms was associated with having insurance and a lesser number of mammograms was associated with lower income. In practice, for count data the Poisson regression model is not always a good choice. If overdispersion exists, the Poisson regression model tends to underestimate errors. So the negative binomial and the overdispersed Poisson regression model may be more appropriate. If zero-inflation exists, the ZIP model may be used. If both zeroinflation and overdispersion exist, the ZINB, the negative binomial, and the ZIP models may be used.

Request print copy from Annex

Search Website

Search Directory

A to Z

Search Website

Search Directory

Collections

Research

Services

About

Help

Ask a Librarian

Threre are lots of ways to contact a librarian. Choose what works best for you.

10:00 am - 4:00 pm

Reference Desk

(802) 656-2022

Voice

(802) 503-1703

Text

Meet with a librarian or subject specialist for in-depth help.

Submit a question for reply by e-mail.

WANT TO TALK TO SOMEONE RIGHT AWAY?

Library Hours for Thursday, November 21st

All of the hours for today can be found below. We look forward to seeing you in the library.

HOURS TODAY

MAIN LIBRARY

WITHIN HOWE LIBRARY

OTHER DEPARTMENTS

CATQuest

Search the UVM Libraries' collections

UVM Theses and Dissertations