In statistics, the Poisson regression model attributes to a response variable Y a Poisson distribution whose expected value depends on a predictor variable x (written in lower case because the model treats x as non-random, in the following way:
(where "log" means natural logarithm). Poisson regression models are generalized linear models with "log" as the (canonical) link function, and Poisson distributed errors.
If Yi are independent observations with corresponding values xi of the predictor variable, then a and b can be estimated by maximum likelihood if the number of distinct x values is at least 2. The maximum-likelihood estimates lack a closed-form expression and must be found by numerical methods.
Poisson regression is appropriate when the dependent variable is a count, for instance of events such as the arrival of a telephone call at a call centre (see Poisson distribution#Occurrence). The events must be independent in the sense that the arrival of one call will not make another more or less likely, but the probability per unit time of events is understood to be related to covariates such as time of day.
Another common problem with Poisson regression is excess zeros: if there are two processes at work, one determining whether there are zero events or any events, and a Poisson process determining how many events there are, there will be more zeros than a Poisson regression would predict. An example would be the distribution of cigarettes smoked in an hour by members of a group where some individuals are non-smokers.
Other generalized linear models such as the negative binomial model may function better in these cases.
This article is licensed under the GNU Free Documentation License.
It uses material from the
"Poisson regression".
Home Page • arts • business • computers • games • health • hospitals • home • kids & teens • news • physicians • recreation• reference • regional • science • shopping • society • sports • world