\name{g.test} \alias{g.test} \title{Log-Likelihood Ratio Test for Count Data} \usage{ g.test(x, y = NULL, correct = NONE, p = rep(1/length(x), length(x)), simulate.p.value = FALSE, B = 2000) } \arguments{ \item{x}{a vector or matrix.} \item{y}{a vector; ignored if \code{x} is a matrix.} \item{correct}{a logical indicating whether to apply continuity correction when computing the test statistic.} \item{p}{a vector of probabilities of the same length of \code{x}.} \item{simulate.p.value}{a logical indicating whether to compute p-values by Monte Carlo simulation.} \item{B}{an integer specifying the number of replicates used in the Monte Carlo simulation.} } \description{ \code{g.test} performs log-likelihood ratio tests on contingency tables. } \details{ If \code{x} is a matrix with one row or column, or if \code{x} is a vector and \code{y} is not given, \code{x} is treated as a one-dimensional contingency table. In this case, the hypothesis tested is whether the population probabilities equal those in \code{p}, or are all equal if \code{p} is not given. If \code{x} is a matrix with at least two rows and columns, it is taken as a two-dimensional contingency table, and hence its entries should be nonnegative integers. Otherwise, \code{x} and \code{y} must be vectors or factors of the same length; incomplete cases are removed, the objects are coerced into factor objects, and the contingency table is computed from these. Then, log-likelihood ratio test of the null that the joint distribution of the cell counts in a 2-dimensional contingency table is the product of the row and column marginals is performed. Continuity correction is only used in the df=1 case if \code{correct} is \code{yates}. Williams' approximation to chi-squared distribution is calculated is \code{correct} is \code{williams} } \value{ A list with class \code{"htest"} containing the following components: \item{statistic}{the value the G test statistic.} \item{parameter}{the degrees of freedom of the approximate chi-squared distribution of the test statistic. \item{p.value}{the p-value for the test.} \item{method}{a character string indicating the type of test performed, and continuity correction, or Williams' chi-squared approximation was used.} \item{data.name}{a character string giving the name(s) of the data.} \item{observed}{the observed counts.} \item{expected}{the expected counts under the null hypothesis.} } \references{ Robert R. Sokal & F. James Rohlf (1995), \emph{Biometry}, 3rd ed. New York: W. H. Freeman & Company. Pages 690--760. Jerrold H. Zar (1999), \emph{Biostatistical Analysis}, 4th ed. New Jersey: Prentice-Hall Inc. Pages 473--514. } \examples{ ## Goodness of fit example (Sokal & Rohlf Box 17.1 Part 1 pp 699) pheno <- c(63,31,28,12,39,16,40,12) mendel <- c(18,6,6,2,12,4,12,4) g.test(pheno,p=(mendel/sum(mendel)),correct="williams")$p.value ## p = 0.2696 ## 2x2 Test of independence example (Sokal & Rohlf Box 17.6 pp 731) x <- matrix(c(12, 16, 22, 50), nc = 2) g.test(x)$statistic ## G = 1.332489 g.test(x,correct="williams")$p.value ## p = 0.2537082 ## Effect of simulating p-values x <- matrix(c(12, 5, 7, 7), nc = 2) g.test(x)$p.value ## p = 0.2408640 g.test(x,simulate.p.value=TRUE, B = 10000)$p.value ## around 0.13! } \keyword{htest}