Saturday, July 21, 2012

Confidence Intervals for Proportions

Confidence Intervals for Proportions

Suppose we have a population proportion of interest. There are many examples:
1.
The proportion of left-handed professional baseball players.
2.
President Clinton's rating.
3.
The proportion of patients with a specific disease who are under a new drug.
4.
The proportion of graduating high school students who can read at the eighth grade level.
5.
The proportion of Republicans who will vote for Bush.
6.
The proportion of Democrats who will vote for Bush.
7.
The proportion of Republicans who will vote for Gore.
8.
The proportion of Democrats who will vote for Gore.
9.
The proportion of citizens who will not vote.
Let p denote the population proportion. To estimate p, we sample the population and form the sample proportion which we will call $\hat{p}$.
Baseball Example . Consider the first example above: The proportion of left-handed professional baseball players. We have a sample of size 59 from this population.

There are 15 left-handed baseball players so the sample proportion is $\hat{p} = 15/59 = .2542$. Thus .2542 is our estimate of the proportion of left-handed professional baseball players. How much did it miss by?

In general, $\hat{p}$ is a sample average, (Record Success as 1 and Failure as 0, then the sum of these 0's and 1's is the number of successes and the average (divide sum by n) is $\hat{p}$). Hence we can invoke the Central Limit Theorem to determine a confidence interval for p. We use a slightly different standard error, though. The Confidence Interval for p is: 
\begin{displaymath}\big(\hat{p} - 1.96 \sqrt{\frac{\hat{p}(1-\hat{p})}{n}}, \hat{p} + 1.96 \sqrt{\frac{\hat{p}(1-\hat{p})}{n}}\big)
\end{displaymath}

You have already used this. The error here is the error (except 1.96 replaces 2) that we used for our estimates of probability based on resampling. This confidence interval has the same interpretation as the one in the last section; i.e., we are fairly confident that the true population proportion is contained in the interval.
Baseball Example. For the baseball data,
\begin{displaymath}1.96 \sqrt{\frac{\hat{p}(1-\hat{p})}{n}} = 1.96 \sqrt{\frac{.2542(1-.2542)}{59}} = .1111
\end{displaymath}

Hence, the confidence interval is (.2542 - .1111, .2542 + .1111) = (.143, .365). So provided this data came from a random sample, we are fairly confident that the true percentage of left-handed professional baseball players is between 14 and 37%.



By: Lera Gay Bacay

No comments:

Post a Comment