hey guys
Having some stats problems.....(or maybe just not good at justifying things)
i have to analyse frequency of letters and create statistics about the population frequencies of certain letters occurring within a random given text.
the problem is, what could be the smallest sufficiently large sample that i could use....
there are a few formulae around the internet but i dont know which to use.
i am given nothing about the population statistics of the frequency of the certain letters i am analysing so no standard deviation or mean.
no idea where to begin, teacher says do whatever you want as long as you can justify it.
someone please help!!
thanks 
You don't need population parameters to make a statistical inference. There are SO many things you can do given what little information you seem to have, so let's do a confidence interval.
We know that if

is binomial, then
}/\sqrt{n}}=\frac{\bar{X}/n-p}{\sqrt{p(1-p)/n}})
has an approximate normal distribution. This means that
/n}}<z_{0.975}\right))
will be the inner 95% of our distribution.
So, let's model this problem as a binomial distribution, where p represents the probability of finding a letter we want in our sample, and n is the amount of letters we analyse. We can then use

to estimate the probability of the population. We can then arrange this interval to be:
},\hat{p} + z_{0.025}\sqrt{\frac{1}{n}\hat{p}\left(1-\hat{p}\right)}\right]<br />)
Then, all you have to do is choose n such that this interval is small enough for you to be satisfactory with your result. Note that the expected value of X, np, will be the frequency of the specific letters.