ATAR Notes: Forum

VCE Stuff => VCE Mathematics => VCE Mathematics/Science/Technology => VCE Subjects + Help => VCE Mathematical Methods CAS => Topic started by: ElephantStew on August 11, 2008, 07:14:01 pm

Title: quick question - discrete prob
Post by: ElephantStew on August 11, 2008, 07:14:01 pm

quick question regarding discrete probability, how do you find the median???
thanks

Title: Re: quick question - discrete prob
Post by: Mao on August 11, 2008, 08:05:43 pm

this is where 50% of data is above it and 50% of data is below it

generally just count up to 0.5

more formally, you'd use cumulative distribution function (cdf). the point where the probability passes 0.5 (or reaches it) is the median

Title: Re: quick question - discrete prob
Post by: excal on August 11, 2008, 08:09:05 pm

You can also construct a culmulative probability table* in order to 'count up to 0.5'.

Title: Re: quick question - discrete prob
Post by: /0 on August 11, 2008, 08:52:24 pm

Quote from: Mao on August 11, 2008, 08:05:43 pm

the point where the probability passes 0.5 (or reaches it) is the median

Won't this 'median' be as much in the current value of X as it is in the next value of X? i.e. the current and next values of X share the median. So then what do you do if it adds exactly up to 50%?

Title: Re: quick question - discrete prob
Post by: Mao on August 11, 2008, 10:02:14 pm

depends on if we are talking about discrete or continuous

consider the following discrete distribution:

X	1	2	3	4
p(X)	0.1	0.4	0.3	0.2

in this case, the median is "shared" by 2 and 3. For the purposes of Maths Methods, you take an average. i.e. median is 2.5. [I am unsure as to if there is a more correct way of doing this]

if we have continuous random distribution however, the continuity of the variable means that we can sleep soundly knowing that there is no categorical jump from the left 0.5 and the right 0.5 (there is an infinitesimal difference if you'd like). Hence, we can soundly use this:

$cdf(p(m))=\int_a^m p(x)\; dx = 0.5$ , where a is the lower-bound of the probability density function (pdf) p, and m represents a number within the domain of p such that it is the median
for the purposes of MM, p(x) would be simple to differentiate
the reverse would be the same: $cdf(p(m))=\int_m^b p(x)\; dx = 0.5$ , where b is the right-most (greatest) value in the defined domain of p

more strictly however, probability density functions are usually defined from $-\infty$ to $\infty$ , and evaluation would look more like $cdf(p(m))=\int_{-\infty}^m p(x)\; dx = 0.5$ , and evaluation of integrals involving infinity requires knowledge of improper integrals (1st year/UMEP), and is not required for any year 12 courses. We accept that our pdfs is a piecewise function with non-zero values within a defined domain, and 0 elsewhere.

Title: Re: quick question - discrete prob
Post by: /0 on August 11, 2008, 10:10:34 pm

Lol, doesn't feel quite right doing that in a discrete probability distribution... but I guess whatever the VCAA says is gospel anyway

Title: Re: quick question - discrete prob
Post by: Mao on August 11, 2008, 10:17:55 pm

Median is formally defined as a value separating the top half of the population from the bottom half of the population. For the case of discrete random probability, any number between the two categories will suffice where even population is concerned. Conventionally, we use the simple arithmetic average. [wikipedia doesn't have a problem with this: http://en.wikipedia.org/wiki/Median]

If you'd like to be REAL pedantic, you'd try to calculate the regression by plotting the pdf, then integrating the curve of best fit and finding at which value half the area of the whole interval is achieved.
but if you require THAT kind of accuracy, you wouldn't be using discrete probability distribution anyways :P

Title: Re: quick question - discrete prob
Post by: excal on August 12, 2008, 05:49:01 pm

Quote from: DivideBy0 on August 11, 2008, 10:10:34 pm

Lol, doesn't feel quite right doing that in a discrete probability distribution... but I guess whatever the VCAA says is gospel anyway

Medians and means are ways to find the 'centre' point of the data, with varying accuracy depending on the underlying factors in data collection (cf. robustness).

In cases of a 'tie' between two numbers in a median calculation, it is perfectly acceptable to find the centre point between these two using what is essentially a mean of those two - remembering that the purpose of the median is a measure of central tendency.