A few questions
1.How would you determine the location of the upper fence and the lower fence when you're shown a boxplot with outliers.
2.If you're shown a boxplot with outliers and you're asked to estimate the percentage of values that are less than a certain number, do you include the outliers when determining the percentage?
3. the distribution of heights of 19-year old woman is approximately normal, with a mean of 170cm and a standard deviation of 5cm
What percentage of these women have heights: between 160 and 175 cm
like 160 is 2 standard deviations below the mean, and 175 is 1 standard deviation above the mean, so how would you work it out in this case since there are different standard deviations (2 and 1)
4. if you're given two variables, number in theatre and time (minutes), and given values for both variables and asked to draw a scatterplot, how do you determine the response and explanatory variables if they aren't given to you?
Thanks
Hey!
1. You would find them as normal - the ends of the box.
2. Yes.
When interpreting a box plot, outliers only affect your interpretation of the maximum and the minimum - if there is an outlier on any end, it is a maximum or a minimum, and not the ends of the whisker. All other interpretations - such as finding the IQR, median, upper fence, lower fence, and percentage of values from a data value that meets with a line on the box - are done as with no outliers. Rarely, or never, will they ask the percentage or number of values that are not outliers.
3. Use the normal distribution and 68-95-99.7% diagram (you should have one of these) - as you have identified with the number of s.d's below and above the mean, it will be 13.5% + 34% + 34%, or 64% + 13.5%.
4. Think of the variables - which one depends on what? Does time depend on the number of people in the theatre, or does the number of people in the theatre depend on the time? Which one makes sense? The response variable is the one that is dependent on/is affected by the explanatory variable. In this example, the time that goes on affects the number of people in the theatre. (As a side note, time will always be an explanatory variable, as time is time. It cannot be clearly affected by anything else).
5. A
positive association occurs when the values of the response variable tends to
increase as the explanatory variable increases (positive r value, regression line goes upwards), whereas a
negative association occurs when the values of the response variable tends to
decrease as the explanatory variable increase (negative r value, regression line goes downwards). The example requires more thinking than determining which variable is the response or explanatory. Think of real life.
For (1), does population density (RV) increase when distance from city centre (EV) increases, i.e. more towards the country? It doesn't, population density decreases, so there is a negative association between these variables. For (2), is it true that time studying is less when time using social media increases? (The RV and EV between these two can be interchangeable.)
I hope this was clear to you.