Hello, I need clarification with validity and reliability as I can't really nail down how to determine the extent of each one in a given scenario. Here is a research scenario from Jacaranda:
For her extended VCE Psychology practical investigation, Amelia decided to investigate encoding in short-term memory. She used a random sample of 30 students from a cohort of 150 Year 10 students at her school.
Two lists of monosyllabic words were read out to participants in the investigation:
• List 1 – key, pea, ski, flea, tea, bee, knee, tree, sea (monosyllabic words that rhyme)
• List 2 – sock, bean, stick, ant, milk, fly, leg, leaf, sand (monosyllabic words that do notrhyme)
All 30 participants listened to two readings of the words in List 1 and were then given two minutes to write down the words that they recalled. Next, they all listened to two readings of the words in List 2 and were then given two minutes to write down the words that they recalled.
There were two questions asking about validity and reliability(2 marks each). For validity I wrote the definition correct but in the scenario I wrote that validity is the extent of amount of time taken to do the puzzle was actually due to the IV which was not marked correct. My teacher wrote 'How do you know time taken to complete puzzle seems valid in objectively measuring concentration skills'. tbh I don't know how it is actually 'valid' as well so I need help on that. For reliability I wrote the definition and then wrote wether results will be consistent or same in different experimental conditions or if other experimenters used different items. Another mark taken off, I looked at the solutions and I sorta understand reliability but need help with validity in particular.
Thanks!
Validity - does the experiment test the relationship between IV and DV?
I don't understand... is the puzzle you refer to a different scenario? Are still talking about the Jacaranda prompt?
Having few extraneous variables contributes positively to validity, because you know the IV is isolated in making changes in the DV (therefore the test of the IV on the DV is true, not affected by other variables that aren't the IV). Is it a good design to test one against the other? Validity also refers to the ability to be generalised. Are the results
valid for people in the past? People of a different age group? A different culture/ethnicity? This is rare to see in a VCE context.
Reliability - does the test give the same results time and time again?
You can have reliability within the experiment itself. For example, if you refer to your above scenario, it would have poor reliability if the word list changed each time, or if one list was allowed to be recalled over 3 minutes, but the other over 2. It wouldn't be consistent, and would give different results each time - not a
reliable measure. You can also have external reliability. This is a measure of the experiment itself - if the experiment was run again by someone else, would they get similar results? If they would (e.g. the method is easy to follow, the trials are consistent measures), then there is a high level of reliability. If each time the test was re-run the results were different, it is not a reliable tool for testing whatever you're trying to. Inter-rater reliability was also mentioned in the previous study design, and refers to when 'raters' (basically assessors of the experiment) predict what is going to happen. If their perceptions/predictions are accurate, your test is probably reliable. If not, it probably isn't. For example, if you said that the DV is 'violent behaviour' - some raters might only think that a kick (but not a kick) is violent, while others might think a punch is violent. In that case, you can improve reliability by operationalising or objectifying your DV ("violent behaviour" -> number of times bodily contact was made in any form)
The reason why "results will be consistent or same in different experimental conditions or if other experimenters used different items" is not correct is because if you change the experimental conditions or different items, how is that meant to show consistency? E.g. if one experiment showed that 15/20 words could be remembered on average when given two minutes to recall, and another showed an average of 17/20 for five minutes to recall, how are they testing the same thing? How does that show that your method is reliable when you actually change it? Using different items is the same deal. Let's say one experiment used monosyllabic words (rhymed) vs. monosyllabic words (not rhymed), and another used polysyllabic words (rhyme) vs. non-rhymed, then you aren't testing rhyme vs. non-rhyme, you're comparing monosyllabic vs. polysyllabic, which has nothing to do with your topic.
Sorry, I think that's confusing but I hope you get what I mean.
Idk I hope that helped a bit. If you could clear up the puzzle thing I can update my answer. Good luck