--- title: "Assignment: Review of basic statistical tests" output: html_document --- # General remarks This week you are asked to try to report the results of the statistical analyses in a more "professional" way. A good place to start is the APA standard (or "APA style"). There are many materials on the web, but I like this short guide from University of Washington (read it!): https://psych.uw.edu/storage/writing_center/stats.pdf Let's analyze two relevant examples: **One sample *t*-test** *Students taking statistics courses in psychology at the University of Washington reported studying more hours for tests (M = 121, SD = 14.2) than did UW college students in general, t(33) = 2.10, p = .034.* **Repeated measures *t* test*** *Results indicate a significant preference for pecan pie (M = 3.45, SD = 1.11) over cherry pie (M = 3.00, SD =.80), t(15) = 4.00, p = .001.* **Take home messages:** 1. The direction of the difference/relationship is explicitly stated (more hours/preference for X over Y). 2. Descriptive statistics are reported: mean, abbreviated as M and standard deviation abbreviated as SD, you might also consider reporting sample size and median here. 3. All numbers are rounded to increase readability. Statistics (both descriptive and inferential) are rounded to two decimal places, p-value is rounded to three decimal places. 4. For very low p-values ($< 0.001$) whole number with many decimal places is not included. Instead, we write that it is lower than some threshold (e.g. $p < 0.001$). 5. For common statistical tests often there is no need to state which test was used. We need to report the magnitude of the test statistic with degrees of freedom ($t(33) = 20$) and corresponding p-value. **Extra things to consider:** 1. Many researchers argue that some measure of effect size should be always reported so in the case of *t* test Cohen's $d$ or similar measure should be included. For the chi-squared test of independence, you should look into Phi and Cramers's V statistics. ```{r setup, include=FALSE} knitr::opts_chunk$set(echo = TRUE) ``` # Exercise 1 The chairperson of a psychology department suspects that some of her faculty are more popular with students than are others. There are three sections of introductory psychology, taught at 10:00 a.m., 11:00 a.m., and 12:00 p.m. by Professors Anderson, Klatsky, and Kamm. The number of students who enroll for each is: * Professor Anderson: 32 * Professor Klatsky: 25 * Professor Kamm: 10 State the null hypothesis, run the appropriate chi-squared test, and interpret the results. ```{r} # Put your code here ``` *Put your answer here* From the point of view of designing a valid experiment (as opposed to the arithmetic of calculation), the data in this exercise will not really answer the question the chairperson wants answered. What is the problem and how could the experiment be improved? *Put your answer here* # Exercise 2 In a classic study by Clark and Clark (1939), African American children were shown black dolls and white dolls and were asked to select the one with which they wished to play. Out of 252 children, 169 chose the white doll and 83 chose the black doll. What can we conclude about the behavior of these children? ```{r} # Put your code here ``` *Put your answer here* # Exercsie 3 Thirty years after the Clark and Clark study, Hraba and Grant (1970) repeated the study referred to in Part B. The studies, though similar, were not exactly equivalent, but the results were interesting. Hraba and Grant found that out of 89 African American children, 28 chose the white doll and 61 chose the black doll. Run the appropriate chi-squared test on their data and interpret the results. ```{r} # Put your code here ``` *Put your answer here* # Exercise 4 Combine the data from Exercises 2 and 3 into a two-way contingency table (for example using `rbind` or `cbind` function) and run the appropriate test. How does the question that the two-way classification addresses differ from the questions addressed by Exercises 2 and 3? ```{r} # Put your code here ``` *Put your answer here* # Exercise 5 We know that smoking has a variety of ill effects on people; among other things, there is evidence that it affects fertility. Weinberg and Gladen (1986, `weinberg1986.csv`) examined the effects of smoking and the ease with which women become pregnant. They took 586 women who had planned pregnancies and asked how many menstrual cycles it had taken for them to become pregnant after discontinuing contraception. Weinberg and Gladen also sorted the women into smokers and non-smokers. Does smoking affect the ease with which women become pregnant? ```{r} # Put your code here ``` *Put your answer here* # Exercsie 6 (SHARKS) You may have noticed that the chi-squared goodness-of-fit test can be thought as an "inexact" version of the binomial test when we have only two categories. You may also have wondered: is there a test that is exact (like the binomial test) and works with an arbitrary number of categories (like the chi-squared test)? Of course, there is and it is called the "multinomial test" (who would have thought?). * Google how to perform the multinomial test using R (feel free to use external libraries) and re-analyze data from Exercise 1. * Do the conclusions differ? If yes, how? * Try to explain why the multinomial test is not widely used. ```{r} # Put your code here ``` *Put your answer here* # Exercise 7 (SHARKS) Some people say that the chi-squared test of independence should be abandoned because there are better alternatives. One of them is called the *G* test. * Google how to perform the G test using R (feel free to use external libraries) and re-analyze data from Exercise 6. * Do the conclusions differ? If yes, how? * Why the *G* test is thought to be better than the chi-squared test of independence? ```{r} # Put your code here ``` # Exercise 8 This example is taken from a classic paper by Kaufman and Rock (1962) on the moon illusion. The moon illusion has fascinated psychologists for years, and refers to the fact that when we see the moon near the horizon, it appears to be considerably larger than when we see it high in the sky. Kaufman and Rock hypothesized that this illusion could be explained on the basis of the greater apparent distance of the moon when it is at the horizon. As part of a very complete series of experiments, the authors initially sought to estimate the moon illusion by asking subjects to adjust a variable "moon" that appeared to be on the horizon so as to match the size of a standard "moon" that appeared at its zenith, or vice versa. (In these measurements, they used not the actual moon but an artificial one created with special apparatus.) One of the first questions we might ask is whether this apparatus really produces a moon illusion—that is, whether a larger setting is required to match a horizon moon or a zenith moon. The data for 10 subjects (`kaufman1962.csv`) are taken from Kaufman and Rock’s paper and present the ratio of the diameter of the variable and standard moons. A ratio of 1.00 would indicate no illusion, whereas a ratio other than 1.00 would represent an illusion. (For example, a ratio of 1.50 would mean that the horizon moon appeared to have a diameter 1.50 times the diameter of the zenith moon.) ```{r} # Put your code here ``` *Put your answer here* # Exercise 9 Hoaglin, Mosteller, and Tukey (1983) present data on blood levels of beta-endorphin as a function of stress. They took beta-endorphin levels for 19 patients 12 hours before surgery, and again 10 minutes before surgery. The observations are in the `hoaglin1983.csv` file, in fmol/ml. Based on these data, what effect does increased stress have on endorphin levels? Calculate and interpret an effect size for the data. ```{r} # Put your code here ``` *Put your answer here* # Exercise 10 Hout, Duncan, and Sobel (1987) reported on the relative sexual satisfaction of married couples. They asked each member of 91 married couples to rate the degree to which they agreed with “Sex is fun for me and my partner” on a four-point scale ranging from 1, "never or occasionally", to 4, "almost always." The data appear in the `hout1987.csv` file. Start out by running a match-sample t-test on these data. Why is a matched-sample test appropriate? ```{r} # Put your code here ``` *Put your answer here*