**Course:** TThu 15:00-16:20, SBS N310

**Instructor:** Jeffrey Heinz, jeffrey.heinz@stonybrook.edu

**Office Hours:** TThu 12:30-14:00, SBS N237

- Syllabus
- Books we are using:

- We determined the order of the presentations of the following papers for next week.
- We discussed Mixed Effects Models (Sonderegger chapter 7).

- Class canceled.

- We clarified any issues with the paper presentation.
- We finished issues with logistic regression (Sonderegger chapter 6).
- We briefly reviewed some aspects of Sonderegger chapter 7.

- We continued discussion of logistic regression (Sonderegger chapter 6).
- We discussed the paper presentation you need to do in the last week of the course.

- We discussed HW exercises in chapter 5.
- We began discussion of logistic regression (Sonderegger chapter 6).

- We finished issues with linear regression (Sonderegger chapter 5).
- Homework for next Thursday: chapter 5, exercises 1-3 (page 172 in Sonderegger)

- We discussed issues with linear regression (up to 5.6 in Sonderegger).

- We spent the day exploring linear models of the preliminary NN/subregular lg data set.

- We finished chapter 4 of Sonderegger.

- We reviewed multiple linear regression (4.4 in Sonderegger).
- We discussed the NN/subreg data set.
- Homework for next Tuesday: Recall the Tati vowel data. Let us see how well we can predict the duration of a vowel from some its properties. Write a rmd file which conducts simple linear and multiple regressions to answer the following questions. For each regression conducted, make sure to present the summary (or tidy) of the model obtained, and include 1-2 sentences explaining what it means.
- Conduct a simple linear regression predicting the vowel duration from the F1.
- Conduct a simple linear regression predicting the vowel duration from the F2.
- Conduct a simple linear regression predicting the vowel duration from the pitch.
- Conduct a multiple linear regression predicting the vowel duration from the F1 + F2 + pitch.
- Conduct a simple linear regression predicting the vowel duration from the stress (S or U).
- Conduct a mulitple linear regression predicting the vowel duration from the Segment and stress.
- What are your overall conclusions about vowel duration in Tati?

- We reviewed simple linear regression and this fish example.
- We discussed the NN/subreg data set.
- We began to look at regression with categorical variables.

- We began discussion of regression. Regression Handout
- We covered the first couple of sections of Sonderegger chapter 4.

- We discussed power and the incomplete neutralization example (chapter 3 section 3 of Sonderegger).
- For Thursday:
- Using ggplot, draw a plot like the one in Figure 3.3 on page 61 of Sonderegger. As in Figure 3.3, the power should be plotted on the y-axis and sample size on the x-axis. However, instead of considering three effect sizes (5,10,30) just consider the effect size (delta) to be 10. And instead of considering fixing the standard deviation to 45ms, consider three possibilities: 15, 45, and 135. So your plot should have three lines in it like Figure 3.3, but each one corresponding to one of the standard deviations instead of to the effect sizes.

- We discussed effect size and the incomplete neutralization example (chapter 3 sections 1 and 2 of Sonderegger).
- For Tuesday:
- Do exercise 1 in Sonderegger Chapter 3 (p.81)

- Today we discussed z-tests, one-sample and two-saple t-tests, Wilcoxon tests, and checking normality.
- For Thursday:
- Finish reading chapter 2 of Sonderegger.

- We reviewed different kinds of probability distributions (dists.rmd).
- We reviewed standard error.
- We studied section 2.4 on hypothesis testing.
- For Tuesday:
- Do exercise 1 in Sonderegger Chapter 2 (p.44)
- Consider the Tati vowel data (will be sent in an email). Draw a vowel plot with F1 and F2 values forming the axes.
- Conduct a Welch two sample t-test to see whether the mean heights of the [i] and [u] vowels are equal. (Update: No need to to do this one – we will do in class on Tuesday)

- We discussed the first part of Sonderegger Chapter 2.
- Seeing Theory

- We went over Franke chapter 6.
- For Tuesday 2/22: In a Rmarkdown file, load the fish file, and write some R commmands to draw the following plots.
- Draw a bar chart showing the average weights of each kind of fish.
- Draw a histogram with 50 bins showing the weights of the fish. Color the different kinds of fish.
- Draw a boxplot showing the weights of the different kinds of fish.

- We finished discussing covariance and correlation from Franke chapter 5.
- We discussed visualization including part of 6.3.
- For Thursday 2/17: finish reading 6.3, 6.4, and 6.5.

- We went over Franke chapter 5.
- For Tuesday 2/15: In a Rmarkdown file, load the fish and nettle files, and write some R commmands which answer the following questions.
- In
`nettle`

, what is the mean and standard deviation of the number of languages (per country)? - In
`nettle`

, what the mean and standard deviation of the average Mean Growing Season (per country)? - In
`fish`

, calculate the covariance and correlation for each*kind*of fish? - In
`fish`

, change the weight and length of each fish to the imperial units (pounds and inches). Calculate the covariance and correlation for each*kind*of fish now. What has changed?

- In

- We went over the HW.
- We went over parts of Franke chapters 3 and 4.

- We went over Franke’s 2.2. and 2.3 and studied anonymous functions and map functions.
- We went over some parts of Franke’s chapter 4.
- For Tuesday 2/8: In a Rmarkdown file, load the fish and nettle files, and write some R commmands which answer the following questions.
- In
`nettle`

, what are the minimum, maximum, and average number of languages (per country)? - In
`nettle`

, what are the minimum, maximum, and average number the average Mean Growing Season (per country)? - In
`fish`

, what are the minimum, maximum, and average weight for each*kind*of fish? - In
`fish`

, what are the minimum, maximum, and average length for each*kind*of fish? (Hint: For #3 and #4, use the`group_by`

and`summarize`

commands in 4.4 of Franke.)

- In

- We went over the HW exercises.
- We reviewed sections 2.5, 2.6 and part of 2.4 of Franke.
- For Thursday 2/3:
- Do the R markdown tutorial
- Review 2.3 and 2.4 and come prepared with questions.
- Review chapter 4 (at least 4.1 and 4.2 - I hope to also get to 4.3 on Thursday)

- We discussed Chapter 1 of Sonderegger 2022.
- We reviewed Franke Section 2.1 and part of 2.2
- For Tuesday 2/1:
- Complete exercises 1.17.1 - 1.17.5 from Winter 2019
- Finish reading chapter 2 of Franke 2021.
- Bring questions on Tuesday

- We went over the syllabus.
- We talked about the course and our individual interests.
- For Thursday 1/27:
- Install R and R Studio
- Open up RStudio and install the tidyverse in R:
`install.packages('tidyverse')`

- Read Chapter 1 of Sonderegger 2022 (look for the file
`rmld_V1.0.pdf`

in the`output`

folder). - Read Chapter 1 to section 2.3 of Franke 2021.