29-CIsTesting-MeanDifference.Rmd


# CIs and tests: mean differences (paired data) {#AnalysisPaired}
\index{Research question!relational}\index{Mean difference}


<!-- Introductions; easier to separate by format -->
```{r, child = if (knitr::is_html_output()) {'./introductions/29-CIsTesting-MeanDifference-HTML.Rmd'} else {'./introductions/29-CIsTesting-MeanDifference-LaTeX.Rmd'}}
```

<!-- Define colours as appropriate -->
```{r, child = if (knitr::is_html_output()) {'./children/coloursHTML.Rmd'} else {'./children/coloursLaTeX.Rmd'}}
```


## Introduction: six-minute walk test {#PairedIntro}

The Six-Minute Walk Test (6MWT) measures how far subjects can walk in six minutes, and is used as a simple, low-cost evaluation of fitness and other health-related measures.
The recommended setting for the test is usually a walkway of at least\ $30\ms$.
@saiphoklang2022comparison measured the 6MWT distance when the same subjects used *both*\ $20\ms$ and\ $30\ms$\ walkways.

The comparison is *within* individuals (Sect.\ \@ref(RQsRepeatedMeasures));\index{Comparison!within individuals} this is a *repeated-measures* study.
Each subject has a *pair* of 6MWT measurements, and the study produced *paired data*
`r if (knitr::is_latex_output()) {
'(Table\\ \\@ref(tab:Data6MWT)),'
} else {
'(below),'
}`
the topic of this chapter.\index{Data!paired}\index{Study types!paired}


```{r Data6MWT}
data(SixMWT)

WTlen <- dim(SixMWT)[1]

Labels <- 1 : WTlen

tb1 <- array( cbind( Labels[1:5 ],
                     SixMWT$Distance20[1:5 ],
                     SixMWT$Distance30[1:5 ],
                     round( SixMWT$Distance30[1:5 ] - SixMWT$Distance20[1:5 ], 2)),
              dim = c(5, 4) )


T1 <- knitr::kable(y <- pad(tb1,
                            surroundMaths = TRUE,
                            targetLength = c(0, 5, 5, 5),
                            decDigits = c(0, 1, 1, 1)),
                   format = "latex",
                   valign = 't',
                   align = "c",
                   linesep = "",
                   col.names = c("Person", 
                                 "$20\\ms$\\ w'way", 
                                 "$30\\ms$\\ w'way", 
                                 "Diff."),
                   row.names = FALSE,
                   escape = FALSE,
                   booktabs = TRUE) %>%
  add_header_above(c( " " = 1, 
                      "Distance walked (in m)" = 3),
                   line = TRUE,
                   bold = TRUE) %>%
  row_spec(0, bold = TRUE)


tb2 <- array( cbind( Labels[(WTlen - 4):WTlen ],
                     SixMWT$Distance20[(WTlen - 4):WTlen ],
                     SixMWT$Distance20[(WTlen - 4):WTlen ],
                     SixMWT$Distance30[(WTlen - 4):WTlen ] - SixMWT$Distance20[(WTlen - 4):WTlen ]),
              dim = c(5, 4) )


T2 <- knitr::kable(pad(tb2,
                       surroundMaths = TRUE,
                       targetLength = c(0, 5, 5, 4),
                       decDigits = c(0, 1, 1, 1)),
                   format = "latex",
                   valign = 't',
                   align = "c",
                   linesep = "",
                   col.names = c("Person", 
                                 "20\\ms\\ w'way", 
                                 "30\\ms\\ w'way", 
                                 "Diff."),
                   row.names = FALSE,
                   escape = FALSE,
                   booktabs = TRUE) %>%
  add_header_above(c( " " = 1, 
                      "Distance walked (in m)" = 3),
                   line = TRUE,
                   bold = TRUE) %>%
  row_spec(0, bold = TRUE)

out <- knitr::kables(list(T1, T2),
                     format = "latex",
                     label = "Data6MWT",
                     caption = "The six-minute walk test (6MWT) distance, for walkways of $20\\ms$\\ and $30\\ms$\\ length. These are the first five and the last five of the $50$ total observations. (A negative difference means the $20\\ms$\\ distance is greater than the $30\\ms$\\ distance.)") %>% 
  kable_styling(font_size = 8)
out2 <- prepareSideBySideTable(out,
                               gap = "\\quad") 
out2

```

```{r}
if( knitr::is_html_output() ) {
  SixMWT$Diff <- round(SixMWT$Distance30 - SixMWT$Distance20, 2)
  
  DT::datatable(SixMWT,
                fillContainer = FALSE, # Make more room, so we don't just have ten values
                colnames = c("Subject", 
                             "Distance (20 m walkway)",
                             "Distance (30 m walkway)",
                             "Age",
                             "Difference"),
                filter = "none",
                options = list(searching = FALSE), # Remove searching: See: https://stackoverflow.com/questions/35624413/remove-search-option-but-leave-search-columns-option
                caption = "The six-minute walk test (6MWT) distance, for walkways of $20\\ms$ and $30\\ms$ length. (A negative difference means the $20\\ms$ distance is greater than the $30\\ms$ distance.)")
}
```


::: {.importantBox .important data-latex="{iconmonstr-warning-8-240.png}"}
Some differences are *negative*.
This does *not* mean a negative distance.
Since the differences are computed as the $30\ms$\ distance minus the $20\ms$\ distance, a negative difference means the $20\ms$\ distance is a larger value than the $30\ms$\ distance.
:::


## Paired data {#PairedData}
\index{Data!paired}

<!-- The RQ is a special case of a *repeated-measures RQs* (Sect.\ \@ref(RQsRepeatedMeasures)),\index{Research question!repeated-measures} where each unit of analysis has just two observations. -->
The data
`r if( knitr::is_latex_output() ) {
'in Table\\ \\@ref(tab:SoilCN)'
} else {
'above'
}`
are *paired*.\index{Data!paired}
Computing the *differences* or *changes* between the pairs of observations makes sense, since the values for each pair belong to the same unit of analysis (the same person, in this case).

Pairing data, when appropriate, is useful because individuals can vary substantially.
Pairing means that extraneous variables\index{Variables!extraneous} (potentially, *confounding* variables)\index{Variables!confounding} are held constant for those paired observations.
For example, each pair of measurements in
`r if (knitr::is_latex_output()) {
'Table\\ \\@ref(tab:SoilCN)'
} else {
'the data above'
}`
are recorded for the same person, so both measurements are recorded for someone of the same age, same sex, and with the same physical attributes.

Pairing is a form of blocking\index{Blocking} (Sect.\ \@ref(ManagingConfounding)).
Pairing is a good design strategy when the individuals in the pair are the same, or are very similar for many extraneous variables.
(For example, the pair may comprise two different people, of the same sex, with similar age, height and weight.)
Pairing often involves taking two measurements from the *same* individuals, as in
`r if (knitr::is_latex_output()) {
'Table\\ \\@ref(tab:SoilCN).'
} else {
'the data above.'
}`


::: {.definition #PairedData name="Paired data"}
*Paired data* occurs when the outcome is compared for two different, distinct situations for each unit of analysis.
:::


Paired studies appear in many situations; as examples:

* Heart rate is measured for each twin in a pair (the twin-pair is the 'individual'), one of whom exercises regularly and one who does not.
Pairing the twins is reasonable, given the shared genetics (and probably childhood environments also).
The *difference* between the hearts rates of the twins can be recorded for each pair.
* The body temperature of dogs (the 'individuals') is measured using *both* rectal and ear thermometers for each dog.
The *difference* between the two recorded temperatures from the thermometers for each dog is recorded.
* Blood pressure is recorded from some individuals (Group\ A) after receiving Drug\ A, and from another group of individuals (Group\ B) after receiving Drug\ B.
Each person in Group\ A is matched with someone in Group\ B of the same sex, similar age and similar weight (e.g., in one of the pairs, both individuals are male, about $30$\ years-of-age, and about $180\cms$\ tall).
The *difference* between the blood pressure measurements for the individual in Group\ A and the matched person in Group\ B is recorded for each pair.
* The number of campers is recorded at many national parks (the 'individuals') on the first weekend in summer, and on the first weekend on winter.
The *difference* in camper numbers for each national park between these time points is recorded.

Many of these examples can be extended to beyond two measurements.
For instance, temperatures can be compared on each dog using three different types of thermometers.
We only study *pairs* of measurements, and only for quantitative variables.


## Summarising the data {#SummarisingPairedCI}

For the 6MWT study, the distance is measured for the same subjects for two different walkway distances.
Each subject receives two measurements, and the *difference* between the distances walked for each individual is computed.\index{Mean difference}


Since the data are paired, an appropriate graph is a histogram of the differences (Sect.\ \@ref(HistoDiffPlot)); specifically, $30\ms$\ distance minus the $20\ms$\ distance.
A boxplot comparing 6MWT distance for both walkway lengths (that is, *not* pairing the data) shows the distribution of distances, and the median distances, are very similar (Fig.\ \@ref(fig:ComparePairedBoxplotsHistogram), left panel).
Any difference in individuals' 6MWT distances is difficult to see and detect.
In addition, linking the $20\ms$ and $30\ms$ distances that belong together for each individual patient is not possible,

Using a histogram of the differences makes the individuals' differences easier to see (Fig.\ \@ref(fig:ComparePairedBoxplotsHistogram), right panel).
The histogram also makes it easy to see that some subjects walked further with a $20\ms$\ walkway, and some further for a $30\ms$\ walkway.
Individually graphing the distances for both walkway distances may also be useful too (using two histograms), but a graph of the differences is *crucial*, as the RQ is about those differences.
A case-profile plot (Sect.\ \@ref(CaseProfilePlot)) is also appropriate, but is difficult to read for these data because sample size is large (a line is needed for each of the $50$\ units of analysis).


(ref:WalkwayPlots) Plots of the 6MWT data. Left: graphing the data *incorrectly* as unpaired. Right: a histogram of 6MWT distances changes ($30\ms$\ walkway distance *minus* $20\ms$\ walkway distance; the vertical grey line represents no change in distance).

```{r ComparePairedBoxplotsHistogram, fig.align="center", out.width='95%', fig.cap="(ref:WalkwayPlots)", fig.height = 3.25, fig.width = 7.75}

par(mfrow = c(1, 2))

boxplot( cbind(SixMWT$Distance20,
               SixMWT$Distance30),
         names = c("20m", "30m"),
         las = 1,
         col = plot.colour,
         ylim = c(180, 600),
         ylab = "Walk distance (in m)",
         xlab = "Walkway distance",
         main = "POOR: A boxplot of 6MWT\ndistance for two walkway lengths")

out <- hist( SixMWT$Distance30 - SixMWT$Distance20,
             breaks = seq(-40, 80, by = 10),
             plot = FALSE)

plot(x = c(-40, 80),
     y = c(0, 18),
     xlim = c(-45, 80),
     xlab = "Difference in distances (in m)",
     ylab = "Frequency",
     sub = "(30m distance minus 20m distance)",
     las = 2,
     type = "n",
     main = "Histogram of 6MWT distance\nfor two walkway lengths")
box()

abline(v = 0, 
       col = "grey", 
       lty = 1,
       lwd = 2)
plot(out,
     add = TRUE,
     col = plot.colour)

arrows(x0 = 5,
       x1 = 75,
       y0 = 16,
       y1 = 16,
       code = 2,
       angle = 15,
       length = 0.1,
       col = "grey")
text( x = 45,
      y = 16,
      pos = 1,
      cex = 0.9,
      label = "30m greater")


arrows(x0 = -5,
       x1 = -41,
       y0 = 16,
       y1 = 16,
       code = 2,
       angle = 15,
       length = 0.1,
       col = "grey")
text( x = -23,
      y = 16,
      pos = 1,
      cex = 0.9,
      label = "20m greater")
```


The 6MWT distances for each walkway length can be summarised individually (the first two rows of Table\ \@ref(tab:SMWTSummary)) using the methods of Chap.\ \@ref(OneMeanConfInterval), using software (Fig.\ \@ref(fig:SMWTNumericalOutput)).
All statistics are slightly different for the two walkway distances; in particular, the mean\ $30\ms$ walkway distance is slightly larger.
However, since the RQ is about the difference between the distances, a numerical summary of the *differences* is essential (third row of Table\ \@ref(tab:SMWTSummary), based on Fig.\ \@ref(fig:SMWTNumericalOutput)).
Notice that the third row of information is computed from the values in the **Diff.** column in
`r if (knitr::is_latex_output()) {
'Table\\ \\@ref(tab:SoilCN),'
} else {
'the data above,'
}`
not by (for instance) finding the difference between the standard deviations in the first two rows.


```{r SMWTSummary}
SMWT.DataSummary <- array( dim = c(3, 5))

SMWT.DataSummary[1, 1] <- mean(SixMWT$Distance20)
SMWT.DataSummary[2, 1] <- mean(SixMWT$Distance30)
SMWT.DataSummary[3, 1] <- mean(SixMWT$Distance30 - SixMWT$Distance20)

SMWT.DataSummary[1, 2] <- median(SixMWT$Distance20)
SMWT.DataSummary[2, 2] <- median(SixMWT$Distance30)
SMWT.DataSummary[3, 2] <- median(SixMWT$Distance30 - SixMWT$Distance20)

SMWT.DataSummary[1, 3] <- sd(SixMWT$Distance20)
SMWT.DataSummary[2, 3] <- sd(SixMWT$Distance30)
SMWT.DataSummary[3, 3] <- sd(SixMWT$Distance30 - SixMWT$Distance20)

SMWT.DataSummary[1, 4] <- findStdError(SixMWT$Distance20)
SMWT.DataSummary[2, 4] <- findStdError(SixMWT$Distance30)
SMWT.DataSummary[3, 4] <- findStdError(SixMWT$Distance30 - SixMWT$Distance20)

SMWT.DataSummary[1, 5] <- realLength(SixMWT$Distance20)
SMWT.DataSummary[2, 5] <- realLength(SixMWT$Distance30)
SMWT.DataSummary[3, 5] <- realLength(SixMWT$Distance30 - SixMWT$Distance20)

rownames(SMWT.DataSummary) <- c("20m walkway distance (in m)", 
                                "30m walkway distance (in m)", 
                                "Difference (in m)")

if( knitr::is_latex_output() ) {
  kable(pad(SMWT.DataSummary,
            targetLength = c(6, 5, 6, 6, 2),
            surroundMaths = TRUE,
            decDigits = c(2, 1, 3, 3, 0)),
        format = "latex",
        booktabs = TRUE,
        longtable = FALSE,
        escape = FALSE,
        align = "c",
        col.names = c("Mean", "Median", "deviation", "error", "size"),
        digits = 2,
        caption = "The numerical summary of the 6MWT data. (The differences are the $30\\ms$\\ distances minus the $20\\ms$\\ differences.)") %>%
    row_spec(0, bold = TRUE) %>%
    row_spec(3, italic = TRUE) %>%
    row_spec(2,
             hline_after = TRUE) %>%    
    add_header_above(header = c(" " = 1,
                                " " = 1,
                                " " = 1,
                                "Standard" = 1,
                                "Standard",
                                "Sample"), 
                     bold = TRUE, 
                     line = FALSE,
                     align = "c") %>%
    kable_styling(font_size = 8)
} else {
  kable(pad(SMWT.DataSummary,
            targetLength = c(6, 5, 6, 6, 2),
            surroundMaths = TRUE,
            decDigits = c(2, 1, 3, 3, 0)),
        format = "html",
        booktabs = TRUE,
        longtable = FALSE,
        escape  = FALSE,
        align = "c",
        col.names = c("Mean", "Median", "Std deviation", "Std error", "Sample size"),
        digits = 2,
        caption = "The numerical summary of the 6MWT data.") %>%
    row_spec(0, bold = TRUE) 
}
```


```{r SMWTNumericalOutput, fig.cap="The 6MWT data: numerical summary software output for each group (top), and the CI and test results (bottom).", fig.align="center", out.width=c("65%","101%"), fig.show="hold"}
knitr::include_graphics("jamovi/SMWT/SMWT-NumericalSummary.png")
knitr::include_graphics("jamovi/SMWT/SMWT-PairedT.png")
```


The differences (i.e., the **Diff.**\ column in 
`r if (knitr::is_latex_output()) {
'Table\\ \\@ref(tab:Data6MWT))'
} else {
'the data given in Sect.\\ \\@ref(PairedIntro))'
}`
can be treated like a single sample of data (Table\ \@ref(tab:PairedNotation)), with the notation adapted accordingly:

* $\mu_d$: the mean *difference* in the *population* (in m).
* $\bar{d}$: the mean *difference* in the *sample* (in m).
* $s_d$: the *sample* standard deviation of the *differences* (in m).
* $n$: the number of *differences*.


```{r PairedNotation}
DiffNotation <- array(dim = c(6, 2))
colnames(DiffNotation) <- c(	"One sample mean", 
                             "Mean difference")
rownames(DiffNotation) <- c(	"The observations:",
                             "Population mean:",
                             "Sample mean:",
                             "Standard deviation:",
                             "Standard error of $\\bar{x}$:",
                             "Sample size:")


if( knitr::is_latex_output() ) {
  DiffNotation[1, ] <- c(	"Values: $x$", 
                          "Differences: $d$")
  DiffNotation[2, ] <- c(	"$\\mu$",		
                          "$\\mu_d$")
  DiffNotation[3, ] <- c(	"$\\bar{x}$",		
                          "$\\bar{d}$")
  DiffNotation[4, ] <- c(	"$s$", 			
                          "$s_d$")
  DiffNotation[5, ] <- c(	"$\\displaystyle\\text{s.e.}(\\bar{x}) = \\frac{s}{\\sqrt{n}}$",
                          "$\\displaystyle\\text{s.e.}(\\bar{d}) = \\frac{s_d}{\\sqrt{n}}$")
  DiffNotation[6, ] <- c(	"Number of \\emph{observations}: $n$",
                          "Number of \\emph{differences}: $n$")
  
  kable( DiffNotation,
         format = "latex",
         booktabs = TRUE,
         align = c("c", "c"),
         longtable = FALSE,
         escape = FALSE,
         col.names = colnames(DiffNotation),
         caption = "The notation used for mean differences (paired data) compared to the notation used for one sample mean.") %>%
    kable_styling(font_size = 8) %>%
    row_spec(0, bold = TRUE) 
}
if( knitr::is_html_output() ) {
  
  DiffNotation[1, ] <- c(	"Values: $x$", 	
                          "Differences: $d$")
  DiffNotation[2, ] <- c(	"$\\mu$",		
                          "$\\mu_d$")
  DiffNotation[3, ] <- c(	"$\\bar{x}$",		
                          "$\\bar{d}$")
  DiffNotation[4, ] <- c(	"$s$", 			
                          "$s_d$")
  DiffNotation[5, ] <- c(	"$\\displaystyle\\text{s.e.}(\\bar{x}) = \\frac{s}{\\sqrt{n}}$",
                          "$\\displaystyle\\text{s.e.}(\\bar{d}) = \\frac{s_d}{\\sqrt{n}}$")
  DiffNotation[6, ] <- c(	"Number of *observations*: $n$",
                          "Number of *differences*: $n$")
  
  kable( DiffNotation,
         format = "html",
         booktabs = TRUE,
         longtable = FALSE,
         align = c("c", "c"),
         col.names = colnames(DiffNotation),
         caption = "The notation used for mean differences (paired data) compared to the notation used for one sample mean.") %>%
    row_spec(0, bold = TRUE) 
}
```


## Confidence intervals for $\mu_d$  {#MeanDiffCI}
\index{Sampling distribution!paired quantitative data}\index{Confidence intervals!paired quantitative data|(}

The data in
`r if (knitr::is_latex_output()) {
'Table\\ \\@ref(tab:Data6MWT)'
} else {
'Sect.\\ \\@ref(PairedIntro)'
}`
can be used to answer this repeated-measures, estimation RQ:

> For Thai patients with chronic obstructive pulmonary disease, what is the mean difference between the 6MWT distance when subjects use a\ $20\ms$ walkway and a\ $30\ms$ walkway?

Every possible sample of $n = 50$ subjects comprises different people, and hence produces different 6MWT distances for\ $20\ms$ and\ $30\ms$ walkways.
For this reason, the 6MWT distance summaries in Table\ \@ref(tab:SMWTSummary) include standard errors.
Since the 6MWT distance varies from sample to sample for each person, the *difference* between the distances for each person varies from sample to sample too, and also have a *sampling distribution*.


::: {.definition #DEFSamplingDistributionDbar name="Sampling distribution of a sample mean difference"}
The *sampling distribution of a sample mean difference* is (when certain conditions are met; Sect.\ \@ref(ValiditySampleMeanDiff)) described by:

* an approximate normal distribution,
* centred around the *sampling mean* whose value is the population mean *difference*\ $\mu_d$,
* with a standard deviation, called the standard error of the difference, of 
$\displaystyle\text{s.e.}(\bar{d}) = \frac{s_d}{\sqrt{n}}$,

where\ $n$ is the number of differences, and\ $s_d$ is the standard deviation of the individual differences in the sample.
:::


<!-- ```{r PairedNotation} -->
<!-- DiffNotation <- array(dim = c(6, 2)) -->
<!-- colnames(DiffNotation) <- c(	"One sample mean",  -->
<!--                              "Mean difference") -->
<!-- rownames(DiffNotation) <- c(	"The observations:", -->
<!--                              "Sample mean:", -->
<!--                              "Standard deviation:", -->
<!--                              "Standard error of $\\bar{x}$:", -->
<!--                              "Sample size:", -->
<!--                              "Confidence interval:") -->


<!-- if( knitr::is_latex_output() ) { -->
<!--   DiffNotation[1, ] <- c(	"Values: $x$",  -->
<!--                           "Differences: $d$") -->
<!--   DiffNotation[2, ] <- c(	"$\\bar{x}$",		 -->
<!--                           "$\\bar{d}$") -->
<!--   DiffNotation[3, ] <- c(	"$s$", 			 -->
<!--                           "$s_d$") -->
<!--   DiffNotation[4, ] <- c(	"$\\displaystyle\\text{s.e.}(\\bar{x}) = \\frac{s}{\\sqrt{n}}$", -->
<!--                           "$\\displaystyle\\text{s.e.}(\\bar{d}) = \\frac{s_d}{\\sqrt{n}}$") -->
<!--   DiffNotation[5, ] <- c(	"Number of \\emph{observations}: $n$", -->
<!--                           "Number of \\emph{differences}: $n$") -->
<!--   DiffNotation[6, ] <- c(	"$\\bar{x}\\pm\\big(\\text{multiplier}\\times \\text{s.e.}(\\bar{x})\\big)$", -->
<!--                           "$\\bar{d}\\pm\\big(\\text{multiplier}\\times \\text{s.e.}(\\bar{d})\\big)$") -->

<!--   kable( DiffNotation, -->
<!--          format = "latex", -->
<!--          booktabs = TRUE, -->
<!--          align = c("c", "c"), -->
<!--          longtable = FALSE, -->
<!--          escape = FALSE, -->
<!--          col.names = colnames(DiffNotation), -->
<!--          caption = "The notation used for mean differences (paired data) compared to the notation used for one sample mean.") %>% -->
<!--     kable_styling(font_size = 8) %>% -->
<!--     row_spec(0, bold = TRUE)  -->
<!-- } -->
<!-- if( knitr::is_html_output() ) { -->

<!--   DiffNotation[1, ] <- c(	"Values: $x$", 	 -->
<!--                           "Differences: $d$") -->
<!--   DiffNotation[2, ] <- c(	"$\\bar{x}$",		 -->
<!--                           "$\\bar{d}$") -->
<!--   DiffNotation[3, ] <- c(	"$s$", 			 -->
<!--                           "$s_d$") -->
<!--   DiffNotation[4, ] <- c(	"$\\displaystyle\\text{s.e.}(\\bar{x}) = \\frac{s}{\\sqrt{n}}$", -->
<!--                           "$\\displaystyle\\text{s.e.}(\\bar{d}) = \\frac{s_d}{\\sqrt{n}}$") -->
<!--   DiffNotation[5, ] <- c(	"Number of *observations*: $n$", -->
<!--                           "Number of *differences*: $n$") -->
<!--   DiffNotation[6, ] <- c(	"$\\bar{x}\\pm\\big(\\text{multiplier}\\times \\text{s.e.}(\\bar{x})\\big)$", -->
<!--                           "$\\bar{d}\\pm\\big(\\text{multiplier}\\times \\text{s.e.}(\\bar{d})\\big)$") -->

<!--   kable( DiffNotation, -->
<!--                 format = "html", -->
<!--                 booktabs = TRUE, -->
<!--                 longtable = FALSE, -->
<!--                 align = c("c", "c"), -->
<!--                 col.names = colnames(DiffNotation), -->
<!--                 caption = "The notation used for mean differences (paired data) compared to the notation used for one sample mean.") %>% -->
<!--     row_spec(0, bold = TRUE)  -->
<!-- } -->
<!-- ``` -->


For the 6MWT data, the sample mean differences\ $\bar{d}$ are described by (Fig.\ \@ref(fig:SMWTSamplingDist)):

* approximate normal distribution,
* with a sampling mean whose value is\ $\mu_{{d}}$,
* with a *standard error* of
\begin{equation}
\text{s.e.}(\bar{d}) =  \frac{22.039}{\sqrt{50}} = 3.117.
(\#eq:StdErrorDifferences)
\end{equation}


```{r SMWTSamplingDist, fig.cap="The sampling distribution is a normal distribution; it describes how the sample mean difference between the 6MWT distances varies in samples of size $n = 50$.", fig.align="center", fig.width=9.25, fig.height=2.5, out.width='100%'}
mn <- mean(SixMWT$Distance20 - SixMWT$Distance30)
n <- length(SixMWT$Distance20)
stdd <- sd(SixMWT$Distance20 - SixMWT$Distance30)

se <- stdd/sqrt(n)

par( mar = c(4, 0.25, 0.5, 0.25) )
out <- plotNormal(0,
                  se,
                  xlab = "Sample mean difference in 6MWT distances (in m)", 
                  cex.axis = 0.95,
                  ylim = c(0, 0.18),
                  xlim.hi = 0 + 3.25 * se,
                  xlim.lo = 0 - 3.25 * se,
                  showXlabels = c( 	
                    expression( mu[d]-"9.350"),
                    expression( mu[d]-6.234), 
                    expression( mu[d]-3.117), 
                    expression( mu[d] ),
                    expression( mu[d] + 3.117), 
                    expression( mu[d] + 6.234), 
                    expression( mu[d] + "9.350") ) )

arrows(x0 = 0,
       x1 = 0,
       y0 = max(out$y) * 1.25,
       y1 = max(out$y),
       angle = 15,
       length = 0.1)

text(x = 0,
     y = max(out$y) * 1.2,
     pos = 3,
     labels = expression(Sampling~mean~difference))


arrows(x0 = 0,
       x1 = 0 + se,
       y0 = 0.30 * max(out$y),
       y1 = 0.30 * max(out$y),
       code = 3, # Arrows both ends
       angle = 15,
       length = 0.1)

text(x = 0 + (se / 2),
     y = 0.30 * max(out$y),
     labels = expression( atop(Std~error,
                               s.e.(bar(italic(d)))==3.117)) )


arrows(x0 = mn,
       x1 = mn,
       y0 = 0.7 * max(out$y),
       y1 = 0,
       angle = 15,
       length = 0.1)
text(x = mn,
     y = 0.7 * max(out$y),
     pos = 3,
     labels = expression(bar(italic(d)) == 0.0282) )
```


The CI for the mean difference has the same form as for a single mean (Chap.\ \@ref(OneMeanConfInterval)).
The $95$%\ confidence interval (CI) for\ $\mu_d$ is
$$
\bar{d} \pm \big(\text{multiplier} \times\text{s.e.}(\bar{d})\big).
$$
As usual when the sampling distribution has an approximate normal distribution, an approximate $95$%\ confidence interval (CI) uses the approximate multiplier of\ $2$ (from the $68$--$95$--$99.7$ rule).
This is the same as the CI for\ $\bar{x}$ if the differences are treated as the data.

For the 6MWT data:
$$
22.03 \pm (2 \times 3.117),
$$
or $22.03\pm 6.234\ms$ (so the *margin of error* is\ $6.234\ms$).
Equivalently, the CI is from $22.03 - 6.234 = 15.796\ms$, up to $22.03 + 6.234 = 28.264\ms$.
We write:

> The mean difference in the 6MWT distances when using a\ $20\ms$ and\ $30\ms$ walkway is\ $22.03\ms$ ($\text{s.e.} = 3.117$; $n = 50$), with an approximate $95$%\ CI from\ $15.80\ms$ to\ $28.26\ms$, further for a $30\ms$\ walkway.

The CI means that the reasonable values for the population mean difference in 6MTW distances are between\ $15.80\ms$ and\ $28.26\ms$.
Alternatively, we are $95$%\ confident that the population mean difference between the 6MWT distances is between\ $15.80\ms$ and\ $28.26\ms$ (further for $30\ms$\ walkway).
A difference of this magnitude probably has practical importance.\index{Practical importance}
Also notice that the *direction* of the difference is given: 'further for $30\ms$\ walkway'.


<iframe src="https://learningapps.org/watch?v=piue8vvyk22" style="border:0px;width:100%;height:600px" allowfullscreen="true" webkitallowfullscreen="true" mozallowfullscreen="true"></iframe>


Statistical software\index{Software output!mean differences} produces *exact* $95$%\ CIs, which may be slightly different from the *approximate* $95$%\ CI (recall: the $68$--$95$--$99.7$ rule gives *approximate* multipliers).
For the 6MWT data, the *approximate* and *exact* $95$%\ CIs are the same to one decimal place (Fig.\ \@ref(fig:SMWTNumericalOutput)).
We write:

> The mean difference in the 6MWT distances when using a\ $20\ms$ and\ $30\ms$ walkway is $22.03\ms$ ($\text{s.e.} = 3.117$; $n = 50$), with a $95$%\ CI from\ $15.76\ms$ to\ $28.29\ms$ further for a $30\ms$\ walkway.
\index{Confidence intervals!paired quantitative data|)}


## Hypothesis tests for $\mu_d$: $t$-test  {#MeanDiffTest}
\index{Sampling distribution!paired quantitative data}\index{Hypothesis testing!paired quantitative data|(}


The data in
`r if (knitr::is_latex_output()) {
'Table\\ \\@ref(tab:Data6MWT)'
} else {
'Sect.\\ \\@ref(PairedIntro)'
}`
can be used to answer this repeated-measures, decision-making RQ:

> For Thai patients with chronic obstructive pulmonary disease, is there a mean increase in 6MWT distance using a\ $30\ms$ walkway compared to a $20\ms$ walkway?

In Sect.\ \@ref(PairedIntro), the differences were defined as the $30\ms$\ distance minus the $20\ms$\ distance, which is consistent with the wording in this RQ. 
This RQ asks if the mean walking distance is, in general, a smaller value when subjects use a\ $20\ms$ walkway compared to a\ $30\ms$ walkway (that is how a *positive* difference will be found).
The *parameter* is the *population mean difference* in 6MWT, $\mu_d$. 
The RQ is one-tailed, since the shorter walkway means more time spent changing direction, possibly negatively impacting the walking distance.

The *null* hypothesis is that 'there is *no mean change* in 6MWT, in the population' (Sect.\ \@ref(AboutHypotheses)):

* $H_0$: $\mu_d = 0$.

This hypothesis, which we initially *assume* to be true, postulates that the mean reduction may not be zero in the *sample*, due to sampling variation.

Since the RQ asks specifically if the mean distance is *smaller* for a $20\ms$ walkway, the alternative hypothesis is *one-tailed* (Sect.\ \@ref(AboutHypotheses)).
According to how the differences have been defined, the alternative hypothesis is:

* $H_1$: $\mu_d > 0$ (i.e., one-tailed).

This hypothesis says that the mean change in the population is *greater than* zero, because of the wording of the RQ, and because of how the differences were defined.
If the differences were defined in the opposite way (as 'the $20\ms$\ distance minus the $30\ms$\ distance') then the alternative hypothesis would be $\mu_d < 0$, which has the same *meaning*.

The sampling distribution, as described in Sect.\ \@ref(def:DEFSamplingDistributionDbar), still applies, where $\mu_d$ is assumed to be the value given in $H_0$ (see Fig.\ \@ref(fig:SMWTSamplingDistHT)):

* an approximate normal distribution,
* centred around the *sampling mean* whose value is the population mean *difference*\ $\mu_d = 0$ (from $H_0$),
* with a standard deviation of $\displaystyle\text{s.e.}(\bar{d}) = 3.117$ (from Eq.\ \@ref(eq:StdErrorDifferences)).

The sample mean difference can be located on the sampling distribution by computing the $t$-score:\index{Hypothesis testing!mean difference}\index{Test statistic!t@$t$-score}
$$
t
= \frac{\bar{d} - \mu_{d}}{\text{s.e.}(\bar{d})}
= \frac{22.026 - 0}{3.117} = 7.07,
$$
following the ideas in Eq.\ \@ref(eq:tscore).
Software displays the same $t$-score (Fig.\ \@ref(fig:SMWTNumericalOutput)).
This is a *huge* $t$-score.


```{r SMWTSamplingDistHT, fig.cap="The sampling distribution is a normal distribution; it describes how the sample mean difference between the 6MWT distances varies in samples of size $n = 50$.", fig.align="center", fig.width=9.25, fig.height=2.5, out.width='100%'}
mn <- mean(SixMWT$Distance20 - SixMWT$Distance30)
n <- length(SixMWT$Distance20)
stdd <- sd(SixMWT$Distance20 - SixMWT$Distance30)

se <- stdd/sqrt(n)

par( mar = c(4, 0.25, 0.5, 0.25) )
out <- plotNormal(0,
                  se,
                  xlab = "Sample mean difference in 6MWT distances (in m)", 
                  cex.axis = 0.95,
                  ylim = c(0, 0.18),
                  xlim.hi = 0 + 3.25 * se,
                  xlim.lo = 0 - 3.25 * se,
                  # showXlabels = c( 	
                  #   expression( -9.35),
                  #   expression( -6.234), 
                  #   expression( -3.117), 
                  #   expression( 0 ),
                  #   expression( 3.117), 
                  #   expression( 6.234), 
                  #   expression( 9.350) ) 
)

arrows(x0 = 0,
       x1 = 0,
       y0 = max(out$y) * 1.25,
       y1 = max(out$y),
       angle = 15,
       length = 0.1)

text(x = 0,
     y = max(out$y) * 1.2,
     pos = 3,
     labels = expression(Sampling~mean~difference))


arrows(x0 = 0,
       x1 = 0 + se,
       y0 = 0.30 * max(out$y),
       y1 = 0.30 * max(out$y),
       code = 3, # Arrows both ends
       angle = 15,
       length = 0.1)

text(x = 0 + (se / 2),
     y = 0.30 * max(out$y),
     labels = expression( atop(Std~error,
                               s.e.(bar(italic(d)))==3.117)) )


arrows(x0 = mn,
       x1 = mn,
       y0 = 0.7 * max(out$y),
       y1 = 0,
       angle = 15,
       length = 0.1)
text(x = mn,
     y = 0.7 * max(out$y),
     pos = 3,
     labels = expression(bar(italic(d)) == 0.0282) )
```


A $P$-value determines if the sample data are consistent with the assumption (Table\ \@ref(tab:PvaluesInterpretation)).
Since $t = 7.07$, and since $t$-scores are like $z$-scores, the *one*-tailed $P$-value will be very small (based on the $68$--$95$--$99.7$ rule).\index{68@$68$--$95$--$99.7$ rule}
Software (Fig.\ \@ref(fig:SMWTNumericalOutput)) reports that the *two*-tailed $P$-value is less than\ $0.0001$.
Hence, the *one*-tailed $P$-value is less than $0.0001/2 = 0.00005$.


::: {.importantBox .important data-latex="{iconmonstr-warning-8-240.png}"}
The software clarifies *how* the differences have been computed.
At the left of the output (Fig.\ \@ref(fig:SMWTNumericalOutput)), the order implies the differences are found as `Distance30` (the $30\ms$ walk distance) minus `Distance20` (the $20\ms$ walk distance), the same as our definition.
:::


`r if (knitr::is_latex_output()) '<!--'`
<iframe src="https://learningapps.org/watch?v=pj3pt56fk22" style="border:0px;width:100%;height:500px" allowfullscreen="true" webkitallowfullscreen="true" mozallowfullscreen="true"></iframe>
`r if (knitr::is_latex_output()) '-->'`


The one-tailed $P$-value is less than\ $0.00005$, suggesting very strong evidence (Table\ \@ref(tab:PvaluesInterpretation)) to support $H_1$.
A conclusion requires an *answer to the RQ*, a summary of the *evidence* leading to that conclusion, and some *summary statistics*:

> Very strong evidence exists in the sample (paired $t = 7.07$; one-tailed $P < 0.0005$) of a mean reduction in 6MWT for a $20\ms$ walkway compared to a $30\ms$ walkway (mean reduction: $22.03\ms$; $95$% CI: $15.76\ms$ to\ $28.29\ms$; $n = 50$).

Note that the direction of the difference is provided.


::: {.importantBox .important data-latex="{iconmonstr-warning-8-240.png}"}
Saying 'there is evidence of a difference' is insufficient.
You must state *which* measurement is, on average, higher (that is, what the differences *mean*).
:::
\index{Hypothesis testing!paired quantitative data|)}


## Statistical validity conditions {#ValiditySampleMeanDiff}
\index{Sampling distribution!mean difference}\index{Statistical validity (for inference)!mean differences}

As with any confidence interval and hypothesis test, these results apply under certain conditions.
The conditions under which the results are statistically valid for paired data are similar to those for one sample mean, rephrased for differences.

Statistical validity can be assessed using these criteria:

* When $n \ge 25$, the CI is statistically valid.
(If the distribution of the differences is highly skewed, the sample size may need to be larger.)
* When $n < 25$, the CI is statistically valid only if the data come from a *population* of differences with a normal distribution.

The sample size of\ $25$ is a rough figure; some books give other values (such as\ $30$).

This condition ensures that the *distribution of the sample means has an approximate normal distribution* (so that, for example, the $68$--$95$--$99.7$ rule can be used).
Provided the sample size is larger than about\ $25$, this will be approximately true *even if* the distribution of the differences in the population does not have a normal distribution.
That is, when $n \ge 25$ the sample means generally have an approximate normal distribution, even if the data themselves don't have a normal distribution.
The units of analysis are also assumed to be *independent* (e.g., from a simple random sample).

If the statistical validity conditions are not met, other methods (e.g., non-parametric methods\index{Non-parametric statistics} [@conover2003practical]; resampling methods\index{Resampling methods} [@efron2021computer]) may be used.
For paired qualitative data, McNemar's test can be used [@conover2003practical].


::: {.example #StatisticalValidityWeightGain name="Statistical validity"}
For the 6MWT data, the sample size is $n = 50$, so the results are statistically valid.
Neither the differences *in the population*, nor the distances *in the population* for the individual walkway lengths, need to follow a normal distribution.
:::


## Example: invasive plants {#PairedInvasivePlants}

Skypilot is an alpine wildflower native to the Colorado Rocky Mountains (USA).
In recent years, a willow shrub (*Salix*) has been encroaching on skypilot territory and, because willow often flowers early, @kettenbach2017shrub studied whether the willow may 'negatively affect pollination regimes of resident alpine wildflower species' (p.\ $6\,965$).

Data for both species was collected at $n = 25$ different sites, so the data are *paired* by site (Sect.\ \@ref(PairedIntro)).\index{Data!paired}
The data are shown in
`r if( knitr::is_latex_output() ) {
'Table\\ \\@ref(tab:FloweringData).'
} else {
'Sect.\\ \\@ref(CompareWithinInvasivePlants).'
}`
The parameter is\ $\mu_d$, the population mean *difference* in day of first flowering for skypilot, less the day of first flowering for willow.
A *positive* value for the difference means that the skypilot values are larger, and hence that willow flowered first.
The RQ is:

> In the Colorado Rocky Mountains, is there a mean difference between first-flowering day for the native skypilot and encroaching willow?

The hypotheses are
$$
\text{$H_0$: $\mu_d = 0$}\quad\text{and}\quad\text{$H_1$: $\mu_d\ne 0$},
$$
where the alternative hypothesis is two-tailed, and $\mu_d$ is the mean difference between first-flowering day for the native skypilot and encroaching willow.


::: {.tipBox .tip data-latex="{iconmonstr-info-6-240.png}"}
Explaining *how* the differences are computed is important.
The differences here are skypilot minus willow first-flowering days.

However, the differences could be computed as willow minus skypilot first-flowering days.
*Either is fine*, as long as you remain consistent.
The *meaning* of any conclusions will be the same.  
:::


The data are summarised graphically in Fig.\ \@ref(fig:FloweringPlots) and numerically (Table\ \@ref(tab:FloweringSummaryHT)), using software output (Fig.\ \@ref(fig:FloweringjamoviHT)).


```{r FloweringjamoviHT, fig.cap="Software output for the flowering-day data.", fig.align="center", out.width=c("100%"), fig.show='hold'}
knitr::include_graphics("jamovi/Flowering/FloweringAll.png")
```


```{r FloweringSummaryHT}
data(Flowering)
FloweringSummary <- array( dim = c(3, 4))

FloweringTab <- cbind( Flowering[, 1:2], 
                       Change = Flowering[, 2] - Flowering[, 1])

rownames(FloweringSummary) <- c("Willow (encroaching)",
                                "Skypilot (native)",
                                "Differences")
colnames(FloweringSummary) <- c("Mean",
                                "Standard deviation",
                                "Standard error",
                                "Sample size")

FloweringSummary[, 1] <- colMeans(FloweringTab,
                                  na.rm = TRUE)
FloweringSummary[, 2] <- apply(FloweringTab,
                               2,
                               "sd",
                               na.rm = TRUE)
FloweringSummary[, 3] <- apply(FloweringTab,
                               2,
                               "findStdError",
                               na.rm = TRUE)
FloweringSummary[, 4] <- apply(FloweringTab,
                               2,
                               "realLength")

# Do some appropriate rounding
FloweringSummary <- round(FloweringSummary, 4)
FloweringSummary[1:2, 1] <- round(FloweringSummary[1:2, 1], 3)

if( knitr::is_latex_output() ) {
  
  knitr::kable(pad(FloweringSummary,
                   surroundMaths = TRUE,
                   targetLength = c(6, 6, 5, 0),
                   decDigits = c(2, 3, 3, 0)),
               format = "latex",
               align = "c",
               linesep = "",
               caption = "The day of first flowering for encroaching willow and native skypilot.",
               col.names = c("Mean", "Standard deviation", "Standard error", "Sample size"),
               row.names = TRUE,
               escape = FALSE,
               booktabs = TRUE) %>%
    row_spec(0, bold = TRUE) %>%
    row_spec(3, italic = TRUE) %>%
    row_spec(2, hline_after = TRUE) %>%
    kable_styling(font_size = 8)
  
}

if( knitr::is_html_output() ) {
  kable( pad(FloweringSummary,
             surroundMaths = TRUE,
             targetLength = c(6, 6, 5, 0),
             decDigits = c(2, 3, 3, 0)),
         format = "html",
         align = "c",
         booktabs = TRUE,
         longtable = FALSE,
         col.names =  c("Mean", "Standard deviation", "Standard error", "Sample size"),
         caption = "The day of first flowering for encroaching willow and native skypilot.") %>% 
    row_spec(0, bold = TRUE)
}
```

The standard error of the mean difference is $\text{s.e.}(\bar{d}) = 0.9396$ (Fig.\ \@ref(fig:FloweringjamoviHT) or Table\ \@ref(tab:FloweringSummaryHT)).
The sampling distribution for $\bar{d}$ has a normal distribution, centred around $\mu_d$ with a standard deviation of $\text{s.e.}(\bar{d}) = 0.9396$.

The approximate $95$%\ CI for the mean difference is
$$
1.36 \pm ( 2\times 0.9396),
$$
or from $-0.519$ to\ $3.24$\ days.
The exact $95$% CI (Fig.\ \@ref(fig:FloweringjamoviHT)) is $-0.579$ to\ $3.30$\ days; the difference is because the approximate CI uses the *approximate* multiplier of\ $2$ from the $68$--$95$--$99.7$ rule.

The value of the test statistic (i.e., the $t$-score) is
\begin{align*}
t 
= \frac{\bar{d} - \mu_d}{\text{s.e.}(\bar{d})}
= \frac{1.36 - 0}{0.9396} = 1.45,
\end{align*}
as in the output.
This is a relatively small value of\ $t$, so a large $P$-value is expected using the $68$--$95$--$99.7$ rule.
Indeed, the output shows that $P = 0.161$: there is *no evidence* of a mean difference in first-flowering day (i.e., the sample mean difference could reasonably be explained by sampling variation if $\mu_d = 0$).

Since *positive* differences mean willow flowers earlier, we write (using the exact CI):

> No evidence exists ($t = 1.45$; two-tailed $P = 0.161$) that the day of first-flowering is different for the encroaching willow and the native skypilot (mean difference: $1.36$ days earlier for willow; approximate $95$%\ CI between $0.52$\ days earlier for skypilot to $3.24$\ days earlier for willow; $n = 25$).

The CI is statistically valid since $n = 25$.


::: {.importantBox .important data-latex="{iconmonstr-warning-8-240.png}"}
Be clear in your conclusion about *how* the differences are computed.
Make sure to interpret the test and CI consistently with how the differences are defined.
:::


:::: {.importantBox .important data-latex="{iconmonstr-warning-8-240.png}"}
We *do not* say whether the evidence supports the null hypothesis.
We assume the null hypothesis is true, so we state how strong the evidence is to support the alternative hypothesis.
The current sample presents no evidence to contradict the assumption (but future evidence may emerge).
:::


## Example: chamomile tea {#ChamomileTea-Paired}

@rafraf2015effectiveness studied patients with Type\ 2 diabetes mellitus (T2DM).
They randomly allocated $32$\ patients into a control group (who drank hot water), and another $32$\ patients to receive chamomile tea (p.\ 164):\index{Control}\index{Blinding!researchers}

> The study was blinded so that the allocation of the intervention or control group was concealed from the researchers and statistician [...]
> The intervention group ($n = 32$) consumed one cup of chamomile tea [...] three times a day immediately after meals (breakfast, lunch, and dinner) for $8$\ weeks. 
> The control group ($n = 32$) consumed an equivalent volume of warm water during the $8$-week period...

The total glucose (TG) was measured for each individual both *before* the intervention and *after* eight weeks on the intervention, in both the control and treatment groups.
The data are not available, so no graphical summary of the data can be produced; however, the article gives a data summary (motivating Table\ \@ref(tab:TGsummaryTable)).


(ref:TGsummary) The total glucose (TG; in mg.dl$^{-1}$) for two groups: those who drank chamomile tea, and those who drank hot water (the control group). The **Reduction** columns summarise the reduction in TG for each group.

```{r TGsummaryTable}
TGsummary <- array(dim = c(3, 8) )

rownames(TGsummary) <- c("Chamomile tea",
                         "Control",
                         "Difference")
colnames(TGsummary) <- c("n",
                         "BaselineMean",
                         "BaselinesSD",
                         "PostMean",
                         "PostSD",
                         "ChangeMean",
                         "ChangeSD",
                         "ChangesSE")

TGsummary[1, ] <- c(32, 
                    203.00, 54.96,
                    164.37, 50.70,
                    38.62, 30.37, 30.37/sqrt(32) )
TGsummary[2, ] <- c(32,
                    178.25, 53.06,
                    185.37, 52.59,
                    -7.12, 36.66, 36.66/sqrt(32) )
TGsummary[3, ] <- c(NA,
                    24.75, NA,
                    21.00, NA,
                    45.74, NA, NA)

if( knitr::is_latex_output() ) {
  
  knitr::kable(pad(TGsummary,
                   surroundMaths = TRUE,
                   targetLength = c(2, 6, 5, 6, 5, 5, 5, 5),
                   decDigits = c(0, 2, 2, 2, 2, 2, 2, 2)),
               format = "latex",
               align = "c",
               linesep = "",
               caption = "(ref:TGsummary)",
               col.names = c("$n$",
                             "Mean", "Std dev.", 
                             "Mean", "Std dev.",
                             "Mean", "Std dev.", 
                             "Std error"),
               row.names = TRUE,
               escape = FALSE,
               booktabs = TRUE) %>%
    row_spec(0, bold = TRUE) %>%
    row_spec(3, italic = TRUE) %>%
    kable_styling(font_size = 8) %>%
    row_spec(2,
             hline_after = TRUE) %>%
    add_header_above( c(" " = 2,
                        "Baseline" = 2,
                        "After 8 weeks" = 2,
                        "Reduction" = 3),
                      bold = TRUE)
}

if( knitr::is_html_output() ) {
  kable( pad(TGsummary,
             surroundMaths = TRUE,
             poorMansNegative = TRUE,
             targetLength = c(2, 6, 5, 6, 5, 5, 5, 5),
             decDigits = c(0, 2, 2, 2, 2, 2, 2, 2)),
         format = "html",
         align = "c",
         booktabs = TRUE,
         longtable = FALSE,
         col.names =  c("$n$",
                        "Mean", "Std dev.", 
                        "Mean", "Std dev.",
                        "Mean", "Std dev.", 
                        "Std error"),
         caption = "(ref:TGsummary)") %>% 
    row_spec(0, bold = TRUE) %>%
    add_header_above( c(" " = 2,
                        "Baseline" = 2,
                        "After 8 weeks" = 2,
                        "Reduction" = 3),
                      bold = TRUE)
}

```


Is there a mean reduction in TG in either group?
Estimates of the mean reduction in each group can be found by constructing a CI for each group.
First, the standard errors for each reduction are needed:

* \makebox[31mm][l]{Tea-drinking group:}  $\text{s.e.}(\bar{d}) = 30.37/\sqrt{32} = 5.37$.
* \makebox[31mm][l]{Control group:}       $\text{s.e.}(\bar{d}) = 36.66/\sqrt{32} = 6.48$.

Then the approximate $95$% CIs are:

* \makebox[31mm][l]{Tea-drinking group:} $38.62\pm (2\times 5.37)$, or from\ $27.88$ to\ $49.36$ mg.dl$^{-1}$.
* \makebox[31mm][l]{Control group:} $-7.12\pm (2\times 6.48)$, or from\ $-20.08$ to\ $5.84$ mg.dl$^{-1}$.

(A *negative reduction* in TG means an *increase* in TG.)
The first CI suggests that the population mean difference is almost certainly larger than zero; the second suggests that a population mean difference of zero could reasonably have produced the sample data.

Of course, the sample mean differences in TG may be non-zero due to sampling variation.
So, the following repeated-measures RQs can be asked:

> * For patients with T2DM, is there a mean *change* in TG after eight weeks drinking *chamomile tea*?
> * For patients with T2DM, is there a mean *change* in TG after eight weeks drinking *hot water*?

Then, the hypotheses are (where $\mu_d$ represent the mean change in TG (in\ mg.dl^$-1$^) after eight weeks):

* \makebox[31mm][l]{Tea-drinking group:}  $H_0$:\ $\mu_d = 0$\quad\ vs\ $H_1$:\ $\mu_d \ne 0$.
* \makebox[31mm][l]{Control group:}       $H_0$:\ $\mu_d = 0$\quad\ vs\ $H_1$:\ $\mu_d \ne 0$.

The two test statistics are:
$$
  t_T = \frac{38.62 - 0}{5.37} = 7.19\qquad\text{and}\qquad t_W = \frac{-7.12 - 0}{6.48} = -1.10,
$$
where the subscripts\ $T$ and\ $W$ refer to the tea and hot-water groups respectively.
The $t$-score for the tea-drinking group is *huge*, so the two-tailed $P$-value will be *very small* using the $68$--$95$--$99.7$ rule, and certainly smaller than\ $0.001$.
This means that there is evidence that chamomile tea had an impact on the mean change in\ TG.

In contrast, the $t$-score for the water-drinking group is *small*, so the two-tailed $P$-value will be *large* using the $68$--$95$--$99.7$ rule, and certainly larger than\ $0.10$.
This means there is no evidence that placebo treatment (hot water) had any impact on mean change in TG (as one might expect for a placebo).


We write:

> There is very strong evidence ($t = 7.19$; two-tailed $P < 0.001$) of a mean change in TG for the chamomile-drinking groups (mean reduction: $38.62\mgs$.dl^$-1$^; approx. $95$%\ CI: $27.88$ to\ $49.36\mgs$.dl^$-1$^; $n = 32$), but *no* evidence ($t = -1.10$; two-tailed $P > 0.10$) of a mean change in the hot-water drinking group (mean reduction: $-7.12\mgs$.dl^$-1$^; approx. $95$%\ CI: $-20.08$ and\ $-5.84\mgs$.dl^$-1$^; $n = 32$).

The intervals have a $95$%\ chance of straddling the population mean reduction in TG.\spacex
The sample sizes are larger than\ $25$, so the results are statistically valid.

These hypothesis tests have allowed decisions to be made about each group individually.
However, the two groups ultimately need to be compared so that the tea-drinking group and the water-drinking groups can be *compared*.
This is considered in Sect.\ \@ref(ChamomileTea-TwoMeans).


<iframe src="https://learningapps.org/watch?v=pj3pt56fk22" style="border:0px;width:100%;height:500px" allowfullscreen="true" webkitallowfullscreen="true" mozallowfullscreen="true"></iframe>


## Chapter summary {#Chap29-Summary}

To compute a confidence interval (CI) for a mean difference, compute the sample mean difference,\ $\bar{d}$, and identify the sample size\ $n$.
Then compute the standard error, which quantifies how much the value of\ $\bar{d}$ varies across all possible samples:
$$
\text{s.e.}(\bar{d})
=
\frac{ s_d }{\sqrt{n}},
$$
where\ $s_d$ is the sample standard deviation.
The *margin of error* is (multiplier${}\times{}$standard error), where the multiplier is\ $2$ for an approximate $95$%\ CI (using the $68$--$95$--$99.7$ rule).
Then the CI is:
$$
\bar{d} \pm \left( \text{multiplier}\times\text{standard error} \right).
$$
The statistical validity conditions should also be checked.


To test a hypothesis about a population mean difference $\mu_d$:

* Write the null hypothesis ($H_0$) and the alternative hypothesis ($H_1$).
* Initially *assume* the value of\ $\mu_d$ in the null hypothesis to be true.
* Then, describe the *sampling distribution*, which describes what to *expect*  from the sample mean difference based on this assumption: under certain statistical validity conditions, the sample mean difference varies with:
*  an approximate normal distribution,
*  with sampling mean whose value is the value of\ $\mu_d$ (from\ $H_0$), and
*  having a standard deviation of $\displaystyle \text{s.e.}(\bar{d}) =\frac{s_d}{\sqrt{n}}$.
* Compute the value of the *test statistic*:
$$
t = \frac{ \bar{d} - \mu}{\text{s.e.}(\bar{d})},
$$
where\ $\mu_d$ is the hypothesised value given in the null hypothesis.
* The $t$-value is like a $z$-score, and so an approximate *$P$-value* can be estimated using the $68$--$95$--$99.7$ rule, or found using software.
* Make a decision, and write a conclusion.
* Check the statistical validity conditions.


`r if (knitr::is_html_output()){
'The following short video may help explain some of these concepts:'
}`


<div style="text-align:center;">
```{r}
htmltools::tags$video(src = "./videos/PairedTTest.mp4", 
                      width = "550", 
                      controls = "controls", 
                      loop = "loop", 
                      style = "padding:5px; border: 2px solid gray;")
```
</div>


## Quick review questions {#Chap34-QuickReview}

::: {.webex-check .webex-box}
@bacho2019effects compared joint pain in stroke patients receiving a supervised exercise treatment.
The same participants ($n = 34$) were assessed *before* and *after* treatment.
The mean *improvement* in joint pain after $13$\ weeks was\ $1.27$ (with a standard error of\ $0.57$) measured using a standardised tool.

Are the following statements *true* or *false*?

1. For paired data, the mean of the *differences* is treated like the mean of a single variable.\tightlist  
`r if( knitr::is_html_output() ) { torf( answer=TRUE )}`
1. An appropriate graph for displaying these data is a histogram of the differences.  
`r if( knitr::is_html_output() ) { torf( answer=TRUE )}`
1. The *population* mean difference is denoted $\mu_d$.  
`r if( knitr::is_html_output() ) { torf( answer=TRUE )}`
1. The standard error of the sample mean difference is denoted $s_d$.  
`r if( knitr::is_html_output() ) { torf( answer=FALSE )}`
1. Only 'before and after' studies can be paired. \tightlist
`r if( knitr::is_html_output() ) { torf( answer=FALSE )}`
1. The null hypothesis is about the *population* mean difference.
`r if( knitr::is_html_output() ) { torf( answer=TRUE )}`
1. The value of the test statistic is $2.23$
`r if( knitr::is_html_output() ) { torf( answer=TRUE )}`
1. The approximate value of the two-tailed $P$-value is very small.
`r if( knitr::is_html_output() ) { torf( answer=TRUE )}`
1. The 'test statistic' for this test is a $t$-score.
`r if( knitr::is_html_output() ) { torf( answer=TRUE )}`
:::


## Exercises {#TestPairedMeansExercises}

[Answers to odd-numbered exercises] are given at the end of the book. 

`r if( knitr::is_latex_output() ) "\\captionsetup{font=small}"`


::: {.exercise #MeanDiffWhichPaired}
Which (if any) of these scenarios are *paired*?

1. Heart rate is measured for each individual when sitting and when standing.
(Some individuals have their heart rate recorded first while sitting, and some first while standing.)
Each person receives two measurements, and the *difference* in heart rate between sitting and standing is recorded.
1. The mean protein concentrations were compared in sea turtles before and after being rehabilitated  [@data:March2018:turtles].
:::


::: {.exercise #MeanDiffWhichPaired2}
Which (if any) of these scenarios are *paired*?

1. The mean HDL cholesterol concentration is recorded for a group of males and a group of females, and the means compared.
1. Heart rate was recorded for $36$\ people, both before and after exercise, to determine how much the average heart rate increase.
:::


::: {.exercise #MeanDiffGDiffsA}
A group of primary school children were asked to complete a certain task on both a personal computer (PC) and using a tablet computer.

If the differences were defined as the time to complete the task on the PC, minus the time to complete the same task on a tablet (one difference for each child), what do the difference *mean*?
:::


::: {.exercise #MeanDiffGDiffsB}
Suppose water quality was recorded $500\ms$ upstream and $500\ms$ downstream of $28$\ different copper mines.

If the differences were defined as the pH downstream minus the water pH upstream for each river, what do the differences *mean*?
:::


::: {.exercise #MeanDiffFlowering}
Suppose, in the example of Sect.\ \@ref(PairedInvasivePlants), the differences were defined as the day of first flowering for willow, less the day of first flowering for skypilot.

Write down, and interpret the meaning of, the confidence interval for the mean difference in first-flowering times.
:::


::: {.exercise #MeanDiffTea}
Suppose, in the example of Sect.\ \@ref(ChamomileTea-Paired), the differences were defined as *increase* in total glucose (TG).

Write down, and interpret the meaning of, the confidence interval for the mean increase in TG for the tea-drinking group.
:::


::: {.exercise #MeanDiffGrowingSquash}
[*Dataset*: `Fruit`]
@mukherjee2019diversity studied the effect of rainfall on growing Chayote squash (*Sechium edule*).
They compared the size of the fruit in a year with normal rainfall (2015) compared to a dry year (2014) on $24$ farms:

> For Chayote squash grown in Bangalore, what is the mean difference in fruit weight between a normal and dry year?

Ten fruits were gathered from each farm in both years, and the average (mean) weight of the fruit recorded for the farm.
Since the same farms are used in both years, the data are *paired* 
`r if( knitr::is_latex_output() ) {
'(Table\\ \\@ref(tab:FruitsData)).'
} else {
'(see above).'
}`
Data is missing for Farm\ 20 in the dry year (2014), so there are $n = 23$ differences.


```{r FruitsData, echo=FALSE}
data(Fruit) ### Exercise

Fruit$Farm <- 1 : length(Fruit$FWeight2014)

FruitTab <- dplyr::select(Fruit,
                          Farm,
                          FWeight2014,
                          FWeight2015)
FruitTab$Change <- FruitTab$FWeight2015 - FruitTab$FWeight2014

if( knitr::is_latex_output() ) {
  
  T1 <- knitr::kable(pad(FruitTab[1:5, ],
                         decDigits = c(0, 2, 2, 2),
                         surroundMaths = TRUE,
                         targetLength = c(2, 6, 6, 6)),
                     format = "latex",
                     valign = 't',
                     align = "c",
                     linesep = "",
                     col.names = c("Farm", 
                                   "Dry", 
                                   "Normal",
                                   "Change"),
                     row.names = FALSE,
                     escape = FALSE,
                     booktabs = TRUE) %>%
    add_header_above(c( " " = 1,
                        "Average fruit weight (kg)" = 3),
                     bold = TRUE) %>%
    row_spec(0, bold = TRUE) 
  
  
  T2 <- knitr::kable(pad(FruitTab[20:24, ],
                         decDigits = c(0, 2, 2, 2),
                         surroundMaths = TRUE,
                         targetLength = c(2, 6, 6, 7)),
                     format = "latex",
                     valign = 't',
                     align = "c",
                     linesep = "",
                     col.names = c("Farm", 
                                   "Dry", 
                                   "Normal",
                                   "Change"),
                     row.names = FALSE,
                     escape = FALSE,
                     booktabs = TRUE)  %>%
    add_header_above(c( " " = 1,
                        "Average fruit weight (kg)" = 3),
                     bold = TRUE) %>%
    row_spec(0, bold = TRUE) 
  
  out <- knitr::kables(list(T1, T2),
                       format = "latex",
                       label = "FruitsData",
                       caption = "The average weight of fruits (in g) in two different years, from $24$ farms. One observation is missing for Farm\ 20. The change is computed as the normal year minus dry year.") %>% 
    kable_styling(font_size = 8)
  out2 <- prepareSideBySideTable(out) 
  out2 
}

if( knitr::is_html_output() ) {
  kable( pad(FruitTab,
             decDigits = c(0, 2, 2, 2),
             surroundMaths = TRUE,
             poorMansNegative = TRUE,
             targetLength = c(2, 6, 6, 7)),
         format = "html",
         align = "c",
         booktabs = TRUE,
         longtable = FALSE,
         col.names = c("Farm", 
                       "Dry", 
                       "Normal",
                       "Change (in g)"),
         caption = "The average weight of fruits (in g) in two different years, from $24$ farms. One observation is missing for Farm\ 20. The change is computed as the normal year minus dry year.") %>% 
    row_spec(0, bold = TRUE)
}
```


```{r FruitDescriptivesjamovi, fig.cap="Software output for the fruit data.", fig.align="center", out.width='80%'}
knitr::include_graphics("jamovi/Fruit/Fruit-Descriptives.png")
```


1. What is the *unit of analysis*?\index{Units of analysis}
What is the *units of observation*?\index{Units of observation}
1. What is the advantage of using the same $24$\ farms twice each?
1. Construct a suitable graph to display the differences.
1. Create a numerical summary table for the data (use Fig.\ \@ref(fig:FruitDescriptivesjamovi)).
1. What is the parameter?
Carefully describe what it means.
1. Write down the hypotheses.
1. Sketch the sampling distribution.
1. Compute the $t$-score.
1. Determine the $P$-value.
1. Construct an approximate $95$%\ CI for the mean difference in fruit weight.
1. Are the test and the CI statistically valid?
1. Write a conclusion.
:::


::: {.exercise #TestPairedMeansCaptopril}
[*Dataset*: `Captopril`]
In a study of hypertension [@data:hand:handbook; @data:macgregor:essential], patients were given a drug (Captopril) and their systolic blood pressure measured (in mm Hg) immediately before and two hours after being given the drug.

The aim is to see if there is evidence of a *reduction* in blood pressure after taking Captopril.
Use the data (Table\ \@ref(tab:CICaptoprilData)) and the software output (Fig.\ \@ref(fig:Captoriljamovi)) to answer these questions.

1. Explain why it is probably more sensible to compute differences as the *Before* minus the *After* measurements. 
What do the differences *mean* when computed this way?
1. What is the advantage of using the same patients for both the before and after measurements, rather than one group for before measurements and a different group of people for after measurements?
1. What is the parameter?
Carefully describe what it means.
1. Construct a suitable graph to display the differences.
1. Write down the hypotheses.
1. Sketch the sampling distribution.
1. Write down the $t$-score.
1. Write down the $P$-value.
1. Write down the *exact* $95$%\ CI using the computer output (Fig.\ \@ref(fig:Captoriljamovi)). 
1. Compute an *approximate* $95$%\ CI for the mean difference.
1. Why are the two CIs different?
1. Write a conclusion.
1. Are the CI and test statistically valid?
:::


```{r}
data(Captopril) ### Exercise

Captopril$Differences <- Captopril$Before - Captopril$After

bloodS <- subset(Captopril, BP == "S")
bloodS <- bloodS[, c("Before", 
                     "After", 
                     "Differences")]

bloodS2 <- cbind( "Before" = bloodS$Before[1:8], 
                  "After" = bloodS$After[1:8],
                  "Before" = c(bloodS$Before[9:15], NA), 
                  "After" = c(bloodS$After[9:15], NA) )
```


```{r Captoriljamovi, fig.cap="Software output for the Captopril data.", fig.align="center", out.width="80%", fig.show="hold"}
knitr::include_graphics("jamovi/CaptoprilAll/CaptoprilAll-PairedTOutput.png") 
#knitr::include_graphics("SPSS/CaptoprilAll/CaptoprilAll-PairedTOutput.png")
```


::: {.exercise #TestPairedMeansTasteOfBroccoli}
People often struggle to eat the recommended intake of vegetables.
@data:Fritts2018:Vegetables explored ways to increase vegetable intake in teens.
Teens rated the taste of raw broccoli, and raw broccoli served with a specially-made dip.

Each teen ($n = 100$) had a *pair* of measurements: the taste rating of the broccoli *with* and *without* dip.
Taste was assessed using a '$100\mms$\ visual analogue scale', where a *higher* score means a *better* taste.
In summary:

* For raw broccoli, the mean taste rating was\ $56.0$ (with a standard deviation of\ $26.6$);
<!-- %  (SDs); so if $n = 100$ we'd get SE: 2.647 -->
* For raw broccoli served with dip, the mean taste rating was\ $61.2$ (with a standard deviation of\ $28.7$).

Because the data are paired, the *differences* are the best way to describe the data.
The mean difference in the ratings was\ $5.2$, with standard error of\ $3.06$. 
<!-- (working backwards from the $t$-score). Looks like $n = 101$. -->

1. Construct a suitable numerical summary table.
1. What does a positive difference mean?
1. Perform a hypothesis test to see if the use of dip *increases* the mean taste rating.
1. Compute the approximate $95$%\ CI for the mean difference in taste ratings.
1. Are the CI and test statistically valid?
:::
<!-- (working backwards from the $t$-score). Looks like $n=101$. n=100...? -->


::: {.exercise #TestPairedMeansSmokingAndExercise}
@data:Allen2018:Smoking examined the effect of exercise on smoking.
Men and women were assessed on their 'intention to smoke', both before and after exercise for each subject (using two quantitative questionnaires).
Smokers ('smoking at least five cigarettes per day') aged\ $18$ to\ $40$ were enrolled for the study.
For the $23$\ women in the study, the mean intention to smoke after exercise *reduced* by\ $0.66$ (with a standard error of\ $0.37$).
(Larger values for 'intention to smoke' mean a greater intent to smoke.)

1. What does a positive difference mean?
1. Perform a hypothesis test to determine if there is evidence of a population mean *reduction* in intention-to-smoke for women after exercising.
1. Find an approximate $95$% confidence interval for the population mean reduction in intention to smoke for women after exercising.
1. Are the CI and test statistically valid?
:::


::: {.exercise #TestPairedMeansFerritin}
[*Dataset*: `Ferritin`]
In a study [@cressie1984use] conducted at the Adelaide Children's Hospital (p.\ 107; emphasis added):

> ... a group of beta thalassemia patients [...] were treated by a continuous infusion of desferrioxamine, in order to *reduce* their ferritin content...


Using the data 
`r if (knitr::is_latex_output()) {
'in Table\\ \\@ref(tab:FerritinTable),'
} else {
'shown below,'
}`
conduct a hypothesis test to determine if there is evidence that the treatment reduces the ferritin content, as intended.
Make sure to include a $95$%\ CI in the conclusion.
:::

```{r FerritinTable}
data(Ferritin) ### Exercise

FR <- Ferritin

if( knitr::is_latex_output() ) {
  T1 <- kable( pad(FR[1:10, ],
                   surroundMaths = TRUE,
                   targetLength = c(4, 4, 5),
                   decDigits = 0),
               format = "latex",
               row.names = FALSE,
               escape = FALSE,
               align = "c",
               col.names = c("Sept.", "March", "Reduction"),
               booktabs = TRUE, 
               linesep = c("", "", "", "\\addlinespace"),
               longtable = FALSE) %>%
    row_spec(0, bold = TRUE)
  T2 <- kable( pad( FR[11:20, ],
                    surroundMaths = TRUE,
                    targetLength = c(4, 4, 5),
                    decDigits = 0),
               format = "latex",
               row.names = FALSE,
               escape = FALSE,
               align = "c",
               col.names = c("Sept.", "March", "Reduction"),
               booktabs = TRUE,
               linesep = c("", "", "", "\\addlinespace"),
               longtable = FALSE) %>%
    row_spec(0, bold = TRUE)
  
  out <- knitr::kables(list(T1, T2),
                       format = "latex",
                       label = "FerritinTable",
                       caption = "The ferritin content (in $\\mu$g/L) for $20$\\ thalassemia patients at the Adelaide Children's Hospital.") %>% 
    kable_styling(font_size = 8)  
  out2 <- prepareSideBySideTable(out, 
                                 gap = "\\quad") 
  out2
}
if( knitr::is_html_output()) {
  kable( pad(head(FR, 10),
             surroundMaths = TRUE,
             targetLength = c(4, 4, 5),
             poorMansNegative = TRUE,
             decDigits = 0),
         format = "html",
         align = "c",
         longtable = FALSE,
         caption = "The ferritin content (in $\\mu$g/L) for $20$\\ thalassemia patients at the Adelaide Children's Hospital (first ten observations).",
         booktabs = TRUE)
}
```


::: {.exercise #StressSurgeryHT}
[*Dataset*: `Stress`]
The concentration of beta-endorphins in the blood is a sign of stress.
@hoaglin2011exploring measured the beta-endorphin concentration for $19$\ patients about to undergo surgery [@data:hand:handbook].
Each patient had their beta-endorphin concentrations measured $12$--$14$ hours before surgery, and also $10\mins$ before surgery.

A numerical summary (Table\ \@ref(tab:StressDescriptivesjamoviHT)) was produced from output.

1. Use the output to test the RQ.
1. Use the software output in Fig.\ \@ref(fig:StressDescriptivesjamovi) to construct an *approximate* $95$%\ CI for the *increase* in beta-endorphin concentrations as surgery gets closer.
1. Use the software output in Fig.\ \@ref(fig:StressDescriptivesjamovi) to write down the *exact* $95$%\ CI for the *increase* in beta-endorphin concentrations as surgery gets closer.
1. Why is there a difference between the two CIs?
1. Are the CI and test statistically valid?
:::

```{r StressDescriptivesjamoviHT,  fig.align="center", out.width='80%'}
data(Stress) ### Exercise

ST <- Stress

StressTab <- data.frame(
  "Means" = c( colMeans(ST),
               mean(ST$BeforeMins - ST$BeforeHours) ),
  "Std deviation" = c( apply(ST, 2, "sd"),
                       sd(ST$BeforeMins - ST$BeforeHours) ),
  "Std Error" = c( apply(ST, 2, function(x){sd(x)/sqrt(length(x))}),
                   sd(ST$BeforeMins - ST$BeforeHours)/sqrt(19) ),
  "Sample size" = c( apply(ST, 2, length),
                     length(ST$BeforeMins - ST$BeforeHours) )
)
rownames(StressTab) <- c("12--14 hours before surgery",
                         "10 minutes before surgery",
                         "Increase")

if( knitr::is_latex_output() ) {
  knitr::kable(pad(StressTab,
                   surroundMaths = TRUE,
                   targetLength = c(5, 5, 4, 2),
                   decDigits = c(2, 2, 2, 0)),
               format = "latex",
               align = c("c", "c", "c", "c"),
               booktabs = TRUE,
               longtable = FALSE,
               escape = FALSE,
               caption = "The surgery-stress data.",
               col.names = c("Sample mean",
                             "Standard deviation",
                             "Standard error",
                             "$n$"),
               row.names = TRUE,
               digits = 2) %>%
    row_spec(0, bold = TRUE) %>%
    row_spec(2,
             hline_after = TRUE) %>%
    row_spec(3, italic = TRUE) %>%
    
    kable_styling(font_size = 8)
} 

if( knitr::is_html_output() ) {
  kable(pad(StressTab,
                   surroundMaths = TRUE,
                   targetLength = c(5, 5, 4, 2),
                   decDigits = c(2, 2, 2, 0)),
        format = "html",
        booktabs = TRUE,
        longtable = FALSE,
        align = "r",
        caption = "The surgery-stress data.") %>%
    column_spec(1, bold = TRUE) %>%
    row_spec(1, bold = TRUE)
}
```


```{r StressDescriptivesjamovi, fig.cap="Software output for the surgery-stress data.", fig.align="center", out.width='70%'}
knitr::include_graphics("jamovi/Stress/StressDescriptives.png")
```


::: {.exercise #MeanDiffCOVIDCI}
A study of $n = 213$ Spanish health students [@romero2020physical] measured (among other things) the number of minutes of vigorous physical activity (PA) performed by students *before* and *during* the <span style="font-variant:small-caps;">covid</span>-19 lockdown (from March\ to April\ 2020 in Spain).
Since the *before* and *during* lockdown were both measured on *each* participant, the data are *paired*.
The data are summarised in Table\ \@ref(tab:COVIDsummaryTable).

1. Explain what the differences *mean*.
1. Compute the standard error of the differences.
1. Perform a hypothesis test to compare the change in mean PA (including a CI).
:::

(ref:COVIDtable) Summary information for the <span style="font-variant:small-caps;">covid</span>-lockdown exercise data for $n = 214$ Spanish students.

```{r COVIDsummaryTable}
COVID.summary     <- array( dim = c(3, 2))
colnames(COVID.summary) <- c("Mean (mins)",
                             "Std dev. (mins)")
rownames(COVID.summary) <- c("Before",
                             "During",
                             "Increase")


COVID.summary[1, ] <- c(28.47,
                        54.13)
COVID.summary[2, ] <- c(30.66,
                        30.04)
COVID.summary[3, ] <- c(2.68,
                        51.30)


if( knitr::is_latex_output() ) {
  knitr::kable( pad(COVID.summary,
                    surroundMaths = TRUE,
                    targetLength = c(5, 5),
                    decDigits = 2),
                format = "latex",
                booktabs = TRUE,
                longtable = FALSE,
                escape = FALSE,
                caption = "(ref:COVIDtable)",
                align = "c") %>%
    row_spec(0, bold = TRUE) %>%
    row_spec(3, italic = TRUE) %>%
    row_spec(2,
             hline_after = TRUE) %>%
    kable_styling(font_size = 8)
}
if( knitr::is_html_output() ) {
  knitr::kable( pad(COVID.summary,
                    surroundMaths = TRUE,
                    targetLength = c(5, 5),
                    decDigits = 2),
                format = "html",
                align = "c",
                caption = "(ref:COVIDtable)" )
}
```


::: {.exercise #MeanDiffStudentEatingHabits}
What happens to students' eating habits when they start university?
Many students will be responsible for their own meals for the first time, so some  may forgo healthy foods for convenient, but less healthy, foods.
Alternatively, some may not be able to afford sufficient or healthy food.

@levitsky2004freshman recorded some students' weights as they began university, and then *the same* students' weight some later time.
They asked the RQ:

> For Cornell University students, what is the *mean weight change* in students after $12$ weeks at university?

The data collected to answer this RQ are shown
`r if (knitr::is_latex_output()) {
'in  Table\\ \\@ref(tab:DataWeightChange) [@DASL:WeightChange].'
} else {
'below [@DASL:WeightChange].'
}`


1. Use the software output (Fig.\ \@ref(fig:WeightGainOutput)) to compute an approximate $95$%\ CI for the weight *gain* from Weeks\ $1$ to\ $12$.
2. Use the software output to write down an exact $95$%\ CI for the weight *gain* from Weeks\ $1$ to\ $12$.
3. Comment on the two CIs.
4. Are the CIs statistically valid?
5. Conduct a hypothesis tests to determine if there is a change in mean weight *gain* from Weeks\ $1$ to\ $12$.
6. Do you think the weight gain would be of practical importance?


```{r DataWeightChange}
data(StudentWt) ### Exercise

SWlen <- length(StudentWt$Week1)

Labels <- 1 : length(StudentWt$Student)

tb1 <- array( cbind( Labels[1:5 ],
                     StudentWt$Week1[1:5 ],
                     StudentWt$Week12[1:5 ],
                     StudentWt$GainWt[1:5 ]),
              dim = c(5, 4) )


T1 <- knitr::kable(pad(tb1,
                       surroundMaths = TRUE,
                       targetLength = 4,
                       decDigits = c(0, 1, 1, 1)),
                   format = "latex",
                   valign = 't',
                   align = "c",
                   linesep = "",
                   col.names = c("Student", 
                                 "Week 1", 
                                 "Week 12", 
                                 "Weight gain"),
                   row.names = FALSE,
                   escape = FALSE,
                   booktabs = TRUE) %>%
  add_header_above(c( " " = 1, 
                      "Weight (in kg)" = 3),
                   line = TRUE,
                   bold = TRUE) %>%
  row_spec(0, bold = TRUE)


tb2 <- array( cbind( Labels[(SWlen - 4):SWlen ],
                     StudentWt$Week1[(SWlen - 4):SWlen ],
                     StudentWt$Week12[(SWlen - 4):SWlen ],
                     StudentWt$GainWt[(SWlen - 4):SWlen ]),
              dim = c(5, 4) )


T2 <- knitr::kable(pad(tb2,
                       surroundMaths = TRUE,
                       targetLength = 4,
                       decDigits = c(0, 1, 1, 1)),
                   format = "latex",
                   valign = 't',
                   align = "c",
                   linesep = "",
                   col.names = c("Student", 
                                 "Week 1", 
                                 "Week 12", 
                                 "Weight gain"),
                   row.names = FALSE,
                   escape = FALSE,
                   booktabs = TRUE) %>%
  add_header_above(c( " " = 1, 
                      "Weight (in kg)" = 3),
                   line = TRUE,
                   bold = TRUE) %>%
  row_spec(0, bold = TRUE)

out <- knitr::kables(list(T1, T2),
                     format = "latex",
                     label = "DataWeightChange",
                     caption = "The student weight-change data, showing the weight of students in Week\\ 1 at university, in Week\\ 12, and the weight gain (all in kg). These are the first five and the last five of the $68$ total observations. (A negative weight gain means a weight loss.)") %>% 
  kable_styling(font_size = 8)
out2 <- prepareSideBySideTable(out,
                               gap = "\\enskip") 
out2

```


```{r}
if( knitr::is_html_output() ) {
  DT::datatable(StudentWt,
                fillContainer = FALSE, # Make more room, so we don't just have ten values
                colnames = c("Student", 
                             "Week 1",
                             "Week 12",
                             "Weight gain"),
                filter = "none",
                options = list(searching = FALSE), # Remove searching: See: https://stackoverflow.com/questions/35624413/remove-search-option-but-leave-search-columns-option
                caption = "Weight at 1 and 12 weeks after the start of semester (in kg). (A negative weight gain means a weight loss.)")
}
```


```{r WeightGainOutput, fig.cap="The weight-gain data: software output.", fig.align="center", out.width=c("60%", "100%"), fig.show="hold"}
knitr::include_graphics("jamovi/WeightGain/WeightGain-Numericals.png")
knitr::include_graphics("jamovi/WeightGain/WeightGain-PairedT.png")
```
:::


```{r out.width='80%'}
data(Anorexia) ### Exercise

ANCB <- subset(Anorexia, 
               Treatment=="CB")
```


::: {.exercise #PairedCIExercisesAnorexia}
[*Dataset*: `Anorexia`]
Young girls with anorexia ($n = 29$) received cognitive behavioural treatment (@data:hand:handbook), and their weight before and after treatment were recorded.
In summary:

* Before the treatment, the mean weight was $82.69$\ pounds ($s = 4.845$\ pounds);
* After the treatment, the mean weight was $85.70$\ pounds ($s = 8.352$\ pounds).

The mean weight gain per girls was\ $`r round(mean( ANCB$After - ANCB$Before), 2)`$\ pounds, with a standard deviation of\ $`r round(sd( ANCB$After - ANCB$Before), 2)`$\ pounds.
Find an approximate $95$%\ CI for the population mean weight gain. 
Do you think the treatment had any meaningful impact on the mean weight gain of the girls, based solely on these data?
:::


::: {.exercise #PairedCIExercisesSoilN}
[*Dataset*: `SoilCN`]
@lambie2021microbial compared the percentage nitrogen (%N) in soils from intensively-grazed irrigated and non-irrigated pastures.
The researchers *paired*\index{Comparison!within individuals} similar irrigated and non-irrigated sites (p.\ 338):

> The irrigated and non-irrigated pairs within each site were within\ $100\ms$ of each other and were on the same soil, landform and usually the same farm with the same farm management...

One RQ in the study was:

> For intensively grazed pastures sites, is there a mean reduction in percentage soil nitrogen (%N) when sites are irrigated, compared to non-irrigated?

The data are shown in
`r if( knitr::is_latex_output() ) {
'Table\\ \\@ref(tab:SoilCN).'
} else {
'the table below.'
}`
Use the data to answer the RQ.


```{r}
data(SoilCN) ### Exercise

SoilN <- subset(SoilCN, 
                select = c(IrrigatedN, NonirrigatedN))
SoilN$Change <- SoilN$NonirrigatedN - SoilN$IrrigatedN

Nlen <- length(SoilN$TotalCI) 

if( knitr::is_latex_output() ) {
  
  T1 <- knitr::kable(pad(SoilN[1:14, ],
                         surroundMaths = TRUE,
                         targetLength = 5,
                         decDigits = 2),
                     format = "latex",
                     valign = 't',
                     align = "c",
                     linesep = "",
                     col.names = c("irrigated", 
                                   "irrigated", 
                                   "when irrigated"),
                     row.names = FALSE,
                     escape = FALSE,
                     booktabs = TRUE) %>%
    add_header_above( c("%N:" = 1, 
                        "%N: Not" = 1, 
                        "%N: reduction" = 1),
                      line = FALSE,
                      bold = TRUE) %>%
    row_spec(0, bold = TRUE) 
  
  
  T2 <- knitr::kable(pad(SoilN[15:28, ],
                         surroundMaths = TRUE,
                         targetLength = 5,
                         decDigits = 2),
                     format = "latex",
                     valign = 't',
                     align = "c",
                     linesep = "",
                     col.names = c("irrigated", 
                                   "irrigated", 
                                   "when irrigated"),
                     row.names = FALSE,
                     escape = FALSE,
                     booktabs = TRUE) %>%
    add_header_above( c("%N:" = 1, 
                        "%N: Not" = 1, 
                        "%N: reduction" = 1),
                      line = FALSE,
                      bold = TRUE) %>%
    row_spec(0, bold = TRUE)
  
  out <- knitr::kables(list(T1, T2),
                       format = "latex",
                       label = "SoilCN",
                       caption = "The percentage total soil nitrogen (\\%N) in irrigated and non-irrigated soils in $28$\\ sites.") %>% 
    kable_styling(font_size = 8)
  out2 <- prepareSideBySideTable(out,
                                 gap = "\\enskip") 
  out2 
}

if( knitr::is_html_output() ) {
  kable( pad(SoilN,
             surroundMaths = TRUE,
             poorMansNegative = TRUE,
             targetLength = 5,
             decDigits = 2),
         format = "html",
         align = "c",
         booktabs = TRUE,
         longtable = FALSE,
         col.names = c("Irrigated", "Not irrigated", "Reduction"),
         caption = "The percentage total nitrogen in irrigated and non-irrigated soils in $n = 28$ sites.") %>% 
    row_spec(0, bold = TRUE)
}
```


```{r Nitrogenjamovi, fig.cap="Software output for the nitrogen data. In the top table, the difference is implied as non-irrigated minus irrigated.", fig.align="center", out.width=c("100%","60%"), fig.show='hold'}
knitr::include_graphics("jamovi/SoilCN/SoilCN-Testing.png")
knitr::include_graphics("jamovi/SoilCN/SoilCN-Summary.png")
```
:::


:::{.exercise #PairedCIJumping}
[*Dataset*: `Jumping`]
@hebert2023effect recorded double-legged jumping distance for $80$ healthy people, when they wore both shoes and were barefoot (Exercise\ \@ref(exr:CompareWithinJumping)).
Use the data to form a $95$%\ CI to estimate the mean distance people can jump further when barefoot.
:::


:::{.exercise #PairedCIWCTennis}
[*Dataset*: `WCTennis`]
@alberca2022sprint recorded the push-time for French wheelchair tennis players, while holding a racquet and not holding a racquet (Table\ \@ref(tab:WCTennis); @alberca2022sprintDATA).
Use the data to form a $95$%\ CI to estimate the mean difference between push-times with and without a racquet.
:::


`r if( knitr::is_latex_output() ) "\\captionsetup{font=normalsize}"`


<!-- QUICK REVIEW ANSWERS -->
`r if (knitr::is_html_output()) '<!--'`
::: {.EOCanswerBox .EOCanswer data-latex="{iconmonstr-check-mark-14-240.png}"}
**Answers to *Quick Revision* questions:**
**1.** True.
**2.** True.
**3.** True.
**4.** False. 
**5.** False. 
**6.** True. 
**7.** True. 
**8.** True.
**9.** True.
:::
`r if (knitr::is_html_output()) '-->'`