-
Notifications
You must be signed in to change notification settings - Fork 7
/
18-Tools-Probability.Rmd
1298 lines (1044 loc) · 46 KB
/
18-Tools-Probability.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
# (PART) Tools for answering RQs {-}
# Probability {#Probability}
```{r, child = if (knitr::is_html_output()) {'./introductions/18-Tools-Probability-HTML.Rmd'} else {'./introductions/18-Tools-Probability-LaTeX.Rmd'}}
```
<!-- Define colours as appropriate -->
```{r, child = if (knitr::is_html_output()) {'./children/coloursHTML.Rmd'} else {'./children/coloursLaTeX.Rmd'}}
```
## Introduction {#Chap19Intro}
This chapter briefly discusses *probability*.
*Probability* quantifies the chance of observing a specific, unknown result (an 'event').
Before discussing probability, some associated terms needs defining.
First, a *random procedure* must be defined.
::: {.definition #RandomProcedure name="Random procedure"}
\index{Random procedure}
A *random procedure* is a sequence of well-defined steps that (a)\ can be repeated, in theory, indefinitely under essentially identical conditions; (b)\ has well-defined results; and (c)\ has result that are unpredictable for any individual repetition.
:::
Using this definition, the result of rolling a die is a 'random procedure', with possible results
`r if (knitr::is_latex_output()) {
'\\largedice{1}'
} else {
'<span class="larger-die">⚀</span>'
}`,
`r if (knitr::is_latex_output()) {
'\\largedice{2}'
} else {
'<span class="larger-die">⚁</span>'
}`,
`r if (knitr::is_latex_output()) {
'\\largedice{3}'
} else {
'<span class="larger-die">⚂</span>'
}`,
`r if (knitr::is_latex_output()) {
'\\largedice{4}'
} else {
'<span class="larger-die">⚃</span>'
}`,
`r if (knitr::is_latex_output()) {
'\\largedice{5}'
} else {
'<span class="larger-die">⚄</span>'
}`
and
`r if (knitr::is_latex_output()) {
'\\largedice{6}'
} else {
'<span class="larger-die">⚅</span>'
}`.
Similarly, tossing a coin is a random procedure with two possible results: **Heads** or **Tails**.
## Sample spaces, events and probability {#SampleSpaceEvents}
A list of all distinct possible results from one instance of a random procedure is the *sample space*.
A *simple event* is any element of the sample space.\index{Event}
::: {.definition #SampleSpace name="Sample space"}
\index{Sample space}
The *sample space* is a list of all possible and distinct results after administering a random procedure once.
:::
::: {.definition #SimpleEvent name="Simple event"}
\index{Event!simple}
A *simple event* is a single element of the sample space.
:::
<div style="float:right; width: 222x; border: 1px; padding:10px">
<img src="Illustrations/pexels-skitterphoto-705171.jpg" width="200px"/>
</div>
::: {.example #SampleSpaceDie name="Sample spaces"}
Consider rolling a fair, six-sided die (the random procedure).
We do not know what face will be uppermost until we roll the die.
However, the *sample space* for this procedure can be listed:
`r if (knitr::is_latex_output()) {
'\\largedice{1}'
} else {
'<span class="larger-die">⚀</span>'
}`,
`r if (knitr::is_latex_output()) {
'\\largedice{2}'
} else {
'<span class="larger-die">⚁</span>'
}`,
`r if (knitr::is_latex_output()) {
'\\largedice{3}'
} else {
'<span class="larger-die">⚂</span>'
}`,
`r if (knitr::is_latex_output()) {
'\\largedice{4}'
} else {
'<span class="larger-die">⚃</span>'
}`,
`r if (knitr::is_latex_output()) {
'\\largedice{5}'
} else {
'<span class="larger-die">⚄</span>'
}`
and\
`r if (knitr::is_latex_output()) {
'\\largedice{6}'
} else {
'<span class="larger-die">⚅</span>'
}`.
These are all mutually exclusive\index{Mutually exclusive} (or distinct) results and cover all possible results (exhaustive)\index{Exhaustive} from a single roll.
The sample space is *discrete*.\index{Quantitative data!discrete}
The event 'rolling a
`r if (knitr::is_latex_output()) {
'\\largedice{1}'
} else {
'<span class="larger-die">⚀</span>'
}`'
is a simple event.
:::
Combinations of the elements in the sample space are usually of more interest than simple events.
These are called *compound events*.
::: {.definition #CompoundEvent name="Compound event"}
\index{Event!compound}
A *compound event* is any combination of simple events (i.e., of elements in the sample space).
:::
::: {.example #Events name="Events"}
Some *events* that can be defined using the sample space in Example\ \@ref(exm:SampleSpaceDie) include:
* Rolling a
`r if (knitr::is_latex_output()) {
'\\largedice{4}'
} else {
'<span class="larger-die">⚃</span>'
}`:
this *simple event* includes one element of the sample space:
`r if (knitr::is_latex_output()) {
'\\largedice{4}'
} else {
'<span class="larger-die">⚃</span>'
}`.
* Rolling an odd number:
this *compound event* includes three elements of the sample space:
`r if (knitr::is_latex_output()) {
'\\largedice{1}'
} else {
'<span class="larger-die">⚀</span>'
}`,
`r if (knitr::is_latex_output()) {
'\\largedice{3}'
} else {
'<span class="larger-die">⚂</span>'
}` and
`r if (knitr::is_latex_output()) {
'\\largedice{5}'
} else {
'<span class="larger-die">⚄</span>'
}`.
* Rolling a number larger than
`r if (knitr::is_latex_output()) {
'\\largedice{2}'
} else {
'<span class="larger-die">⚁</span>'
}`:
this *compound event* includes four elements of the sample space:
`r if (knitr::is_latex_output()) {
'\\largedice{3}'
} else {
'<span class="larger-die">⚂</span>'
}`,
`r if (knitr::is_latex_output()) {
'\\largedice{4}'
} else {
'<span class="larger-die">⚃</span>'
}`,
`r if (knitr::is_latex_output()) {
'\\largedice{5}'
} else {
'<span class="larger-die">⚄</span>'
}` and
`r if (knitr::is_latex_output()) {
'\\largedice{6}'
} else {
'<span class="larger-die">⚅</span>'
}`.
The sample space is *discrete* (see Sect.\ \@ref(QuantData)).
:::
::: {.example #SampleSpaceThrowing name="Sample spaces and events"}
Consider the distance you can throw a baseball (the random procedure).
We do not know beforehand what distance your next throw will be, but the *sample space* (i.e., the throwing distance) is a number greater than $0\ms$.
This sample space is *continuous*.\index{Quantitative data!continuous}
Many *compound events* can be defined using this sample space; for example:
* throwing more than\ $50\ms$.
* throwing between\ $10$ and\ $40\ms$.
Because the sample space is continuous, throwing an *exact* distance (such as *exactly*\ $10\ms$) is technically not possible (see Sect.\ \@ref(QuantData)).
:::
Events are often defined using **and**, **or**, **not**.
Consider two events called\ $A$ and\ $B$.
Then, '$A$ **and**\ $B$' is the event comprising events in\ $A$ and also in\ $B$.
(In other words, events in *both*\ $A$ and\ $B$.)
'$A$ **or**\ $B$' is the event comprising events in\ $A$, events in\ $B$, and events in both.
The event '**not**\ $A$' comprises all the events in the sample space that are *not* in Event\ $A$.
::: {.example #ComplicatedEvents name="Defining events"}
Consider rolling a fair, six-sided die again (Example\ \@ref(exm:SampleSpaceDie)).
Suppose these two (compound) events are defined:
* Event\ $A$: Roll a number divisible by\ $2$.
* Event\ $B$: Roll a number divisible by\ $3$.
Event\ $A$ comprises the simple events
`r if (knitr::is_latex_output()) {
"'roll a \\largedice{2}'"
} else {
'*roll a <span class="larger-die">⚁</span>*'
}`,
`r if (knitr::is_latex_output()) {
"'roll a \\largedice{4}'"
} else {
'*roll a <span class="larger-die">⚃</span>*'
}` and
`r if (knitr::is_latex_output()) {
"'roll a \\largedice{6}'"
} else {
'*roll a <span class="larger-die">⚅</span>*'
}`.
Event\ $B$ comprises the simple events
`r if (knitr::is_latex_output()) {
"'roll a \\largedice{3}'"
} else {
'*roll a <span class="larger-die">⚂</span>*'
}` and
`r if (knitr::is_latex_output()) {
"'roll a \\largedice{6}'"
} else {
'*roll a <span class="larger-die">⚅</span>*'
}`.
Then, the Event\ '$A$ **and**\ $B$' includes all events in\ $A$ and *also* in\ $B$; that is, '$A$ **and**\ $B$' comprises the single simple event
`r if (knitr::is_latex_output()) {
"'roll a \\largedice{6}'"
} else {
'*roll a <span class="larger-die">⚅</span>*'
}`.
Event\ '$A$ **or**\ $B$' include the events in\ $A$, the events in\ $B$, and those in both; that is, '$A$ **or**\ $B$' comprises the four simple events
`r if (knitr::is_latex_output()) {
"'roll a \\largedice{2}'"
} else {
'*roll a <span class="larger-die">⚁</span>*'
}`,
`r if (knitr::is_latex_output()) {
"'roll a \\largedice{3}'"
} else {
'*roll a <span class="larger-die">⚂</span>*'
}`,
`r if (knitr::is_latex_output()) {
"'roll a \\largedice{4}'"
} else {
'*roll a <span class="larger-die">⚃</span>*'
}` and\
`r if (knitr::is_latex_output()) {
"'roll a \\largedice{6}'"
} else {
'*roll a <span class="larger-die">⚅</span>*'
}`.
The event '**not**\ $A$' comprises the three simple events
`r if (knitr::is_latex_output()) {
"'roll a \\largedice{1}'"
} else {
'*roll a <span class="larger-die">⚀</span>*'
}`,
`r if (knitr::is_latex_output()) {
"'roll a \\largedice{3}'"
} else {
'*roll a <span class="larger-die">⚂</span>*'
}` and\
`r if (knitr::is_latex_output()) {
"'roll a \\largedice{5}'"
} else {
'*roll a <span class="larger-die">⚄</span>*'
}`.
:::
Using these definitions, a *probability* can be defined.\index{Probability}
<div style="float:right; width: 222x; border: 1px; padding:10px">
<img src="Pics/iconmonstr-coin-5-240.png" width="50px"/>
</div>
::: {.definition #Probability name="Probability"}
A *probability* is a number between $0$ and $1$ inclusive (or between\ $0$% and\ $100$% inclusive) that quantifies the likelihood of a certain event occurring.
:::
A probability of\ $0$ (or\ $0$%) means the event is 'impossible' (will *never* occur), and a probability of\ $1$ (or\ $100$%) means that the event is *certain* to happen (will *always* occur).
Most events have a probability between the extremes of\ $0$% and\ $100$%.
::: {.example #Probabilities name="Probabilities"}
Consider these cases:
* The probability of receiving negative rainfall is\ $0$; it is impossible.
* The probability of receiving some rain in London next year is\ $1$; it is certain.
* The probability of receiving rain on 01\ January next year in London is between\ $0$ and\ $1$ inclusive.
:::
## Determining probabilities {#DetermineProbabilities}
The probability of an event occurring can be determined, or approximated, in different ways, including:
* the *classical approach* (Sect.\ \@ref(ProbClassical));
* the *relative frequency approach* (Sect.\ \@ref(ProbRelFreq)); and
* the *subjective approach* (Sect.\ \@ref(ProbSubjective)).
### Classical approach {#ProbClassical}
\index{Probability!classical approach}
What is the probability of rolling a
`r if (knitr::is_latex_output()) {
'\\largedice{4}'
} else {
'<span class="larger-die">⚃</span>'
}`
on a die?
The sample space has six possible outcomes (see Example\ \@ref(exm:SampleSpaceDie)) that are *equally likely* to occur, and the event 'rolling a
`r if (knitr::is_latex_output()) {
"\\largedice{4}'"
} else {
'<span class="larger-die">⚃</span>\''
}`
comprises just *one* of those events.
Thus,
`r if (knitr::is_html_output()) '<!--'`
$$
\text{Probability of rolling a $\largedice{4}$}
= \frac{\text{The number of results that are a $\largedice{4}$}}{\text{The number of possible results}}
= \frac{1}{6}.
$$
`r if (knitr::is_html_output()) '-->'`
`r if (knitr::is_latex_output()) '<!--'`
$$
\text{Probability of rolling a 4}
= \frac{\text{The number of results that are a 4}}{\text{The number of possible results}}
= \frac{1}{6}.
$$
`r if (knitr::is_latex_output()) '-->'`
This approach to computing probabilities is called the *classical* approach to probability, and is only appropriate when all events in the sample space are *equally likely*.
::: {.definition #ClassicalApproachToProbability name="Classical approach to probability"}
In the *classical approach to probability*, the probability of an event occurring is the number of elements of the sample space included in the event, divided by the total number of elements in the sample space, *when all outcomes are equally likely*.
:::
By this definition:
$$
\text{Prob. of an event}
=
\frac{\text{Number of results in the event of interest}}{\text{Total number of equally-likely results}}.
$$
We can say that 'the probability of rolling a
`r if (knitr::is_latex_output()) {
'\\largedice{4}'
} else {
'<span class="larger-die">⚃</span>'
}`
is $1/6$', or 'the probability of rolling a
`r if (knitr::is_latex_output()) {
'\\largedice{4}'
} else {
'<span class="larger-die">⚃</span>'
}`
is (approximately) $0.1667$'.
The answer can also be expressed as a *percentage*: 'the probability of rolling a
`r if (knitr::is_latex_output()) {
'\\largedice{4}'
} else {
'<span class="larger-die">⚃</span>'
}`
is (approximately) $16.67$%'.\index{Percentages}
The answer could also be interpreted as 'the *expected* proportion of rolls that are a
`r if (knitr::is_latex_output()) {
'\\largedice{4}'
} else {
'<span class="larger-die">⚃</span>'
}`
is (approximately) $0.1667$'.\index{Proportions}
That is, about\ $16.67$% of a very large number of future rolls are likely to be a
`r if (knitr::is_latex_output()) {
'\\largedice{4}'
} else {
'<span class="larger-die">⚃</span>'
}`.
The probability of rolling a
`r if (knitr::is_latex_output()) {
'\\largedice{4}'
} else {
'<span class="larger-die">⚃</span>'
}`
is $0.1667$, but any single roll of the die either *will* or *will not* produce a
`r if (knitr::is_latex_output()) {
'\\largedice{4}'
} else {
'<span class="larger-die">⚃</span>'
}`, and we don't know which will occur.
::: {.example #SimpleProb name="Probabilities for compound events"}
Consider rolling a standard six-sided die.
With six equally-likely results (Example\ \@ref(exm:SampleSpaceDie)), the probability of rolling an even number is\ $3/6$, since there are three even numbers in the sample space.
:::
::: {.example #ProbabilityOutcomes name="Describing probability"}
Consider rolling a standard six-sided die.
* The *probability* of rolling an even number is $3 \div 6 = 0.5$.
* The *percentage* of rolls expected to be even is $3 \div 6 \times 100 = 50$%.
* The *odds* of rolling an even number is $3\div 3 = 1$.
:::
::: {.example #EventsAndProb name="Probabilities"}
The probability of the events listed in Example\ \@ref(exm:Events) are:
* The probability of rolling a
`r if (knitr::is_latex_output()) {
'\\largedice{4}'
} else {
'<span class="larger-die">⚃</span>'
}` is\ $1/6$ (or about\ $0.1667$).
* The probability of rolling an odd number is\ $3/6$, or\ $1/2$ (or\ $0.5$).
* The probability of rolling a number larger than
`r if (knitr::is_latex_output()) {
'\\largedice{2}'
} else {
'<span class="larger-die">⚁</span>'
}`
is\ $4/6$, or\ $2/3$ (or about\ $0.6667$).
:::
::: {.importantBox .important data-latex="{iconmonstr-warning-8-240.png}"}
*Probabilities* describe the likelihood that an event will occur *before* the result is known.
*Odds* and *proportions* can be used either *before* or *after* the result is known, provided the wording is correct.
For example, *proportions* describe how often an event has occurred *after* the result is known, and *expected proportions* describe the likelihood that an event will occur in many repetitions *before* the result is known.
:::
The following example may help explain.
::: {.example #ProbProportioOdds name="Probabilities, proportions and odds"}
*Before* a fair coin is tossed:\index{Probability}\index{Proportions}\index{Odds}
* The *probability* of throwing a head is $1/2 = 0.5$ (or\ $50$%).
* The *expected proportion* of heads for many coin tosses is\ $0.5$.
* The *odds* of throwing a head is $1/1 = 1$.
If we have *already* tossed a coin $100$ times and found $47$\ heads:
* The *proportion* of heads in the sample is $47/100 = 0.47$ (or\ $47$%).
* The odds that we *threw* a head in the sample is $47/53 = 0.887$.
The 'probability that we just threw a head' makes no sense, because the result is known.
:::
### Relative frequency approach {#ProbRelFreq}
\index{Probability!relative frequency approach}
<div style="float:right; width: 222x; border: 1px; padding:10px">
<img src="Illustrations/father-22194_640.jpg" width="200px"/>
</div>
What is the probability that a newborn baby will be a boy?
The sample space could be listed as: *boy* and *non-boy*.
Since the sample space has two elements, the classical approach suggests the probability is $1\div2 = 0.5$.
However, this approach is appropriate *only if* boys and non-boys are *equally likely* to be born.
But are they?
In
`r if (knitr::is_latex_output()) {
'Australia'
} else {
'[Australia](https://www.aihw.gov.au/reports/mothers-babies/australias-mothers-babies/data)'
}`
in 2021, $289\,603$ live births occurred, with $148\,636$ male births, $140\,944$ female births, and $23$\ others (or 'not stated').
The *proportion* of boys born in the 2021 sample is $148\,636\div 289\,603 = 0.513$, or about\ $51.3$%.
An *estimate* of the probability that the next birth will be a boy is about\ $0.513$ (or\ $51.3$%), using past data.
This is the *relative frequency* approach to calculating probabilities: using past data.
Using the relative frequency method can only ever produce an *approximate* probability, as it is based on a limited number of past observations.
An actual probability would require an infinite number of observations.
::: {.definition #RelativeFrequencyApproachToProbability name="Relative frequency approach to probability"}
In the *relative frequency approach to probability*, the probability of an event is *approximately* the number of times the outcomes of interest has appeared in the past, divided by the number of 'attempts' in the past.
This produces an *approximate* probability.
:::
::: {.example #RFProbability name="Relative frequency probability"}
Based on the earlier information, the *odds* that a new baby will be a boy is *approximately* $0.513\div (1 - 0.513) = 1.053$.\index{Odds}
According to the
`r if (knitr::is_latex_output()) {
'*Australian Bureau of Statistics* (ABS):'
} else {
'[*Australian Bureau of Statistics* (ABS)](http:www.abs.gov.au/ausstats/[email protected]/0/B8865D71D84F5210CA2579330016754C?opendocument):'
}`
> The sex ratio for all births registered in Australia generally fluctuates around\ $105.5$ male births per\ $100$ female births.
This is close to the odds of\ $1.053$ found above.
:::
### Subjective approach {#ProbSubjective}
\index{Probability!subjective approach}
<div style="float:right; width: 222x; border: 1px; padding:10px">
<img src="Illustrations/cyclone-62957_640.jpg" width="200px"/>
</div>
Many probabilities cannot be computed using the classical or relative frequency approach; for example, what is the probability that your sporting team wins their next game?
In this case, only a *subjective probability* can be given.
'Subjective' probabilities may be based on personal judgement or experience.
They can also be given by considering all the relevant issues that may impact the probability (and may, for example, be based on mathematical models that incorporate information from numerous inputs).
Depending on how these other issues are considered and combined, different subjective probabilities may be given.
Weather forecasts are one example: they incorporate data from sea surface temperatures, local topography, air pressures, air temperatures and so on.
Different models use different inputs, and may combine these inputs differently to produce different (subjective) forecast probabilities.
Subjective probabilities are deductive probabilities (based on reasoning).
::: {.definition #SubjectiveApproachToProbability name="Subjective approach to probability"}
In the *subjective approach to probability*, various factors are incorporated, perhaps subjectively, to determine the probability of an event occurring.
:::
::: {.example #SubjectiveProbElNino name="Subjective probability"}
During El `r readr::parse_character( c("Niño"), locale = locale(encoding = "UTF-8"))` events, eastern Australia typically experiences drier-than-average winters and springs.
The
`r if (knitr::is_latex_output()) {
"*Australian Broadcasting Corporation*'s news website"
} else {
"[*Australian Broadcasting Corporation*'s news website](https://www.abc.net.au/news/2023-05-23/noaa-bom-el-nino-chances-explored/102341466)"
}`
reported (on 23\ May 2023) that the Australian *Bureau of Meteorology* predicted a $50$%\ probability of an El `r readr::parse_character( c("Niño"), locale=locale(encoding="UTF-8"))` event in\ 2023, while the American *National Oceanic and Atmospheric Administration* predicted a $90$%\ chance of an El `r readr::parse_character( c("Niño"), locale=locale(encoding="UTF-8"))` event in\ 2023.
Despite this, '[both] agencies are looking at the same part of the Pacific Ocean' to make their predictions.
However, 'the US and Australia base their probability on different criteria'.
The probabilities are subjective probabilities, based on complex mathematical models.
:::
## Independence of events {#Independence}
\index{Independence}
One important concept in probability is *independence*.
Two events are *independent* if the probability of one event happening is the same, whether or not the other event has happened.
For example, the probability of getting a head on a coin toss is the same whether you are sitting or not sitting: the result of the coin toss is *independent* of your position.
<div style="float:right; width: 222x; border: 1px; padding:10px">
<img src="Pics/iconmonstr-arrow-44-240.png" width="50px"/>
</div>
::: {.definition #Independence name="Independence"}
Two events are *independent* if the probability of one event is the same, whether or not the other event has happened.
:::
::: {.example #IndependenceCards name="Independence"}
Consider drawing two cards from a well-shuffled, fair pack (of $52$\ cards), *without* returning the first card.
For the *first* card, the sample space contains every card in the pack, and drawing any card is as equally likely as drawing any other.
Since four cards are **Aces**, the probability of drawing an **Ace** on the first draw is $4/52$ (using the classical approach).
If we drew an **Ace** for the first card, the probability of drawing an **Ace** for the *second* card is $3/51$ (*three* **Aces** remain among the $51$ remaining cards).
Alternatively, if we *don't* draw an **Ace** for the first card, the probability of drawing an **Ace** second time is $4/51$ (*four* **Aces** remain among the $51$\ remaining cards).
That is, the probability of drawing an **Ace** for the second card *depends* on whether an **Ace** was drawn for the first card.
The two events 'Drawing an **Ace** for the first card' and 'Drawing an **Ace** for the second card' are *not independent* events.
:::
::: {.tipBox .tip data-latex="{iconmonstr-info-6-240.png}"}
A 'standard' pack of cards has $52$\ cards, organised into four *suits*: spades $\spadesuit$, clubs $\clubsuit$ (both black),
hearts $\heartsuit$ and diamonds $\diamondsuit$ (both red).
Each *suit* has $13$\ *denominations*: $2$, $3$, $4$, $5$, $6$, $7$, $8$, $9$, $10$, Jack\ (J), Queen\ (Q), King\ (K), Ace\ (A).
The Ace, King, Queen and Jack cards are called *picture cards*.
(Most packs also contain two jokers, which are not considered part of a *standard* pack.)
:::
:::: {.importantBox .important data-latex="{iconmonstr-warning-8-240.png}"}
Random samples produce *independent* units of analysis.\index{Sampling!random}\index{Units of analysis}
:::
<iframe src="https://learningapps.org/watch?v=pbd3ekn3k22" style="border:0px;width:100%;height:800px" allowfullscreen="true" webkitallowfullscreen="true" mozallowfullscreen="true"></iframe>
## Conditional probability {#ConditionalProbability}
\index{Probability!conditional}
*Conditional probability* refers to adjusting probabilities when extra information is known.
For example, the probability of rolling a
`r if (knitr::is_latex_output()) {
'\\largedice{1}'
} else {
'<span class="larger-die">⚀</span>'
}`
is $1/6$ using the classical approach, as the sample space has six equally-likely elements.
However, if we are told that the number rolled is an *odd number*, only three elements in the sample space need now be considered (rolls of
`r if (knitr::is_latex_output()) {
'\\largedice{1}'
} else {
'<span class="larger-die">⚀</span>'
}`,
`r if (knitr::is_latex_output()) {
'\\largedice{3}'
} else {
'<span class="larger-die">⚂</span>'
}`,
`r if (knitr::is_latex_output()) {
'\\largedice{5})'
} else {
'<span class="larger-die">⚄</span>)'
}`
rather than six elements.
No other outcome is possible, so the probability of rolling a
`r if (knitr::is_latex_output()) {
'\\largedice{1}'
} else {
'<span class="larger-die">⚀</span>'
}`
is $1/3$.
We say 'the probability of rolling a
`r if (knitr::is_latex_output()) {
'\\largedice{1}'
} else {
'<span class="larger-die">⚀</span>'
}`,
*given that the roll is an odd number*, is\ $1/3$'.
:::{.example #ConditionalCards name="Conditional probability"}
Suppose someone draws a card from a pack of cards.
The probability that the card is a\ $\clubsuit$ is $13/52 = 1/4$, or\ $25$%.
However, if that person tells you that the card is a *black* card, then the card must be either a\ $\clubsuit$ or\ $\spadesuit$.
Hence, the probability that the card is a\ $\clubsuit$, *given* that the card is black, is $13/26 = 1/2$, or\ $50$%.
:::
:::{.example #Sunglasses name="Wearing sunglasses"}
@data:Dexter2019:SunProtection recorded the number of people at the foot of the Goodwill Bridge, Brisbane, who wore sunglasses between $11$:$30$am to $12$:$30$pm (Table\ \@ref(tab:SunglassesTableProb)).
The probability of an observed person wearing sunglasses is
$$
\frac{126 + 123}{126 + 123 + 240 + 263} = 0.3311,
$$
or about $33.1$%.
Conditional probabilities can also be computed:
* *If the observed person is female*, the probability that she is wearing sunglasses is $126\div (240 + 126) = 0.3443$, or about\ $34.4$%.
* *If the observed person is male*, the probability that he is wearing sunglasses is $123\div (263 + 123) = 0.3187$, or about\ $31.9$%.
These probabilities are close, but not exactly equal.
If the two events were *independent*, then these two conditional probabilities would be the same: the probability of wearing sunglasses would be the same for females and males.
In other words, the probability of wearing sunglasses did not depend on whether a female or a male was observed.
We might say that wearing sunglasses is close to, but not exactly, independent of the sex of the person, *in the sample*.
We cannot be sure if wearing sunglasses in independent of the sex of the person in the *population*.
:::
```{r SunglassesTableProb}
data(HatSunglasses)
SG.Table <- xtabs(Count ~ Sunglasses + Gender,
data = HatSunglasses)
rownames(SG.Table) <- c("Not wearing sunglasses",
"Wearing sunglasses")
if( knitr::is_latex_output() ) {
kable(pad(SG.Table,
surroundMaths = TRUE,
targetLength = 3,
decDigits = 0),
format = "latex",
booktabs = TRUE,
longtable = FALSE,
escape = FALSE,
align = "c",
caption = "Females and males wearing sunglasses on the Goodwill Bridge, Brisbane.") %>%
row_spec(0, bold = TRUE) %>%
kable_styling(font_size = 8)
}
if( knitr::is_html_output() ) {
kable(pad(SG.Table,
surroundMaths = TRUE,
targetLength = 3,
decDigits = 0),
format = "html",
booktabs = TRUE,
longtable = FALSE,
align = "c",
caption = "Females and males wearing sunglasses on the Goodwill Bridge, Brisbane.") %>%
row_spec(0, bold = TRUE)
}
```
## Chapter summary {#ToolsProbabilitySummary}
A *probability* is a number between $0$ and $1$ inclusive (or between\ $0$% and\ $100$% inclusive) that quantifies the likelihood of a certain event occurring.
Three ways to compute probabilities are:
* the *classical approach*, which requires all outcomes to be *equally likely*;
* the *relative frequency* approach (giving approximate probabilities); and
* the *subjective approach* (deductive probabilities).
Two events are *independent* if the probability of one event is the same, whether the other event has happened or not.
Conditional probability incorporate extra information when the probability is computed.
## Quick review questions {#ToolsProbabilityQuickReview}
::: {.webex-check .webex-box}
Suppose Event\ $A$ is defined as '*Rolling a
`r if (knitr::is_latex_output()) {
'\\largedice{1}'
} else {
'<span class="larger-die">⚀</span>'
}`
or a
`r if (knitr::is_latex_output()) {
'\\largedice{2}'
} else {
'<span class="larger-die">⚁</span>'
}`
on a fair die*'.
Also, suppose Event\ $B$ is defined as '*Rolling an even number on the same die*'.
Are the following statements *true* or *false*?
1. The best *approach* to computing the probability of Event\ $A$ occurring is the *classical* approach.\tightlist
`r if( knitr::is_html_output() ) {torf( answer=TRUE )}`
2. The *probability* of Event\ $A$ occurring is $2/6$.
`r if( knitr::is_html_output() ) {torf( answer=TRUE )}`
3. Rolling a
`r if (knitr::is_latex_output()) {
'\\largedice{1}'
} else {
'<span class="larger-die">⚀</span>'
}`
on the first roll is *independent* of rolling a
`r if (knitr::is_latex_output()) {
'\\largedice{1}'
} else {
'<span class="larger-die">⚀</span>'
}` on a second roll.
`r if( knitr::is_html_output() ) {torf(answer = TRUE )}`
4. The *probability* of\ '$A$ **and**\ $B$' occurring is $1/6$.
`r if( knitr::is_html_output() ) {torf( answer=TRUE )}`
5. The *probability* of\ '$A$ **or**\ $B$' occurring is $4/6$.
`r if( knitr::is_html_output() ) {torf( answer=TRUE )}`
6. The *probability* of '**not**\ $B$' occurring is $3/6$.
`r if( knitr::is_html_output() ) {torf( answer=TRUE )}`
7. The *odds* of '**not**\ $B$' occurring is $3/6$.
`r if( knitr::is_html_output() ) {torf( answer=FALSE )}`
8. The probability of Event\ $B$ occurring, *if* Event\ $A$ has already occurred, is $1/2$.
`r if( knitr::is_html_output() ) {torf( answer=TRUE )}`
:::
## Exercises {#ProbabilityExercises}
[Answers to odd-numbered exercises] are given at the end of the book.
`r if( knitr::is_latex_output() ) "\\captionsetup{font=small}"`
::: {.exercise #ProbabilityMethod}
Which *approach* is best used to estimate a probability in these situations?
1. The probability that the stock market will rise next month.
2. The probability that a randomly-chosen person writes left-handed.
:::
::: {.exercise #ProbabilityMethodB}
Which *approach* is best used to estimate a probability in these situations?
1. The probability that a **King** will be chosen from a pack of cards.
2. The probability that Paris receives more than\ $50\mms$ of rain next May.
:::
::: {.exercise #ProbabilityAndOrNot}
Consider drawing cards from a fair pack.
*Event\ A* is 'drawing a picture card', *Event\ B* is 'drawing a **King** or **Ace**' and *Event\ C* is 'drawing a $\spadesuit$'.
1. What events are in '$A$ **and**\ $B$'? \tightlist
2. Compute the probability of '$A$ **and**\ $B$'.
3. What events are in '$A$ **or**\ $B$'? \tightlist
4. Compute the probability of '$A$ **or**\ $B$'.
5. What events are in '$A$ **and**\ $C$'? \tightlist
6. Compute the probability of '$A$ **and**\ $C$'.
7. What events are in '**not**\ $C$'? \tightlist
8. Compute the probability of '**not**\ $C$'.
9. Compute the probability of $C$, if\ $A$ has already occurred.
10. Compute the probability of $A$, if\ $C$ has already occurred.
:::
::: {.exercise #ProbabilityAndOrNot2}
Consider rolling a fair die.
*Event\ A* is 'rolling an *even* number', *Event\ B* is 'rolling an *odd* number' and *Event\ C* is 'rolling a
`r if (knitr::is_latex_output()) {
'\\largedice{2}'
} else {
'<span class="larger-die">⚁</span>\''
}`.
1. What events are in '$A$ **and**\ $B$'? \tightlist
2. Compute the probability of '$A$ **and**\ $B$'.
3. What events are in '$A$ **or** $B$'? \tightlist
4. Compute the probability of '$A$ **or**\ $B$'.
5. What events are in '$A$ **and**\ $C$'? \tightlist
6. Compute the probability of '$A$ **and**\ $C$'.
7. What events are in '**not**\ $C$'? \tightlist
8. Compute the probability of '**not**\ $C$'.
9. Compute the probability of $C$, if\ $A$ has already occurred.
10. Compute the probability of $C$, if\ $B$ has already occurred.
:::
::: {.exercise #ProbabilityThreeEvents}
Consider these three events about tossing two fair coins, say Coin\ A and Coin\ B:
*Event\ 1* is 'toss a **Head** on Coin\ A'; *Event\ 2* is 'toss a **Tail** on Coin\ A'; and *Event\ 3* is 'toss a **Head** on Coin\ B'.
1. Are *Event\ 1* and *Event\ 2* independent events? \tightlist
2. Are *Event\ 1* and *Event\ 3* independent events?
3. Compute the probability of *Event\ 3*.
4. What is the probability of *Event\ 3* occurring, if *Event\ 1* has already occurred?
5. List the sample space for the random procedure.
:::
::: {.exercise #ProbabilityThreeEvents2}
Consider these three events about drawing one card from a fair pack:
*Event\ 1* is 'draw a **Jack**'; *Event\ 2* is 'draw a $\heartsuit$'; and *Event\ 3* is 'draw a $\clubsuit$'.
1. Compute the probability of *Event\ 1*. \tightlist
1. Compute the probability of *Event\ 1*, if *Event\ 2* has occurred.
1. Compute the probability of *Event\ 1*, if *Event\ 2* has *not* occurred.
1. Are *Event\ 1* and *Event\ 2* independent?
Explain.
1. Compute the probability of *Event\ 3*.
1. Compute the probability of *Event\ 3*, if *Event\ 2* has occurred.
1. Compute the probability of *Event\ 3*, if *Event\ 2* has *not* occurred.
1. Are *Event\ 2* and *Event\ 2* independent?
Explain.
:::
::: {.exercise #ProbabilityDie2}
Suppose I roll a standard six-sided die.
1. What is the *probability* that I will roll a number larger than
`r if (knitr::is_latex_output()) {
'\\largedice{2}'
} else {
'<span class="larger-die">⚁</span>'
}`? \tightlist
`r if( knitr::is_html_output() ) {mcq( c(
"0",
"1/6",
"2/6",
"3/6",
answer = "4/6",
"5/6",
"1"))}`
2. What are the *odds* of rolling a number smaller than
`r if (knitr::is_latex_output()) {
'\\largedice{6}'
} else {
'<span class="larger-die">⚅</span>'
}`?
`r if( knitr::is_html_output() ) {mcq( c(
"0",
"1/6",
"1/5",
"5/6",
"6/5",
answer = "5",
"5/6",
"1"))}`
3. Suppose I toss a coin after rolling the die.
Is the result from the coin toss *independent* of what I rolled on the die?
`r if( knitr::is_html_output() ) {longmcq( c(
"No: because there are six possible outcomes from rolling a die, but only two from tossing a coin",
"Yes: because die was not rolled at the same time as the coin was tossed",
answer = "Yes: what happens on the die won't change what happens on the coin",
"No: because the die was not rolled at the same time as the coin was tossed"))}`
4. What is the probability that I roll a number divisible by\ $2$ on the die?
`r if( knitr::is_html_output() ) {mcq( c(
"0",
"1/6",
"2/6",
answer = "3/6",
"4/6",
"5/6",
"1"))}`
5. What is the probability that I roll a number divisible by\ $2$ **and** divisible by\ $3$ on the die?
`r if( knitr::is_html_output() ) {mcq( c(
"0",
answer = "1/6",
"2/6",
"3/6",
"4/6",
"5/6",
"1"))}`
6. What is the probability of rolling a
`r if (knitr::is_latex_output()) {
'\\largedice{2}'
} else {
'<span class="larger-die">⚁</span>'
}`, *given* that the number is smaller than
`r if (knitr::is_latex_output()) {
'\\largedice{4}'
} else {
'<span class="larger-die">⚃</span>'
}`?
:::
::: {.exercise #ProbabilityCards}
Suppose you have a well-shuffled, standard pack of $52$\ cards.
1. What is the *probability* that you will draw a **King**?