update ch.19

pglpm · Sep 28, 2024 · e2fdf32 · e2fdf32
1 parent 52b39f7
commit e2fdf32
Show file tree

Hide file tree

Showing 6 changed files with 36 additions and 37 deletions.
diff --git a/conditional_probability.qmd b/conditional_probability.qmd
@@ -9,19 +9,17 @@ When we introduced the notion of degree of belief -- a.k.a. probability -- in [c
 This term must be understood in a way analogous to "marginal probability": it applies in situations where we have two or more sentences of interest. We speak of a "conditional probability" when we want to emphasize that additional sentences appear in the conditional (right side of "$\|$") of that probability. For instance, in a scenario with these two probabilities:
 
 $$
-\P(\se{A} \| \se{B} \and \yI)
+\P(\se{A} \| \se{\yellow B} \and \yI)
 \qquad
 \P(\se{A} \| \yI)
 $$
 
-we call the first [**conditional probability**]{.blue} of $\se{A}$ ([**given**]{.blue} $\se{B}$) to emphasize or point out that its conditional includes the additional sentence ($\se{B}$), whereas the conditional of the second probability doesn't include this sentence.
+we call the first [**conditional probability**]{.blue} of $\se{A}$ ([**given**]{.blue} $\se{\yellow B}$) to emphasize or point out that its conditional includes the additional sentence $\se{\yellow B}$, whereas the conditional of the second probability doesn't include this sentence.
 
 
-## The relation between *learning* and Conditional probability {#sec-conditional-prob_learning}
+## The relation between *learning* and conditional probability {#sec-conditional-prob_learning}
 
-Why do we need to emphasize that a particular degree of belief is conditional on an *additional* sentence?
-
-Because the additional sentence usually represents *new information that the agent has learned*.
+Why do we need to emphasize that a particular degree of belief is conditional on an additional sentence? Because the additional sentence usually represents *new information that the agent has learned*.
 
 Remember that the conditional of a probability usually contains all factual information known to the agent^[Exceptions are, for instance, when the agent does *counterfactual* or *hypothetical* reasoning, as we discussed in [§@sec-inference-scenarios].]. Therefore if an agent acquires new data or a new piece of information expressed by a sentence $\yellow\se{D}$, it should draw inferences and make decisions using probabilities that include $\yellow\se{D}$ in their conditional. In other words, the agent before was drawing inferences and making decisions using some probabilities
 

diff --git a/connection-3-ML.qmd b/connection-3-ML.qmd
@@ -12,11 +12,11 @@ $$
 
 The correspondence about [training data]{.green} and [architecture]{.yellow} seems somewhat convincing, the one about [outcome]{.red} needs more exploration.
 
-Having introduced the notion of quantity in the latest chapters [-@sec-quantities-types-basic] and [-@sec-quantities-types-multi], we recognize that "training data" are nothing else but quantities with given values. So a datum $\se{D}_i$ can be expressed by a sentence like $Z_i\mo z_i$, where
+Having introduced the notion of quantity in the latest chapters [-@sec-quantities-types-basic] and [-@sec-quantities-types-multi], we recognize that "training data" are just quantities, the values of which the agent has learned. So a datum $\se{D}_i$ can be expressed by a sentence like $Z_i\mo z_i$, where
 
-- $i$ is the instance: $1,2,\dotsc,N, N+1$
-- $Z_i$, a quantity, describes the type of data at instance $i$, for example "128 × 128 image with 24-bit colour depth, with a character label"
-- $z_i$ is the value of the quantity $Z_i$ at instance $i$, for example the specific image & label enclosed here:
+- $i$ is the instance: $1,2,\dotsc,N, N+1$.
+- $Z_i$, a quantity, describes the type of data at instance $i$, for example "128 × 128 image with 24-bit colour depth, with a character label".
+- $z_i$ is the value of the quantity $Z_i$ at instance $i$, for example the specific image & label displayed here:
 
 :::{.column-margin}
 ![label = "Saitama"](saitama_smile.png){width=128 fig-cap-location="center"}
@@ -30,6 +30,7 @@ $$
 \black\and \underbracket[0ex]{\yellow\yI}_{\mathrlap{\yellow\uparrow\ \text{architecture?}}})
 $$
 
-This is the kind of inference that we explored in the "next-three-patients" scenario of [§@sec-conditional-joint-sim] and in some of its following sections. In [chapter @sec-beyond-ML], after a review of conventional machine-learning methods and terminology, we shall discuss with more care what these inferences are about, what kind of information they use, and how they can be concretely calculated.
+This is the kind of inference that we explored in the "next-three-patients" scenario of [§@sec-conditional-joint-sim] and in some of the subsequent sections. In [chapter @sec-beyond-ML], after a review of conventional machine-learning methods and terminology, we shall discuss with more care what these inferences are about, what kind of information they use, and how they can be concretely calculated.
 
-But we have been speaking of "task instances" and "instances of quantities" quite vaguely so far. These are important notions: the whole idea of "learning from examples" hinges on them. In the next few chapters we shall therefore make them more rigorous. The theory at makes them rigorous is *Statistics*. As a bonus we shall find out that a rigorous analysis of the notion of "instances" also leads to concrete formulae to calculate the probabilities discussed above.
+\
+In the last sections we have often been speaking about "instances", "instances of similar quantities", "task instances", and similar expression. What do with mean with "instance", more exactly? It is time that we make this and related notions more precise: the whole idea of "learning from examples" hinges on them. In the next few chapters we shall therefore make these ideas more rigorous and quantifiable. *Statistics* is the theory that deals with these ideas. As a bonus we shall find out that a rigorous analysis of the notion of "instances" also leads to concrete formulae for calculating the kind of probabilities discussed in the present chapter.
diff --git a/docs/conditional_probability.html b/docs/conditional_probability.html
@@ -6,7 +6,7 @@
 
 <meta name="viewport" content="width=device-width, initial-scale=1.0, user-scalable=yes">
 
-<meta name="dcterms.date" content="2024-09-27">
+<meta name="dcterms.date" content="2024-09-28">
 
 <title>ADA511 0.2 Foundations of data science and data-driven engineering - 17&nbsp; Conditional probability and learning</title>
 <style>
@@ -133,7 +133,7 @@ <h1 class="quarto-secondary-nav-title"><span id="sec-learning" class="quarto-sec
       <img src="./ada511logo8_small.png" alt="" class="sidebar-logo py-0 d-lg-inline d-none">
       </a>
     <div class="sidebar-title mb-0 py-0">
-      <a href="./">ADA511 <span class="small grey">0.2</span><br><span class="small grey">2024-09-27</span><br></a><a href="http://creativecommons.org/licenses/by-sa/4.0"><img src="cc_by_sa.png" class="img-fluid" style="width:3em" alt="CC BY-SA 4.0"> <span class="small grey">licence</span></a><br> 
+      <a href="./">ADA511 <span class="small grey">0.2</span><br><span class="small grey">2024-09-28</span><br></a><a href="http://creativecommons.org/licenses/by-sa/4.0"><img src="cc_by_sa.png" class="img-fluid" style="width:3em" alt="CC BY-SA 4.0"> <span class="small grey">licence</span></a><br> 
         <div class="sidebar-tools-main">
   <a href="" class="quarto-reader-toggle quarto-navigation-tool px-1" onclick="window.quartoToggleReader(); return false;" title="Toggle reader mode">
   <div class="quarto-reader-toggle-btn">
@@ -552,7 +552,7 @@ <h2 id="toc-title">Table of contents</h2>
 
   <ul>
   <li><a href="#sec-conditional-probs" id="toc-sec-conditional-probs" class="nav-link active" data-scroll-target="#sec-conditional-probs"><span class="header-section-number">17.1</span> The meaning of the term “conditional probability”</a></li>
-  <li><a href="#sec-conditional-prob_learning" id="toc-sec-conditional-prob_learning" class="nav-link" data-scroll-target="#sec-conditional-prob_learning"><span class="header-section-number">17.2</span> The relation between <em>learning</em> and Conditional probability</a></li>
+  <li><a href="#sec-conditional-prob_learning" id="toc-sec-conditional-prob_learning" class="nav-link" data-scroll-target="#sec-conditional-prob_learning"><span class="header-section-number">17.2</span> The relation between <em>learning</em> and conditional probability</a></li>
   <li><a href="#sec-conditional-joint-dis" id="toc-sec-conditional-joint-dis" class="nav-link" data-scroll-target="#sec-conditional-joint-dis"><span class="header-section-number">17.3</span> Learning about a quantity from a <em>different</em> quantity</a></li>
   <li><a href="#sec-conditional-joint-sim" id="toc-sec-conditional-joint-sim" class="nav-link" data-scroll-target="#sec-conditional-joint-sim"><span class="header-section-number">17.4</span> Learning about a quantity from instances of <em>similar</em> quantities</a></li>
   <li><a href="#sec-conditional-joint-general" id="toc-sec-conditional-joint-general" class="nav-link" data-scroll-target="#sec-conditional-joint-general"><span class="header-section-number">17.5</span> Learning in the general case</a></li>
@@ -578,7 +578,7 @@ <h1 class="title d-none d-lg-block"><span id="sec-learning" class="quarto-sectio
     <div>
     <div class="quarto-title-meta-heading">Published</div>
     <div class="quarto-title-meta-contents">
-      <p class="date">2024-09-27</p>
+      <p class="date">2024-09-28</p>
     </div>
   </div>
 
@@ -604,16 +604,15 @@ <h2 data-number="17.1" class="anchored" data-anchor-id="sec-conditional-probs"><
 <p>When we introduced the notion of degree of belief – a.k.a. probability – in <a href="probability_inference.html" class="quarto-xref">chapter&nbsp;&nbsp;<span>8</span></a>, we emphasized that <em>every probability is conditional on some state of knowledge or information</em>. So the term “conditional probability” sounds like a <a href="https://dictionary.cambridge.org/dictionary/english/pleonasm">pleonasm</a>, just like saying “round circle”.</p>
 <p>This term must be understood in a way analogous to “marginal probability”: it applies in situations where we have two or more sentences of interest. We speak of a “conditional probability” when we want to emphasize that additional sentences appear in the conditional (right side of <span style="display:inline-block;">“<span class="math inline">\(\nonscript\:\vert\nonscript\:\mathopen{}\)</span>”</span>) of that probability. For instance, in a scenario with these two probabilities:</p>
 <p><span class="math display">\[
-\mathrm{P}(\mathsfit{A} \nonscript\:\vert\nonscript\:\mathopen{} \mathsfit{B} \mathbin{\mkern-0.5mu,\mkern-0.5mu}\mathsfit{I})
+\mathrm{P}(\mathsfit{A} \nonscript\:\vert\nonscript\:\mathopen{} \mathsfit{\color[RGB]{204,187,68}B} \mathbin{\mkern-0.5mu,\mkern-0.5mu}\mathsfit{I})
 \qquad
 \mathrm{P}(\mathsfit{A} \nonscript\:\vert\nonscript\:\mathopen{} \mathsfit{I})
 \]</span></p>
-<p>we call the first <span class="blue"><strong>conditional probability</strong></span> of <span style="display:inline-block;"><span class="math inline">\(\mathsfit{A}\)</span></span> (<span class="blue"><strong>given</strong></span> <span style="display:inline-block;"><span class="math inline">\(\mathsfit{B}\)</span>)</span> to emphasize or point out that its conditional includes the additional sentence (<span style="display:inline-block;"><span class="math inline">\(\mathsfit{B}\)</span>),</span> whereas the conditional of the second probability doesn’t include this sentence.</p>
+<p>we call the first <span class="blue"><strong>conditional probability</strong></span> of <span style="display:inline-block;"><span class="math inline">\(\mathsfit{A}\)</span></span> (<span class="blue"><strong>given</strong></span> <span style="display:inline-block;"><span class="math inline">\(\mathsfit{\color[RGB]{204,187,68}B}\)</span>)</span> to emphasize or point out that its conditional includes the additional sentence <span style="display:inline-block;"><span class="math inline">\(\mathsfit{\color[RGB]{204,187,68}B}\)</span>,</span> whereas the conditional of the second probability doesn’t include this sentence.</p>
 </section>
 <section id="sec-conditional-prob_learning" class="level2 page-columns page-full" data-number="17.2">
-<h2 data-number="17.2" class="anchored" data-anchor-id="sec-conditional-prob_learning"><span class="header-section-number">17.2</span> The relation between <em>learning</em> and Conditional probability</h2>
-<p>Why do we need to emphasize that a particular degree of belief is conditional on an <em>additional</em> sentence?</p>
-<p>Because the additional sentence usually represents <em>new information that the agent has learned</em>.</p>
+<h2 data-number="17.2" class="anchored" data-anchor-id="sec-conditional-prob_learning"><span class="header-section-number">17.2</span> The relation between <em>learning</em> and conditional probability</h2>
+<p>Why do we need to emphasize that a particular degree of belief is conditional on an additional sentence? Because the additional sentence usually represents <em>new information that the agent has learned</em>.</p>
 <p>Remember that the conditional of a probability usually contains all factual information known to the agent<a href="#fn1" class="footnote-ref" id="fnref1" role="doc-noteref"><sup>1</sup></a>. Therefore if an agent acquires new data or a new piece of information expressed by a sentence <span style="display:inline-block;"><span class="math inline">\(\color[RGB]{204,187,68}\mathsfit{D}\)</span>,</span> it should draw inferences and make decisions using probabilities that include <span style="display:inline-block;"><span class="math inline">\(\color[RGB]{204,187,68}\mathsfit{D}\)</span></span> in their conditional. In other words, the agent before was drawing inferences and making decisions using some probabilities</p>
 <div class="no-row-height column-margin column-container"><p><sup>1</sup>&nbsp;Exceptions are, for instance, when the agent does <em>counterfactual</em> or <em>hypothetical</em> reasoning, as we discussed in <a href="inference.html#sec-inference-scenarios" class="quarto-xref">§&nbsp;<span>5.1</span></a>.</p></div><p><span class="math display">\[
 \mathrm{P}(\dotso \nonscript\:\vert\nonscript\:\mathopen{} \mathsfit{K})