Newer
Older
#+title: CS101 - Cognitive Science and Artificial Intelligence - Lecture 2
#+startup: beamer
#+latex_class: beamer
#+latex_class_options: [14pt,aspectratio=169,hyperref={pdfpagelabels=false},svgnames]
#+latex_header_extra: \usetheme{strath}
#+latex_header_extra: \usepackage{tikz}
#+latex_header_extra: \usepackage{pgfplots}
#+latex_header_extra: \usepackage{filecontents}
#+options: toc:nil
* CS101 - Cog. Sci. and AI
** Today
- Recap of assignment 1 - git
- Probability and text
* Assignment 1
- 74 students have attempted to submit the assignment
- of these, 72 have actually submitted the assignment
*This is only slightly more than half the class!*
Marking will begin this week.
If you are having trouble, *ask a demonstrator for help*.
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
If you do not do this assignment, you will not be able to submit any
of the other assignments and you will get *0* for this module.
* Probability Distributions: Definition
* Probability Distributions: Definition
- Support :: a set $S$,
$$
S = \left\{ x_1, x_2, \ldots, x_n \right\}
$$
- Probability map :: a mapping
$$
\Pr: S \rightarrow \left[0,1\right]
$$
- Condition ::
$$
\sum_{x \in S} \Pr(x) = 1
$$
* Probability Distributions: a Fair Coin
Consider a fair coin$\ldots$
$$
S_1 = \left\{ \mathtt{heads}, \mathtt{tails} \right\}
$$
$$
\Pr_1(x) = \begin{cases}
\frac{1}{2} & x\,is\,\mathtt{heads}\\
\frac{1}{2} & x\,is\,\mathtt{tails}
\end{cases}
$$
$$
\Pr_1(\mathtt{heads}) + \Pr_1(\mathtt{tails}) = \frac{1}{2} + \frac{1}{2} = 1
$$
* Probability Distributions: Two Fair Coins
\begin{align*}
S_2 = S_1 \times S_1 = \left\{&
(\mathtt{heads}, \mathtt{heads}), (\mathtt{heads}, \mathtt{tails}),\right.\\
&\left.(\mathtt{tails}, \mathtt{heads}), (\mathtt{tails}, \mathtt{tails})
\right\}
\end{align*}
\begin{align*}
\Pr_2(x,y) &= \Pr_1(x)\Pr_1(y),\qquad\forall (x,y) \in S_2\\
&= \frac{1}{2}\cdot\frac{1}{2} = \frac{1}{4}
\end{align*}
*Assumption:* these two coin tosses are /independent/
* Probability Distributions: Fair Dice
Consider a (fair) six-sided die,
$$
S_{d6} = \left\{1,2,3,4,5,6\right\}
$$
$$
\Pr_\text{fair}(x) = \frac{1}{6}, \qquad\forall x \in S_{d6}
$$
* Probability Distributions: Unfair Dice
Now consider an /unfair/ six-sided die,
$$
\Pr_\text{unfair}(x) = \begin{cases}
\frac{1}{5} & x = 1\\
\frac{1}{10} & x \in \left\{2,3,4,5\right\}\\
\frac{2}{5} & x = 6
\end{cases}
$$
* Questions
- Can we quantify /fairness/?
- Can we distinguish /information/ from random noise?
* Entropy
\begin{align*}
H(\Pr) &= -\sum_{x\in S} \Pr(x) \log_2 \Pr(x)\\
\Pr &: S \rightarrow \left[0,1\right]
\end{align*}
Define: $ 0 \log_2 0 \equiv 0 $
* Entropy
\begin{center}
\vspace{-2\baselineskip}
\hspace{2ex}\includegraphics[height=\textheight]{./img/neglog.png}
\end{center}
* Entropy -- Fair Coin
\begin{align*}
H(\Pr_1)
&= -\left[ \Pr_1(\mathtt{heads})\log_2 \Pr_1(\mathtt{heads}) +
\Pr_1(\mathtt{tails})\log_2 \Pr_1(\mathtt{tails}) \right]\\
&= -\left[ \frac{1}{2}\log_2 \frac{1}{2} + \frac{1}{2}\log_2 \frac{1}{2}\right]\\
&= -2 \frac{1}{2} \log_2 \frac{1}{2}\\
&= -\log_2 \frac{1}{2}\\
&= 1
\end{align*}
* Entropy -- Two Fair Coins
\begin{align*}
S_2 = \left\{&
(\mathtt{heads}, \mathtt{heads}), (\mathtt{heads}, \mathtt{tails}),\right.\\
&\left. (\mathtt{tails}, \mathtt{heads}), (\mathtt{tails}, \mathtt{tails})
\right\}
\end{align*}
\begin{align*}
\Pr_2(x) &= \frac{1}{4}, \qquad\forall x \in S_2
\end{align*}
* Entropy -- Two Fair Coins
\begin{align*}
H(\Pr_2)
&= -\left[ \Pr_2(\mathtt{HH})\log_2 \Pr_2(\mathtt{HH}) +
\Pr_2(\mathtt{HT})\log_2 \Pr_2(\mathtt{HT}) +\right.\\
&\qquad\,\left.
\Pr_2(\mathtt{TH})\log_2 \Pr_2(\mathtt{TH}) +
\Pr_2(\mathtt{TT})\log_2 \Pr_2(\mathtt{TT})\right]\\
&= -4 \frac{1}{4} \log_2 \frac{1}{4}\\
&= -\log_2 \frac{1}{4}\\
&= 2
\end{align*}
* Entropy -- Fair Dice
$$
H(\Pr_\text{fair}) = -\log_2 \frac{1}{6} \approx 2.58\ldots
$$
* Entropy -- Unfair Dice
\begin{align*}
\Pr_\text{unfair}(S) &= \left\{\Pr_\text{unfair}(1), \Pr_\text{unfair}(2), \Pr_\text{unfair}(3), \Pr_\text{unfair}(4), \Pr_\text{unfair}(5), \Pr_\text{unfair}(6) \right\}\\
&= \left\{\frac{1}{5}, \frac{1}{10}, \frac{1}{10}, \frac{1}{10}, \frac{1}{10}, \frac{2}{5} \right\}
\end{align*}
* Entropy -- Unfair Dice
\begin{align*}
\Pr_\text{unfair}(S) &= \left\{\Pr_\text{unfair}(1), \Pr_\text{unfair}(2), \Pr_\text{unfair}(3), \Pr_\text{unfair}(4), \Pr_\text{unfair}(5), \Pr_\text{unfair}(6) \right\}\\
&= \left\{\frac{1}{5}, \frac{1}{10}, \frac{1}{10}, \frac{1}{10}, \frac{1}{10}, \frac{2}{5} \right\}
\end{align*}
\begin{align*}
H(\Pr_\text{unfair}) &= -\left[
\frac{1}{5}\log_2 \frac{1}{5} + 4\frac{1}{10}\log_2 \frac{1}{10} + \frac{2}{5}\log_2 \frac{2}{5}
\right]\\
&\approx 2.32\ldots
\end{align*}
* Entropy -- Really Unfair Dice
What if...
$$
Pr(x) = \begin{cases}
1 & x = 1\\
0 & \text{otherwise}
\end{cases}
$$
\begin{align*}
H(Pr) &= -1 \log_2 1 - 5\cdot0\log_2 0\\
&= 0
\end{align*}
* Normalised Entropy
$$
\bar{H} = \frac{\text{entropy of a distribution}}{\text{entropy of the uniform distribution}}
$$
* Normalised Entropy
$$
\bar{H} = \frac{\text{entropy of a distribution}}{\text{entropy of the uniform distribution}}
$$
Recall entropy of the uniform distribution is just $-log_2 \frac{1}{|S|}$
Notation -- $|S|$ is number of elements in the support
* Normalised Entropy
For our unfair vs fair (uniform) dice,
$$
\bar{H}(\Pr_\text{unfair}) = \frac{H(\Pr_\text{unfair})}{-\log_2\frac{1}{|S_{d6}|}} = \frac{2.32\ldots}{2.58\ldots} = 0.898\ldots
$$
* Normalised Entropy
There better concept of relative entropy of two distributions,
*Kullback-Leibler divergence*. You would learn about this in a course on
Information Theory.
For our purposes, the normalised entropy will do.
* So what about text? Choose an alphabet
$$
S_\alpha = \mathtt{'abcdefghijklmnopqrstuvwxyz\_'}
$$
(by '_' we mean a space)
Some light pre-processing:
- make all letters lower case
- ignore punctuation etc
* So what about text? Probability distribution
Choose a letter at random from a text.
What is the chance you pick =e= or =q= or =' '= (space?)
- Support :: all (ascii) letters + space
- Mapping ::
$$
\Pr_\alpha(x) = \frac{\mathrm{count}(x)}{\mathrm{count}(\text{all letters})}
$$
* Letter probabilities
#+begin_src
a 0.0654 b 0.0124 c 0.0214 d 0.0311 e 0.1061
f 0.0195 g 0.0144 h 0.0547 i 0.0604 j 0.0014
k 0.0043 l 0.0316 m 0.0196 n 0.0586 o 0.0633
p 0.0145 q 0.0009 r 0.0483 s 0.0537 t 0.0783
u 0.0236 v 0.0078 w 0.0177 x 0.0013 y 0.0164
z 0.0004 0.1727
#+end_src
* Normalised entropy of letter probabilities
$$
H(\Pr_\alpha) = 4.095\ldots
$$
Maximum entropy, $\log_2 \frac{1}{27} \approx 4.754\ldots$
$$
\bar{H}(\Pr_\alpha) = \frac{4.095\ldots}{4.754\ldots} = 0.861\ldots
$$
* Pair and conditional probability
\begin{align*}
\Pr(y | x) &\qquad\text{probability of }y\text{ given }x\\
\Pr(x,y) &\qquad\text{probability of seeing the pair }(x,y)
\end{align*}
\begin{align*}
\Pr(x,y) &= \Pr(y|x) \Pr(x)\\
\Pr(y) &= \sum_x \Pr(y|x) \Pr(x)
\end{align*}
* Assignment 2a - Probability and Text (10%)
Write a program named =assignment-2a= that takes a file and
prints out a letter according to the distribution of letters in
that file, e.g.
#+begin_src
xgb21195@cafe:~/cs101$ assignment-2a filename
e
xgb21195@cafe:~/cs101$
#+end_src
* Assignment 2a - Probability and Text (10%)
/hint: use the Python 3 built in/ =random.choices= /function/
Note:
#+begin_src
random.choices(population, weights=None, ...)
#+end_src
- =population= $\rightarrow$ support
- =weights= $\rightarrow$ probabilities
https://docs.python.org/3/library/random.html
* Assignment 2b - Probability and Text (10%)
Write a program named =assignment-2b= that takes a file and a
letter and prints out a following letter according to the
/conditional/ distribution letters given the previous letter that file,
e.g.
#+begin_src
xgb21195@cafe:~/cs101$ ./assignment-2b filename e
r
xgb21195@cafe:~/cs101$
#+end_src
* Assignment 2c - Probability and Text (10%)
Write a program named =assignment-2c= that takes a filename and a
number and prints out a sequence of characters according to the
conditional distribution from 2b
#+begin_src
xgb21195@cafe:~/cs101$ ./assignment-2c filename 25
end usanve n imemas hely
xgb21195@cafe:~/cs101$
#+end_src
* Assignment 2
- Your programs *must* be named =assignment-2a=, =assignment-2b=, and
=assignment-2c= in your git repository
- You can write your programs in whatever language you wish. If you
use a compiled language, you must include a binary that runs on the
CIS linux machines and source code and a Makefile to build it.
- You *must* make sure the programs run on the CIS linux machines, if it
does not, you will receive no marks, no exceptions.
* Assignment 2 bonus (10%)
Write a program like 2c that instead of using probability conditional
one /one/ previous letter, conditions on the previous /two/ letters.
Call this program =assignment-2d=
* Materials
https://gitlab.cis.strath.ac.uk/xgb21195/cs101-csai
look in the =lec2= subdirectory.
- letters.py :: definition of "letters" and a functions to get
distributions of letters from a file
- entropy.py :: implementation of entropy function and example of
using it on a file (for letters)
- example.py :: examples of using this function
- republic.txt :: Plato's Republic from Project Gutenberg
* Reading for next week
The Game, Anatoly Dneprov, 1961
http://q-bits.org/images/Dneprov.pdf