-
William Waites authoredWilliam Waites authored
CS101 - Cognitive Science and Artificial Intelligence - Lecture 2
CS101 - Cog. Sci. and AI
Today
- Recap of assignment 1 - git
- Probability and text
Assignment 1
- 74 students have attempted to submit the assignment
- of these, 72 have actually submitted the assignment
This is only slightly more than half the class!
Marking will begin this week.
If you are having trouble, ask a demonstrator for help.
If you do not do this assignment, you will not be able to submit any of the other assignments and you will get 0 for this module.
Probability Distributions: Definition
Probability Distributions: Definition
- Support
- a set $S$, $$ S = \left\{ x_1, x_2, \ldots, x_n \right\} $$
- Probability map
- a mapping $$ Pr: S → \left[0,1\right] $$
- Condition
- $$ ∑x ∈ S Pr(x) = 1 $$
Probability Distributions: a Fair Coin
Consider a fair coin$\ldots$
$$ S_1 = \left\{ \mathtt{heads}, \mathtt{tails} \right\} $$
$$
Pr_1(x) = \begin{cases}
\frac{1}{2} & x\,is\,\mathtt{heads}
\frac{1}{2} & x\,is\,\mathtt{tails}
\end{cases}
$$
$$ Pr_1(\mathtt{heads}) + Pr_1(\mathtt{tails}) = \frac{1}{2} + \frac{1}{2} = 1 $$
Probability Distributions: Two Fair Coins
\begin{align*}
S_2 = S_1 × S_1 = \left\{&
(\mathtt{heads}, \mathtt{heads}), (\mathtt{heads}, \mathtt{tails}),\right.
&\left.(\mathtt{tails}, \mathtt{heads}), (\mathtt{tails}, \mathtt{tails})
\right\}
\end{align*}
\begin{align*}
Pr_2(x,y) &= Pr_1(x)Pr_1(y),\qquad\forall (x,y) ∈ S_2
&= \frac{1}{2}⋅\frac{1}{2} = \frac{1}{4}
\end{align*}
Assumption: these two coin tosses are independent
Probability Distributions: Fair Dice
Consider a (fair) six-sided die, $$ Sd6 = \left\{1,2,3,4,5,6\right\} $$ $$ Pr_\text{fair}(x) = \frac{1}{6}, \qquad\forall x ∈ Sd6 $$
Probability Distributions: Unfair Dice
Now consider an unfair six-sided die,
$$
Pr_\text{unfair}(x) = \begin{cases}
\frac{1}{5} & x = 1
\frac{1}{10} & x ∈ \left\{2,3,4,5\right\}\
\frac{2}{5} & x = 6
\end{cases}
$$
Questions
- Can we quantify fairness?
- Can we distinguish information from random noise?
Entropy
\begin{align*}
H(Pr) &= -∑x∈ S Pr(x) log_2 Pr(x)
Pr &: S → \left[0,1\right]
\end{align*}
Define: $ 0 log_2 0 ≡ 0 $
Entropy
\begin{center} \vspace{-2\baselineskip} \hspace{2ex}\includegraphics[height=\textheight]{./img/neglog.png} \end{center}
Entropy – Fair Coin
\begin{align*}
H(Pr_1)
&= -\left[ Pr_1(\mathtt{heads})log_2 Pr_1(\mathtt{heads}) +
Pr_1(\mathtt{tails})log_2 Pr_1(\mathtt{tails}) \right]
&= -\left[ \frac{1}{2}log_2 \frac{1}{2} + \frac{1}{2}log_2 \frac{1}{2}\right]\
&= -2 \frac{1}{2} log_2 \frac{1}{2}\
&= -log_2 \frac{1}{2}\
&= 1
\end{align*}
Entropy – Two Fair Coins
\begin{align*}
S_2 = \left\{&
(\mathtt{heads}, \mathtt{heads}), (\mathtt{heads}, \mathtt{tails}),\right.
&\left. (\mathtt{tails}, \mathtt{heads}), (\mathtt{tails}, \mathtt{tails})
\right\}
\end{align*}
\begin{align*} Pr_2(x) &= \frac{1}{4}, \qquad\forall x ∈ S_2 \end{align*}
Entropy – Two Fair Coins
\begin{align*}
H(Pr_2)
&= -\left[ Pr_2(\mathtt{HH})log_2 Pr_2(\mathtt{HH}) +
Pr_2(\mathtt{HT})log_2 Pr_2(\mathtt{HT}) +\right.
&\qquad\,\left.
Pr_2(\mathtt{TH})log_2 Pr_2(\mathtt{TH}) +
Pr_2(\mathtt{TT})log_2 Pr_2(\mathtt{TT})\right]\
&= -4 \frac{1}{4} log_2 \frac{1}{4}\
&= -log_2 \frac{1}{4}\
&= 2
\end{align*}
Entropy – Fair Dice
$$ H(Pr_\text{fair}) = -log_2 \frac{1}{6} ≈ 2.58\ldots $$
Entropy – Unfair Dice
\begin{align*}
Pr_\text{unfair}(S) &= \left\{Pr_\text{unfair}(1), Pr_\text{unfair}(2), Pr_\text{unfair}(3), Pr_\text{unfair}(4), Pr_\text{unfair}(5), Pr_\text{unfair}(6) \right\}
&= \left\{\frac{1}{5}, \frac{1}{10}, \frac{1}{10}, \frac{1}{10}, \frac{1}{10}, \frac{2}{5} \right\}
\end{align*}
Entropy – Unfair Dice
\begin{align*}
Pr_\text{unfair}(S) &= \left\{Pr_\text{unfair}(1), Pr_\text{unfair}(2), Pr_\text{unfair}(3), Pr_\text{unfair}(4), Pr_\text{unfair}(5), Pr_\text{unfair}(6) \right\}
&= \left\{\frac{1}{5}, \frac{1}{10}, \frac{1}{10}, \frac{1}{10}, \frac{1}{10}, \frac{2}{5} \right\}
\end{align*}
\begin{align*}
H(Pr_\text{unfair}) &= -\left[
\frac{1}{5}log_2 \frac{1}{5} + 4\frac{1}{10}log_2 \frac{1}{10} + \frac{2}{5}log_2 \frac{2}{5}
\right]
&≈ 2.32\ldots
\end{align*}
Entropy – Really Unfair Dice
What if…
$$
Pr(x) = \begin{cases}
1 & x = 1
0 & \text{otherwise}
\end{cases}
$$
\begin{align*}
H(Pr) &= -1 log_2 1 - 5⋅0log_2 0
&= 0
\end{align*}
Normalised Entropy
$$ \bar{H} = \frac{\text{entropy of a distribution}}{\text{entropy of the uniform distribution}} $$
Normalised Entropy
$$ \bar{H} = \frac{\text{entropy of a distribution}}{\text{entropy of the uniform distribution}} $$
Recall entropy of the uniform distribution is just $-log_2 \frac{1}{|S|}$
Notation – $|S|$ is number of elements in the support
Normalised Entropy
For our unfair vs fair (uniform) dice,
$$ \bar{H}(Pr_\text{unfair}) = \frac{H(Pr_\text{unfair})}{-log_2\frac{1}{|Sd6|}} = \frac{2.32\ldots}{2.58\ldots} = 0.898\ldots $$
Normalised Entropy
There better concept of relative entropy of two distributions, Kullback-Leibler divergence. You would learn about this in a course on Information Theory.
For our purposes, the normalised entropy will do.
So what about text? Choose an alphabet
$$ S_α = \mathtt{‘abcdefghijklmnopqrstuvwxyz\_’} $$
(by ‘_’ we mean a space)
Some light pre-processing:
- make all letters lower case
- ignore punctuation etc
So what about text? Probability distribution
Choose a letter at random from a text.
What is the chance you pick e
or q
or =’ ‘= (space?)
- Support
- all (ascii) letters + space
- Mapping
- $$ Pr_α(x) = \frac{\mathrm{count}(x)}{\mathrm{count}(\text{all letters})} $$
Letter probabilities
a 0.0654 b 0.0124 c 0.0214 d 0.0311 e 0.1061 f 0.0195 g 0.0144 h 0.0547 i 0.0604 j 0.0014 k 0.0043 l 0.0316 m 0.0196 n 0.0586 o 0.0633 p 0.0145 q 0.0009 r 0.0483 s 0.0537 t 0.0783 u 0.0236 v 0.0078 w 0.0177 x 0.0013 y 0.0164 z 0.0004 0.1727
Normalised entropy of letter probabilities
$$ H(Pr_α) = 4.095\ldots $$
Maximum entropy, $log_2 \frac{1}{27} ≈ 4.754\ldots$
$$ \bar{H}(Pr_α) = \frac{4.095\ldots}{4.754\ldots} = 0.861\ldots $$
Pair and conditional probability
\begin{align*}
Pr(y | x) &\qquad\text{probability of }y\text{ given }x
Pr(x,y) &\qquad\text{probability of seeing the pair }(x,y)
\end{align*}
\begin{align*}
Pr(x,y) &= Pr(y|x) Pr(x)
Pr(y) &= ∑_x Pr(y|x) Pr(x)
\end{align*}
Assignment 2a - Probability and Text (10%)
Write a program named assignment-2a
that takes a file and
prints out a letter according to the distribution of letters in
that file, e.g.
xgb21195@cafe:~/cs101$ assignment-2a filename e xgb21195@cafe:~/cs101$
Assignment 2a - Probability and Text (10%)
hint: use the Python 3 built in random.choices
function
Note:
random.choices(population, weights=None, ...)
-
population
$→$ support -
weights
$→$ probabilities
https://docs.python.org/3/library/random.html
Assignment 2b - Probability and Text (10%)
Write a program named assignment-2b
that takes a file and a
letter and prints out a following letter according to the
conditional distribution letters given the previous letter that file,
e.g.
xgb21195@cafe:~/cs101$ ./assignment-2b filename e r xgb21195@cafe:~/cs101$
Assignment 2c - Probability and Text (10%)
Write a program named assignment-2c
that takes a filename and a
number and prints out a sequence of characters according to the
conditional distribution from 2b
xgb21195@cafe:~/cs101$ ./assignment-2c filename 25 end usanve n imemas hely xgb21195@cafe:~/cs101$
Assignment 2
- Your programs must be named
assignment-2a
,assignment-2b
, andassignment-2c
in your git repository - You can write your programs in whatever language you wish. If you use a compiled language, you must include a binary that runs on the CIS linux machines and source code and a Makefile to build it.
- You must make sure the programs run on the CIS linux machines, if it does not, you will receive no marks, no exceptions.
Assignment 2 bonus (10%)
Write a program like 2c that instead of using probability conditional one one previous letter, conditions on the previous two letters.
Call this program assignment-2d
Materials
https://gitlab.cis.strath.ac.uk/xgb21195/cs101-csai
look in the lec2
subdirectory.
- letters.py
- definition of “letters” and a functions to get distributions of letters from a file
- entropy.py
- implementation of entropy function and example of using it on a file (for letters)
- example.py
- examples of using this function
- republic.txt
- Plato’s Republic from Project Gutenberg
Reading for next week
The Game, Anatoly Dneprov, 1961
http://q-bits.org/images/Dneprov.pdf