Skip to content
Snippets Groups Projects
cs101-csai-lec2.org 9.31 KiB

CS101 - Cognitive Science and Artificial Intelligence - Lecture 2

CS101 - Cog. Sci. and AI

Today

  • Recap of assignment 1 - git
  • Probability and text

Assignment 1

  • 74 students have attempted to submit the assignment
  • of these, 72 have actually submitted the assignment

This is only slightly more than half the class!

Marking will begin this week.

If you are having trouble, ask a demonstrator for help.

If you do not do this assignment, you will not be able to submit any of the other assignments and you will get 0 for this module.

Probability Distributions: Definition

Probability Distributions: Definition

Support
a set $S$, $$ S = \left\{ x_1, x_2, \ldots, x_n \right\} $$
Probability map
a mapping $$ Pr: S → \left[0,1\right] $$
Condition
$$ ∑x ∈ S Pr(x) = 1 $$

Probability Distributions: a Fair Coin

Consider a fair coin$\ldots$

$$ S_1 = \left\{ \mathtt{heads}, \mathtt{tails} \right\} $$

$$ Pr_1(x) = \begin{cases} \frac{1}{2} & x\,is\,\mathtt{heads}
\frac{1}{2} & x\,is\,\mathtt{tails} \end{cases} $$

$$ Pr_1(\mathtt{heads}) + Pr_1(\mathtt{tails}) = \frac{1}{2} + \frac{1}{2} = 1 $$

Probability Distributions: Two Fair Coins

\begin{align*} S_2 = S_1 × S_1 = \left\{& (\mathtt{heads}, \mathtt{heads}), (\mathtt{heads}, \mathtt{tails}),\right.
&\left.(\mathtt{tails}, \mathtt{heads}), (\mathtt{tails}, \mathtt{tails}) \right\} \end{align*}

\begin{align*} Pr_2(x,y) &= Pr_1(x)Pr_1(y),\qquad\forall (x,y) ∈ S_2
&= \frac{1}{2}⋅\frac{1}{2} = \frac{1}{4} \end{align*}

Assumption: these two coin tosses are independent

Probability Distributions: Fair Dice

Consider a (fair) six-sided die, $$ Sd6 = \left\{1,2,3,4,5,6\right\} $$ $$ Pr_\text{fair}(x) = \frac{1}{6}, \qquad\forall x ∈ Sd6 $$

Probability Distributions: Unfair Dice

Now consider an unfair six-sided die,

$$ Pr_\text{unfair}(x) = \begin{cases} \frac{1}{5} & x = 1
\frac{1}{10} & x ∈ \left\{2,3,4,5\right\}\ \frac{2}{5} & x = 6 \end{cases} $$

Questions

  • Can we quantify fairness?
  • Can we distinguish information from random noise?

Entropy

\begin{align*} H(Pr) &= -∑x∈ S Pr(x) log_2 Pr(x)
Pr &: S → \left[0,1\right] \end{align*}

Define: $ 0 log_2 0 ≡ 0 $

Entropy

\begin{center} \vspace{-2\baselineskip} \hspace{2ex}\includegraphics[height=\textheight]{./img/neglog.png} \end{center}

Entropy – Fair Coin

\begin{align*} H(Pr_1) &= -\left[ Pr_1(\mathtt{heads})log_2 Pr_1(\mathtt{heads}) + Pr_1(\mathtt{tails})log_2 Pr_1(\mathtt{tails}) \right]
&= -\left[ \frac{1}{2}log_2 \frac{1}{2} + \frac{1}{2}log_2 \frac{1}{2}\right]\ &= -2 \frac{1}{2} log_2 \frac{1}{2}\ &= -log_2 \frac{1}{2}\ &= 1 \end{align*}

Entropy – Two Fair Coins

\begin{align*} S_2 = \left\{& (\mathtt{heads}, \mathtt{heads}), (\mathtt{heads}, \mathtt{tails}),\right.
&\left. (\mathtt{tails}, \mathtt{heads}), (\mathtt{tails}, \mathtt{tails}) \right\} \end{align*}

\begin{align*} Pr_2(x) &= \frac{1}{4}, \qquad\forall x ∈ S_2 \end{align*}

Entropy – Two Fair Coins

\begin{align*} H(Pr_2) &= -\left[ Pr_2(\mathtt{HH})log_2 Pr_2(\mathtt{HH}) + Pr_2(\mathtt{HT})log_2 Pr_2(\mathtt{HT}) +\right.
&\qquad\,\left. Pr_2(\mathtt{TH})log_2 Pr_2(\mathtt{TH}) + Pr_2(\mathtt{TT})log_2 Pr_2(\mathtt{TT})\right]\ &= -4 \frac{1}{4} log_2 \frac{1}{4}\ &= -log_2 \frac{1}{4}\ &= 2 \end{align*}

Entropy – Fair Dice

$$ H(Pr_\text{fair}) = -log_2 \frac{1}{6} ≈ 2.58\ldots $$

Entropy – Unfair Dice

\begin{align*} Pr_\text{unfair}(S) &= \left\{Pr_\text{unfair}(1), Pr_\text{unfair}(2), Pr_\text{unfair}(3), Pr_\text{unfair}(4), Pr_\text{unfair}(5), Pr_\text{unfair}(6) \right\}
&= \left\{\frac{1}{5}, \frac{1}{10}, \frac{1}{10}, \frac{1}{10}, \frac{1}{10}, \frac{2}{5} \right\} \end{align*}

Entropy – Unfair Dice

\begin{align*} Pr_\text{unfair}(S) &= \left\{Pr_\text{unfair}(1), Pr_\text{unfair}(2), Pr_\text{unfair}(3), Pr_\text{unfair}(4), Pr_\text{unfair}(5), Pr_\text{unfair}(6) \right\}
&= \left\{\frac{1}{5}, \frac{1}{10}, \frac{1}{10}, \frac{1}{10}, \frac{1}{10}, \frac{2}{5} \right\} \end{align*}

\begin{align*} H(Pr_\text{unfair}) &= -\left[ \frac{1}{5}log_2 \frac{1}{5} + 4\frac{1}{10}log_2 \frac{1}{10} + \frac{2}{5}log_2 \frac{2}{5} \right]
&≈ 2.32\ldots \end{align*}

Entropy – Really Unfair Dice

What if… $$ Pr(x) = \begin{cases} 1 & x = 1
0 & \text{otherwise} \end{cases} $$

\begin{align*} H(Pr) &= -1 log_2 1 - 5⋅0log_2 0
&= 0 \end{align*}

Normalised Entropy

$$ \bar{H} = \frac{\text{entropy of a distribution}}{\text{entropy of the uniform distribution}} $$

Normalised Entropy

$$ \bar{H} = \frac{\text{entropy of a distribution}}{\text{entropy of the uniform distribution}} $$

Recall entropy of the uniform distribution is just $-log_2 \frac{1}{|S|}$

Notation – $|S|$ is number of elements in the support

Normalised Entropy

For our unfair vs fair (uniform) dice,

$$ \bar{H}(Pr_\text{unfair}) = \frac{H(Pr_\text{unfair})}{-log_2\frac{1}{|Sd6|}} = \frac{2.32\ldots}{2.58\ldots} = 0.898\ldots $$

Normalised Entropy

There better concept of relative entropy of two distributions, Kullback-Leibler divergence. You would learn about this in a course on Information Theory.

For our purposes, the normalised entropy will do.

So what about text? Choose an alphabet

$$ S_α = \mathtt{‘abcdefghijklmnopqrstuvwxyz\_’} $$

(by ‘_’ we mean a space)

Some light pre-processing:

  • make all letters lower case
  • ignore punctuation etc

So what about text? Probability distribution

Choose a letter at random from a text.

What is the chance you pick e or q or =’ ‘= (space?)

Support
all (ascii) letters + space
Mapping
$$ Pr_α(x) = \frac{\mathrm{count}(x)}{\mathrm{count}(\text{all letters})} $$

Letter probabilities

a 0.0654   b 0.0124   c 0.0214   d 0.0311   e 0.1061
f 0.0195   g 0.0144   h 0.0547   i 0.0604   j 0.0014
k 0.0043   l 0.0316   m 0.0196   n 0.0586   o 0.0633
p 0.0145   q 0.0009   r 0.0483   s 0.0537   t 0.0783
u 0.0236   v 0.0078   w 0.0177   x 0.0013   y 0.0164
z 0.0004     0.1727

Normalised entropy of letter probabilities

$$ H(Pr_α) = 4.095\ldots $$

Maximum entropy, $log_2 \frac{1}{27} ≈ 4.754\ldots$

$$ \bar{H}(Pr_α) = \frac{4.095\ldots}{4.754\ldots} = 0.861\ldots $$

Pair and conditional probability

\begin{align*} Pr(y | x) &\qquad\text{probability of }y\text{ given }x
Pr(x,y) &\qquad\text{probability of seeing the pair }(x,y) \end{align*}

\begin{align*} Pr(x,y) &= Pr(y|x) Pr(x)
Pr(y) &= ∑_x Pr(y|x) Pr(x) \end{align*}

Assignment 2a - Probability and Text (10%)

Write a program named assignment-2a that takes a file and prints out a letter according to the distribution of letters in that file, e.g.

xgb21195@cafe:~/cs101$ assignment-2a filename
e
xgb21195@cafe:~/cs101$

Assignment 2a - Probability and Text (10%)

hint: use the Python 3 built in random.choices function

Note:

random.choices(population, weights=None, ...)
  • population $→$ support
  • weights $→$ probabilities

https://docs.python.org/3/library/random.html

Assignment 2b - Probability and Text (10%)

Write a program named assignment-2b that takes a file and a letter and prints out a following letter according to the conditional distribution letters given the previous letter that file, e.g.

xgb21195@cafe:~/cs101$ ./assignment-2b filename e
r
xgb21195@cafe:~/cs101$

Assignment 2c - Probability and Text (10%)

Write a program named assignment-2c that takes a filename and a number and prints out a sequence of characters according to the conditional distribution from 2b

xgb21195@cafe:~/cs101$ ./assignment-2c filename 25
end usanve n imemas hely
xgb21195@cafe:~/cs101$

Assignment 2

  • Your programs must be named assignment-2a, assignment-2b, and assignment-2c in your git repository
  • You can write your programs in whatever language you wish. If you use a compiled language, you must include a binary that runs on the CIS linux machines and source code and a Makefile to build it.
  • You must make sure the programs run on the CIS linux machines, if it does not, you will receive no marks, no exceptions.

Assignment 2 bonus (10%)

Write a program like 2c that instead of using probability conditional one one previous letter, conditions on the previous two letters.

Call this program assignment-2d

Materials

https://gitlab.cis.strath.ac.uk/xgb21195/cs101-csai

look in the lec2 subdirectory.

letters.py
definition of “letters” and a functions to get distributions of letters from a file
entropy.py
implementation of entropy function and example of using it on a file (for letters)
example.py
examples of using this function
republic.txt
Plato’s Republic from Project Gutenberg

Reading for next week

The Game, Anatoly Dneprov, 1961

http://q-bits.org/images/Dneprov.pdf