Skip to content
Snippets Groups Projects
cs101-csai-lec2.org 9.31 KiB
Newer Older
#+title: CS101 - Cognitive Science and Artificial Intelligence - Lecture 2
#+startup: beamer
#+latex_class: beamer
#+latex_class_options: [14pt,aspectratio=169,hyperref={pdfpagelabels=false},svgnames]
#+latex_header_extra: \usetheme{strath}
#+latex_header_extra: \usepackage{tikz}
#+latex_header_extra: \usepackage{pgfplots}
#+latex_header_extra: \usepackage{filecontents}
#+options: toc:nil

* CS101 - Cog. Sci. and AI

** Today

- Recap of assignment 1 - git
- Probability and text

* Assignment 1

- 74 students have attempted to submit the assignment
- of these, 72 have actually submitted the assignment

*This is only slightly more than half the class!*

Marking will begin this week.

If you are having trouble, *ask a demonstrator for help*.

If you do not do this assignment, you will not be able to submit any
of the other assignments and you will get *0* for this module.

* Probability Distributions: Definition
* Probability Distributions: Definition

- Support :: a set $S$,
  $$
  S = \left\{ x_1, x_2, \ldots, x_n \right\}
  $$
- Probability map :: a mapping
  $$
  \Pr: S \rightarrow \left[0,1\right]
  $$
- Condition ::
  $$
  \sum_{x \in S} \Pr(x) = 1
  $$

* Probability Distributions: a Fair Coin

Consider a fair coin$\ldots$

$$
S_1 = \left\{ \mathtt{heads}, \mathtt{tails} \right\}
$$

$$
\Pr_1(x) = \begin{cases}
\frac{1}{2} & x\,is\,\mathtt{heads}\\
\frac{1}{2} & x\,is\,\mathtt{tails}
\end{cases}
$$

$$
\Pr_1(\mathtt{heads}) + \Pr_1(\mathtt{tails}) = \frac{1}{2} + \frac{1}{2} = 1
$$

* Probability Distributions: Two Fair Coins
\begin{align*}
S_2 = S_1 \times S_1 = \left\{&
(\mathtt{heads}, \mathtt{heads}), (\mathtt{heads}, \mathtt{tails}),\right.\\
&\left.(\mathtt{tails}, \mathtt{heads}), (\mathtt{tails}, \mathtt{tails})
\right\}
\end{align*}

\begin{align*}
\Pr_2(x,y) &= \Pr_1(x)\Pr_1(y),\qquad\forall (x,y) \in S_2\\
&= \frac{1}{2}\cdot\frac{1}{2} = \frac{1}{4}
\end{align*}

*Assumption:* these two coin tosses are /independent/

* Probability Distributions: Fair Dice

Consider a (fair) six-sided die,
$$
S_{d6} = \left\{1,2,3,4,5,6\right\}
$$
$$
\Pr_\text{fair}(x) = \frac{1}{6}, \qquad\forall x \in S_{d6}
$$

* Probability Distributions: Unfair Dice

Now consider an /unfair/ six-sided die,

$$
\Pr_\text{unfair}(x) = \begin{cases}
\frac{1}{5} & x = 1\\
\frac{1}{10} & x \in \left\{2,3,4,5\right\}\\
\frac{2}{5} & x = 6
\end{cases}
$$

* Questions

- Can we quantify /fairness/?

- Can we distinguish /information/ from random noise?
  
* Entropy

\begin{align*}
H(\Pr) &= -\sum_{x\in S} \Pr(x) \log_2 \Pr(x)\\
\Pr &: S \rightarrow \left[0,1\right]
\end{align*}

Define: $ 0 \log_2 0 \equiv 0 $

* Entropy

\begin{center}
\vspace{-2\baselineskip}
\hspace{2ex}\includegraphics[height=\textheight]{./img/neglog.png}
\end{center}

* Entropy -- Fair Coin

\begin{align*}
H(\Pr_1)
&= -\left[ \Pr_1(\mathtt{heads})\log_2 \Pr_1(\mathtt{heads}) +
           \Pr_1(\mathtt{tails})\log_2 \Pr_1(\mathtt{tails}) \right]\\
&= -\left[ \frac{1}{2}\log_2 \frac{1}{2} + \frac{1}{2}\log_2 \frac{1}{2}\right]\\
&= -2 \frac{1}{2} \log_2 \frac{1}{2}\\
&= -\log_2 \frac{1}{2}\\
&= 1
\end{align*}

* Entropy -- Two Fair Coins
\begin{align*}
S_2 = \left\{&
(\mathtt{heads}, \mathtt{heads}), (\mathtt{heads}, \mathtt{tails}),\right.\\
&\left. (\mathtt{tails}, \mathtt{heads}), (\mathtt{tails}, \mathtt{tails})
\right\}
\end{align*}

\begin{align*}
\Pr_2(x) &= \frac{1}{4}, \qquad\forall x \in S_2
\end{align*}

* Entropy -- Two Fair Coins
\begin{align*}
H(\Pr_2)
&= -\left[ \Pr_2(\mathtt{HH})\log_2 \Pr_2(\mathtt{HH}) +
           \Pr_2(\mathtt{HT})\log_2 \Pr_2(\mathtt{HT}) +\right.\\
&\qquad\,\left.
           \Pr_2(\mathtt{TH})\log_2 \Pr_2(\mathtt{TH}) +
           \Pr_2(\mathtt{TT})\log_2 \Pr_2(\mathtt{TT})\right]\\
&= -4 \frac{1}{4} \log_2 \frac{1}{4}\\
&= -\log_2 \frac{1}{4}\\
&= 2
\end{align*}

* Entropy -- Fair Dice

$$
H(\Pr_\text{fair}) = -\log_2 \frac{1}{6} \approx 2.58\ldots
$$

* Entropy -- Unfair Dice
\begin{align*}
\Pr_\text{unfair}(S) &= \left\{\Pr_\text{unfair}(1), \Pr_\text{unfair}(2), \Pr_\text{unfair}(3), \Pr_\text{unfair}(4), \Pr_\text{unfair}(5), \Pr_\text{unfair}(6) \right\}\\
      &= \left\{\frac{1}{5}, \frac{1}{10}, \frac{1}{10}, \frac{1}{10}, \frac{1}{10}, \frac{2}{5} \right\}
\end{align*}

* Entropy -- Unfair Dice
\begin{align*}
\Pr_\text{unfair}(S) &= \left\{\Pr_\text{unfair}(1), \Pr_\text{unfair}(2), \Pr_\text{unfair}(3), \Pr_\text{unfair}(4), \Pr_\text{unfair}(5), \Pr_\text{unfair}(6) \right\}\\
      &= \left\{\frac{1}{5}, \frac{1}{10}, \frac{1}{10}, \frac{1}{10}, \frac{1}{10}, \frac{2}{5} \right\}
\end{align*}

\begin{align*}
H(\Pr_\text{unfair}) &= -\left[
\frac{1}{5}\log_2 \frac{1}{5} + 4\frac{1}{10}\log_2 \frac{1}{10} + \frac{2}{5}\log_2 \frac{2}{5}
\right]\\
&\approx 2.32\ldots
\end{align*}

* Entropy -- Really Unfair Dice
What if...
$$
Pr(x) = \begin{cases}
1 & x = 1\\
0 & \text{otherwise}
\end{cases}
$$

\begin{align*}
H(Pr) &= -1 \log_2 1 - 5\cdot0\log_2 0\\
      &= 0
\end{align*}

* Normalised Entropy

$$
\bar{H} = \frac{\text{entropy of a distribution}}{\text{entropy of the uniform distribution}}
$$


* Normalised Entropy

$$
\bar{H} = \frac{\text{entropy of a distribution}}{\text{entropy of the uniform distribution}}
$$

Recall entropy of the uniform distribution is just $-log_2 \frac{1}{|S|}$

Notation -- $|S|$ is number of elements in the support

* Normalised Entropy

For our unfair vs fair (uniform) dice,

$$
\bar{H}(\Pr_\text{unfair}) = \frac{H(\Pr_\text{unfair})}{-\log_2\frac{1}{|S_{d6}|}} = \frac{2.32\ldots}{2.58\ldots} = 0.898\ldots
$$

* Normalised Entropy

There better concept of relative entropy of two distributions,
*Kullback-Leibler divergence*. You would learn about this in a course on
Information Theory.

For our purposes, the normalised entropy will do.

* So what about text? Choose an alphabet

$$
S_\alpha = \mathtt{'abcdefghijklmnopqrstuvwxyz\_'}
$$

(by '_' we mean a space)

Some light pre-processing:
- make all letters lower case
- ignore punctuation etc

* So what about text? Probability distribution

Choose a letter at random from a text.

What is the chance you pick =e= or =q= or =' '= (space?)

- Support :: all (ascii) letters + space
- Mapping ::
  $$
  \Pr_\alpha(x) = \frac{\mathrm{count}(x)}{\mathrm{count}(\text{all letters})}
  $$


* Letter probabilities

#+begin_src
a 0.0654   b 0.0124   c 0.0214   d 0.0311   e 0.1061
f 0.0195   g 0.0144   h 0.0547   i 0.0604   j 0.0014
k 0.0043   l 0.0316   m 0.0196   n 0.0586   o 0.0633
p 0.0145   q 0.0009   r 0.0483   s 0.0537   t 0.0783
u 0.0236   v 0.0078   w 0.0177   x 0.0013   y 0.0164
z 0.0004     0.1727
#+end_src

* Normalised entropy of letter probabilities

$$
H(\Pr_\alpha) = 4.095\ldots
$$

Maximum entropy, $\log_2 \frac{1}{27} \approx 4.754\ldots$

$$
\bar{H}(\Pr_\alpha) = \frac{4.095\ldots}{4.754\ldots} = 0.861\ldots
$$

* Pair and conditional probability

\begin{align*}
\Pr(y | x) &\qquad\text{probability of }y\text{ given }x\\
\Pr(x,y) &\qquad\text{probability of seeing the pair }(x,y)
\end{align*}

\begin{align*}
\Pr(x,y) &= \Pr(y|x) \Pr(x)\\
\Pr(y) &= \sum_x \Pr(y|x) \Pr(x)
\end{align*}

* Assignment 2a - Probability and Text (10%)

Write a program named =assignment-2a= that takes a file and
prints out a letter   according to the distribution of letters in
that file, e.g. 

#+begin_src
xgb21195@cafe:~/cs101$ assignment-2a filename
e
xgb21195@cafe:~/cs101$
#+end_src

* Assignment 2a - Probability and Text (10%)

/hint: use the Python 3 built in/ =random.choices= /function/

Note:

#+begin_src
random.choices(population, weights=None, ...)
#+end_src

- =population= $\rightarrow$ support
- =weights= $\rightarrow$ probabilities
  
https://docs.python.org/3/library/random.html

* Assignment 2b - Probability and Text (10%)

Write a program named =assignment-2b= that takes a file and a
letter and prints out a following letter according to the
/conditional/ distribution letters given the previous letter that file,
e.g.

#+begin_src
xgb21195@cafe:~/cs101$ ./assignment-2b filename e
r
xgb21195@cafe:~/cs101$
#+end_src

* Assignment 2c - Probability and Text (10%)

Write a program named =assignment-2c= that takes a filename and a
number and prints out a sequence of characters according to the
conditional distribution from 2b

#+begin_src
xgb21195@cafe:~/cs101$ ./assignment-2c filename 25
end usanve n imemas hely
xgb21195@cafe:~/cs101$
#+end_src

* Assignment 2

- Your programs *must* be named =assignment-2a=, =assignment-2b=, and
  =assignment-2c= in your git repository
- You can write your programs in whatever language you wish. If you
  use a compiled language, you must include a binary that runs on the
  CIS linux machines and source code and a Makefile to build it.
- You *must* make sure the programs run on the CIS linux machines, if it
  does not, you will receive no marks, no exceptions.

* Assignment 2 bonus (10%)

Write a program like 2c that instead of using probability conditional
one /one/ previous letter, conditions on the previous /two/ letters.

Call this program =assignment-2d=

* Materials

https://gitlab.cis.strath.ac.uk/xgb21195/cs101-csai

look in the =lec2= subdirectory.

- letters.py :: definition of "letters" and a functions to get
  distributions of letters from a file
- entropy.py :: implementation of entropy function and example of
  using it on a file (for letters)
- example.py :: examples of using this function
- republic.txt :: Plato's Republic from Project Gutenberg
  
* Reading for next week

The Game, Anatoly Dneprov, 1961

http://q-bits.org/images/Dneprov.pdf