## Beliefs nets

Continued from previous class

Events diagrams must always be arranged in a way so that there are final nodes and no loops. Recording probabilities in tables for each event, the tables are filled by repeating experience so as to know the probabilities and occurrences of each event.

### Bayesian inference

Several models can be drawn for a given set of events. To know which model is right, the Bayesian probabilities formulas can be used to confirm if events are independent or not, make them easier to compute, and choose the more appropriate model.

• P(a/b) = P(a,b) / P(b)
• P(a/b) P(b) = P(a,b) = P(b/a) P(a)
• P(a/b) = P(b/a) P(a) / P(b)

Defining a as a class, and b as the evidence, the probability of the evidence given the class can be obtained through these formulas.

P(class/evidence) = P(evidence/class) P(class) / P(evidence)

Using the evidence from experience, classes can inferred by analyzing the results and corresponding probabilities.

### Structure discovery

Given the data from experience / simulation, the right model can be sorted as it better corresponds to the probabilities. This allows to select between 2 existing models.

However if multiple models can be created, volumes of data make it impossible to compare them all. The solution is to use two models and compare them recursively. At each trial, the losing model is modified for improvements until a model fits certain criteria for success.

A trick is to use the sum of the logarithms rather than the probabilities, as large numbers of trials will make numbers too small to compute properly.

To avoid local maxima, a radical rearrangement of structure is launched after a certain number of trials.

### Applications

This Bayesian structure discovery works quite well in situations when a diagnosis must be completed: medical diagnosis, lie-detector, symptoms of aircraft or program not working…

## Probabilities in Artificial Intelligence

With a joint probability table, recording the tally of crossed events occurrence will allow us to measure the probabilities of each event happening, conditional or unconditional probabilities, independence of events, etc.

The problem with such table is that as the number of variables increase, the number of rows in the table grows exponentially.

## Reminders of probabilities formulas

### Basic axioms of probability

• 0 ≤ P(a) ≤  1
• P(True) = 1 ; P(False) = 0
• P(a+b) = P(a) + P(b) – P(a,b)

### Basic definitions of probability

• P(a/b) = P(a,b) / P(b)
• P(a,b) = P(a/b) P(b)
• P(a/b,c) = P(a/b,c) P(b,c) = P(a/b,c)P(b/c)P(c)

### Chain rule of probability

By generalizing the previous formula, we obtain the following chain rule:

### Independence

#### Independent events

P(a/b) = P(a) if a and b are independent

#### Conditional independence

If a and b are independent

• P(a/b+z) = P(a/z)
• P(a+b/z) = P(a/z)P(b/z)

## Belief nets

Causal relations between events can be represented in nets. These models highlight that any event is only dependent from its parents and descendants. Recording the probabilities at each node, the number of table and rows is significantly smaller than a general table of all events tallies.