This section describes the feature of using data (i.e. a set of cases) to "learn" the graphical structure of a Bayesian network. This is known as structural learning.
A case is an assignment of values to some or all the nodes of a domain. If values have been assigned to all nodes, the case is said to be complete; otherwise, it is said to be incomplete. The data used by the learning algorithm is comprised of a set of cases. The cases are numbered sequentially from 0 to N-1, where N is the total number of cases.
The algorithm used by HUGIN for learning the network structure is the PC algorithm} developed by Spirtes and Glymour. A similar algorithm (the IC algorithm) was independently developed by Verma and Pearl. The current implementation is limited to domains of discrete chance nodes, and no expert knowledge (e.g., which edges should be included/excluded, directions of edges, etc.) is taken into account.
An outline of the algorithm is as follows:
The PC algorithm works by applying a set of rules to create directed links between the nodes of domain, which is assumed to contain only discrete chance nodes and no edges. Data must have been entered (using the functions described in case data), and the number of states for each node must have been set appropriately.
The PC algorithm only determines the structure of the network. It does not calculate the conditional probability tables. This can be done using the EM-algorithm.
If a log-file has been specified, then a log of the actions taken by the PC algorithm are produced. Such a log is useful for debugging and validation purposes (e.g., to determine which edge directions were determined from data and which were selected at random).
The dependency tests calculate a test statistic which is (approximately) chi^2-distributed assuming (conditional) independence. If the test statistic is large, we reject the independence hypothesis; otherwise, we accept. The probability of rejecting a true independence hypothesis is set using the following function. The significance level (i.e., the probability of rejecting a true independence hypothesis) a value between 0 and 1. The default value is 0.05.
In general, increasing the significance level will result in more edges, whereas reducing it will result in fewer edges. With fewer edges, the number of arbitrarily directed edges will decrease. Reducing the significance level will also reduce the running time of structural learning.