weka.classifiers.bayes
Class BayesNet

java.lang.Object
  extended by weka.classifiers.Classifier
      extended by weka.classifiers.bayes.BayesNet
All Implemented Interfaces:
java.io.Serializable, java.lang.Cloneable, AdditionalMeasureProducer, CapabilitiesHandler, Drawable, OptionHandler, RevisionHandler, WeightedInstancesHandler
Direct Known Subclasses:
BIFReader, EditableBayesNet

public class BayesNet
extends Classifier
implements OptionHandler, WeightedInstancesHandler, Drawable, AdditionalMeasureProducer

Bayes Network learning using various search algorithms and quality measures.
Base class for a Bayes Network classifier. Provides datastructures (network structure, conditional probability distributions, etc.) and facilities common to Bayes Network learning algorithms like K2 and B.

For more information see:

http://www.cs.waikato.ac.nz/~remco/weka.pdf

Valid options are:

 -D
  Do not use ADTree data structure
 
 -B <BIF file>
  BIF file to compare with
 
 -Q weka.classifiers.bayes.net.search.SearchAlgorithm
  Search algorithm
 
 -E weka.classifiers.bayes.net.estimate.SimpleEstimator
  Estimator algorithm
 

Version:
$Revision: 1.33 $
Author:
Remco Bouckaert (rrb@xm.co.nz)
See Also:
Serialized Form

Field Summary
 Estimator[][] m_Distributions
          The attribute estimators containing CPTs.
 Instances m_Instances
          The dataset header for the purposes of printing out a semi-intelligible model
 
Fields inherited from interface weka.core.Drawable
BayesNet, NOT_DRAWABLE, TREE
 
Constructor Summary
BayesNet()
           
 
Method Summary
 java.lang.String BIFFileTipText()
           
 void buildClassifier(Instances instances)
          Generates the classifier.
 void buildStructure()
          buildStructure determines the network structure/graph of the network.
 double[] countsForInstance(Instance instance)
          Calculates the counts for Dirichlet distribution for the class membership probabilities for the given test instance.
 double[] distributionForInstance(Instance instance)
          Calculates the class membership probabilities for the given test instance.
 java.util.Enumeration enumerateMeasures()
          Returns an enumeration of the measure names.
 void estimateCPTs()
          estimateCPTs estimates the conditional probability tables for the Bayes Net using the network structure.
 java.lang.String estimatorTipText()
          This will return a string describing the BayesNetEstimator.
 ADNode getADTree()
          get ADTree strucrture containing efficient representation of counts.
 java.lang.String getBIFFile()
          Get name of network in BIF file to compare with
 java.lang.String getBIFHeader()
           
 Capabilities getCapabilities()
          Returns default capabilities of the classifier.
 int getCardinality(int iNode)
          get number of values a node can take
 Estimator[][] getDistributions()
          Get full set of estimators.
 BayesNetEstimator getEstimator()
          Get the BayesNetEstimator used for calculating the CPTs
 double getMeasure(java.lang.String measureName)
          Returns the value of the named measure
 java.lang.String getName()
          get name of the Bayes network
 java.lang.String getNodeName(int iNode)
          get name of a node in the Bayes network
 java.lang.String getNodeValue(int iNode, int iValue)
          get name of a particular value of a node
 int getNrOfNodes()
          get number of nodes in the Bayes network
 int getNrOfParents(int iNode)
          get number of parents of a node in the network structure
 java.lang.String[] getOptions()
          Gets the current settings of the classifier.
 int getParent(int iNode, int iParent)
          get node index of a parent of a node in the network structure
 int getParentCardinality(int iNode)
          get number of values the collection of parents of a node can take
 ParentSet getParentSet(int iNode)
          get the parent set of a node
 ParentSet[] getParentSets()
          Get full set of parent sets.
 double getProbability(int iNode, int iParent, int iValue)
          get particular probability of the conditional probability distribtion of a node given its parents.
 java.lang.String getRevision()
          Returns the revision string.
 SearchAlgorithm getSearchAlgorithm()
          Get the SearchAlgorithm used as the search algorithm
 boolean getUseADTree()
          Method declaration
 java.lang.String globalInfo()
          This will return a string describing the classifier.
 java.lang.String graph()
          Returns a BayesNet graph in XMLBIF ver 0.3 format.
 int graphType()
          Returns the type of graph this classifier represents.
 void initCPTs()
          initializes the conditional probabilities
 void initStructure()
          Init structure initializes the structure to an empty graph or a Naive Bayes graph (depending on the -N flag).
 java.util.Enumeration listOptions()
          Returns an enumeration describing the available options
static void main(java.lang.String[] argv)
          Main method for testing this class.
 double measureAICScore()
           
 double measureBayesScore()
           
 double measureBDeuScore()
           
 double measureDivergence()
           
 double measureEntropyScore()
           
 double measureExtraArcs()
           
 double measureMDLScore()
           
 double measureMissingArcs()
           
 double measureReversedArcs()
           
static java.lang.String[] partitionOptions(java.lang.String[] options)
          Returns the secondary set of options (if any) contained in the supplied options array.
 java.lang.String searchAlgorithmTipText()
           
 void setBIFFile(java.lang.String sBIFFile)
          Set name of network in BIF file to compare with
 void setEstimator(BayesNetEstimator newBayesNetEstimator)
          Set the Estimator Algorithm used in calculating the CPTs
 void setOptions(java.lang.String[] options)
          Parses a given list of options.
 void setSearchAlgorithm(SearchAlgorithm newSearchAlgorithm)
          Set the SearchAlgorithm used in searching for network structures.
 void setUseADTree(boolean bUseADTree)
          Set whether ADTree structure is used or not
 java.lang.String toString()
          Returns a description of the classifier.
 java.lang.String toXMLBIF03()
          Returns a description of the classifier in XML BIF 0.3 format.
 void updateClassifier(Instance instance)
          Updates the classifier with the given instance.
 java.lang.String useADTreeTipText()
           
 
Methods inherited from class weka.classifiers.Classifier
classifyInstance, debugTipText, forName, getDebug, makeCopies, makeCopy, setDebug
 
Methods inherited from class java.lang.Object
equals, getClass, hashCode, notify, notifyAll, wait, wait, wait
 

Field Detail

m_Distributions

public Estimator[][] m_Distributions
The attribute estimators containing CPTs.


m_Instances

public Instances m_Instances
The dataset header for the purposes of printing out a semi-intelligible model

Constructor Detail

BayesNet

public BayesNet()
Method Detail

getCapabilities

public Capabilities getCapabilities()
Returns default capabilities of the classifier.

Specified by:
getCapabilities in interface CapabilitiesHandler
Overrides:
getCapabilities in class Classifier
Returns:
the capabilities of this classifier
See Also:
Capabilities

buildClassifier

public void buildClassifier(Instances instances)
                     throws java.lang.Exception
Generates the classifier.

Specified by:
buildClassifier in class Classifier
Parameters:
instances - set of instances serving as training data
Throws:
java.lang.Exception - if the classifier has not been generated successfully

initStructure

public void initStructure()
                   throws java.lang.Exception
Init structure initializes the structure to an empty graph or a Naive Bayes graph (depending on the -N flag).

Throws:
java.lang.Exception - in case of an error

buildStructure

public void buildStructure()
                    throws java.lang.Exception
buildStructure determines the network structure/graph of the network. The default behavior is creating a network where all nodes have the first node as its parent (i.e., a BayesNet that behaves like a naive Bayes classifier). This method can be overridden by derived classes to restrict the class of network structures that are acceptable.

Throws:
java.lang.Exception - in case of an error

estimateCPTs

public void estimateCPTs()
                  throws java.lang.Exception
estimateCPTs estimates the conditional probability tables for the Bayes Net using the network structure.

Throws:
java.lang.Exception - in case of an error

initCPTs

public void initCPTs()
              throws java.lang.Exception
initializes the conditional probabilities

Throws:
java.lang.Exception - in case of an error

updateClassifier

public void updateClassifier(Instance instance)
                      throws java.lang.Exception
Updates the classifier with the given instance.

Parameters:
instance - the new training instance to include in the model
Throws:
java.lang.Exception - if the instance could not be incorporated in the model.

distributionForInstance

public double[] distributionForInstance(Instance instance)
                                 throws java.lang.Exception
Calculates the class membership probabilities for the given test instance.

Overrides:
distributionForInstance in class Classifier
Parameters:
instance - the instance to be classified
Returns:
predicted class probability distribution
Throws:
java.lang.Exception - if there is a problem generating the prediction

countsForInstance

public double[] countsForInstance(Instance instance)
                           throws java.lang.Exception
Calculates the counts for Dirichlet distribution for the class membership probabilities for the given test instance.

Parameters:
instance - the instance to be classified
Returns:
counts for Dirichlet distribution for class probability
Throws:
java.lang.Exception - if there is a problem generating the prediction

listOptions

public java.util.Enumeration listOptions()
Returns an enumeration describing the available options

Specified by:
listOptions in interface OptionHandler
Overrides:
listOptions in class Classifier
Returns:
an enumeration of all the available options

setOptions

public void setOptions(java.lang.String[] options)
                throws java.lang.Exception
Parses a given list of options.

Valid options are:

 -D
  Do not use ADTree data structure
 
 -B <BIF file>
  BIF file to compare with
 
 -Q weka.classifiers.bayes.net.search.SearchAlgorithm
  Search algorithm
 
 -E weka.classifiers.bayes.net.estimate.SimpleEstimator
  Estimator algorithm
 

Specified by:
setOptions in interface OptionHandler
Overrides:
setOptions in class Classifier
Parameters:
options - the list of options as an array of strings
Throws:
java.lang.Exception - if an option is not supported

partitionOptions

public static java.lang.String[] partitionOptions(java.lang.String[] options)
Returns the secondary set of options (if any) contained in the supplied options array. The secondary set is defined to be any options after the first "--" but before the "-E". These options are removed from the original options array.

Parameters:
options - the input array of options
Returns:
the array of secondary options

getOptions

public java.lang.String[] getOptions()
Gets the current settings of the classifier.

Specified by:
getOptions in interface OptionHandler
Overrides:
getOptions in class Classifier
Returns:
an array of strings suitable for passing to setOptions

setSearchAlgorithm

public void setSearchAlgorithm(SearchAlgorithm newSearchAlgorithm)
Set the SearchAlgorithm used in searching for network structures.

Parameters:
newSearchAlgorithm - the SearchAlgorithm to use.

getSearchAlgorithm

public SearchAlgorithm getSearchAlgorithm()
Get the SearchAlgorithm used as the search algorithm

Returns:
the SearchAlgorithm used as the search algorithm

setEstimator

public void setEstimator(BayesNetEstimator newBayesNetEstimator)
Set the Estimator Algorithm used in calculating the CPTs

Parameters:
newBayesNetEstimator - the Estimator to use.

getEstimator

public BayesNetEstimator getEstimator()
Get the BayesNetEstimator used for calculating the CPTs

Returns:
the BayesNetEstimator used.

setUseADTree

public void setUseADTree(boolean bUseADTree)
Set whether ADTree structure is used or not

Parameters:
bUseADTree - true if an ADTree structure is used

getUseADTree

public boolean getUseADTree()
Method declaration

Returns:
whether ADTree structure is used or not

setBIFFile

public void setBIFFile(java.lang.String sBIFFile)
Set name of network in BIF file to compare with

Parameters:
sBIFFile - the name of the BIF file

getBIFFile

public java.lang.String getBIFFile()
Get name of network in BIF file to compare with

Returns:
BIF file name

toString

public java.lang.String toString()
Returns a description of the classifier.

Overrides:
toString in class java.lang.Object
Returns:
a description of the classifier as a string.

graphType

public int graphType()
Returns the type of graph this classifier represents.

Specified by:
graphType in interface Drawable
Returns:
Drawable.TREE

graph

public java.lang.String graph()
                       throws java.lang.Exception
Returns a BayesNet graph in XMLBIF ver 0.3 format.

Specified by:
graph in interface Drawable
Returns:
String representing this BayesNet in XMLBIF ver 0.3
Throws:
java.lang.Exception - in case BIF generation fails

getBIFHeader

public java.lang.String getBIFHeader()

toXMLBIF03

public java.lang.String toXMLBIF03()
Returns a description of the classifier in XML BIF 0.3 format. See http://www-2.cs.cmu.edu/~fgcozman/Research/InterchangeFormat/ for details on XML BIF.

Returns:
an XML BIF 0.3 description of the classifier as a string.

useADTreeTipText

public java.lang.String useADTreeTipText()
Returns:
a string to describe the UseADTreeoption.

searchAlgorithmTipText

public java.lang.String searchAlgorithmTipText()
Returns:
a string to describe the SearchAlgorithm.

estimatorTipText

public java.lang.String estimatorTipText()
This will return a string describing the BayesNetEstimator.

Returns:
The string.

BIFFileTipText

public java.lang.String BIFFileTipText()
Returns:
a string to describe the BIFFile.

globalInfo

public java.lang.String globalInfo()
This will return a string describing the classifier.

Returns:
The string.

main

public static void main(java.lang.String[] argv)
Main method for testing this class.

Parameters:
argv - the options

getName

public java.lang.String getName()
get name of the Bayes network

Returns:
name of the Bayes net

getNrOfNodes

public int getNrOfNodes()
get number of nodes in the Bayes network

Returns:
number of nodes

getNodeName

public java.lang.String getNodeName(int iNode)
get name of a node in the Bayes network

Parameters:
iNode - index of the node
Returns:
name of the specified node

getCardinality

public int getCardinality(int iNode)
get number of values a node can take

Parameters:
iNode - index of the node
Returns:
cardinality of the specified node

getNodeValue

public java.lang.String getNodeValue(int iNode,
                                     int iValue)
get name of a particular value of a node

Parameters:
iNode - index of the node
iValue - index of the value
Returns:
cardinality of the specified node

getNrOfParents

public int getNrOfParents(int iNode)
get number of parents of a node in the network structure

Parameters:
iNode - index of the node
Returns:
number of parents of the specified node

getParent

public int getParent(int iNode,
                     int iParent)
get node index of a parent of a node in the network structure

Parameters:
iNode - index of the node
iParent - index of the parents, e.g., 0 is the first parent, 1 the second parent, etc.
Returns:
node index of the iParent's parent of the specified node

getParentSets

public ParentSet[] getParentSets()
Get full set of parent sets.

Returns:
parent sets;

getDistributions

public Estimator[][] getDistributions()
Get full set of estimators.

Returns:
estimators;

getParentCardinality

public int getParentCardinality(int iNode)
get number of values the collection of parents of a node can take

Parameters:
iNode - index of the node
Returns:
cardinality of the parent set of the specified node

getProbability

public double getProbability(int iNode,
                             int iParent,
                             int iValue)
get particular probability of the conditional probability distribtion of a node given its parents.

Parameters:
iNode - index of the node
iParent - index of the parent set, 0 <= iParent <= getParentCardinality(iNode)
iValue - index of the value, 0 <= iValue <= getCardinality(iNode)
Returns:
probability

getParentSet

public ParentSet getParentSet(int iNode)
get the parent set of a node

Parameters:
iNode - index of the node
Returns:
Parent set of the specified node.

getADTree

public ADNode getADTree()
get ADTree strucrture containing efficient representation of counts.

Returns:
ADTree strucrture

enumerateMeasures

public java.util.Enumeration enumerateMeasures()
Returns an enumeration of the measure names. Additional measures must follow the naming convention of starting with "measure", eg. double measureBlah()

Specified by:
enumerateMeasures in interface AdditionalMeasureProducer
Returns:
an enumeration of the measure names

measureExtraArcs

public double measureExtraArcs()

measureMissingArcs

public double measureMissingArcs()

measureReversedArcs

public double measureReversedArcs()

measureDivergence

public double measureDivergence()

measureBayesScore

public double measureBayesScore()

measureBDeuScore

public double measureBDeuScore()

measureMDLScore

public double measureMDLScore()

measureAICScore

public double measureAICScore()

measureEntropyScore

public double measureEntropyScore()

getMeasure

public double getMeasure(java.lang.String measureName)
Returns the value of the named measure

Specified by:
getMeasure in interface AdditionalMeasureProducer
Parameters:
measureName - the name of the measure to query for its value
Returns:
the value of the named measure
Throws:
java.lang.IllegalArgumentException - if the named measure is not supported

getRevision

public java.lang.String getRevision()
Returns the revision string.

Specified by:
getRevision in interface RevisionHandler
Returns:
the revision