weka.classifiers.misc
Class OLM

java.lang.Object
  extended by weka.classifiers.Classifier
      extended by weka.classifiers.RandomizableClassifier
          extended by weka.classifiers.misc.OLM
All Implemented Interfaces:
java.io.Serializable, java.lang.Cloneable, CapabilitiesHandler, OptionHandler, Randomizable, RevisionHandler, TechnicalInformationHandler

public class OLM
extends RandomizableClassifier
implements TechnicalInformationHandler

This class is an implementation of the Ordinal Learning Method
Further information regarding the algorithm and variants can be found in:

Arie Ben-David (1992). Automatic Generation of Symbolic Multiattribute Ordinal Knowledge-Based DSSs: methodology and Applications. Decision Sciences. 23:1357-1372.

Lievens, Stijn (2003-2004). Studie en implementatie van instantie-gebaseerde algoritmen voor gesuperviseerd rangschikken..

BibTeX:

 @article{Ben-David1992,
    author = {Arie Ben-David},
    journal = {Decision Sciences},
    pages = {1357-1372},
    title = {Automatic Generation of Symbolic Multiattribute Ordinal Knowledge-Based DSSs: methodology and Applications},
    volume = {23},
    year = {1992}
 }
 
 @mastersthesis{Lievens2003-2004,
    author = {Lievens, Stijn},
    school = {Ghent University},
    title = {Studie en implementatie van instantie-gebaseerde algoritmen voor gesuperviseerd rangschikken.},
    year = {2003-2004}
 }
 

Valid options are:

 -S <num>
  Random number seed.
  (default 1)
 -D
  If set, classifier is run in debug mode and
  may output additional info to the console
 -C <CL|REG>
  Sets the classification type to be used.
  (Default: REG)
 -A <MEAN|MED|MAX>
  Sets the averaging type used in phase 1 of the classifier.
  (Default: MEAN)
 -N <NONE|EUCL|HAM>
  If different from NONE, a nearest neighbour rule is fired when the
  rule base doesn't contain an example smaller than the instance
  to be classified
  (Default: NONE).
 -E <MIN|MAX|BOTH>
  Sets the extension type, i.e. the rule base to use.
  (Default: MIN)
 -sort
  If set, the instances are also sorted within the same class
  before building the rule bases

Version:
$Revision: 1.2 $
Author:
Stijn Lievens (stijn.lievens@ugent.be)
See Also:
Serialized Form

Field Summary
static int AT_MAXPROB
          Use the mode for averaging in phase 1.
static int AT_MEAN
          Use the mean for averaging in phase 1.
static int AT_MEDIAN
          Use the median for averaging in phase 1.
static int CT_REAL
          No rounding is performed during classification, this is the classification is done in a regression like way.
static int CT_ROUNDED
          Round the real value that is returned by the original algorithm to the nearest label.
static int DT_EUCLID
          Use the Euclidian distance whenever a nearest neighbour rule is fired.
static int DT_HAMMING
          Use the Hamming distance, this is the number of positions in which the instances differ, whenever a nearest neighbour rule is fired
static int DT_NONE
          No nearest neighbour rule will be fired when classifying an instance for which there is no smaller rule in the rule base?
static int ET_BOTH
          Combine both the minimal and maximal extension, and use the midpoint of the resulting interval as prediction.
static int ET_MAX
          Use only the maximal extension.
static int ET_MIN
          Use only the minimal extension, as in the original algorithm of Ben-David.
static Tag[] TAGS_AVERAGINGTYPES
          the averaging types
static Tag[] TAGS_CLASSIFICATIONTYPES
          the classification types
static Tag[] TAGS_DISTANCETYPES
          the distance types
static Tag[] TAGS_EXTENSIONTYPES
          the mode types
 
Constructor Summary
OLM()
           
 
Method Summary
 java.lang.String averagingTypeTipText()
          Returns the tip text for this property.
 void buildClassifier(Instances instances)
          Build the OLM classifier, meaning that the rule bases are built.
 java.lang.String classificationTypeTipText()
          Returns the tip text for this property.
 double classifyInstance(Instance instance)
          Classifies a given instance according to the current settings of the classifier.
 java.lang.String distanceTypeTipText()
          Returns the tip text for this property.
 java.lang.String extensionTypeTipText()
          Returns the tip text for this property.
 SelectedTag getAveragingType()
          Gets the averaging type.
 Capabilities getCapabilities()
          Returns default capabilities of the classifier.
 SelectedTag getClassificationType()
          Gets the classification type.
 SelectedTag getDistanceType()
          Gets the distance type used by a nearest neighbour rule (if any).
 SelectedTag getExtensionType()
          Gets the extension type.
 java.lang.String[] getOptions()
          Gets an array of string with the current options of the classifier.
 java.lang.String getRevision()
          Returns the revision string.
 int getSizeRuleBaseMax()
          Return the number of examples in the maximal rule base.
 int getSizeRuleBaseMin()
          Return the number of examples in the minimal rule base.
 boolean getSort()
          Returns if the instances are sorted prior to building the rule bases.
 TechnicalInformation getTechnicalInformation()
          Returns an instance of a TechnicalInformation object, containing detailed information about the technical background of this class, e.g., paper reference or book this class is based on.
 java.lang.String globalInfo()
          Returns a string describing the classifier.
 java.util.Enumeration listOptions()
          Get an enumeration of all available options for this classifier.
static void main(java.lang.String[] args)
          Main method for testing this class.
 java.lang.String seedTipText()
          Returns the tip text for this property.
 void setAveragingType(SelectedTag value)
          Sets the averaging type to use in phase 1 of the algorithm.
 void setClassificationType(SelectedTag value)
          Sets the classification type.
 void setDistanceType(SelectedTag value)
          Sets the distance type to be used by a nearest neighbour rule (if any).
 void setExtensionType(SelectedTag value)
          Sets the extension type to use.
 void setOptions(java.lang.String[] options)
          Parses the options for this object.
 void setSort(boolean sort)
          Sets if the instances are to be sorted prior to building the rule bases.
 java.lang.String sortTipText()
          Returns the tip text for this property.
 java.lang.String toString()
          Returns a string description of the classifier.
 
Methods inherited from class weka.classifiers.RandomizableClassifier
getSeed, setSeed
 
Methods inherited from class weka.classifiers.Classifier
debugTipText, distributionForInstance, forName, getDebug, makeCopies, makeCopy, setDebug
 
Methods inherited from class java.lang.Object
equals, getClass, hashCode, notify, notifyAll, wait, wait, wait
 

Field Detail

CT_ROUNDED

public static final int CT_ROUNDED
Round the real value that is returned by the original algorithm to the nearest label.

See Also:
Constant Field Values

CT_REAL

public static final int CT_REAL
No rounding is performed during classification, this is the classification is done in a regression like way.

See Also:
Constant Field Values

TAGS_CLASSIFICATIONTYPES

public static final Tag[] TAGS_CLASSIFICATIONTYPES
the classification types


AT_MEAN

public static final int AT_MEAN
Use the mean for averaging in phase 1. This is in fact a non ordinal procedure. The scores used for averaging are the internal values of WEKA.

See Also:
Constant Field Values

AT_MEDIAN

public static final int AT_MEDIAN
Use the median for averaging in phase 1. The possible values are in the extended set of labels, this is labels in between the original labels are possible.

See Also:
Constant Field Values

AT_MAXPROB

public static final int AT_MAXPROB
Use the mode for averaging in phase 1. The label that has maximum frequency is used. If there is more than one label that has maximum frequency, the lowest one is prefered.

See Also:
Constant Field Values

TAGS_AVERAGINGTYPES

public static final Tag[] TAGS_AVERAGINGTYPES
the averaging types


DT_NONE

public static final int DT_NONE
No nearest neighbour rule will be fired when classifying an instance for which there is no smaller rule in the rule base?

See Also:
Constant Field Values

DT_EUCLID

public static final int DT_EUCLID
Use the Euclidian distance whenever a nearest neighbour rule is fired.

See Also:
Constant Field Values

DT_HAMMING

public static final int DT_HAMMING
Use the Hamming distance, this is the number of positions in which the instances differ, whenever a nearest neighbour rule is fired

See Also:
Constant Field Values

TAGS_DISTANCETYPES

public static final Tag[] TAGS_DISTANCETYPES
the distance types


ET_MIN

public static final int ET_MIN
Use only the minimal extension, as in the original algorithm of Ben-David.

See Also:
Constant Field Values

ET_MAX

public static final int ET_MAX
Use only the maximal extension. In this case an algorithm dual to the original one is performed.

See Also:
Constant Field Values

ET_BOTH

public static final int ET_BOTH
Combine both the minimal and maximal extension, and use the midpoint of the resulting interval as prediction.

See Also:
Constant Field Values

TAGS_EXTENSIONTYPES

public static final Tag[] TAGS_EXTENSIONTYPES
the mode types

Constructor Detail

OLM

public OLM()
Method Detail

globalInfo

public java.lang.String globalInfo()
Returns a string describing the classifier.

Returns:
a description suitable for displaying in the explorer/experimenter gui

getCapabilities

public Capabilities getCapabilities()
Returns default capabilities of the classifier.

Specified by:
getCapabilities in interface CapabilitiesHandler
Overrides:
getCapabilities in class Classifier
Returns:
the capabilities of this classifier
See Also:
Capabilities

getTechnicalInformation

public TechnicalInformation getTechnicalInformation()
Returns an instance of a TechnicalInformation object, containing detailed information about the technical background of this class, e.g., paper reference or book this class is based on.

Specified by:
getTechnicalInformation in interface TechnicalInformationHandler
Returns:
the technical information about this class

classificationTypeTipText

public java.lang.String classificationTypeTipText()
Returns the tip text for this property.

Returns:
tip text for this property suitable for displaying in the explorer/experimenter gui

setClassificationType

public void setClassificationType(SelectedTag value)
Sets the classification type.

Parameters:
value - the classification type to be set.

getClassificationType

public SelectedTag getClassificationType()
Gets the classification type.

Returns:
the classification type

averagingTypeTipText

public java.lang.String averagingTypeTipText()
Returns the tip text for this property.

Returns:
tip text for this property suitable for displaying in the explorer/experimenter gui

setAveragingType

public void setAveragingType(SelectedTag value)
Sets the averaging type to use in phase 1 of the algorithm.

Parameters:
value - the averaging type to use

getAveragingType

public SelectedTag getAveragingType()
Gets the averaging type.

Returns:
the averaging type

distanceTypeTipText

public java.lang.String distanceTypeTipText()
Returns the tip text for this property.

Returns:
tip text for this property suitable for displaying in the explorer/experimenter gui

setDistanceType

public void setDistanceType(SelectedTag value)
Sets the distance type to be used by a nearest neighbour rule (if any).

Parameters:
value - the distance type to use

getDistanceType

public SelectedTag getDistanceType()
Gets the distance type used by a nearest neighbour rule (if any).

Returns:
the distance type

extensionTypeTipText

public java.lang.String extensionTypeTipText()
Returns the tip text for this property.

Returns:
tip text for this property suitable for displaying in the explorer/experimenter gui

setExtensionType

public void setExtensionType(SelectedTag value)
Sets the extension type to use. The minimal extension is the one used by Ben-David in the original algorithm. The maximal extension is a completely dual variant of the minimal extension. When using both, then the midpoint of the interval determined by both extensions is returned.

Parameters:
value - the extension type to use

getExtensionType

public SelectedTag getExtensionType()
Gets the extension type.

Returns:
the extension type

sortTipText

public java.lang.String sortTipText()
Returns the tip text for this property.

Returns:
tip text for this property suitable for displaying in the explorer/experimenter gui

setSort

public void setSort(boolean sort)
Sets if the instances are to be sorted prior to building the rule bases.

Parameters:
sort - if true the instances will be sorted

getSort

public boolean getSort()
Returns if the instances are sorted prior to building the rule bases.

Returns:
true if instances are sorted prior to building the rule bases, false otherwise.

seedTipText

public java.lang.String seedTipText()
Returns the tip text for this property.

Overrides:
seedTipText in class RandomizableClassifier
Returns:
tip text for this property suitable for displaying in the explorer/experimenter gui

getSizeRuleBaseMin

public int getSizeRuleBaseMin()
Return the number of examples in the minimal rule base. The minimal rule base is the one that corresponds to the rule base of Ben-David.

Returns:
the number of examples in the minimal rule base

getSizeRuleBaseMax

public int getSizeRuleBaseMax()
Return the number of examples in the maximal rule base. The maximal rule base is built using an algorithm dual to that for building the minimal rule base.

Returns:
the number of examples in the maximal rule base

classifyInstance

public double classifyInstance(Instance instance)
Classifies a given instance according to the current settings of the classifier.

Overrides:
classifyInstance in class Classifier
Parameters:
instance - the instance to be classified
Returns:
a double that represents the classification, this could either be the internal value of a label, when rounding is on, or a real number.

buildClassifier

public void buildClassifier(Instances instances)
                     throws java.lang.Exception
Build the OLM classifier, meaning that the rule bases are built.

Specified by:
buildClassifier in class Classifier
Parameters:
instances - the instances to use for building the rule base
Throws:
java.lang.Exception - if instances cannot be handled by the classifier.

toString

public java.lang.String toString()
Returns a string description of the classifier. In debug mode, the rule bases are added to the string representation as well. This means that the description can become rather lengthy.

Overrides:
toString in class java.lang.Object
Returns:
a String describing the classifier.

listOptions

public java.util.Enumeration listOptions()
Get an enumeration of all available options for this classifier.

Specified by:
listOptions in interface OptionHandler
Overrides:
listOptions in class RandomizableClassifier
Returns:
an enumeration of available options

setOptions

public void setOptions(java.lang.String[] options)
                throws java.lang.Exception
Parses the options for this object.

Valid options are:

 -S <num>
  Random number seed.
  (default 1)
 -D
  If set, classifier is run in debug mode and
  may output additional info to the console
 -C <CL|REG>
  Sets the classification type to be used.
  (Default: REG)
 -A <MEAN|MED|MAX>
  Sets the averaging type used in phase 1 of the classifier.
  (Default: MEAN)
 -N <NONE|EUCL|HAM>
  If different from NONE, a nearest neighbour rule is fired when the
  rule base doesn't contain an example smaller than the instance
  to be classified
  (Default: NONE).
 -E <MIN|MAX|BOTH>
  Sets the extension type, i.e. the rule base to use.
  (Default: MIN)
 -sort
  If set, the instances are also sorted within the same class
  before building the rule bases

Specified by:
setOptions in interface OptionHandler
Overrides:
setOptions in class RandomizableClassifier
Parameters:
options - an array of strings containing the options
Throws:
java.lang.Exception - if there are options that have invalid arguments.

getOptions

public java.lang.String[] getOptions()
Gets an array of string with the current options of the classifier.

Specified by:
getOptions in interface OptionHandler
Overrides:
getOptions in class RandomizableClassifier
Returns:
an array suitable as argument for setOptions

getRevision

public java.lang.String getRevision()
Returns the revision string.

Specified by:
getRevision in interface RevisionHandler
Returns:
the revision

main

public static void main(java.lang.String[] args)
Main method for testing this class.

Parameters:
args - the command line arguments