|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
java.lang.Objectweka.filters.Filter
weka.filters.unsupervised.attribute.RandomProjection
public class RandomProjection
Reduces the dimensionality of the data by projecting it onto a lower dimensional subspace using a random matrix with columns of unit length (i.e. It will reduce the number of attributes in the data while preserving much of its variation like PCA, but at a much less computational cost).
It first applies the NominalToBinary filter to convert all attributes to numeric before reducing the dimension. It preserves the class attribute.
For more information, see:
Dmitriy Fradkin, David Madigan: Experiments with random projections for machine learning. In: KDD '03: Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining, New York, NY, USA, 517-522, 003.
@inproceedings{Fradkin003, address = {New York, NY, USA}, author = {Dmitriy Fradkin and David Madigan}, booktitle = {KDD '03: Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining}, pages = {517-522}, publisher = {ACM Press}, title = {Experiments with random projections for machine learning}, year = {003} }Valid options are:
-N <number> The number of dimensions (attributes) the data should be reduced to (default 10; exclusive of the class attribute, if it is set).
-D [SPARSE1|SPARSE2|GAUSSIAN] The distribution to use for calculating the random matrix. Sparse1 is: sqrt(3)*{-1 with prob(1/6), 0 with prob(2/3), +1 with prob(1/6)} Sparse2 is: {-1 with prob(1/2), +1 with prob(1/2)}
-P <percent> The percentage of dimensions (attributes) the data should be reduced to (exclusive of the class attribute, if it is set). This -N option is ignored if this option is present or is greater than zero.
-M Replace missing values using the ReplaceMissingValues filter
-R <num> The random seed for the random number generator used for calculating the random matrix (default 42).
Field Summary | |
---|---|
static int |
GAUSSIAN
distribution type: gaussian |
static int |
SPARSE1
distribution type: sparse 1 |
static int |
SPARSE2
distribution type: sparse 2 |
static Tag[] |
TAGS_DSTRS_TYPE
The types of distributions that can be used for calculating the random matrix |
Constructor Summary | |
---|---|
RandomProjection()
|
Method Summary | |
---|---|
boolean |
batchFinished()
Signify that this batch of input to the filter is finished. |
java.lang.String |
distributionTipText()
Returns the tip text for this property |
Capabilities |
getCapabilities()
Returns the Capabilities of this filter. |
SelectedTag |
getDistribution()
Returns the current distribution that'll be used for calculating the random matrix |
int |
getNumberOfAttributes()
Gets the current number of attributes (dimensionality) to which the data will be reduced to. |
java.lang.String[] |
getOptions()
Gets the current settings of the filter. |
double |
getPercent()
Gets the percent the attributes (dimensions) of the data will be reduced to |
long |
getRandomSeed()
Gets the random seed of the random number generator |
boolean |
getReplaceMissingValues()
Gets the current setting for using ReplaceMissingValues filter |
java.lang.String |
getRevision()
Returns the revision string. |
TechnicalInformation |
getTechnicalInformation()
Returns an instance of a TechnicalInformation object, containing detailed information about the technical background of this class, e.g., paper reference or book this class is based on. |
java.lang.String |
globalInfo()
Returns a string describing this filter |
boolean |
input(Instance instance)
Input an instance for filtering. |
java.util.Enumeration |
listOptions()
Returns an enumeration describing the available options. |
static void |
main(java.lang.String[] argv)
Main method for testing this class. |
java.lang.String |
numberOfAttributesTipText()
Returns the tip text for this property |
java.lang.String |
percentTipText()
Returns the tip text for this property |
java.lang.String |
randomSeedTipText()
Returns the tip text for this property |
java.lang.String |
replaceMissingValuesTipText()
Returns the tip text for this property |
void |
setDistribution(SelectedTag newDstr)
Sets the distribution to use for calculating the random matrix |
boolean |
setInputFormat(Instances instanceInfo)
Sets the format of the input instances. |
void |
setNumberOfAttributes(int newAttNum)
Sets the number of attributes (dimensions) the data should be reduced to |
void |
setOptions(java.lang.String[] options)
Parses a given list of options. |
void |
setPercent(double newPercent)
Sets the percent the attributes (dimensions) of the data should be reduced to |
void |
setRandomSeed(long seed)
Sets the random seed of the random number generator |
void |
setReplaceMissingValues(boolean t)
Sets either to use replace missing values filter or not |
Methods inherited from class weka.filters.Filter |
---|
batchFilterFile, filterFile, getCapabilities, getOutputFormat, isFirstBatchDone, isNewBatch, isOutputFormatDefined, makeCopies, makeCopy, numPendingOutput, output, outputPeek, toString, useFilter, wekaStaticWrapper |
Methods inherited from class java.lang.Object |
---|
equals, getClass, hashCode, notify, notifyAll, wait, wait, wait |
Field Detail |
---|
public static final int SPARSE1
public static final int SPARSE2
public static final int GAUSSIAN
public static final Tag[] TAGS_DSTRS_TYPE
Constructor Detail |
---|
public RandomProjection()
Method Detail |
---|
public java.util.Enumeration listOptions()
listOptions
in interface OptionHandler
public void setOptions(java.lang.String[] options) throws java.lang.Exception
-N <number> The number of dimensions (attributes) the data should be reduced to (default 10; exclusive of the class attribute, if it is set).
-D [SPARSE1|SPARSE2|GAUSSIAN] The distribution to use for calculating the random matrix. Sparse1 is: sqrt(3)*{-1 with prob(1/6), 0 with prob(2/3), +1 with prob(1/6)} Sparse2 is: {-1 with prob(1/2), +1 with prob(1/2)}
-P <percent> The percentage of dimensions (attributes) the data should be reduced to (exclusive of the class attribute, if it is set). This -N option is ignored if this option is present or is greater than zero.
-M Replace missing values using the ReplaceMissingValues filter
-R <num> The random seed for the random number generator used for calculating the random matrix (default 42).
setOptions
in interface OptionHandler
options
- the list of options as an array of strings
java.lang.Exception
- if an option is not supportedpublic java.lang.String[] getOptions()
getOptions
in interface OptionHandler
public java.lang.String globalInfo()
public TechnicalInformation getTechnicalInformation()
getTechnicalInformation
in interface TechnicalInformationHandler
public java.lang.String numberOfAttributesTipText()
public void setNumberOfAttributes(int newAttNum)
newAttNum
- the goal for the dimensionspublic int getNumberOfAttributes()
public java.lang.String percentTipText()
public void setPercent(double newPercent)
newPercent
- the percentage of attributespublic double getPercent()
public java.lang.String randomSeedTipText()
public void setRandomSeed(long seed)
seed
- the random seed valuepublic long getRandomSeed()
public java.lang.String distributionTipText()
public void setDistribution(SelectedTag newDstr)
newDstr
- the distribution to usepublic SelectedTag getDistribution()
public java.lang.String replaceMissingValuesTipText()
public void setReplaceMissingValues(boolean t)
t
- if true then the replace missing values is usedpublic boolean getReplaceMissingValues()
public Capabilities getCapabilities()
getCapabilities
in interface CapabilitiesHandler
getCapabilities
in class Filter
Capabilities
public boolean setInputFormat(Instances instanceInfo) throws java.lang.Exception
setInputFormat
in class Filter
instanceInfo
- an Instances object containing the input
instance structure (any instances contained in the object are
ignored - only the structure is required).
java.lang.Exception
- if the input format can't be set
successfullypublic boolean input(Instance instance) throws java.lang.Exception
input
in class Filter
instance
- the input instance
java.lang.IllegalStateException
- if no input format has been set
java.lang.NullPointerException
- if the input format has not been
defined.
java.lang.Exception
- if the input instance was not of the correct
format or if there was a problem with the filtering.public boolean batchFinished() throws java.lang.Exception
batchFinished
in class Filter
java.lang.NullPointerException
- if no input structure has been defined,
java.lang.Exception
- if there was a problem finishing the batch.public java.lang.String getRevision()
getRevision
in interface RevisionHandler
public static void main(java.lang.String[] argv)
argv
- should contain arguments to the filter:
use -h for help
|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |