cc.mallet.types
Class GainRatio
java.lang.Object
   cc.mallet.types.SparseVector
cc.mallet.types.SparseVector
       cc.mallet.types.FeatureVector
cc.mallet.types.FeatureVector
           cc.mallet.types.RankedFeatureVector
cc.mallet.types.RankedFeatureVector
               cc.mallet.types.GainRatio
cc.mallet.types.GainRatio
- All Implemented Interfaces: 
- AlphabetCarrying, ConstantMatrix, Vector, java.io.Serializable
- public class GainRatio 
- extends RankedFeatureVector
List of features along with their thresholds sorted in descending order of 
 the ratio of (1) information gained by splitting instances on the 
 feature at its associated threshold value, to (2) the split information.
 The calculations performed do not take into consideration the instance weights.
 To create an instance of GainRatio from an InstanceList, one must do the following:
 InstanceList ilist = ... 
         ...
 GainRatio gr = GainRatio.createGainRatio(ilist);
 
 J. R. Quinlan
 "Improved Use of Continuous Attributes in C4.5" 
 ftp://ftp.cs.cmu.edu/project/jair/volume4/quinlan96a.ps
- Author:
- Gary Huang ghuang@cs.umass.edu
- See Also:
- Serialized Form
 
 
| Field Summary | 
| static double | log2
 | 
 
 
| Constructor Summary | 
| protected  | GainRatio(Alphabet dataAlphabet,
          double[] gainRatios,
          double[] splitPoints,
          double baseEntropy,
          LabelVector baseLabelDistribution,
          int numSplitPointsForBestFeature,
          int minNumInsts)
 | 
 
 
| Methods inherited from class cc.mallet.types.RankedFeatureVector | 
| getIndexAtRank, getMaxValue, getMaxValuedIndex, getMaxValuedIndexIn, getMaxValuedObject, getMaxValuedObjectIn, getMaxValueIn, getObjectAtRank, getRank, getRank, getValueAtRank, printByRank, printByRank, printLowerK, printTopK, set, setRankOrder, setRankOrder, setRankOrder, setReverseRankOrder | 
 
| Methods inherited from class cc.mallet.types.FeatureVector | 
| alphabetsMatch, cloneMatrix, cloneMatrixZeroed, contains, getAlphabet, getAlphabets, getObjectIndices, location, newFeatureVector, toSimpFile, toString, toString, value | 
 
| Methods inherited from class cc.mallet.types.SparseVector | 
| absNorm, addTo, addTo, arrayCopyFrom, arrayCopyFrom, arrayCopyInto, dotProduct, dotProduct, dotProduct, dotProduct, extendedDotProduct, extendedDotProduct, getDimensions, getIndices, getNumDimensions, getValues, incrementValue, indexAtLocation, infinityNorm, isBinary, isInfinite, isNaN, isNaNOrInfinite, location, makeBinary, makeNonBinary, map, numLocations, oneNorm, plusEqualsSparse, plusEqualsSparse, print, removeDuplicates, setAll, setValue, setValueAtLocation, singleIndex, singleSize, singleToIndices, singleValue, sortIndices, timesEquals, timesEqualsSparse, timesEqualsSparse, timesEqualsSparseZero, twoNorm, value, value, valueAtLocation, vectorAdd | 
 
| Methods inherited from class java.lang.Object | 
| clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait | 
 
log2
public static final double log2
GainRatio
protected GainRatio(Alphabet dataAlphabet,
                    double[] gainRatios,
                    double[] splitPoints,
                    double baseEntropy,
                    LabelVector baseLabelDistribution,
                    int numSplitPointsForBestFeature,
                    int minNumInsts)
calcGainRatios
protected static java.lang.Object[] calcGainRatios(InstanceList ilist,
                                                   int[] instIndices,
                                                   int minNumInsts)
- Calculates gain ratios for all (feature, split point) pairs 
 snd returns array of:
   1.  gain ratios (each element is the max gain ratio of a feature 
 for those split points with at least average gain)
   2.  the optimal split point for each feature
   3.  the overall entropy 
   4.  the overall label distribution of the given instances
   5.  the number of split points of the split feature.
    
 
- 
 
sortInstances
public static int[] sortInstances(InstanceList ilist,
                                  int[] instIndices,
                                  int featureIndex)
- 
 
createGainRatio
public static GainRatio createGainRatio(InstanceList ilist)
- Constructs a GainRatio object.
 
- 
 
createGainRatio
public static GainRatio createGainRatio(InstanceList ilist,
                                        int[] instIndices,
                                        int minNumInsts)
- Constructs a GainRatio object
 
- 
 
getMaxValuedThreshold
public double getMaxValuedThreshold()
- 
- Returns:
- the threshold of the (feature, threshold) 
 pair with with maximum gain ratio
 
getThresholdAtRank
public double getThresholdAtRank(int rank)
- 
- Returns:
- the threshold of the (feature, threshold)
 pair with the given rank
 
getBaseEntropy
public double getBaseEntropy()
- 
 
getBaseLabelDistribution
public LabelVector getBaseLabelDistribution()
- 
 
getNumSplitPointsForBestFeature
public int getNumSplitPointsForBestFeature()
-