cc.mallet.types
Class FeatureSequence

java.lang.Object
  extended by cc.mallet.types.FeatureSequence
All Implemented Interfaces:
AlphabetCarrying, Sequence, java.io.Serializable
Direct Known Subclasses:
FeatureSequenceWithBigrams, LabelSequence

public class FeatureSequence
extends java.lang.Object
implements Sequence, java.io.Serializable, AlphabetCarrying

An implementation of Sequence that ensures that every Object in the sequence has the same class. Feature sequences are mutable, and will expand as new objects are added.

Author:
Andrew McCallum mccallum@cs.umass.edu
See Also:
Serialized Form

Constructor Summary
FeatureSequence(Alphabet dict)
           
FeatureSequence(Alphabet dict, int capacity)
           
FeatureSequence(Alphabet dict, int[] features)
          Creates a FeatureSequence given all of the objects in the sequence.
FeatureSequence(Alphabet dict, int[] features, int len)
           
 
Method Summary
 void add(int featureIndex)
           
 void add(java.lang.Object key)
           
 void addFeatureWeightsTo(double[] weights)
           
 void addFeatureWeightsTo(double[] weights, double scale)
           
 boolean alphabetsMatch(AlphabetCarrying object)
           
 java.lang.Object get(int pos)
           
 Alphabet getAlphabet()
           
 Alphabet[] getAlphabets()
           
 int[] getFeatures()
           
 int getIndexAtPosition(int pos)
           
 int getLength()
           
 java.lang.Object getObjectAtPosition(int pos)
           
protected  void growIfNecessary()
           
 void prune(double[] counts, Alphabet newAlphabet, int cutoff)
          Remove features from the sequence that occur fewer than cutoff times in the corpus, as indicated by the provided counts.
 int size()
           
 int[] toFeatureIndexSequence()
           
 int[] toSortedFeatureIndexSequence()
           
 java.lang.String toString()
           
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait
 

Constructor Detail

FeatureSequence

public FeatureSequence(Alphabet dict,
                       int[] features)
Creates a FeatureSequence given all of the objects in the sequence.

Parameters:
dict - A dictionary that maps objects in the sequence to numeric indices.
features - An array where features[i] gives the index in dict of the ith element of the sequence.

FeatureSequence

public FeatureSequence(Alphabet dict,
                       int[] features,
                       int len)

FeatureSequence

public FeatureSequence(Alphabet dict,
                       int capacity)

FeatureSequence

public FeatureSequence(Alphabet dict)
Method Detail

getFeatures

public int[] getFeatures()

getAlphabet

public Alphabet getAlphabet()
Specified by:
getAlphabet in interface AlphabetCarrying

getAlphabets

public Alphabet[] getAlphabets()
Specified by:
getAlphabets in interface AlphabetCarrying

alphabetsMatch

public boolean alphabetsMatch(AlphabetCarrying object)

getLength

public final int getLength()

size

public final int size()
Specified by:
size in interface Sequence

getIndexAtPosition

public final int getIndexAtPosition(int pos)

getObjectAtPosition

public java.lang.Object getObjectAtPosition(int pos)

get

public java.lang.Object get(int pos)
Specified by:
get in interface Sequence

toString

public java.lang.String toString()
Overrides:
toString in class java.lang.Object

growIfNecessary

protected void growIfNecessary()

add

public void add(int featureIndex)

add

public void add(java.lang.Object key)

addFeatureWeightsTo

public void addFeatureWeightsTo(double[] weights)

addFeatureWeightsTo

public void addFeatureWeightsTo(double[] weights,
                                double scale)

toFeatureIndexSequence

public int[] toFeatureIndexSequence()

toSortedFeatureIndexSequence

public int[] toSortedFeatureIndexSequence()

prune

public void prune(double[] counts,
                  Alphabet newAlphabet,
                  int cutoff)
Remove features from the sequence that occur fewer than cutoff times in the corpus, as indicated by the provided counts. Also swap in the new, reduced alphabet. This method alters the instance in place; it is not appropriate if the original instance will be needed.