|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
java.lang.Object cc.mallet.pipe.Pipe
public abstract class Pipe
The abstract superclass of all Pipes, which transform one data type to another. Pipes are most often used for feature extraction.
Although Pipe does not have any "abstract methods", in order to use a Pipe subclass
you must override either the pipe
method or the newIteratorFrom
method.
The former is appropriate when the pipe's processing of an Instance is strictly
one-to-one. For every Instance coming in, there is exactly one Instance coming out.
The later is appropriate when the pipe's processing may result in more or fewer
Instances than arrive through its source iterator.
A pipe operates on an Instance
, which is a carrier of data.
A pipe reads from and writes to fields in the Instance when it is requested
to process the instance. It is up to the pipe which fields in the Instance it
reads from and writes to, but usually a pipe will read its input from and write
its output to the "data" field of an instance.
A pipe doesn't have any direct notion of input or output - it merely modifies instances
that are handed to it. A set of helper classes, which implement the interface Iterator
,
iterate over commonly encountered input data structures and feed the elements of these
data structures to a pipe as instances.
A pipe is frequently used in conjunction with an InstanceList
As instances are added
to the list, they are processed by the pipe associated with the instance list and
the processed Instance is kept in the list.
In one common usage, a FileIterator
is given a list of directories to operate over.
The FileIterator walks through each directory, creating an instance for each
file and putting the data from the file in the data field of the instance.
The directory of the file is stored in the target field of the instance. The
FileIterator feeds instances to an InstanceList, which processes the instances through
its associated pipe and keeps the results.
Pipes can be hierachically composed. In a typical usage, a SerialPipe is created, which holds other pipes in an ordered list. Piping an instance through a SerialPipe means piping the instance through each of the child pipes in sequence.
A pipe holds two separate Alphabets: one for the symbols (feature names) encountered in the data fields of the instances processed through the pipe, and one for the symbols (e.g. class labels) encountered in the target fields.
Constructor Summary | |
---|---|
Pipe()
Construct a pipe with no data and target dictionaries |
|
Pipe(Alphabet dataDict,
Alphabet targetDict)
Construct pipe with data and target dictionaries. |
Method Summary | |
---|---|
boolean |
alphabetsMatch(AlphabetCarrying object)
|
Alphabet |
getAlphabet()
|
Alphabet[] |
getAlphabets()
|
Alphabet |
getDataAlphabet()
|
java.rmi.dgc.VMID |
getInstanceId()
|
Alphabet |
getTargetAlphabet()
|
Instance |
instanceFrom(Instance inst)
|
Instance[] |
instancesFrom(Instance inst)
|
Instance[] |
instancesFrom(java.util.Iterator<Instance> source)
A convenience method that will pull all instances from source through this pipe, and return the results as an array. |
boolean |
isDataAlphabetSet()
|
boolean |
isTargetProcessing()
Return true iff this pipe expects and processes information in the target slot. |
java.util.Iterator<Instance> |
newIteratorFrom(java.util.Iterator<Instance> source)
Given an InstanceIterator, return a new InstanceIterator whose instances have also been processed by this pipe. |
Instance |
pipe(Instance inst)
Really this should be 'protected', but isn't for historical reasons. |
protected void |
preceedingPipeDataAlphabetNotification(Alphabet a)
|
protected void |
preceedingPipeTargetAlphabetNotification(Alphabet a)
|
boolean |
precondition(Instance inst)
Each instance processed is tested by this method. |
java.lang.Object |
readResolve()
This gets called after readObject; it lets the object decide whether to return itself or return a previously read in version. |
void |
setDataAlphabet(Alphabet dDict)
|
void |
setOrCheckDataAlphabet(Alphabet a)
|
void |
setOrCheckTargetAlphabet(Alphabet a)
|
void |
setTargetAlphabet(Alphabet tDict)
|
void |
setTargetProcessing(boolean lookForAndProcessTarget)
Set whether input is taken from target field of instance during processing. |
Methods inherited from class java.lang.Object |
---|
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
Constructor Detail |
---|
public Pipe()
public Pipe(Alphabet dataDict, Alphabet targetDict)
dataDict
- Alphabet that will be used as the data dictionary.targetDict
- Alphabet that will be used as the target dictionary.Method Detail |
---|
public boolean precondition(Instance inst)
SerialPipes sp = new SerialPipes (new Pipe[] {
new CharSequence2TokenSequence() {
public boolean precondition (Instance inst) { return inst instanceof CharSequence; }
},
new TokenSequence2FeatureSequence(),
});
public Instance pipe(Instance inst)
public java.util.Iterator<Instance> newIteratorFrom(java.util.Iterator<Instance> source)
skipIfFalse(Instance)
method.
public Instance[] instancesFrom(java.util.Iterator<Instance> source)
public Instance[] instancesFrom(Instance inst)
public Instance instanceFrom(Instance inst)
public void setTargetProcessing(boolean lookForAndProcessTarget)
public boolean isTargetProcessing()
public Alphabet getDataAlphabet()
public Alphabet getTargetAlphabet()
public Alphabet getAlphabet()
getAlphabet
in interface AlphabetCarrying
public Alphabet[] getAlphabets()
getAlphabets
in interface AlphabetCarrying
public boolean alphabetsMatch(AlphabetCarrying object)
public void setDataAlphabet(Alphabet dDict)
public boolean isDataAlphabetSet()
public void setOrCheckDataAlphabet(Alphabet a)
public void setTargetAlphabet(Alphabet tDict)
public void setOrCheckTargetAlphabet(Alphabet a)
protected void preceedingPipeDataAlphabetNotification(Alphabet a)
protected void preceedingPipeTargetAlphabetNotification(Alphabet a)
public java.rmi.dgc.VMID getInstanceId()
public java.lang.Object readResolve() throws java.io.ObjectStreamException
java.io.ObjectStreamException
|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |