cc.mallet.pipe.iterator
Class CsvIterator
java.lang.Object
cc.mallet.pipe.iterator.CsvIterator
- All Implemented Interfaces:
- java.util.Iterator<Instance>
public class CsvIterator
- extends java.lang.Object
- implements java.util.Iterator<Instance>
This iterator, perhaps more properly called a Line Pattern Iterator,
reads through a file and returns one instance per line,
based on a regular expression.
If you have data of the form
[name] [label] [data]
and a Pipe
instancePipe
, you could read instances using this code:
InstanceList instances = new InstanceList(instancePipe);
instances.addThruPipe(new CsvIterator(new FileReader(dataFile),
"(\\w+)\\s+(\\w+)\\s+(.*)",
3, 2, 1) // (data, target, name) field indices
);
Constructor Summary |
CsvIterator(java.io.Reader input,
java.util.regex.Pattern lineRegex,
int dataGroup,
int targetGroup,
int uriGroup)
|
CsvIterator(java.io.Reader input,
java.lang.String lineRegex,
int dataGroup,
int targetGroup,
int uriGroup)
|
CsvIterator(java.lang.String filename,
java.lang.String lineRegex,
int dataGroup,
int targetGroup,
int uriGroup)
|
Methods inherited from class java.lang.Object |
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
CsvIterator
public CsvIterator(java.io.Reader input,
java.util.regex.Pattern lineRegex,
int dataGroup,
int targetGroup,
int uriGroup)
CsvIterator
public CsvIterator(java.io.Reader input,
java.lang.String lineRegex,
int dataGroup,
int targetGroup,
int uriGroup)
CsvIterator
public CsvIterator(java.lang.String filename,
java.lang.String lineRegex,
int dataGroup,
int targetGroup,
int uriGroup)
throws java.io.FileNotFoundException
- Throws:
java.io.FileNotFoundException
next
public Instance next()
- Specified by:
next
in interface java.util.Iterator<Instance>
hasNext
public boolean hasNext()
- Specified by:
hasNext
in interface java.util.Iterator<Instance>
remove
public void remove()
- Specified by:
remove
in interface java.util.Iterator<Instance>