cc.mallet.pipe.iterator
Class CsvIterator

java.lang.Object
  extended by cc.mallet.pipe.iterator.CsvIterator
All Implemented Interfaces:
java.util.Iterator<Instance>

public class CsvIterator
extends java.lang.Object
implements java.util.Iterator<Instance>

This iterator, perhaps more properly called a Line Pattern Iterator, reads through a file and returns one instance per line, based on a regular expression.

If you have data of the form

[name]  [label]  [data]
and a Pipe instancePipe, you could read instances using this code:
    InstanceList instances = new InstanceList(instancePipe);

    instances.addThruPipe(new CsvIterator(new FileReader(dataFile),
                                          "(\\w+)\\s+(\\w+)\\s+(.*)",
                                          3, 2, 1)  // (data, target, name) field indices                    
                         );


Constructor Summary
CsvIterator(java.io.Reader input, java.util.regex.Pattern lineRegex, int dataGroup, int targetGroup, int uriGroup)
           
CsvIterator(java.io.Reader input, java.lang.String lineRegex, int dataGroup, int targetGroup, int uriGroup)
           
CsvIterator(java.lang.String filename, java.lang.String lineRegex, int dataGroup, int targetGroup, int uriGroup)
           
 
Method Summary
 boolean hasNext()
           
 Instance next()
           
 void remove()
           
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

CsvIterator

public CsvIterator(java.io.Reader input,
                   java.util.regex.Pattern lineRegex,
                   int dataGroup,
                   int targetGroup,
                   int uriGroup)

CsvIterator

public CsvIterator(java.io.Reader input,
                   java.lang.String lineRegex,
                   int dataGroup,
                   int targetGroup,
                   int uriGroup)

CsvIterator

public CsvIterator(java.lang.String filename,
                   java.lang.String lineRegex,
                   int dataGroup,
                   int targetGroup,
                   int uriGroup)
            throws java.io.FileNotFoundException
Throws:
java.io.FileNotFoundException
Method Detail

next

public Instance next()
Specified by:
next in interface java.util.Iterator<Instance>

hasNext

public boolean hasNext()
Specified by:
hasNext in interface java.util.Iterator<Instance>

remove

public void remove()
Specified by:
remove in interface java.util.Iterator<Instance>