cc.mallet.pipe.iterator
Class UnlabeledFileIterator

java.lang.Object
  extended by cc.mallet.pipe.iterator.UnlabeledFileIterator
All Implemented Interfaces:
java.util.Iterator<Instance>

public class UnlabeledFileIterator
extends java.lang.Object
implements java.util.Iterator<Instance>

An iterator that generates instances from an initial directory or set of directories. The iterator will recurse through sub-directories. Each filename becomes the data field of an instance, and the targets are set to null. To set the target values to the directory name, use FileIterator instead.

Author:
Andrew McCallum mccallum@cs.umass.edu, Gregory Druck gdruck@cs.umass.edu

Field Summary
static java.util.regex.Pattern ALL_DIRECTORIES
          Use as label names all the directory names in the filename.
static java.util.regex.Pattern FIRST_DIRECTORY
          Use as label names the first directory in the filename.
static java.util.regex.Pattern LAST_DIRECTORY
          Use as label name the last directory in the filename.
static java.util.regex.Pattern STARTING_DIRECTORIES
          Use as label names the directories specified in the constructor, optionally removing common prefix of all starting directories
 
Constructor Summary
  UnlabeledFileIterator(java.io.File directory)
           
  UnlabeledFileIterator(java.io.File[] directories)
           
protected UnlabeledFileIterator(java.io.File[] directories, java.io.FileFilter fileFilter)
          Construct a FileIterator that will supply filenames within initial directories as instances
  UnlabeledFileIterator(java.io.File directory, java.io.FileFilter fileFilter)
           
  UnlabeledFileIterator(java.lang.String directory)
           
  UnlabeledFileIterator(java.lang.String[] directories, java.io.FileFilter ff)
           
  UnlabeledFileIterator(java.lang.String directory, java.io.FileFilter filter)
           
 
Method Summary
 java.util.ArrayList<java.io.File> getFileArray()
           
 boolean hasNext()
           
 Instance next()
           
 java.io.File nextFile()
           
 void remove()
           
static java.io.File[] stringArray2FileArray(java.lang.String[] sa)
           
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

STARTING_DIRECTORIES

public static final java.util.regex.Pattern STARTING_DIRECTORIES
Use as label names the directories specified in the constructor, optionally removing common prefix of all starting directories


FIRST_DIRECTORY

public static final java.util.regex.Pattern FIRST_DIRECTORY
Use as label names the first directory in the filename.


LAST_DIRECTORY

public static final java.util.regex.Pattern LAST_DIRECTORY
Use as label name the last directory in the filename.


ALL_DIRECTORIES

public static final java.util.regex.Pattern ALL_DIRECTORIES
Use as label names all the directory names in the filename.

Constructor Detail

UnlabeledFileIterator

protected UnlabeledFileIterator(java.io.File[] directories,
                                java.io.FileFilter fileFilter)
Construct a FileIterator that will supply filenames within initial directories as instances

Parameters:
directories - Array of directories to collect files from
fileFilter - class implementing interface FileFilter that will decide which names to accept. May be null.
targetPattern - regex Pattern applied to the filename whose first parenthesized group on matching is taken to be the target value of the generated instance. The pattern is applied to the directory with the matcher.find() method. If null, then all instances will have target null.
removeCommonPrefix - boolean that modifies the behavior of the STARTING_DIRECTORIES pattern, removing the common prefix of all initially specified directories, leaving the remainder of each filename as the target value.

UnlabeledFileIterator

public UnlabeledFileIterator(java.lang.String[] directories,
                             java.io.FileFilter ff)

UnlabeledFileIterator

public UnlabeledFileIterator(java.io.File directory,
                             java.io.FileFilter fileFilter)

UnlabeledFileIterator

public UnlabeledFileIterator(java.io.File directory)

UnlabeledFileIterator

public UnlabeledFileIterator(java.io.File[] directories)

UnlabeledFileIterator

public UnlabeledFileIterator(java.lang.String directory)

UnlabeledFileIterator

public UnlabeledFileIterator(java.lang.String directory,
                             java.io.FileFilter filter)
Method Detail

getFileArray

public java.util.ArrayList<java.io.File> getFileArray()

stringArray2FileArray

public static java.io.File[] stringArray2FileArray(java.lang.String[] sa)

next

public Instance next()
Specified by:
next in interface java.util.Iterator<Instance>

remove

public void remove()
Specified by:
remove in interface java.util.Iterator<Instance>

nextFile

public java.io.File nextFile()

hasNext

public boolean hasNext()
Specified by:
hasNext in interface java.util.Iterator<Instance>