cc.mallet.pipe.iterator
Class FileListIterator

java.lang.Object
  extended by cc.mallet.pipe.iterator.FileListIterator
All Implemented Interfaces:
java.util.Iterator<Instance>

public class FileListIterator
extends java.lang.Object
implements java.util.Iterator<Instance>

An iterator that generates instances for a pipe from a list of filenames. Each file is treated as a text file whose target is determined by a user-specified regular expression pattern applied to the filename

Author:
Gary Huang ghuang@cs.umass.edu

Field Summary
static java.util.regex.Pattern ALL_DIRECTORIES
          Use as label names all the directory names in the filename.
static java.util.regex.Pattern FIRST_DIRECTORY
          Use as label names the first directory in the filename.
static java.util.regex.Pattern LAST_DIRECTORY
          Use as label name the last directory in the filename.
static java.util.regex.Pattern STARTING_DIRECTORIES
          Use as label names the directories of the given files, optionally removing common prefix of all starting directories
 
Constructor Summary
FileListIterator(java.io.File[] files, java.io.FileFilter fileFilter, java.util.regex.Pattern targetPattern, boolean removeCommonPrefix)
          Construct an iterator over the given arry of Files The instances constructed from the files are returned in the same order as they appear in the given array
FileListIterator(java.io.File filelist, java.io.File baseDirectory, java.io.FileFilter fileFilter, java.util.regex.Pattern targetPattern, boolean removeCommonPrefix)
          Construct a FileListIterator with the file containing the list of files of RELATIVE pathnames, one filename per line.
FileListIterator(java.io.File filelist, java.io.FileFilter fileFilter, java.util.regex.Pattern targetPattern, boolean removeCommonPrefix)
          Construct a FileListIterator with the file containing the list of files, which contains one filename per line.
FileListIterator(java.lang.String[] filenames, java.io.FileFilter fileFilter, java.util.regex.Pattern targetPattern, boolean removeCommonPrefix)
           
FileListIterator(java.lang.String filelistName, java.io.FileFilter fileFilter, java.util.regex.Pattern targetPattern, boolean removeCommonPrefix)
           
FileListIterator(java.lang.String filelistName, java.util.regex.Pattern targetPattern)
           
 
Method Summary
 java.util.ArrayList getFileArray()
           
 boolean hasNext()
           
 Instance next()
           
 java.io.File nextFile()
           
 void remove()
           
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

STARTING_DIRECTORIES

public static final java.util.regex.Pattern STARTING_DIRECTORIES
Use as label names the directories of the given files, optionally removing common prefix of all starting directories


FIRST_DIRECTORY

public static final java.util.regex.Pattern FIRST_DIRECTORY
Use as label names the first directory in the filename.


LAST_DIRECTORY

public static final java.util.regex.Pattern LAST_DIRECTORY
Use as label name the last directory in the filename.


ALL_DIRECTORIES

public static final java.util.regex.Pattern ALL_DIRECTORIES
Use as label names all the directory names in the filename.

Constructor Detail

FileListIterator

public FileListIterator(java.io.File[] files,
                        java.io.FileFilter fileFilter,
                        java.util.regex.Pattern targetPattern,
                        boolean removeCommonPrefix)
Construct an iterator over the given arry of Files The instances constructed from the files are returned in the same order as they appear in the given array

Parameters:
files - Array of files from which to construct instances
fileFilter - class implementing interface FileFilter that will decide which names to accept. May be null.
targetPattern - regex Pattern applied to the filename whose first parenthesized group on matching is taken to be the target value of the generated instance. The pattern is applied to the filename with the matcher.find() method.
removeCommonPrefix - boolean that modifies the behavior of the STARTING_DIRECTORIES pattern, removing the common prefix of all initially specified directories, leaving the remainder of each filename as the target value.

FileListIterator

public FileListIterator(java.lang.String[] filenames,
                        java.io.FileFilter fileFilter,
                        java.util.regex.Pattern targetPattern,
                        boolean removeCommonPrefix)

FileListIterator

public FileListIterator(java.io.File filelist,
                        java.io.FileFilter fileFilter,
                        java.util.regex.Pattern targetPattern,
                        boolean removeCommonPrefix)
                 throws java.io.FileNotFoundException,
                        java.io.IOException
Construct a FileListIterator with the file containing the list of files, which contains one filename per line. The instances constructed from the filelist are returned in the same order as listed

Throws:
java.io.FileNotFoundException
java.io.IOException

FileListIterator

public FileListIterator(java.io.File filelist,
                        java.io.File baseDirectory,
                        java.io.FileFilter fileFilter,
                        java.util.regex.Pattern targetPattern,
                        boolean removeCommonPrefix)
                 throws java.io.FileNotFoundException,
                        java.io.IOException
Construct a FileListIterator with the file containing the list of files of RELATIVE pathnames, one filename per line.

The instances constructed from the filelist are returned in the same order as listed

Parameters:
filelist - List of relative file names.
baseDirectory - Base directory for relative file names.
Throws:
java.io.FileNotFoundException
java.io.IOException

FileListIterator

public FileListIterator(java.lang.String filelistName,
                        java.io.FileFilter fileFilter,
                        java.util.regex.Pattern targetPattern,
                        boolean removeCommonPrefix)
                 throws java.io.FileNotFoundException,
                        java.io.IOException
Throws:
java.io.FileNotFoundException
java.io.IOException

FileListIterator

public FileListIterator(java.lang.String filelistName,
                        java.util.regex.Pattern targetPattern)
                 throws java.io.FileNotFoundException,
                        java.io.IOException
Throws:
java.io.FileNotFoundException
java.io.IOException
Method Detail

next

public Instance next()
Specified by:
next in interface java.util.Iterator<Instance>

nextFile

public java.io.File nextFile()

hasNext

public boolean hasNext()
Specified by:
hasNext in interface java.util.Iterator<Instance>

remove

public void remove()
Specified by:
remove in interface java.util.Iterator<Instance>

getFileArray

public java.util.ArrayList getFileArray()