cc.mallet.types
Class StringKernel

java.lang.Object
  extended by java.util.AbstractMap<K,V>
      extended by java.util.HashMap<K,V>
          extended by java.util.LinkedHashMap
              extended by cc.mallet.types.StringKernel
All Implemented Interfaces:
java.io.Serializable, java.lang.Cloneable, java.util.Map

public class StringKernel
extends java.util.LinkedHashMap

Computes a similarity metric between two strings, based on counts of common subsequences of characters. See Lodhi et al "String kernels for text classification." Optionally caches previous kernel computations.

See Also:
Serialized Form

Nested Class Summary
 
Nested classes/interfaces inherited from class java.util.AbstractMap
java.util.AbstractMap.SimpleEntry<K,V>, java.util.AbstractMap.SimpleImmutableEntry<K,V>
 
Constructor Summary
StringKernel()
           
StringKernel(boolean norm, double lam, int length)
           
StringKernel(boolean norm, double lam, int length, boolean cache)
           
 
Method Summary
 double K(java.lang.String s, java.lang.String t)
          Computes the normalized string kernel between two strings.
static void main(java.lang.String[] args)
          Return string kernel between two strings
 
Methods inherited from class java.util.LinkedHashMap
clear, containsValue, get, removeEldestEntry
 
Methods inherited from class java.util.HashMap
clone, containsKey, entrySet, isEmpty, keySet, put, putAll, remove, size, values
 
Methods inherited from class java.util.AbstractMap
equals, hashCode, toString
 
Methods inherited from class java.lang.Object
finalize, getClass, notify, notifyAll, wait, wait, wait
 
Methods inherited from interface java.util.Map
containsKey, entrySet, equals, hashCode, isEmpty, keySet, put, putAll, remove, size, values
 

Constructor Detail

StringKernel

public StringKernel(boolean norm,
                    double lam,
                    int length,
                    boolean cache)
Parameters:
norm - true if we lowercase all strings
lam - 0-1 penalty for gaps between matches.
length - max length of subsequences to compare
cache - true if we should cache previous kernel computations. recommended!

StringKernel

public StringKernel()

StringKernel

public StringKernel(boolean norm,
                    double lam,
                    int length)
Method Detail

K

public double K(java.lang.String s,
                java.lang.String t)
Computes the normalized string kernel between two strings.

Parameters:
s - string 1
t - string 2
Returns:
0-1 value, where 1 is exact match.

main

public static void main(java.lang.String[] args)
                 throws java.lang.Exception
Return string kernel between two strings

Throws:
java.lang.Exception