|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
java.lang.Objectjava.util.Dictionary<K,V>
java.util.Hashtable<String,Integer>
com.jalios.jcms.search.NGramFingerPrint
public class NGramFingerPrint
A FingerPrint maps so called NGrams to their number of occurences in the corresponding text. It is able to categorize itself by comparing its FingerPrint with the FingerPrints of a collection of categories. See sdair-94-bc.pdf in the doc direcory of the jar-file for more information.
Constructor Summary | |
---|---|
NGramFingerPrint()
|
Method Summary | |
---|---|
Map<String,Integer> |
categorize(Collection<NGramFingerPrint> categories)
categorizes the FingerPrint by computing the distance to the FingerPrints in the passed Collection. the category of the FingerPrint with the lowest distance is assigned to this FingerPrint. |
void |
create(String text)
fills the FingerPrint with all the NGrams and their numer of occurences in the passed text. |
String |
getCategory()
returns the category of the FingerPrint or "unknown" if the FingerPrint wasn't categorized yet. |
Map<String,Integer> |
getCategoryDistances()
|
int |
getPosition(String key)
gets the position of the NGram passed to method in the FingerPrint. the NGrams are in descending order according to the number of occurences in the text which was used creating the FingerPrint. |
void |
load(String ngram)
|
protected void |
setCategory(String category)
sets the category of the FingerPrint |
String |
toString()
returns the FingerPrint as a String in the FingerPrint file-format |
Methods inherited from class java.util.Hashtable |
---|
clear, clone, contains, containsKey, containsValue, elements, entrySet, equals, get, hashCode, isEmpty, keys, keySet, put, putAll, rehash, remove, size, values |
Methods inherited from class java.lang.Object |
---|
finalize, getClass, notify, notifyAll, wait, wait, wait |
Constructor Detail |
---|
public NGramFingerPrint()
Method Detail |
---|
public void load(String ngram)
public void create(String text)
text
- text to be analysedpublic Map<String,Integer> categorize(Collection<NGramFingerPrint> categories)
categories
- public Map<String,Integer> getCategoryDistances()
public int getPosition(String key)
key
- the NGram
public String getCategory()
public String toString()
toString
in class Hashtable<String,Integer>
protected void setCategory(String category)
category
- the category
|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |