Jacson

Package de.spieleck.app.cngram

Implementation of character based ngrams.

See:
          Description

Interface Summary
NGram An (character) NGram is a special CharSequence.
NGramMetric A way to measure the distance between two ngram profiles.
NGramProfile A device to keep a bunch of ngram statistics.
NGramProfiles.Ranker  
NGramProfiles.RankResult  
 

Class Summary
C2aMetric This metric is based on a loose interpretation of the Chi^2 formula.
C2Metric Chi^2 Metric without rectification of classes
C2xMetric ALPHA modified Chi^2 Metric without rectification of classes and with centrification between the two profiles.
CosMetric Cosine Metric This is nicely valued between zero and one
LightCharSequence A very light (and therefore fast, efficient) implementation of a CharSequence.
NGramImpl store NGram
NGramProfileImpl Actual implementation of a NGramProfile Methods are provided to build new NGramProfiles profiles.
NGramProfiles Manage a set of profiles and determine "most similar" ones to a given profile.
RawMetric Raw (Delta-count) based difference between profiles.
RunNGram Commandline interface that runs a ngram analysis over submitted text, results can be used for automatic language identification.
SqMetric Squared raw metric for distance between profiles.
 

Package de.spieleck.app.cngram Description

Implementation of character based ngrams.

Version $Revision: 2 $ $Date: 2006-03-27 23:00:21 +0200 (Mo, 27 Mrz 2006) $ $Author: nestefan $


spieleck.de