Jacson

de.spieleck.app.cngram
Class NGramProfileImpl

java.lang.Object
  extended by de.spieleck.app.cngram.NGramProfileImpl
All Implemented Interfaces:
NGramProfile

public class NGramProfileImpl
extends java.lang.Object
implements NGramProfile

Actual implementation of a NGramProfile Methods are provided to build new NGramProfiles profiles.

Version:
$Revision: 2 $ $Date: 2006-03-27 23:00:21 +0200 (Mo, 27 Mrz 2006) $ $Author: nestefan $
Author:
frank nestel, $Author: nestefan $

Field Summary
static int DEFAULT_MAX_NGRAM_LENGTH
          default max length of ngram
static int DEFAULT_MIN_NGRAM_LENGTH
          default min length of ngram.
static char SEPARATOR
          separator char
 
Fields inherited from interface de.spieleck.app.cngram.NGramProfile
CHAR_SEQ_COMPARATOR, FINISHREAD_STR, NGRAM_PROFILE_EXTENSION, NO_NGRAM, NORMALIZATION_STR
 
Constructor Summary
NGramProfileImpl(java.lang.String name)
          Create a new ngram profile with default lengths.
NGramProfileImpl(java.lang.String name, int minlen, int maxlen)
          Create a new ngram profile
 
Method Summary
 void addNGrams(java.lang.CharSequence word)
          Add ngrams from a single word to this profile
 void analyze(java.lang.CharSequence text)
          Analyze a piece of text
 void clear()
           
static NGramProfileImpl createProfile(java.lang.String name, java.io.InputStream is, java.lang.String encoding)
          Create a new Language profile from (preferably quite large) text file
 NGram get(java.lang.CharSequence seq)
           
 int getCount()
           
 java.lang.String getName()
           
 int getNormalization()
          Get the normalization of all NGrams contained.
 java.util.Iterator getSorted()
          Return sorted ngrams
 void load(java.io.InputStream is)
          Loads a ngram profile from InputStream (assumes UTF-8 encoded content)
 void save(java.io.OutputStream os)
          Writes NGramProfile content into OutputStream, content is outputted with UTF-8 encoding
 void setName(java.lang.String name)
           
 void setRestricted(java.util.Set restricted)
           
 java.lang.String toString()
          Return ngramprofile as text
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait
 

Field Detail

SEPARATOR

public static final char SEPARATOR
separator char

See Also:
Constant Field Values

DEFAULT_MIN_NGRAM_LENGTH

public static final int DEFAULT_MIN_NGRAM_LENGTH
default min length of ngram.

See Also:
Constant Field Values

DEFAULT_MAX_NGRAM_LENGTH

public static final int DEFAULT_MAX_NGRAM_LENGTH
default max length of ngram

See Also:
Constant Field Values
Constructor Detail

NGramProfileImpl

public NGramProfileImpl(java.lang.String name)
Create a new ngram profile with default lengths.

Parameters:
name - Name of profile

NGramProfileImpl

public NGramProfileImpl(java.lang.String name,
                        int minlen,
                        int maxlen)
Create a new ngram profile

Parameters:
name - Name of profile
minlen - min length of ngram sequences
maxlen - max length of ngram sequences
Method Detail

setRestricted

public void setRestricted(java.util.Set restricted)

analyze

public void analyze(java.lang.CharSequence text)
Analyze a piece of text

Parameters:
text - the text to be analyzed

clear

public void clear()

getCount

public int getCount()
Specified by:
getCount in interface NGramProfile
Returns:
Returns the number of ngrams.

getNormalization

public int getNormalization()
Description copied from interface: NGramProfile
Get the normalization of all NGrams contained.

Specified by:
getNormalization in interface NGramProfile

addNGrams

public void addNGrams(java.lang.CharSequence word)
Add ngrams from a single word to this profile

Parameters:
word -

getSorted

public java.util.Iterator getSorted()
Description copied from interface: NGramProfile
Return sorted ngrams

Specified by:
getSorted in interface NGramProfile
Returns:
sorted ngrams

get

public NGram get(java.lang.CharSequence seq)
Specified by:
get in interface NGramProfile
Returns:
NGram corresponding to seq, null if not found.

toString

public java.lang.String toString()
Return ngramprofile as text

Overrides:
toString in class java.lang.Object
Returns:
ngramprofile as text

load

public void load(java.io.InputStream is)
          throws java.io.IOException
Loads a ngram profile from InputStream (assumes UTF-8 encoded content)

Throws:
java.io.IOException

createProfile

public static NGramProfileImpl createProfile(java.lang.String name,
                                             java.io.InputStream is,
                                             java.lang.String encoding)
                                      throws java.io.IOException
Create a new Language profile from (preferably quite large) text file

Parameters:
name - name of profile
is -
encoding - encoding of stream
Throws:
java.io.IOException

save

public void save(java.io.OutputStream os)
          throws java.io.IOException
Writes NGramProfile content into OutputStream, content is outputted with UTF-8 encoding

Parameters:
os - Stream to output to
Throws:
java.io.IOException

getName

public java.lang.String getName()
Specified by:
getName in interface NGramProfile
Returns:
Returns the name.

setName

public void setName(java.lang.String name)
Parameters:
name - The name to set.

spieleck.de