NgramJ logo.Getting Started > Decide Which Type2007-03-19 09:47:52 v1.0
NGramJ, smart scanning for document properties.

Decide Which Type


What is NGramJ?
Getting Started
  Download Instructions
  Decide Which Type
  Use CNgram
  Use NGramJ
How Does it Work?
Contact
How to Contribute?
Developer Information
Other Information
 

Run either byte NGramJ or character CNgram Here are some common cases.

You have files with text of unknown encoding.
Use byte NGramJ to determine both encoding and language.
You have files with text of known encoding.
Use CNgram to determine language or mixed language documents.
You don't have files but Strings within your Application.
Use CNgram to determine language or mixed language Strings.
You have structured files in XML/HTML.
Usually encoding is not the problem, but you need to get rid of the markup by using a parser first, then use CNgram to determine language or mixed language documents. Note: The parser has to be started somehow differently.

NewsfeedRSS feed
FilefeedRSS feed
Sourceforge Logo