Class TokenInfoDictionaryBuilder


  • class TokenInfoDictionaryBuilder
    extends java.lang.Object
    • Field Summary

      Fields 
      Modifier and Type Field Description
      private java.lang.String encoding  
      private java.text.Normalizer.Form normalForm  
      private int offset
      Internal word id - incrementally assigned as entries are read and added.
    • Constructor Summary

      Constructors 
      Constructor Description
      TokenInfoDictionaryBuilder​(java.lang.String encoding, boolean normalizeEntries)  
    • Field Detail

      • offset

        private int offset
        Internal word id - incrementally assigned as entries are read and added. This will be byte offset of dictionary file
      • encoding

        private java.lang.String encoding
      • normalForm

        private java.text.Normalizer.Form normalForm
    • Constructor Detail

      • TokenInfoDictionaryBuilder

        TokenInfoDictionaryBuilder​(java.lang.String encoding,
                                   boolean normalizeEntries)
    • Method Detail

      • build

        public TokenInfoDictionaryWriter build​(java.nio.file.Path dir)
                                        throws java.io.IOException
        Throws:
        java.io.IOException
      • buildDictionary

        private TokenInfoDictionaryWriter buildDictionary​(java.util.List<java.nio.file.Path> csvFiles)
                                                   throws java.io.IOException
        Throws:
        java.io.IOException