Package org.freebsd.file
Class FileEncoding
- java.lang.Object
-
- org.freebsd.file.FileEncoding
-
public class FileEncoding extends java.lang.Object
Tries to guess the encoding of the byte sequence. Orignial code taken from https://github.com/file/file/blob/master/src/encoding.c
-
-
Field Summary
Fields Modifier and Type Field Description private java.lang.String
code
private java.lang.String
codeMime
private static char[]
EBCDIC_1047_TO_8859
private static char[]
EBCDIC_TO_ASCII
private static byte
F
private static byte
I
private static byte
T
private byte[]
text_chars
private java.lang.String
type
private static byte
X
-
Constructor Summary
Constructors Constructor Description FileEncoding()
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description private byte[]
fromEbcdic(byte[] buf, int nbytes)
java.lang.String
getCode()
java.lang.String
getCodeMime()
java.lang.String
getType()
boolean
guessFileEncoding(byte[] buf)
Try to determine whether text is in some character code we can identify.private boolean
looksAscii(byte[] buf, int nbytes)
private boolean
looksExtended(byte[] buf, int nbytes)
private boolean
looksLatin1(byte[] buf, int nbytes)
private int
looksUcs16(byte[] buf, int nbytes)
private boolean
looksUtf7(byte[] buf, int nbytes)
protected int
looksUtf8(byte[] buf, int nbytes)
private boolean
looksUtf8WithBOM(byte[] buf, int nbytes)
private int
unsignedByte(byte value)
-
-
-
Field Detail
-
type
private java.lang.String type
-
code
private java.lang.String code
-
codeMime
private java.lang.String codeMime
-
F
private static final byte F
- See Also:
- Constant Field Values
-
T
private static final byte T
- See Also:
- Constant Field Values
-
I
private static final byte I
- See Also:
- Constant Field Values
-
X
private static final byte X
- See Also:
- Constant Field Values
-
text_chars
private byte[] text_chars
-
EBCDIC_TO_ASCII
private static final char[] EBCDIC_TO_ASCII
-
EBCDIC_1047_TO_8859
private static final char[] EBCDIC_1047_TO_8859
-
-
Method Detail
-
getCodeMime
public java.lang.String getCodeMime()
-
getType
public java.lang.String getType()
-
getCode
public java.lang.String getCode()
-
guessFileEncoding
public boolean guessFileEncoding(byte[] buf)
Try to determine whether text is in some character code we can identify. It also identifies EBCDIC by converting it to ISO-8859-1.- Returns:
- true if it could guess an encoding.
-
looksAscii
private boolean looksAscii(byte[] buf, int nbytes)
-
looksLatin1
private boolean looksLatin1(byte[] buf, int nbytes)
-
looksExtended
private boolean looksExtended(byte[] buf, int nbytes)
-
looksUtf8
protected int looksUtf8(byte[] buf, int nbytes)
-
looksUtf8WithBOM
private boolean looksUtf8WithBOM(byte[] buf, int nbytes)
-
looksUtf7
private boolean looksUtf7(byte[] buf, int nbytes)
-
looksUcs16
private int looksUcs16(byte[] buf, int nbytes)
-
fromEbcdic
private byte[] fromEbcdic(byte[] buf, int nbytes)
-
unsignedByte
private int unsignedByte(byte value)
-
-