- java.lang.Object
-
- jakarta.mail.internet.MimeUtility
-
public class MimeUtility extends java.lang.Object
This is a utility class that provides various MIME related functionality.There are a set of methods to encode and decode MIME headers as per RFC 2047. Note that, in general, these methods are not needed when using methods such as
setSubject
andsetRecipients
; Jakarta Mail will automatically encode and decode data when using these "higher level" methods. The methods below are only needed when maniuplating raw MIME headers usingsetHeader
andgetHeader
methods. A brief description on handling such headers is given below:RFC 822 mail headers must contain only US-ASCII characters. Headers that contain non US-ASCII characters must be encoded so that they contain only US-ASCII characters. Basically, this process involves using either BASE64 or QP to encode certain characters. RFC 2047 describes this in detail.
In Java, Strings contain (16 bit) Unicode characters. ASCII is a subset of Unicode (and occupies the range 0 - 127). A String that contains only ASCII characters is already mail-safe. If the String contains non US-ASCII characters, it must be encoded. An additional complexity in this step is that since Unicode is not yet a widely used charset, one might want to first charset-encode the String into another charset and then do the transfer-encoding.
Note that to get the actual bytes of a mail-safe String (say, for sending over SMTP), one must do
byte[] bytes = string.getBytes("iso-8859-1");
The
setHeader
andaddHeader
methods on MimeMessage and MimeBodyPart assume that the given header values are Unicode strings that contain only US-ASCII characters. Hence the callers of those methods must insure that the values they pass do not contain non US-ASCII characters. The methods in this class help do this.The
getHeader
family of methods on MimeMessage and MimeBodyPart return the raw header value. These might be encoded as per RFC 2047, and if so, must be decoded into Unicode Strings. The methods in this class help to do this.Several System properties control strict conformance to the MIME spec. Note that these are not session properties but must be set globally as System properties.
The
mail.mime.decodetext.strict
property controls decoding of MIME encoded words. The MIME spec requires that encoded words start at the beginning of a whitespace separated word. Some mailers incorrectly include encoded words in the middle of a word. If themail.mime.decodetext.strict
System property is set to"false"
, an attempt will be made to decode these illegal encoded words. The default is true.The
mail.mime.encodeeol.strict
property controls the choice of Content-Transfer-Encoding for MIME parts that are not of type "text". Often such parts will contain textual data for which an encoding that allows normal end of line conventions is appropriate. In rare cases, such a part will appear to contain entirely textual data, but will require an encoding that preserves CR and LF characters without change. If themail.mime.encodeeol.strict
System property is set to"true"
, such an encoding will be used when necessary. The default is false.In addition, the
mail.mime.charset
System property can be used to specify the default MIME charset to use for encoded words and text parts that don't otherwise specify a charset. Normally, the default MIME charset is derived from the default Java charset, as specified in thefile.encoding
System property. Most applications will have no need to explicitly set the default MIME charset. In cases where the default MIME charset to be used for mail messages is different than the charset used for files stored on the system, this property should be set.The current implementation also supports the following System property.
The
mail.mime.ignoreunknownencoding
property controls whether unknown values in theContent-Transfer-Encoding
header, as passed to thedecode
method, cause an exception. If set to"true"
, unknown values are ignored and 8bit encoding is assumed. Otherwise, unknown values cause a MessagingException to be thrown.
-
-
Field Summary
Fields Modifier and Type Field Description static int
ALL
(package private) static int
ALL_ASCII
private static boolean
allowUtf8
private static boolean
decodeStrict
private static java.lang.String
defaultJavaCharset
private static java.lang.String
defaultMIMECharset
private static boolean
encodeEolStrict
private static boolean
foldEncodedWords
private static boolean
foldText
private static boolean
ignoreUnknownEncoding
private static java.util.Map<java.lang.String,java.lang.String>
java2mime
private static java.util.Map<java.lang.String,java.lang.String>
mime2java
(package private) static int
MOSTLY_ASCII
(package private) static int
MOSTLY_NONASCII
private static java.util.Map<java.lang.String,java.lang.Boolean>
nonAsciiCharsetMap
private static java.lang.String
TEXT_SPECIALS
private static java.lang.String
WORD_SPECIALS
-
Constructor Summary
Constructors Modifier Constructor Description private
MimeUtility()
-
Method Summary
All Methods Static Methods Concrete Methods Modifier and Type Method Description private static int
bEncodedLength(byte[] b)
Returns the length of the encoded version of this byte array.(package private) static int
checkAscii(byte[] b)
Check if the given byte array contains non US-ASCII characters.(package private) static int
checkAscii(java.io.InputStream is, int max, boolean breakOnNonAscii)
Check if the given input stream contains non US-ASCII characters.(package private) static int
checkAscii(java.lang.String s)
Check if the given string contains non US-ASCII characters.static java.io.InputStream
decode(java.io.InputStream is, java.lang.String encoding)
Decode the given input stream.private static java.lang.String
decodeInnerWords(java.lang.String word)
Look for encoded words within a word.static java.lang.String
decodeText(java.lang.String etext)
Decode "unstructured" headers, that is, headers that are defined as '*text' as per RFC 822.static java.lang.String
decodeWord(java.lang.String eword)
The string is parsed using the rules in RFC 2047 and RFC 2231 for parsing an "encoded-word".private static void
doEncode(java.lang.String string, boolean b64, java.lang.String jcharset, int avail, java.lang.String prefix, boolean first, boolean encodingWord, java.lang.StringBuilder buf)
static java.io.OutputStream
encode(java.io.OutputStream os, java.lang.String encoding)
Wrap an encoder around the given output stream.static java.io.OutputStream
encode(java.io.OutputStream os, java.lang.String encoding, java.lang.String filename)
Wrap an encoder around the given output stream.static java.lang.String
encodeText(java.lang.String text)
Encode a RFC 822 "text" token into mail-safe form as per RFC 2047.static java.lang.String
encodeText(java.lang.String text, java.lang.String charset, java.lang.String encoding)
Encode a RFC 822 "text" token into mail-safe form as per RFC 2047.static java.lang.String
encodeWord(java.lang.String word)
Encode a RFC 822 "word" token into mail-safe form as per RFC 2047.static java.lang.String
encodeWord(java.lang.String word, java.lang.String charset, java.lang.String encoding)
Encode a RFC 822 "word" token into mail-safe form as per RFC 2047.private static java.lang.String
encodeWord(java.lang.String string, java.lang.String charset, java.lang.String encoding, boolean encodingWord)
static java.lang.String
fold(int used, java.lang.String s)
Fold a string at linear whitespace so that each line is no longer than 76 characters, if possible.private static boolean
getBoolean(java.lang.Object value, boolean def)
Interpret the value object as a boolean, returning def if unable.(package private) static boolean
getBooleanProperty(java.util.Properties props, java.lang.String name, boolean def)
Get a boolean valued property.(package private) static boolean
getBooleanSystemProperty(java.lang.String name, boolean def)
Get a boolean valued System property.static byte[]
getBytes(java.io.InputStream is)
static byte[]
getBytes(java.lang.String s)
static java.lang.String
getDefaultJavaCharset()
Get the default charset corresponding to the system's current default locale.(package private) static java.lang.String
getDefaultMIMECharset()
static java.lang.String
getEncoding(jakarta.activation.DataHandler dh)
Same asgetEncoding(DataSource)
except that instead of reading the data from anInputStream
it uses thewriteTo
method to examine the data.static java.lang.String
getEncoding(jakarta.activation.DataSource ds)
Get the Content-Transfer-Encoding that should be applied to the input stream of this DataSource, to make it mail-safe.private static java.lang.Object
getProp(java.util.Properties props, java.lang.String name)
Get the value of the specified property.private static int
indexOfAny(java.lang.String s, java.lang.String any)
Return the first index of any of the characters in "any" in "s", or -1 if none are found.private static int
indexOfAny(java.lang.String s, java.lang.String any, int start)
static java.lang.String
javaCharset(java.lang.String charset)
Convert a MIME charset name into a valid Java charset name.private static void
loadMappings(LineInputStream is, java.util.Map<java.lang.String,java.lang.String> table)
private static java.lang.String
makesafe(java.lang.CharSequence s)
If the String or StringBuilder has any embedded newlines, make sure they're followed by whitespace, to prevent header injection errors.static java.lang.String
mimeCharset(java.lang.String charset)
Convert a java charset into its MIME charset name.(package private) static boolean
nonascii(int b)
private static boolean
nonAsciiCharset(ContentType ct)
Determine whether the charset in the Content-Type is compatible with ASCII or not.private static int
qEncodedLength(byte[] b, boolean encodingWord)
Returns the length of the encoded version of this byte array.static java.lang.String
quote(java.lang.String word, java.lang.String specials)
A utility method to quote a word, if the word contains any characters from the specified 'specials' list.static java.lang.String
unfold(java.lang.String s)
Unfold a folded header.
-
-
-
Field Detail
-
ALL
public static final int ALL
- See Also:
- Constant Field Values
-
nonAsciiCharsetMap
private static final java.util.Map<java.lang.String,java.lang.Boolean> nonAsciiCharsetMap
-
WORD_SPECIALS
private static final java.lang.String WORD_SPECIALS
- See Also:
- Constant Field Values
-
TEXT_SPECIALS
private static final java.lang.String TEXT_SPECIALS
- See Also:
- Constant Field Values
-
decodeStrict
private static final boolean decodeStrict
-
encodeEolStrict
private static final boolean encodeEolStrict
-
ignoreUnknownEncoding
private static final boolean ignoreUnknownEncoding
-
allowUtf8
private static final boolean allowUtf8
-
foldEncodedWords
private static final boolean foldEncodedWords
-
foldText
private static final boolean foldText
-
defaultJavaCharset
private static java.lang.String defaultJavaCharset
-
defaultMIMECharset
private static java.lang.String defaultMIMECharset
-
mime2java
private static java.util.Map<java.lang.String,java.lang.String> mime2java
-
java2mime
private static java.util.Map<java.lang.String,java.lang.String> java2mime
-
ALL_ASCII
static final int ALL_ASCII
- See Also:
- Constant Field Values
-
MOSTLY_ASCII
static final int MOSTLY_ASCII
- See Also:
- Constant Field Values
-
MOSTLY_NONASCII
static final int MOSTLY_NONASCII
- See Also:
- Constant Field Values
-
-
Method Detail
-
getEncoding
public static java.lang.String getEncoding(jakarta.activation.DataSource ds)
Get the Content-Transfer-Encoding that should be applied to the input stream of this DataSource, to make it mail-safe.The algorithm used here is:
-
If the DataSource implements
EncodingAware
, ask it what encoding to use. If it returns non-null, return that value. - If the primary type of this datasource is "text" and if all the bytes in its input stream are US-ASCII, then the encoding is StreamProvider.BIT7_ENCODER. If more than half of the bytes are non-US-ASCII, then the encoding is StreamProvider.BASE_64_ENCODER. If less than half of the bytes are non-US-ASCII, then the encoding is StreamProvider.QUOTED_PRINTABLE_ENCODER.
- If the primary type of this datasource is not "text", then if all the bytes of its input stream are US-ASCII, the encoding is StreamProvider.BIT7_ENCODER. If there is even one non-US-ASCII character, the encoding is StreamProvider.BASE_64_ENCODER.
- Parameters:
ds
- the DataSource- Returns:
- the encoding. This is either StreamProvider.BIT7_ENCODER, StreamProvider.QUOTED_PRINTABLE_ENCODER or StreamProvider.BASE_64_ENCODER
-
If the DataSource implements
-
nonAsciiCharset
private static boolean nonAsciiCharset(ContentType ct)
Determine whether the charset in the Content-Type is compatible with ASCII or not. A charset is compatible with ASCII if the encoded byte stream representing the Unicode string "\r\n" is the ASCII characters CR and LF. For example, the utf-16be charset is not compatible with ASCII. For performance, we keep a static map that caches the results.
-
getEncoding
public static java.lang.String getEncoding(jakarta.activation.DataHandler dh)
Same asgetEncoding(DataSource)
except that instead of reading the data from anInputStream
it uses thewriteTo
method to examine the data. This is more efficient in the common case of aDataHandler
created with an object and a MIME type (for example, a "text/plain" String) because all the I/O is done in this thread. In the case requiring anInputStream
theDataHandler
uses a thread, a pair of pipe streams, and thewriteTo
method to produce the data.- Parameters:
dh
- the DataHandler- Returns:
- the Content-Transfer-Encoding
- Since:
- JavaMail 1.2
-
decode
public static java.io.InputStream decode(java.io.InputStream is, java.lang.String encoding) throws MessagingException
Decode the given input stream. The Input stream returned is the decoded input stream. All the encodings defined in RFC 2045 are supported here. They include StreamProvider.BASE_64_ENCODER, StreamProvider.QUOTED_PRINTABLE_ENCODER, StreamProvider.BIT7_ENCODER, StreamProvider.BIT8_ENCODER, and StreamProvider.BINARY_ENCODER. In addition, StreamProvider.UU_ENCODER is also supported.In the current implementation, if the
mail.mime.ignoreunknownencoding
system property is set to"true"
, unknown encoding values are ignored and the original InputStream is returned.- Parameters:
is
- input streamencoding
- the encoding of the stream.- Returns:
- decoded input stream.
- Throws:
MessagingException
- if the encoding is unknown
-
encode
public static java.io.OutputStream encode(java.io.OutputStream os, java.lang.String encoding) throws MessagingException
Wrap an encoder around the given output stream. All the encodings defined in RFC 2045 are supported here. They include StreamProvider.BASE_64_ENCODER, StreamProvider.QUOTED_PRINTABLE_ENCODER, StreamProvider.BIT7_ENCODER, StreamProvider.BIT8_ENCODER and StreamProvider.BINARY_ENCODER. In addition, StreamProvider.UU_ENCODER is also supported.- Parameters:
os
- output streamencoding
- the encoding of the stream.- Returns:
- output stream that applies the specified encoding.
- Throws:
MessagingException
- if the encoding is unknown
-
encode
public static java.io.OutputStream encode(java.io.OutputStream os, java.lang.String encoding, java.lang.String filename) throws MessagingException
Wrap an encoder around the given output stream. All the encodings defined in RFC 2045 are supported here. They include StreamProvider.BASE_64_ENCODER, StreamProvider.QUOTED_PRINTABLE_ENCODER, StreamProvider.BIT7_ENCODER, StreamProvider.BIT8_ENCODER and StreamProvider.BINARY_ENCODER. In addition, StreamProvider.UU_ENCODER is also supported. Thefilename
parameter is used with the StreamProvider.UU_ENCODER encoding and is included in the encoded output.- Parameters:
os
- output streamencoding
- the encoding of the stream.filename
- name for the file being encoded (only used with uuencode)- Returns:
- output stream that applies the specified encoding.
- Throws:
MessagingException
- for unknown encodings- Since:
- JavaMail 1.2
-
encodeText
public static java.lang.String encodeText(java.lang.String text) throws java.io.UnsupportedEncodingException
Encode a RFC 822 "text" token into mail-safe form as per RFC 2047.The given Unicode string is examined for non US-ASCII characters. If the string contains only US-ASCII characters, it is returned as-is. If the string contains non US-ASCII characters, it is first character-encoded using the platform's default charset, then transfer-encoded using either the B or Q encoding. The resulting bytes are then returned as a Unicode string containing only ASCII characters.
Note that this method should be used to encode only "unstructured" RFC 822 headers.
Example of usage:
MimePart part = ... String rawvalue = "FooBar Mailer, Japanese version 1.1" try { // If we know for sure that rawvalue contains only US-ASCII // characters, we can skip the encoding part part.setHeader("X-mailer", MimeUtility.encodeText(rawvalue)); } catch (UnsupportedEncodingException e) { // encoding failure } catch (MessagingException me) { // setHeader() failure }
- Parameters:
text
- Unicode string- Returns:
- Unicode string containing only US-ASCII characters
- Throws:
java.io.UnsupportedEncodingException
- if the encoding fails
-
encodeText
public static java.lang.String encodeText(java.lang.String text, java.lang.String charset, java.lang.String encoding) throws java.io.UnsupportedEncodingException
Encode a RFC 822 "text" token into mail-safe form as per RFC 2047.The given Unicode string is examined for non US-ASCII characters. If the string contains only US-ASCII characters, it is returned as-is. If the string contains non US-ASCII characters, it is first character-encoded using the specified charset, then transfer-encoded using either the B or Q encoding. The resulting bytes are then returned as a Unicode string containing only ASCII characters.
Note that this method should be used to encode only "unstructured" RFC 822 headers.
- Parameters:
text
- the header valuecharset
- the charset. If this parameter is null, the platform's default chatset is used.encoding
- the encoding to be used. Currently supported values are "B" and "Q". If this parameter is null, then the "Q" encoding is used if most of characters to be encoded are in the ASCII charset, otherwise "B" encoding is used.- Returns:
- Unicode string containing only US-ASCII characters
- Throws:
java.io.UnsupportedEncodingException
- if the charset conversion failed.
-
decodeText
public static java.lang.String decodeText(java.lang.String etext) throws java.io.UnsupportedEncodingException
Decode "unstructured" headers, that is, headers that are defined as '*text' as per RFC 822.The string is decoded using the algorithm specified in RFC 2047, Section 6.1. If the charset-conversion fails for any sequence, an UnsupportedEncodingException is thrown. If the String is not an RFC 2047 style encoded header, it is returned as-is
Example of usage:
MimePart part = ... String rawvalue = null; String value = null; try { if ((rawvalue = part.getHeader("X-mailer")[0]) != null) value = MimeUtility.decodeText(rawvalue); } catch (UnsupportedEncodingException e) { // Don't care value = rawvalue; } catch (MessagingException me) { } return value;
- Parameters:
etext
- the possibly encoded value- Returns:
- the decoded text
- Throws:
java.io.UnsupportedEncodingException
- if the charset conversion failed.
-
encodeWord
public static java.lang.String encodeWord(java.lang.String word) throws java.io.UnsupportedEncodingException
Encode a RFC 822 "word" token into mail-safe form as per RFC 2047.The given Unicode string is examined for non US-ASCII characters. If the string contains only US-ASCII characters, it is returned as-is. If the string contains non US-ASCII characters, it is first character-encoded using the platform's default charset, then transfer-encoded using either the B or Q encoding. The resulting bytes are then returned as a Unicode string containing only ASCII characters.
This method is meant to be used when creating RFC 822 "phrases". The InternetAddress class, for example, uses this to encode it's 'phrase' component.
- Parameters:
word
- Unicode string- Returns:
- Array of Unicode strings containing only US-ASCII characters.
- Throws:
java.io.UnsupportedEncodingException
- if the encoding fails
-
encodeWord
public static java.lang.String encodeWord(java.lang.String word, java.lang.String charset, java.lang.String encoding) throws java.io.UnsupportedEncodingException
Encode a RFC 822 "word" token into mail-safe form as per RFC 2047.The given Unicode string is examined for non US-ASCII characters. If the string contains only US-ASCII characters, it is returned as-is. If the string contains non US-ASCII characters, it is first character-encoded using the specified charset, then transfer-encoded using either the B or Q encoding. The resulting bytes are then returned as a Unicode string containing only ASCII characters.
- Parameters:
word
- Unicode stringcharset
- the MIME charsetencoding
- the encoding to be used. Currently supported values are "B" and "Q". If this parameter is null, then the "Q" encoding is used if most of characters to be encoded are in the ASCII charset, otherwise "B" encoding is used.- Returns:
- Unicode string containing only US-ASCII characters
- Throws:
java.io.UnsupportedEncodingException
- if the encoding fails
-
encodeWord
private static java.lang.String encodeWord(java.lang.String string, java.lang.String charset, java.lang.String encoding, boolean encodingWord) throws java.io.UnsupportedEncodingException
- Throws:
java.io.UnsupportedEncodingException
-
bEncodedLength
private static int bEncodedLength(byte[] b)
Returns the length of the encoded version of this byte array.- Parameters:
b
- the byte array- Returns:
- the length
-
qEncodedLength
private static int qEncodedLength(byte[] b, boolean encodingWord)
Returns the length of the encoded version of this byte array.- Parameters:
b
- the byte arrayencodingWord
- true if encoding words, false if encoding text- Returns:
- the length
-
doEncode
private static void doEncode(java.lang.String string, boolean b64, java.lang.String jcharset, int avail, java.lang.String prefix, boolean first, boolean encodingWord, java.lang.StringBuilder buf) throws java.io.UnsupportedEncodingException
- Throws:
java.io.UnsupportedEncodingException
-
decodeWord
public static java.lang.String decodeWord(java.lang.String eword) throws ParseException, java.io.UnsupportedEncodingException
The string is parsed using the rules in RFC 2047 and RFC 2231 for parsing an "encoded-word". If the parse fails, a ParseException is thrown. Otherwise, it is transfer-decoded, and then charset-converted into Unicode. If the charset-conversion fails, an UnsupportedEncodingException is thrown.- Parameters:
eword
- the encoded value- Returns:
- the decoded word
- Throws:
ParseException
- if the string is not an encoded-word as per RFC 2047 and RFC 2231.java.io.UnsupportedEncodingException
- if the charset conversion failed.
-
decodeInnerWords
private static java.lang.String decodeInnerWords(java.lang.String word) throws java.io.UnsupportedEncodingException
Look for encoded words within a word. The MIME spec doesn't allow this, but many broken mailers, especially Japanese mailers, produce such incorrect encodings.- Throws:
java.io.UnsupportedEncodingException
-
quote
public static java.lang.String quote(java.lang.String word, java.lang.String specials)
A utility method to quote a word, if the word contains any characters from the specified 'specials' list.The
HeaderTokenizer
class defines two special sets of delimiters - MIME and RFC 822.This method is typically used during the generation of RFC 822 and MIME header fields.
- Parameters:
word
- word to be quotedspecials
- the set of special characters- Returns:
- the possibly quoted word
- See Also:
HeaderTokenizer.MIME
,HeaderTokenizer.RFC822
-
fold
public static java.lang.String fold(int used, java.lang.String s)
Fold a string at linear whitespace so that each line is no longer than 76 characters, if possible. If there are more than 76 non-whitespace characters consecutively, the string is folded at the first whitespace after that sequence. The parameterused
indicates how many characters have been used in the current line; it is usually the length of the header name.Note that line breaks in the string aren't escaped; they probably should be.
- Parameters:
used
- characters used in line so fars
- the string to fold- Returns:
- the folded string
- Since:
- JavaMail 1.4
-
makesafe
private static java.lang.String makesafe(java.lang.CharSequence s)
If the String or StringBuilder has any embedded newlines, make sure they're followed by whitespace, to prevent header injection errors.
-
unfold
public static java.lang.String unfold(java.lang.String s)
Unfold a folded header. Any line breaks that aren't escaped and are followed by whitespace are removed.- Parameters:
s
- the string to unfold- Returns:
- the unfolded string
- Since:
- JavaMail 1.4
-
indexOfAny
private static int indexOfAny(java.lang.String s, java.lang.String any)
Return the first index of any of the characters in "any" in "s", or -1 if none are found. This should be a method on String.
-
indexOfAny
private static int indexOfAny(java.lang.String s, java.lang.String any, int start)
-
javaCharset
public static java.lang.String javaCharset(java.lang.String charset)
Convert a MIME charset name into a valid Java charset name.- Parameters:
charset
- the MIME charset name- Returns:
- the Java charset equivalent. If a suitable mapping is not available, the passed in charset is itself returned.
-
mimeCharset
public static java.lang.String mimeCharset(java.lang.String charset)
Convert a java charset into its MIME charset name.Note that a future version of JDK (post 1.2) might provide this functionality, in which case, we may deprecate this method then.
- Parameters:
charset
- the JDK charset- Returns:
- the MIME/IANA equivalent. If a mapping is not possible, the passed in charset itself is returned.
- Since:
- JavaMail 1.1
-
getDefaultJavaCharset
public static java.lang.String getDefaultJavaCharset()
Get the default charset corresponding to the system's current default locale. If the System propertymail.mime.charset
is set, a system charset corresponding to this MIME charset will be returned.- Returns:
- the default charset of the system's default locale, as a Java charset. (NOT a MIME charset)
- Since:
- JavaMail 1.1
-
getDefaultMIMECharset
static java.lang.String getDefaultMIMECharset()
-
loadMappings
private static void loadMappings(LineInputStream is, java.util.Map<java.lang.String,java.lang.String> table)
-
checkAscii
static int checkAscii(java.lang.String s)
Check if the given string contains non US-ASCII characters.- Parameters:
s
- string- Returns:
- ALL_ASCII if all characters in the string belong to the US-ASCII charset. MOSTLY_ASCII if more than half of the available characters are US-ASCII characters. Else MOSTLY_NONASCII.
-
checkAscii
static int checkAscii(byte[] b)
Check if the given byte array contains non US-ASCII characters.- Parameters:
b
- byte array- Returns:
- ALL_ASCII if all characters in the string belong to the US-ASCII charset. MOSTLY_ASCII if more than half of the available characters are US-ASCII characters. Else MOSTLY_NONASCII. XXX - this method is no longer used
-
checkAscii
static int checkAscii(java.io.InputStream is, int max, boolean breakOnNonAscii)
Check if the given input stream contains non US-ASCII characters. Uptomax
bytes are checked. Ifmax
is set toALL
, then all the bytes available in this input stream are checked. IfbreakOnNonAscii
is true the check terminates when the first non-US-ASCII character is found and MOSTLY_NONASCII is returned. Else, the check continues tillmax
bytes or till the end of stream.- Parameters:
is
- the input streammax
- maximum bytes to check for. The special value ALL indicates that all the bytes in this input stream must be checked.breakOnNonAscii
- iftrue
, then terminate the the check when the first non-US-ASCII character is found.- Returns:
- ALL_ASCII if all characters in the string belong to the US-ASCII charset. MOSTLY_ASCII if more than half of the available characters are US-ASCII characters. Else MOSTLY_NONASCII.
-
nonascii
static final boolean nonascii(int b)
-
getBytes
public static byte[] getBytes(java.lang.String s)
-
getBytes
public static byte[] getBytes(java.io.InputStream is) throws java.io.IOException
- Throws:
java.io.IOException
-
getBooleanProperty
static boolean getBooleanProperty(java.util.Properties props, java.lang.String name, boolean def)
Get a boolean valued property.- Parameters:
props
- the propertiesname
- the property namedef
- default value if property not found- Returns:
- the property value
-
getBooleanSystemProperty
static boolean getBooleanSystemProperty(java.lang.String name, boolean def)
Get a boolean valued System property.- Parameters:
name
- the property namedef
- default value if property not found- Returns:
- the property value
-
getProp
private static java.lang.Object getProp(java.util.Properties props, java.lang.String name)
Get the value of the specified property. If the "get" method returns null, use the getProperty method, which might cascade to a default Properties object.
-
getBoolean
private static boolean getBoolean(java.lang.Object value, boolean def)
Interpret the value object as a boolean, returning def if unable.
-
-