Pages

Saturday, May 4, 2019

Guide on Java String getBytes​

1. Overview:

Encodes the current String into a sequence of bytes using the platform's default charset, storing the result into a new byte array.

This method has 4 overloaded methods.

public byte[] getBytes()
public void getBytes​(int srcBegin, int srcEnd, byte[] dst, int dstBegin) --> This is @Deprecated
public byte[] getBytes​(String charsetName) throws UnsupportedEncodingException
public byte[] getBytes​(Charset charset)



Java String getBytes

2. Supported Charset Names:

Every implementation of the Java platform is required to support the following standard charsets. Consult the release documentation for your implementation to see if any other charsets are supported. The behavior of such optional charsets may differ between implementations.

US-ASCII: Seven-bit ASCII, a.k.a. ISO646-US, a.k.a. the Basic Latin block of the Unicode character set
ISO-8859-1: ISO Latin Alphabet No. 1, a.k.a. ISO-LATIN-1
UTF-8: Eight-bit UCS Transformation Format
UTF-16BE: Sixteen-bit UCS Transformation Format, big-endian byte order
UTF-16LE: Sixteen-bit UCS Transformation Format, little-endian byte order
UTF-16: Sixteen-bit UCS Transformation Format, byte order identified by an optional byte-order mark

3. Examples:

We will learn example programs on all overloaded methods of getBytes.

In all our examples, we will use the String object as below described. All these explained examples getBytes method returns byte array.

String string = "w3schools";

3.1 getBytes()

Encodes this String into a sequence of bytes using the platform's default charset, storing the result into a new byte array.

The behavior of this method when this string cannot be encoded in the default charset is unspecified. The CharsetEncoder class should be used when more control over the encoding process is required.

byte[] bytes = string.getBytes();

Output:

1195111599104111111108115

Here default charset is used for string encoding.

3.2 getBytes​(String charsetName)

This method as well encodes the string into a sequence of bytes using the provided charsetName and returns the result into a new byte array.

bytes = string.getBytes("UTF-16BE"); // 0119051011509901040111011101080115
bytes = string.getBytes("UTF-16LE"); // 1190510115099010401110111010801150

If the provided charsetName is not in the supported format then will thrown runtime exception saying "java.io.UnsupportedEncodingException". Always should use a valid charsetName as described in "Supported Charset Names" above section 2.

3.3 getBytes​(Charset charset)

This method as well encodes the string into a sequence of bytes using the provided charset and returns the result into a new byte array.

bytes = string.getBytes(Charset.defaultCharset()); // 11905101150990104011101110108011501195111599104111111108115

This method always replaces malformed-input and unmappable-character sequences with this charset's default replacement byte array. The CharsetEncoder class should be used when more control over the encoding process is required.

4. Conclusion

In this article, we discussed what is getBytes method in String class with overloaded methods. We further discussed examples on each method.

All code examples can be found at GitHub.

1 comment:

Please do not add any spam links in the comments section.