If you work on web applications, it is essential to know how to encode a String to pass it in URL so that it can be passed safely to a servlet or a CGI program. Once a string is encoded, it is safe and can be decoded at the receiver’s end. Java provides the URLEncoder class’s encode() method for this.
URL encoding to handle special characters
In a GET method, parameter values are passed as strings in the URL after the question mark (?). More than one parameter is separated using the ampersand (&) symbol. It is easy when a single string is passed, and HTML form appends it automatically:
However, special characters in string should be appropriately handled, and encoding is the best way for it. The most common encoding scheme to encode Java String is ‘UTF-8’. Let us say you want to pass the parameters ‘joe april’ and ‘@math!\science\myfav90%’. The white space and characters like @, !, \, % cannot be sent as it is. URLEncoder class specifies rules for the same. As per the docs, white space is replaced by ‘+’, and other characters are first changed to one or more bytes using an encoding scheme (like UTF=8), and then each byte is then represented by 3-character string ‘%ab’. ab is the hexadecimal representation of the byte. For more information, check Javadocs.
String param1 = "joe april"; String param1Encoded = URLEncoder.encode(param1, "UTF-8"); System.out.println("param1 after encoding:" + param1Encoded); String param2 = "@math!\\science\\myfav90%"; String param2Encoded = URLEncoder.encode(param2, "UTF-8"); System.out.println("param2 after encoding:" + param2Encoded);
The output will be:
param1 after encoding:joe+april param2 after encoding:%40math%21%5Cscience%5Cmyfav90%25
Note that you have to add import java.net.URLEncoder;
and catch or throw the exception UnsupportedEncodingException
The characters other than alphanumeric characters (a-z, A-Z, 0-9) and some special characters like “.”, “-“, “*”, and “_” are considered unsafe. The encode() method takes two arguments, the parameter to be encoded and the encoding scheme, which is almost always UTF-8. A similar class, URLDecoder, and its method decode() are used for the reverse process to get the original strings back.
People are also reading: