Thursday, November 12, 2015

StringTokenizer to split using multiple tokens in java

When trying to split a string over multiple tokens, the best thing available was thought to be the split. That way if we have to split over a string multiple times then we can split and then split over the splits, but what if there is more the be split over the parts, then we will have to split again.

Let me explain with an example. Say we have a URL that needs to be parsed.

URL: http://localhost/system/config/file/action/updateLogo?fileName=largetc.jpg&text=Company%20Logo
W

So if we have to read all the the path parameters and then also read the arguments then we have to do a split over the '/' parameter as below:

String[] urlTokens = urlFullPath.split("/");

Now let us say that we want to read the path arguments that are passed in the URL. Then we will have to do multiple splits over the path string to reach the desired variables. The below is one of the ways that it can be achieved.

String[] urlTokens = urlFullPath.split("/");
for (String urlPath : urlTokens) {
if(urlPath.contains("?")){
String[] argTokens = urlPath.split("\\?");
String[] argsParts = argTokens[1].split("&");
for (String args : argsParts) {
System.out.println("Args: " + args);
}
}
}

The output of the above piece of Java code would look something like:
Args: fileName=largetc.jpg
Args: text=Company%20Logo
That, as you can see, are a a lot of splits. Is there any better way. Probably there are a dozen ones. Here is one with StringTokenizer for easier manipulation.

There are two ways to specify a delimiter for a StringTokenizer object.

  1. In the StringTokenizer constructor pass the delimiter when initializing the object
  2. In nextToken() method pass the delimiter you are looking for next at runtime

Using StringTokenizer constructor

In the constructor one can pass the list of all the characters on which the string has to be split and accessed as:
StringTokenizer stringTokenizer = new StringTokenizer(theStringToParse, "/?&");
And then iterate over the tokens via a while loop as below:
while (stringTokenizer.hasMoreTokens())
System.out.println(stringTokenizer.nextToken());
Using nextToken method

To use by runtime parameter to the nextToken method, below is one of the ways:
while (stringTokenizer.hasMoreTokens())
System.out.println(stringTokenizer.nextToken("/"));
The token above can be changed as needed. One thing that needs to be kept in mind is that once a move to nextToken is done then the previous token is lost and one cannot do a trace back.

Split that URL
Combining all of the above here's a piece of code that can be used to extract the arguments passed in a URL as a key value pair.

StringTokenizer stringTokenizer = new StringTokenizer(theStringToParse);
// iterate through tokens of path parameters
while (stringTokenizer.hasMoreTokens()) {
String partOfToken = stringTokenizer.nextToken("?");
if (partOfToken.contains("=")) {
StringTokenizer tokenizeAgain = new StringTokenizer(partOfToken, "&");
while (tokenizeAgain.hasMoreTokens()) {
String argument = tokenizeAgain.nextToken();
String[] keyValueOfArgument = argument.split("=");
System.out.println("Key: " + keyValueOfArgument[0] + " and Value: " + keyValueOfArgument[1]);
}
}
}
The output of the above piece of code, when integrated with all the fans and flurries needed to execute would be, assuming the URL given at the beginning is the string to be parsed:
Key: fileName and Value: largetc.jpg
Key: text and Value: Company%20Logo