Tuesday 31 August 2010

Index-of, replace and key-value parsing in XSLT

Lately I had to change a few BPEL Processes in a way that I had to pass multiple parameters in one field. Instead of a complete base64 encoded data-object I had to pass a key to that object in a way that I could determine that the object was in fact a key-value pair. The field contains in that case in fact two key-value pairs. I thought that if I would code the key-value pair with something like $KEY="AABBCCDDEEFF" I could search for $KEY=" and then the value after the double-quote and before the next would then be the value.

The problem with XSLT is that you have functions like substring-before(), substring-after() and positional substring, where the from-position and length can be passed as numeric values. But for the latter function you need to determine the start and end position of the sub-string to extract from the input string. But apparently XSLT does not provide something like the Pl/Sql instr or Java index-of(). Also XSLT lacks a replace function in which you can replace a string within a string with a replacement string. The xslt-replace() function does a character-by-character replace.

Fortunately you can build these functions quite easilily yourself as xslt-templates using the substring-before(), substring-after(), and string-length() functions. I found the examples somewhere and adapted them a little for my purpose. Mainly to get them case-insensitive.

Index-of-ci
Here is my Case insensitive version of the Index-of template:
<!-- index-of: find position of search within string
2010-08-31, by Martien van den Akker -->
<xsl:template name="index-of-ci">
<xsl:param name="string"/>
<xsl:param name="search"/>
<xsl:param name="startPos"/>
<xsl:variable name="searchLwr" select="xp20:lower-case($search)"/>
<xsl:variable name="work">
<xsl:choose>
<xsl:when test="string-length($startPos)&gt;0">
<xsl:value-of select="substring($string,$startPos)"/>
</xsl:when>
<xsl:otherwise>
<xsl:value-of select="$string"/>
</xsl:otherwise>
</xsl:choose>
</xsl:variable>
<xsl:variable name="stringLwr" select="xp20:lower-case($work)"/>
<xsl:variable name="result">
<xsl:choose>
<xsl:when test="contains($stringLwr,$searchLwr)">
<xsl:variable name="stringBefore">
<xsl:value-of select="substring-before($stringLwr,$searchLwr)"/>
</xsl:variable>
<xsl:choose>
<xsl:when test="string-length($startPos)&gt;0">
<xsl:value-of select="$startPos +string-length($stringBefore)"/>
</xsl:when>
<xsl:otherwise>
<xsl:value-of select="1 + string-length($stringBefore)"/>
</xsl:otherwise>
</xsl:choose>
</xsl:when>
<xsl:otherwise>-1</xsl:otherwise>
</xsl:choose>
</xsl:variable>
<xsl:copy-of select="$result"/>
</xsl:template>

The template expects 3 parameters:
  • string: the string in which is searched
  • search: the substring that has to be searched
  • startPos: position in string where the search is started
The first thing the template does is declaring a work-variable. If startPos is not given, the variable work will contain the complete input string. But if startPos is given work will contain the substring of string from startPos to the end.

Then the input and the search strings are converted to lower case into new variables: andinputLwr and searchLwr. These variables are to test case-insensitively if the search string is in the input. If that is the case then the part of the input string before the search string is determined with the substring-before() function. The string-length() of the result denotes in fact the numeric position of the search string within the input string. This is incremented by 1 or by startPos depending on startPos being filled.

The template can be called without the startPos parameter:
<xsl:call-template name="index-of-ci">
<xsl:with-param name="string" select="$input"/>
<xsl:with-param name="search" select="$keyStr"/>
</xsl:call-template>

Or with the parameter:
<xsl:call-template name="index-of-ci">
<xsl:with-param name="string" select="$input"/>
<xsl:with-param name="search" select="string('&quot;')"/>
<xsl:with-param name="startPos" select="$startIdx"/>
</xsl:call-template>


Replace
The next template replaces the fromStr in input to toStr.
<!-- replace-ci: case insensitive replace based on strings 
2010-08-31, by Martien van den Akker -->
<xsl:template name="replace-ci">
<xsl:param name="input"/>
<xsl:param name="fromStr"/>
<xsl:param name="toStr"/>
<xsl:param name="startStr"/>
<xsl:if test="string-length( $input ) &gt; 0">
<xsl:variable name="posStartStr">
<xsl:call-template name="index-of-ci">
<xsl:with-param name="string"
select="$input"/>
<xsl:with-param name="search"
select="$startStr" />
</xsl:call-template>
</xsl:variable>
<xsl:variable name="startPos">
<xsl:call-template name="index-of-ci">
<xsl:with-param name="string"
select="$input"/>
<xsl:with-param name="search"
select="$fromStr" />
<xsl:with-param name="startPos"
select="$posStartStr" />
</xsl:call-template>
</xsl:variable>
<xsl:variable name="inputLwr" select="xp20:lower-case($input)"/>
<xsl:variable name="startStrLwr" select="xp20:lower-case($startStr)"/>
<xsl:choose>
<xsl:when test="contains( $input, $startStrLwr ) and contains( $inputLwr, $fromStr )">
<xsl:variable name="stringBefore" select="substring($input,1,$startPos - 1)"/>   
<xsl:variable name="stringAfter" select="substring($input,$startPos + string-length($fromStr))"/>   
<xsl:value-of select="concat($stringBefore,$toStr)"/>         
<xsl:call-template name="replace-ci">
<xsl:with-param name="input"
select="$stringAfter"/>
<xsl:with-param name="fromStr"
select="$fromStr" />
<xsl:with-param name="toStr"
select="$toStr" />
<xsl:with-param name="startStr"
select="$startStr" />
</xsl:call-template>
</xsl:when>
<xsl:otherwise>
<xsl:value-of select="$input"/>
</xsl:otherwise>
</xsl:choose>
</xsl:if>
</xsl:template>

Here the pos of the from string is determined using the index-of-ci template described earlier.
But this is done from the position of the startStr that marks the start of the replacement search. For example, if you want to replace domain names in email-adresses, you want to start the search of the domain after the at-sign ( '@' ).
Having the start position of the 'from'-string the part of the input before and after the 'from'-string is taken using the string-before() and strina-after() functions.
The 'to'-string is concatenated to the result of the string-before(). The result of the string-after() is used to call the template recursively to search the remainder of the input-string.


Parsing keys
The following template parses a key value like $KEY="AABBCCDDEEFF"
<!-- Parse a KeyValue
2010-08-31, By Martien van den Akker -->
<xsl:template name="getKeyValue">
<xsl:param name="input"/>
<xsl:param name="key"/>
<xsl:param name="default"/>
<!-- Init variables -->
<xsl:variable name="keyStr" select="concat('$',$key,'=&quot;')"/>
<xsl:if test="string-length( $input ) &gt; 0">
<xsl:variable name="startIdxKey">
<xsl:call-template name="index-of-ci">
<xsl:with-param name="string" select="$input"/>
<xsl:with-param name="search" select="$keyStr"/>
</xsl:call-template>
</xsl:variable>
<xsl:variable name="keyLength" select="string-length($keyStr)"/>
<xsl:variable name="startIdx" select="$startIdxKey+$keyLength"/>
<xsl:variable name="endIdx">
<xsl:call-template name="index-of-ci">
<xsl:with-param name="string" select="$input"/>
<xsl:with-param name="search" select="string('&quot;')"/>
<xsl:with-param name="startPos" select="$startIdx"/>
</xsl:call-template>
</xsl:variable>
<!-- Determine value -->
<xsl:choose>
<xsl:when test="$startIdxKey&gt;=0 and $endIdx&gt;=0">
<xsl:value-of select="substring($input,$startIdx, $endIdx - $startIdx)"/>
</xsl:when>
<xsl:when test="$startIdxKey>0 and $endIdx&lt;0">
<xsl:value-of select="substring($input,$startIdx)"/>
</xsl:when>
<xsl:otherwise>
<xsl:if test="$default='Y'">
<xsl:value-of select="$input"/>
</xsl:if>
</xsl:otherwise>
</xsl:choose>
</xsl:if>
</xsl:template>
The working is quite similar to the templates above. If you understand the replace-template, you wouldn't have trouble with this one. Basically the value is search using the index-of-ci template with a concatenation of '$', the key and '="'. The string after that is the value, with a double quote as the end delimiter.
Having this template you can search for the key any where in the string, even if there are multiple key-value pairs.
The template can be called like:
<ns1:Id>
<xsl:call-template name="getKeyValue">
<xsl:with-param name="input" select="/ns2:aap/ns2:noot"/>
<xsl:with-param name="key" select="'KEY'"/>
<xsl:with-param name="default" select="'Y'"/>
</xsl:call-template>
</ns1:Id>


The parameter 'default' is optional: if it is set to 'Y' and the KEY is not found in the input string, then the value of the input is returned as result. Otherwise nothing is returned.

Conclusion
I found these templates very helpful. Using them you can do almost any string replacements in XSLT. ALso to me they function as example to cope with more advanced XSLT-challenges.