Thursday 21 December 2017

Run SQLcl from ANT

I think since last year ORacle released SQLcl which could be seen as the commandline variant of SQL Developer. But even better: a replacement of SQL Plus.

A few years ago I created what I called a InfraPatch framework, to do preparations on an infrastructure as a pre-requisite for the deployment of services and/or applications. It can run WLST scripts for creating datasouces, jms-queues, etc.  It also supported the running of database scripts, but it required an sqlplus installation, for instance using the instant client. Since it was part of a release/deploy toolset, where the created release is to be deployed by an IT admin on a test, acceptance or production environment, I had to rely on a correct Oracle/instant client installation on an agreed location.

I'm in the process of revamping that framework and renamed to InfraPrep,  since preparing an infrastructural environment makes it more clear what it does. (It does not patch a system with Oracle patches...).

Now I'm at the point that I have to implement the support of running database scripts. The framework uses ANT, which in fact is Java. And SQLcl has two big advantages that makes it ideal for me to use in my InfraPrep framework:
  • It is incredibly small: it's only 19MB! And that includes the ojdbc and xmlparser jars. Since i used ANT from a FusionMiddleWare home, I could make it even smaller! 
  • It is Java, so I can leverage the java ant task.
 So, how to call SQLcl from ANT? I need a few ingredients:
  • Download and unzip SQLcl into my Ant project and add a sqlcl.home property:
    sqlcl.home=${basedir}/sqlcl
  • The actual sqlcl jar file and add the sqlcl.jar property for that:
    sqlcl.jar=oracle.sqldeveloper.sqlcl.jar
  • The main class file = oracle.dbtools.raptor.scriptrunner.cmdline.SqlCli
These ingredients can be found in the sql.bat in the bin folder of the SQLcl download.

Then of course in my environment property file I need the user name, password and database url.
Something like:
DWN.dbUrl=(description=(address=(host=darlin-vce-db.org.darwinit.local)(protocol=tcp)(port=1521))(connect_data=(service_name=orcl)))
DWN.dbUserName=dwn_owner
DWN.dbPassword=dwn_owner

I used a TNS-style database URL, since it is the same as used in the creation of the corresponding DataSource. And it can be reused to connect with SQLcl.

Now, to make it easier to use and to abstract the plumbing in a sort of  ANT task, I crated a macrodef:


 <!-- Create Add Outbound connection pool to DB adapter-->
  <macrodef name="runDbScript">
    <attribute name="dbuser"/>
    <attribute name="dbpassword"/>
    <attribute name="dburl"/>
    <attribute name="dbscript"/>
    <sequential>
      <logMessage message="DatabaseUrl: @{dburl}" level="info"/>
      <logMessage message="DatabaseUser: @{dbuser}" level="info"/>
      <logMessage message="DatabasePassword: ****" level="info"/>
      <property name="dbConnectStr" value='@{dbuser}/@{dbpassword}@"@{dburl}"'/>
      <property name="dbScript.absPath" location="@{dbscript}"/>
      <property name="dbScriptArg" value="@${dbScript.absPath}"/>
      <logMessage message="Run Database script: ${dbScriptArg}" level="info"/>
      <record name="${log.file}" action="start" append="true"/>
      <java classname="oracle.dbtools.raptor.scriptrunner.cmdline.SqlCli" failonerror="true" fork="true">
        <arg value="${dbConnectStr}"/>
        <arg value="${dbScriptArg}"/>
        <classpath>
          <pathelement location="${sqlcl.home}/lib/${sqlcl.jar}"/>
        </classpath>
      </java>
      <record name="${log.file}" action="stop"/>
    </sequential>
  </macrodef>
</project>

In this macrodefinition, I first build up a database connect string using the username, password and database url:
      <property name="dbConnectStr" value='@{dbuser}/@{dbpassword}@"@{dburl}"'/>
Then I use a little trick to create an absolute path of the dbscript path:
      <property name="dbScript.absPath" location="@{dbscript}"/>
The trick is in the location attribute of the property.
And since that now is a property instead of an attribute, I circumvented the need for escaping the @ character:
      <property name="dbScriptArg" value="@${dbScript.absPath}"/>
The logmessage task you see is another macrodef I use:
      <macrodef name="logMessage">
            <attribute name="message" default=""/>
            <attribute name="level" default="debug"/>
            <sequential>
                  <echo message="@{message}" level="@{level}"></echo>
                  <echo file="${log.file}" append="true"
                        message="@{message}${line.separator}" level="@{level}"></echo>
            </sequential>
      </macrodef>

It both echo's the output to the console and to a log file.
Since I want the output of the java task into the same log file, I enclosed the java task with record tasks to start and stop the appending of the output-stream to the log file.

The java task is pretty simple, referencing the jar file in the classpath and providing the connect string and the script run argument as two separate arguments.
There are however two important properties:
  • failonerror="true": I want to quit my ANT scripting when the database script fails.
  • fork="true": when providing the exit statement in the sql script, SQLcl tries to quit the JVM. This is not allowed, because it runs by default in the same JVMas ANT. Not providing the exit statement in the script will keep the thread in SQLcl, which is not acceptable. So, forking the JVM will allow SQLcl to quit properly.
Now, the macro can be called as follows:
    <propertycopy name="dbUser" from="${database}.dbUserName"/>
    <propertycopy name="dbUrl" from="${database}.dbUrl"/>
    <propertycopy name="dbPassword" from="${database}.dbPassword"/>
    <runDbScript dbuser="${dbUser}" dbpassword="${dbPassword}" dburl="${dbUrl}" dbscript="${prep.folder}/${dbScript}"/>

Where these properties are used:
database=DWN
dbScript=sample.sql

Ant the sample.sql file:
select * from global_name;
exit;

And this works like a charm:
runPrep:
     [echo] Script voor uitvoeren van database script.
     [echo] Environment:
     [echo] Prep folder: ../../infraPreps/BpmDbS0004
     [echo] Load prep property file ../../infraPreps/BpmDbS0004/BpmDbS0004.properties
     [echo] Run Script
     [echo] DatabaseUrl: (description=(address=(host=darlin-vce-db.org.darwinit.local)(protocol=tcp)(port=1521))(connect_data=(service_name=orcl)))
     [echo] DatabaseUser: dwn_owner
     [echo] DatabasePassword: ****
     [echo] Run Database script: @c:\temp\FMWReleaseAll\DWN\1.0.0\infraPreps\BpmDbS0004\sample.sql
     [java]
     [java] SQLcl: Release 17.3.0 Production on do dec 21 11:18:50 2017
     [java]
     [java] Copyright (c) 1982, 2017, Oracle.  All rights reserved.
     [java]
     [java] Connected to:
     [java] Oracle Database 11g Enterprise Edition Release 11.2.0.4.0 - 64bit Production
     [java] With the Partitioning, Real Application Clusters, Automatic Storage Management, OLAP,
     [java] Data Mining and Real Application Testing options
     [java]
     [java]
     [java] GLOBAL_NAME
     [java] --------------------------------------------------------------------------------
     [java] ORCL
     [java]
     [java]
     [java] Disconnected from Oracle Database 11g Enterprise Edition Release 11.2.0.4.0 - 64bit Production
     [java] With the Partitioning, Real Application Clusters, Automatic Storage Management, OLAP,
     [java] Data Mining and Real Application Testing options
     [echo] Done running preperations.

BUILD SUCCESSFUL
Total time: 12 seconds

One thing to be arranged though is the fetch of the username/password from the commandline, instead of properties. This can be as follows:
    <input message="Enter database user for environment ${database}: " addproperty="db.user"/>
    <input message="Enter password for user ${db.user}: " addproperty="db.password">
      <handler classname="org.apache.tools.ant.input.SecureInputHandler"/>
    </input>

Conclusion

SQLcl is great, since it is small and in java. So it turns out incredibly easy to distribute it within your own framework.

Wednesday 20 December 2017

OSB 12c Customization in WLST, some new insights: use the right jar for the job!

Problem setting  and investigation

Years ago I created a Release & Deploy framework for Fusion Middleware, also supporting Oracle Service Bus. Recently revamped it to use 12c. It uses WLST to import the OSB service to the Service Bus, including the execution the customization file.

There are lots of examples to do this, but I want to zoom in on the execution of the customization file.

The WLST function that does this, that I use is as follows:
#=======================================================================================
# Function to execute the customization file.
#=======================================================================================
def executeCustomization(ALSBConfigurationMBean, createdRefList, customizationFile):
    if customizationFile!=None:
      print 'Loading customization File', customizationFile
      inputStream = FileInputStream(customizationFile)
      if inputStream != None:
        customizationList = Customization.fromXML(inputStream)
        if customizationList != None:
          filteredCustomizationList = ArrayList()
          setRef = HashSet(createdRefList)
          print 'Filter to remove None customizations' 
          print "-----"
          # Apply a filter to all the customizations to narrow the target to the created resources
          print 'Number of customizations in list: ', customizationList.size()
          for customization in customizationList:
            print "Add customization to list: "
            if customization != None:
              print 'Customization: ', customization, " - ", customization.getDescription()
              newCustomization = customization.clone(setRef)
              filteredCustomizationList.add(newCustomization)
            else:
              print "Customization is None!"
            print "-----"
          print 'Number of resulting customizations in list: ', filteredCustomizationList.size()
          ALSBConfigurationMBean.customize(filteredCustomizationList)
        else:
          print 'CustomizationList is null!'
      else:
        print 'Input Stream for customization file is null!'
    else:
      print 'No customization File provided, skip customization.'

The parameter ALSBConfigurationMBean can be fetched with:
...
        sessionName = createSessionName()
        print 'Created session', sessionName
        SessionMBean = getSessionManagementMBean(sessionName)
        print 'SessionMBean started session'
        ALSBConfigurationMBean = findService(String("ALSBConfiguration.").concat(sessionName), "com.bea.wli.sb.management.configuration.ALSBConfigurationMBean")
...

The other parameter is the createdRefList, that is build up from the default ImportPlan during import of the config jar:
...
          print 'ÒSB project', project, 'will get updated'
            osbJarInfo = ALSBConfigurationMBean.getImportJarInfo()
            osbImportPlan = osbJarInfo.getDefaultImportPlan()
            osbImportPlan.setPassphrase(passphrase)
            operationMap=HashMap()
            operationMap = osbImportPlan.getOperations()
            print
            print 'Default importPlan'
            printOpMap(operationMap)
            set = operationMap.entrySet()

            osbImportPlan.setPreserveExistingEnvValues(true)

            #boolean
            abort = false
            #list of created artifact refences
            createdRefList = ArrayList()
            for entry in set:
                ref = entry.getKey()
                op = entry.getValue()
                #set different logic based on the resource type
                type = ref.getTypeId
                if type == Refs.SERVICE_ACCOUNT_TYPE or type == Refs.SERVICE_PROVIDER_TYPE:
                    if op.getOperation() == ALSBImportOperation.Operation.Create:
                        print 'Unable to import a service account or a service provider on a target system', ref
                        abort = true
                else:
                    #keep the list of created resources
                    print 'ref: ',ref
                    createdRefList.add(ref)
            if abort == true :
                print 'This jar must be imported manually to resolve the service account and service provider dependencies'
                SessionMBean.discardSession(sessionName)
                raise
            print
            print 'Modified importPlan'
            printOpMap(operationMap)
            importResult = ALSBConfigurationMBean.importUploaded(osbImportPlan)
            printDiagMap(importResult.getImportDiagnostics())              
            if importResult.getFailed().isEmpty() == false:
                print 'One or more resources could not be imported properly'
                raise
...

The meaning is to build up a set of references of created artefact, to narrow down the customizations to only execute them on the artefacts that are actually imported.

Now, back to the executeCustomization function. It first creates an InputStream on the customization file:
inputStream = FileInputStream(customizationFile)

on which it builds a list of customizations using the .fromXML method of the Customization object:
        customizationList = Customization.fromXML(inputStream)

These customizations are interpreted from the Customization file. If you open that you can find several customization elements:
 <cus:customization xsi:type="cus:EnvValueActionsCustomizationType">
        <cus:description/>
...
    <cus:customization xsi:type="cus:FindAndReplaceCustomizationType">
        <cus:description/>
...
    <cus:customization xsi:type="cus:ReferenceCustomizationType">
        <cus:description/>


These all are mapped to subclasses of the Customization. And now the reason that I write this blogpost is that I ran into a problem with my import tooling. In the EnvValueActionsCustomizationType the endpoint replacements for the target environments is done. And the weren't executed. In fact these customizations were in the customizationList, but as a None/Null object. Thus, executing this complete list using ALSBConfigurationMBean.customize(filteredCustomizationList) would run in an exception, refering to a null object in the customization list. That's why they're filtered out. But why weren't these interpreted by the .fromXml() method?

Strangely enough in the javaAPI docs of 12.2.1 the EnvValueActionsCustomization does not exist, but the EnvValueCustomization does. But searching My Oracle Support shows in Note 1679528.2: 'A new customization type EnvValueActionsCustomizationType is available in 12c which is used when creating a configuration plan file.' and here in the Java API doc (click on com.bea.wli.config.customization) it is stated that EnvValueCustomization is deprecated and EnvValueActionsCustomization should be used in stead.
Apparently the docs is not updated completely....
And it seems that I used a wrong jar file: The customization file was created using the Console, and executing the customization file using the console did execute the endpoint replacements. So I figured that I must be using a wrong version of the jar file.
So I searched on my BPM quickstart installation (12.2.1.2) for the class EnvValueCustomization:
Jar files containing EnvValueCustomization
  • C:\Oracle\JDeveloper\12210_BPMQS\osb\lib\modules\oracle.servicebus.configfwk.jar/com\bea\wli\config\customization\EnvValueCustomization.class
  • C:\Oracle\JDeveloper\12210_BPMQS\oep\spark\lib\spark-osa.jar/com\bea\wli\config\customization\EnvValueCustomization.class
  • C:\Oracle\JDeveloper\12210_BPMQS\oep\common\modules\com.bea.common.configfwk_1.3.0.0.jar/com\bea\wli\config\customization\EnvValueCustomization.class
And then I did a search with EnvValueActionsCustomization.
Jar files containing EnvValueActionsCustomization:
  • C:\Oracle\JDeveloper\12210_BPMQS\osb\lib\modules\oracle.servicebus.configfwk.jar/com\bea\wli\config\customization\EnvValueActionsCustomization.class

Solution

It turns out that in my ANT script I used:
<path id="library.osb">
  <fileset dir="${fmw.home}/oep/common/modules">
     <include name="com.bea.common.configfwk_1.3.0.0.jar"/>
  </fileset> 
  <fileset dir="${weblogic.home}/server/lib">
    <include name="weblogic.jar"/>
    <include name="wls-api.jar"/>
  </fileset>
  <fileset dir="${osb.home}/lib">
    <include name="alsb.jar"/>
  </fileset>
</path>

Where I should use:
<path id="library.osb">
  <fileset dir="${fmw.home}/osb/lib/modules">
    <include name="oracle.servicebus.configfwk.jar"/>
  </fileset>
  <fileset dir="${weblogic.home}/server/lib">
    <include name="weblogic.jar"/>
    <include name="wls-api.jar"/>
  </fileset>
  <fileset dir="${osb.home}/lib">
    <include name="alsb.jar"/>
  </fileset>
</path>

Conclusion

It took me quite some time to debug this. But learned how the customization works. I found quite some examples that use com.bea.common.configfwk_1.X.0.0.jar. And apparently during my revamping, I updated this class path (actually I had 1.7, and found only 1.3 in my environment).  But, somehow Oracle found it sensible to replace it with oracle.servicebus.configfwk.jar while keeping the old jar files.
So use the right Jar for the job!

Monday 18 December 2017

Create the SOA/BPM Demo User Community, with just WLST.

As said in my previous post (I've learned somewhere you should not post twice on the same day, but spread it out over time), I'm delivering a BPM 12c training. And based it on the BPM Quickstart. Although nice for UnitTests and development, the integrated weblogic lacks a  proficient set of users to test your task definitions.

Oracle has a demo community and a set of ANT and Servlet based scripts to povision your SOA or BPMSuite environment with a set of American literature writers, to be used in demo's and trainings. I some how found this years ago and had it debugged to be used in 12.1.3 a few years ago. However, I did not know where I got it and if it was free to be delivered.

Apparently it is, and you can find it at my oracle support. Our friends with Avio Consulting also  did a good job in making it available and working with 12c. However, I could not make it work smoothly end 2 end. I got it seeded, but figured that I would not need ANT and a Servlet.

Last year, in 2016, I created a bit of WLST scripting to create users for OSB and have them assigned to OSB Application roles. You can read about that here for the user creation, and here for the app-role assignment.

One thing that's missing in those scripts is the setting of the user attributes. So I googled around and found a means to add those too.

First, I had to transform the demo community seeding xml file to a property file. Like this:

#
cdickens.password=welcome1
cdickens.description=Demo User
cdickens.email=cdickens@emailExample.com
cdickens.title=CEO
cdickens.firstName=Charles
cdickens.lastName=Dickens
cdickens.timeZone=America/Los_Angeles
cdickens.languagePreference=en-US
cdickens.workPhone=100000001
cdickens.homePhone=200000001
cdickens.mobile=300000001
cdickens.im=jabber|cdickens@exampleIM.com

The complete, usersAndGroups.properties file is available here.
In an earlier blog, I wrote about how to read a property file. But my prefered method, does not allow me to determine the property to be fetched dynamically. That's why I split my basic createDemoUsers.properties file, that refers to the usersAndGroups.properties file, and contains the properties refering the Oracle/Jdeveloper home and the connection details for the AdminServer. This property file also contains comma separated lists of users, groups and AppRoles to be created or  granted.

The actual createDemoUsers.py file loops over the three lists and creates the particular users and groups, and grants the AppRoles.

To set the attributes, the setUserAttributeValue of the authenticatorMBean can be used as follows:
    #Set Properties
    firstName=userProps.getProperty(userName+".firstName")
    lastName=userProps.getProperty(userName+".lastName")
    displayName=nvl(firstName, " ")+" "+nvl(lastName, " ")
    authenticator.setUserAttributeValue(userName,"displayName",displayName.strip())

I published the complete set of scripts on the GitHub repo I shared with my colleague.
You can download them all, adapt the createDemoUsers.sh, to refer the correct MW_HOME to your JDeveloper environment. For Windows you might translate it to a .bat/.cmd file.

And of course you can use it for your own set of users.

I think I covered near to all of the Demo User community. Except for management-chains: I could not find how to register a manager for a user in Weblogic. Neither in the console, nor inWLST. So, I currently I conclude it cannot be done. But, if you have a tip, please be so good to leave a comment. I would highly appreciate it.

2018-08-22, Update: found this article referencing the BPM roles, from my appreciated former Whitehorses co-worker. Should be able to integrate this in my scripts.




BPM 12.2.1.3: Exception when deploying BPM project with Human tasks

This week I deliver a BPM 12c Workshop, that I based on the 12.2.1.3 BPM QuickStart. When the students worked on the lab on Human Workflow, they hit an error deploying the Composite, where in the log you can find something like:

Caused By: oracle.fabric.common.FabricException: Error occurred during deployment of component: RequestHolidayTask to service engine: implementation.workflow, for composite: HolidayRequestProcess: ORABPEL-30257
 
exception.code:30257
exception.type: ERROR
exception.severity: 2
exception.name: Error while Querying workflow task metadata.
exception.description: Error while Querying workflow task metadata.
exception.fix: Check the underlying exception and the database connection information.  If the error persists, contact Oracle Support Services.
: exception.code:30257
exception.type: ERROR
exception.severity: 2
exception.name: Error while Querying workflow task metadata.
exception.description: Error while Querying workflow task metadata.
exception.fix: Check the underlying exception and the database connection information.  If the error persists, contact Oracle Support Services.
 
Caused By: oracle.fabric.common.FabricDeploymentException: ORABPEL-30257
 
 
Caused By: java.sql.SQLSyntaxErrorException: Column 'WFTM.PACKAGENAME' is either not in any table in the FROM list or appears within a join specification and is outside the scope of the join specification or appears in a HAVING clause and is not in the GROUP BY list. If this is a CREATE or ALTER TABLE  statement then 'WFTM.PACKAGENAME' is not a column in the target table.

Apparently in the repository of 12.2.1.3 a column is missing in the Workflow Metadata table.

Luckily, I stumbled upon a question in the community.oracle.com forum that hit this 'bug' as well; and provided a solution. You need to do an alter table to resolve this:
ALTER TABLE SOAINFRA.WFTASKMETADATA ADD PACKAGENAME varchar (200);

The smart guy that provided the answer, used a separate Database UI tool. But fortunately, JDeveloper is perfectly capable to provide you de means as well.

First open the Resource Pallette in JDeveloper. Make sure that you have started your Integrated WebLogic already (since that will run the DerbyDB.

Then in the Resource Pallette, create a new Database Connection:


Provide the following details:
Give it a name, like soainfraDB, as a Connection Type select 'Java DB / Apache Derby'. You can leave Username and Password empty. Then as a Driver Class, choose the 'org.apache.derby.jdbc.ClientDriver' (not the default). Then as a Host Name provide localhost, as a JDBC Port enter 1527 en as  a Database Name enter soainfra.

You can  Test Connection  and then, if Successfull, hit OK.

Then from the IDE Connections pallette  right click your newly created database connection, and choose 'Open in Databases Window':



And from that right click on the database connection and choose 'Open SQL Worksheet':

There you can enter and execute the alter statement:
After this, deployment should succeed. Since it is persisted in the DerbyDB it will survive restarts.

This might apply to the SOA QuickStart as well (did not try).

Friday 1 December 2017

OSB: Disable Chunked Streaming Mode recommendation

Intro

These weeks I got involved in a document generation performance issue. This ran for several months, maybe years even. But it stayed quite unclear what the actual issue was.

Often we got complaints that document generation from the front-end application (based on Siebel) was taking very long. End users often hit the button several times, but with no luck. Asking further, it did not mean that there appeared a document in the content management system (Oracle UCM/WCC). So, we concluded that it wasn't so much a performance issue, but an exception along the process of document generation. Since we upgraded BI Publisher to 12c, it was figured that it might got something to do with that. But we did not find any problems with BI Publisher, itself. Also, there was an issue with Siebel it's self, but that's also out of the scope of this article.

The investigation

First, on OSB the retry interval of the particular Business Service was decreased from 60 seconds to 10. And the performance increased. Since the retry interval was shorter, OSB does a retry on shorter notice. But of course this did not solve the problem.

As Service developers we often are quite laconical about retries. We make up some settings. Quite default is an interval of 30 seconds and a retry count of 3. But, we should actually think about this and figure out what the possible failures could be and what a sensible retry setting would be. For instance: is it likely that the remote system is out of order? What are the SLA's for hoisting it back up again? If the system startup is 10 minutes, then a retry count of 3 and interval of 30 seconds is not making sense. The retries are done long before the system's up again. But of course, in our case sensible settings for system outage would cause delays being too long. We apparently needed to cater for network issues.

Last week our sysadmins encountered network failures, so they changed the LoadBalancer of BIP Publisher, to get chunks/packets of one requests routed to the same BI Publisher node. I found SocketReadTimeOuts in the logfiles. And from the Siebel database a query was done and plotted out in Excel showing lots of request in the  1-15 seconds range, but also some plots in ranges around 40 seconds and 80 seconds. We wondered why these were.

The Connection and Read TimeOut settings on the Business Service were set to 30s. So I figured the 40 and 80 seconds range could have something to do with a retry interval of 10s added to a time out of 30 seconds.

I soon found out that in OSB on the Business Service, the Chunked Streaming Mode  was enabled. This is a setting we struggled with a lot. Several issues we encountered were blamed on this one. As a Helpdesk employee would ask you if you have restarted your system, on OSB questions I would ask you about this setting first... Actually, I did for this case, long before I got actively involved.

Chunked Streaming Mode explained

Let's start with a diagram:

In this diagram you'll see that the OSB is fronted by a Load Balancer. But since 12c the Oracle HTTP Server is added to the Weblogic Infrastructure. And following the Enterprise Deployment Guide we added an OHS to the Weblogic Infrastructure Domain, as a co-located OHS Instance. And since the OSB as well as the Service Provider (in our case BI Publisher) are clustered, the OHS will load balance the requests.

Now, the Chunked transfer encoding is an HTTP 1.1 specification. It is an improvement that allows clients to process the data in chunks right after the chunk is read. But in most (of our) cases a chunk on itself is meaning-less, since a SOAP Request/XML Document need to be parsed as a whole.
The Load Balancer also process the chunks as separate entities. So,by default, it will route the first one to the first endpoint, and the other one to the next. And thus each SP Managed Server gets an incomplete message and there for a so-called Bad Request. This happens with big requests, where for instance a report is requested together with the complete content. Then chances are that the request is split up in chunks.

But although the SysAdmins adapted the SP Load Balancer, and although I was involved in the BIPublisher 12c setup, even I forgot about the BIP12c OHS! And even when the LoadBalancer tries to keep the chunks together, then again the OHS will mess with them. Actually, if the LoadBalancer did not keep them together, the OHS instances could reroute them again to the correct end-node.

The Solution

So for all those Service Bus developers amongst you, I'd like you to memorize two concepts: "Chunked Streaming Mode" and "disable", and the latter in combination with the first, of course.
In short: remember to set Chunked Streaming Mode to disable in every SOAP/http based Business Service. Especially with services that send potentially large requests, for instance document check-in services on Content/Document Management Systems.

The proof of the pudding

After some discussion and not being able to test it on the Acceptance Test environment, due to rebuilds, we decided to change this in production (I would/should not recommend that, at least not right away).

And this was the result:


This picture shows that the first half of the day, plenty requests were retried at least once, and several even twice. Notice the request durations around the 40 seconds (30 seconds read timeout + 10 seconds retry interval) and 80 seconds. But since 12:45, when we disabled the Chunked Streaming Mode we don't see any timeout exceptions any more. I hope the end users are happy now.

Or how a simple setting can throw a spanner in the works. And how difficult it is to get such a simple change into production. Personally I think it's a pity that the Chunked Streaming Mode is enabled by default, since in most cases it causes problems, while in rare cases it might provide some performance improvements. I think you should rationalize the enablement of it, in stead of actively needing to disable it.