Thursday 6 April 2017

BPM BAC Subversion Server refusing connections

These days I work on setting up several development lifecycle environments for BPM, SOA and OSB. What means that we setup servers for Development and test, culminating eventually in supporting the setup of the acceptance and production servers.

Since we want to have development resemble production as much as possible, we installed a dual-node clustered BPM environment, including BAM. However, since it is development, we have the two Managed Servers per cluster on the same phsyical (actually virtual) host.

In 12c BPM introduced a new component the Process Asset Manager, as described here. But installing BPM on a dual node cluster on a single host, will give problems with the underlying BAC/Subversion Component.

Errors you could encounter in the logs are for example:

Status of Start Up operation on target /Domain_o-bpm-1_domain/o-bpm-1_domain/BPMComposer is STATE_FAILED
[Deployer:149193]Operation "start" on application "BPMComposer" has failed on "bpm_server2".
java.lang.RuntimeException: weblogic.osgi.OSGiException: Could not find bundle with Bundle Symbolic Name of:org.tmatesoft.svn.core
               at weblogic.osgi.internal.OSGiAppDeploymentExtension.prepare(OSGiAppDeploymentExtension.java:273)
               at weblogic.application.internal.flow.AppDeploymentExtensionFlow.prepare(AppDeploymentExtensionFlow.java:25)
               at weblogic.application.internal.BaseDeployment$1.next(BaseDeployment.java:727)
               at weblogic.application.utils.StateMachineDriver.nextState(StateMachineDriver.java:45)
               at weblogic.application.internal.BaseDeployment.prepare(BaseDeployment.java:239)
               at weblogic.application.internal.EarDeployment.prepare(EarDeployment.java:66)
               at weblogic.application.internal.DeploymentStateChecker.prepare(DeploymentStateChecker.java:158)
               at weblogic.deploy.internal.targetserver.AppContainerInvoker.prepare(AppContainerInvoker.java:65)
               at weblogic.deploy.internal.targetserver.operations.ActivateOperation.createAndPrepareContainer(ActivateOperation.java:229)
               at weblogic.deploy.internal.targetserver.operations.StartOperation.createAndPrepareContainer(StartOperation.java:95)
               at weblogic.deploy.internal.targetserver.operations.StartOperation.doPrepare(StartOperation.java:108)
               at weblogic.deploy.internal.targetserver.operations.AbstractOperation.prepare(AbstractOperation.java:241)
               at weblogic.deploy.internal.targetserver.DeploymentManager.handleDeploymentPrepare(DeploymentManager.java:794)
               at weblogic.deploy.internal.targetserver.DeploymentManager.prepareDeploymentList(DeploymentManager.java:1340)
               at weblogic.deploy.internal.targetserver.DeploymentManager.handlePrepare(DeploymentManager.java:267)
               at weblogic.deploy.internal.targetserver.DeploymentServiceDispatcher.prepare(DeploymentServiceDispatcher.java:177)
               at weblogic.deploy.service.internal.targetserver.DeploymentReceiverCallbackDeliverer.doPrepareCallback(DeploymentReceiverCallbackDeliverer.java:186)
               at weblogic.deploy.service.internal.targetserver.DeploymentReceiverCallbackDeliverer.access$000(DeploymentReceiverCallbackDeliverer.java:14)
               at weblogic.deploy.service.internal.targetserver.DeploymentReceiverCallbackDeliverer$1.run(DeploymentReceiverCallbackDeliverer.java:47)
               at weblogic.work.SelfTuningWorkManagerImpl$WorkAdapterImpl.run(SelfTuningWorkManagerImpl.java:666)
               at weblogic.invocation.ComponentInvocationContextManager._runAs(ComponentInvocationContextManager.java:348)
               at weblogic.invocation.ComponentInvocationContextManager.runAs(ComponentInvocationContextManager.java:333)
               at weblogic.work.LivePartitionUtility.doRunWorkUnderContext(LivePartitionUtility.java:54)
               at weblogic.work.PartitionUtility.runWorkUnderContext(PartitionUtility.java:41)
               at weblogic.work.SelfTuningWorkManagerImpl.runWorkUnderContext(SelfTuningWorkManagerImpl.java:640)
               at weblogic.work.ExecuteThread.execute(ExecuteThread.java:406)
               at weblogic.work.ExecuteThread.run(ExecuteThread.java:346)
Caused by: java.lang.RuntimeException: Could not find bundle with Bundle Symbolic Name of:org.tmatesoft.svn.core
               at weblogic.osgi.internal.OSGiAppDeploymentExtension.attachToApplicationBundleClassloader(OSGiAppDeploymentExtension.java:161)
               at weblogic.osgi.internal.OSGiAppDeploymentExtension.prepare(OSGiAppDeploymentExtension.java:249)
               ... 26 more
 
[Deployer:149034]An exception occurred for task [Deployer:149026]start application BPMComposer on bpm_cluster.: weblogic.osgi.OSGiException: Could not find bundle with Bundle Symbolic Name of:org.tmatesoft.svn.core.

Or during deployment of OracleBPMBACServerApp on one of the managed servers:
[2015-02-04T07:04:05.848+00:00] [BPM_Ms1_2] [ERROR] [] [oracle.bpm.bac.svnserver] [tid: [ACTIVE].ExecuteThread: '29' for queue: 'weblogic.kernel.Default (self-tuning)'] [userId: ] [ecid: 88957d95-82b2-4365-aaea-ac834a54599d-00000003,0] [APP: OracleBPMBACServerApp] Unexpected error starting BAC SVN Server.[[
oracle.bpm.bac.subversion.server.exception.ServerException: java.net.BindException: Address already in use

WLS version: 12.1.3.0.0
authentication-provider: default
Servers: AdminServer+ local_bpel_ms1 + local_bpel_ms2
Cluster: local_bpel_cluster, cluster-messaging-mode: unicast

Configuration manager:
Protocol: PLAIN
Topology: ACTIVE_ACTIVE_CLUSTER


As described in Doc ID 1987120.1 on Oracle Suppport, this is due to the fact that by default in oracle.bpm.bac.server.mbeans.metadata MBean the Bac Admin node and the BacNode use the same ports. (I think the cause is a bit mis-phrased in the note....)

To solve this:
  1. Login to Enterprise Manager console : http://host:port/em
  2. Expand WebLogic Domain
  3. Right click on the domain and click on System MBean Browser
  4. In System MBean Browser search for: oracle.bpm.bac.server.mbeans.metadata and expand it
  5. Expand the first BacNode and check that the AdminPort and also the BacNode port are different:
  6. Do the same thing for the second BacNode:
  7. Make sure that all 4 ports are different.
  8. Do a netstat on the machine and check also that the ports used by BAC servers are not already in use. Change them if necessary.
Port check can be done by:
[oracle@darlin-vce-bpm logs]$ netstat -tulpn |grep 8323
(Not all processes could be identified, non-owned process info
 will not be shown, you would have to be root to see it all.)
tcp        0      0 192.168.1.239:8323      0.0.0.0:*               LISTEN      23940/java
tcp        0      0 192.168.1.240:8323      0.0.0.0:*               LISTEN      23988/java
[oracle@darlin-vce-bpm logs]$ netstat -tulpn |grep 8424
(Not all processes could be identified, non-owned process info
 will not be shown, you would have to be root to see it all.)
tcp        0      0 192.168.1.240:8424      0.0.0.0:*               LISTEN      23988/java
[oracle@darlin-vce-bpm logs]$ netstat -tulpn |grep 8423
(Not all processes could be identified, non-owned process info
 will not be shown, you would have to be root to see it all.)
tcp        0      0 0.0.0.0:8423            0.0.0.0:*               LISTEN      23940/java
[oracle@darlin-vce-bpm logs]$
You see that the latest port, the AdminPort belonging to the empty AdminAddress on the bpm_server1 is listening on 0.0.0.0, in affect on every address. Apparently, it is not possible to set an explicit address. Since it is on one actual host, the bpm servers are attached to different virtual IP addresses.

Now, this enabled us to start both the OracleBPMBACServerApp  as well as the BPMComposer.
But when we started using the BPM Composer and opening a project and process, there were issues. We found that when you got in bpm_server2, all went well. But when the load balancer directed you to bpm_server1, you could get a Internal server error

We encountered the following errors in the diagnostic log on bpm_server1:
[2017-04-05T03:25:27.270+02:00] [bpm_server1] [NOTIFICATION] [] [oracle.bpm.bac.svnserver.configuration] [tid: Active Sync Thread [/b91abb78-6b3d-4448-af6a-e82125f261f0/]] [userId: <anonymous>] [ecid: 41a5a471-1881-4527-8638-c344af778e7c-0000000a,0:497] [APP: OracleBPMBACServerApp] [partition-name: DOMAIN] [tenant-name: GLOBAL] Default repository path: /u02/oracle/products/fmw/user_projects/domains/o-bpm-1_domain/bpm/bac/bpm_server2/repositories
[2017-04-05T03:25:27.273+02:00] [bpm_server1] [NOTIFICATION] [] [oracle.bpm.bac.svnserver.configuration] [tid: Active Sync Thread [/b91abb78-6b3d-4448-af6a-e82125f261f0/]] [userId: <anonymous>] [ecid: 41a5a471-1881-4527-8638-c344af778e7c-0000000a,0:497] [APP: OracleBPMBACServerApp] [partition-name: DOMAIN] [tenant-name: GLOBAL] Default repository path: /u02/oracle/products/fmw/user_projects/domains/o-bpm-1_domain/bpm/bac/bpm_server1/repositories
[2017-04-05T03:25:27.277+02:00] [bpm_server1] [ERROR] [] [oracle.bpm.bac.svnserver.replication] [tid: Active Sync Thread [/b91abb78-6b3d-4448-af6a-e82125f261f0/]] [userId: <anonymous>] [ecid: 41a5a471-1881-4527-8638-c344af778e7c-0000000a,0:497] [APP: OracleBPMBACServerApp] [partition-name: DOMAIN] [tenant-name: GLOBAL] org.tmatesoft.svn.core.SVNException: svn: E210003: connection refused by the server[[
oracle.bpm.bac.subversion.server.repository.exceptions.RepositoryException: org.tmatesoft.svn.core.SVNException: svn: E210003: connection refused by the server
        at oracle.bpm.bac.subversion.server.repository.exceptions.RepositoryException.wrap(RepositoryException.java:56)
        at oracle.bpm.bac.subversion.server.repository.SVNKitRepositorySession.getRepositoryUUID(SVNKitRepositorySession.java:98)
        at oracle.bpm.bac.subversion.server.repository.RepositorySVNSync.sync(RepositorySVNSync.java:74)
        at oracle.bpm.bac.subversion.server.repository.RepositorySVNSync.sync(RepositorySVNSync.java:59)
        at oracle.bpm.bac.subversion.server.repository.ha.aa.ActiveAARepository$Synchronizer.runImpl(ActiveAARepository.java:360)
        at oracle.bpm.bac.subversion.server.repository.ha.aa.ActiveAARepository$Synchronizer.run(ActiveAARepository.java:304)
        at java.lang.Thread.run(Thread.java:745)
Caused by: org.tmatesoft.svn.core.SVNException: svn: E210003: connection refused by the server
        at org.tmatesoft.svn.core.internal.wc.SVNErrorManager.error(SVNErrorManager.java:85)
        at org.tmatesoft.svn.core.internal.wc.SVNErrorManager.error(SVNErrorManager.java:69)
        at org.tmatesoft.svn.core.internal.io.svn.SVNPlainConnector.open(SVNPlainConnector.java:62)
        at org.tmatesoft.svn.core.internal.io.svn.SVNConnection.open(SVNConnection.java:77)
        at org.tmatesoft.svn.core.internal.io.svn.SVNRepositoryImpl.openConnection(SVNRepositoryImpl.java:1252)
        at org.tmatesoft.svn.core.internal.io.svn.SVNRepositoryImpl.testConnection(SVNRepositoryImpl.java:95)
        at org.tmatesoft.svn.core.io.SVNRepository.getRepositoryUUID(SVNRepository.java:280)
        at oracle.bpm.bac.subversion.server.repository.SVNKitRepositorySession.getRepositoryUUID(SVNKitRepositorySession.java:95)
        ... 5 more
Caused by: java.net.ConnectException: Connection refused (Connection refused)
        at java.net.PlainSocketImpl.socketConnect(Native Method)
        at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:350)
        at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:206)
        at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:188)
        at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
        at java.net.Socket.connect(Socket.java:589)
        at org.tmatesoft.svn.core.internal.util.SVNSocketFactory.connect(SVNSocketFactory.java:112)
        at org.tmatesoft.svn.core.internal.util.SVNSocketFactory.createPlainSocket(SVNSocketFactory.java:68)
        at org.tmatesoft.svn.core.internal.io.svn.SVNPlainConnector.open(SVNPlainConnector.java:53)
        ... 10 more

]]

This occurs just on one of the two servers, in our case bpm_server1. Apparently, the Bac server can't run more than once on the same host. Or, in an active-active configuration, one of the bpm servers does not know how to connect. I don't know yet, how it is supposed to work internally.

Toggling the configuration to Single node won't work. Restarting the servers will cause the bpm composer unable to start at all. But switching to active passive in the BacConfigurationManager helped:

Then restarted the bpm servers and the BPM Composer and the underlying PAM/Bac/Subversion were working on both servers.


1 comment :

Martien van den Akker said...

Please refer to http://blog.darwin-it.nl/2017/11/bpm-bac-subversion-server-refusing.html for a patch on this problem.