Thursday, 1 December 2016

Note to myself: when handling large payloads

Today I stumbled on a question in the communities about handling large payloads in BPEL/XSLT. Although I know that SOASuite from 11g onwards can do paging of XML to disk, I never had the need. However, you could need it from time to time. And it's good to know how to do it.

It's noted on My Oracle Support with Doc ID 1327970.1. Which refers to the 11g documentation on Managing Large Documents and Large Numbers of Instances.

Learning all the time....


Wednesday, 2 November 2016

XMind 8 is published

Already several years I´m a user of XMind. It is a very rich free MindMapping tool. I find the free version very usefull.

Today I found out that the new XMind 8 is published. You can get it on xmind.net.

Enjoy.

Monday, 31 October 2016

OSB Thread handling recommendations

I have got questions on performance of OSB quite a few times already, during the years. A few years ago on a project I got eyes on a set of recommendations on workmanagers for OSB. Many developers know that for instance Service Call outs are blocking activities. And that you should use workmanagers to solve performance problems resulting from the use of those blocking activities.

If you do nothing on dispatch policies in OSB proxy or business services, all is done in the Default Workmanager. But since some constructions, not only service call-outs, need other threads to finish the job, you can get stuck threads because the workmanager's threadpool gets empty having all or near to all threads waiting, leaving no threads to pickup work to free the others.

More on this in the terrific blog of Anthony Reynolds on the subject: Following the thread.

By the way,  I've seen that some people by default use service call-outs for almost everything. But the default use should be the routing node with a route activity. Even in some services you need to gather information from several sources, while you can only have one route node, pick or choose a 'driving-service' to use in the Routing node. Just like creating a query on several tables where you have to choose a 'driving-table'. Then  use service call outs only to do the extra enrichment.

From that earlier project I got the following recommendations, based on the blog of Anthony Reynolds. Since I refer back to it regularly, I think it would be good to share it.

For OSB to work optimally and prevent floading WebLogic’s threadpool with hogged/stuck threads you should create 3 FairShareRequest classes in ratio of 33/33/33, to distinguish different “kinds of threadpools”.

Then create 4 workmanagers:

  • FTPPollingWorkManager: file based inbound OSB proxy services. Polling a filesystem (or FTP). Uses FairShareReqClass-1, and ignores stuck threads.
  • InboudWorkManager: inbound OSB proxy services, not polling file based. Also uses FairShareReqClass-1, not ignoring stuck threads.
  • CallOutWorkManager: Service Call Out operations in a OSB proxy. Uses FairShareReqClass-2.
  • DeliveryWorkManager: outbound business services in OSB. Uses FairShareReqClass-3.
Use these as a dispatch policy in the particular Proxy/Business Services. In OSB 11g this is:
I don't have screendumps of 12c at hand. But the idea would be the same there. I haven't learned that the thread model in 12c is architecturally different.

Wednesday, 19 October 2016

Get the hostname of the executing server in BPEL

This week I got involved in a question on the Oracle Forums on getting the hostname of the server executing the bpel process. In itself this is not possible in BPEL. Also if you have a long running async process, the process gets dehydrated at several points (at a receive, wait, etc.). After an incoming signal, another server could process it further. You can't be sure that one server will process it to the end.

However, using Java, you can get the hostname of an executing server, quite easily. @AnatoliAtanasov suggested this question on stackOverflow. I thought that it would be fun to try this out.

Although you can opt for creating an embedded java activity, I used my earlier article on SOA and Spring Contexts to have it in a separate bean. By the way, in contrast to my suggestions in the article, you don't have to create a separate spring context for every bean you use.

My java bean looks like:
package nl.darwinit.soasuite;
import java.net.InetAddress;
import java.net.UnknownHostException;


public class ServerHostBeanImpl implements IServerHostBean {
    public ServerHostBeanImpl() {
        super();
    }
    
    public  String getHostName(String hostNameDefault){
        String hostName;
        try
        {
            InetAddress addr;
            addr = InetAddress.getLocalHost();
            hostName = addr.getHostName();
        }
        catch (UnknownHostException ex)
        {
            System.out.println("Hostname can not be resolved");
            hostName = hostNameDefault;
        }
        return hostName;
    }
    
}

The interface class I generated is:
package nl.darwinit.soasuite;

public interface IServerHostBean {
    String getHostName(String hostNameDefault);
}

Then I defined a Spring Context, getHostNameContext, with the following content
<?xml version="1.0" encoding="UTF-8" ?>
<beans xmlns="http://www.springframework.org/schema/beans" xmlns:util="http://www.springframework.org/schema/util"
       xmlns:jee="http://www.springframework.org/schema/jee" xmlns:lang="http://www.springframework.org/schema/lang"
       xmlns:aop="http://www.springframework.org/schema/aop" xmlns:tx="http://www.springframework.org/schema/tx"
       xmlns:sca="http://xmlns.oracle.com/weblogic/weblogic-sca" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
       xsi:schemaLocation="http://www.springframework.org/schema/beans http://www.springframework.org/schema/beans/spring-beans.xsd http://www.springframework.org/schema/tool http://www.springframework.org/schema/tool/spring-tool.xsd http://www.springframework.org/schema/util http://www.springframework.org/schema/util/spring-util.xsd http://www.springframework.org/schema/aop http://www.springframework.org/schema/aop/spring-aop.xsd http://www.springframework.org/schema/cache http://www.springframework.org/schema/cache/spring-cache.xsd http://www.springframework.org/schema/context http://www.springframework.org/schema/context/spring-context.xsd http://www.springframework.org/schema/task http://www.springframework.org/schema/task/spring-task.xsd http://www.springframework.org/schema/jee http://www.springframework.org/schema/jee/spring-jee.xsd http://www.springframework.org/schema/lang http://www.springframework.org/schema/lang/spring-lang.xsd http://www.springframework.org/schema/tx http://www.springframework.org/schema/tx/spring-tx.xsd http://www.springframework.org/schema/jdbc http://www.springframework.org/schema/jdbc/spring-jdbc.xsd http://www.springframework.org/schema/jms http://www.springframework.org/schema/jms/spring-jms.xsd http://www.springframework.org/schema/oxm http://www.springframework.org/schema/oxm/spring-oxm.xsd http://www.springframework.org/schema/mvc http://www.springframework.org/schema/mvc/spring-mvc.xsd http://xmlns.oracle.com/weblogic/weblogic-sca META-INF/weblogic-sca.xsd">
    <!--Spring Bean definitions go here-->
    <sca:service name="GetHostService" target="ServerHostBeanImpl" type="nl.darwinit.soasuite.IServerHostBean"/>
    <bean id="ServerHostBeanImpl" class="nl.darwinit.soasuite.ServerHostBeanImpl"/>
</beans>

After wiring the context to my BPEL the composite looks like:


Then, deploying and running it, gives the following output:


Nice, isn't it?

Tuesday, 18 October 2016

Weblogic 11g to 12c: strictness in listen address

Let's say you have a virtual machine with two network adapters, both set on 'HostOnly'.
I used to do that and set the first one of those to a fixed IP address, say 10.0.0.1. To this one I coupled the hostname, for instance darlin-vce-db, using the /etc/hosts file. That way I had a fixed, always existing network address for the database.

Together with the database, you install WebLogic, for instance to serve SOASuite, or OSB, or whatever custom application you want to serve. Now, wouldn't it be nice to be able to use WebLogic from a browser out of the virtual machine? Of course, because this is what you do nowadays: almost everywhere I come, servers are hosted on virtualized computing environments, like VMWare VSE or Oracle VM. So that's where the second adapter comes in, dynamically coupled to an address in the form of 192.168.56.101, for instance. Externally, using the etc/host file on your host OS (in my case Windows), you couple that address also to darlin-vce-db.

So you have two /etc/hosts settings for the hostname, darlin-vce-db:
In internally in the VM:
10.0.0.1        darlin-vce-db     darlin-vce-db.darwin-it.local
And externally on your host OS:
192.168.56.101  darlin-vce-db     darlin-vce-db.darwin-it.local

Nothing special, right? Well, WebLogic 11g, apparently just listens to the hostname 'darlin-vce-db', if that is entered as a listen-address. It seems not to care if the request for 'darlin-vce-db' comes from 192.168.56.101 in stead of the 10.0.0.1 to where the hostname actually is bound.

However, in this particular case WebLogic 12c seems to behave differently. If you provice 'darlin-vce-db' as listen address, that is bound to a network adapter that has 10.0.0.1 it expects that requests also come in via that network adapter. It seems to ignore requests that come in via other adapters (in my case 192.168.56.101).

You can solve this partially using Channels: in the Weblogic Console, navigate to the particular managed server, click Protocols, Channels.
Create a new channel:

Give it a name like 'Extern-Intern' or something else properly denoting the purpose of it and choose 'http' as a protocol:

Then Provide the internal address, for instance 'darlin-vce-db', and the external listen address:

Leave the ports to the default listen-port, in this case. Then finish the wizard.
Although this helps in connecting to the WebLogic console, EM, or with the same method on the SOA Server, to the SOA Composer (soaserver:port/soa/composer), BPM Workspace (soaserver:port/bpm/composer) etc., this will not work for JDeveloper.

When trying to deploy a soa composite from JDeveloper, you define/choose a ApplicationServer with connection to the AdminServer. But in case of deploying a composite, the AdminServer figures out which running SOA Servers there are, and let JDeveloper provide the composite to those servers. But then the soaserver(s) refuse(s) the connections from JDeveloper. Testing the ApplicationServer connection will show success for the Http connection to the AdminServer, but fails for all the other components.

The solution is then to denote a particular Network Adapter/ip address and make sure that internally and externally the particular hostname is coupled to that same particular ip-address.

Wednesday, 12 October 2016

OTN Appreciation Day: - the day after - BPEL, SOASuite and SCA (in that order...)

Unfortunately I noticed this nice initiative only yesterday: OTN Appreciation Day. I did not had a change to cook something up, but I do like to add some mustard after the meal, as we say in Dutch.

In the titles of the 'OTN Appreciation Day'-blogs I miss BPEL. In 2004,when Oracle acquired Collaxa, I worked at Oracle Consulting in the Netherlands. I worked with Oracle Workflow and Interconnect. Oracle wasn't yet into SOA really. But with BPEL PM they acquired the tool that stood at the base of SOA Suite, together with Webservices Manager and what we now know as the Mediator. And it changed my professional life, really. Its extremely powerful, especially with the added  JCA Adapters, xslt-mapper, and since 11g the SCA architecture wich enables you to assemble composite applications with Adapters BusinessRules, Human Workflow, Mediator, Spring Components and of course ... BPEL 2.0.

Sorry, Tim, for being late.

SQL Developer: Live and Let Live my db-connection

At my current customer I have SQLDeveloper open the whole day, and regularly I come back to it to query my throttle-table to see if my requests have been picked up. But regularly my database connection have been broken because of being idle. Probably because of a nasty firewall between my remote development desktop and the database.

Googling on it I found an article of That Jeff Smith on busy connections. That blog is really one to follow. Feed it to your Feedly if your a recent SQL Developer user.

But in my search I found the 'keepAlive-4' extension for SQL Developer 4. Download the KeepAlive zip, go to Help->Check for Updates in SQLDeveloper and install it via the 'Install From Local File' option. Then look for the Keep Alive icon in the toolbar:
Click on it and enter a check frequency, of at least 60 seconds. I try 180 seconds.

You can disable it by clicking it again. Another click and it asks for a new interval again.

Wednesday, 5 October 2016

Keep your Service Directory valid

To day I ran into a trap that trapped me before...

I tried to edit a file adapter service in SOA/BPM QuickStart 12.2.1. And got the error 'Service Directory is not valid' in a pop-up dialog, that prevents me from editing.

The problem is described in this forum thread.

The problem is a space in the path to the JDeveloper project. Now spaces are a drag in filenames and paths (I hate it that Microsoft Windows introducced "Program Files" as the default folder for installing applications).

The solution is in Windows quite simple actually: create a substitute drive like:
echo define drive S: referring to current folder
subst s: /d
subst s: "d:\Data\svn\Applicatie Integration"

Put this in a .bat file and run it before starting JDeveloper and load the application workspaces from the S: drive.

I had this already in place because of my first run into this trap. Why I tripped again? It turned out I accidentally had opened the application from the d: drive again...

Tuesday, 4 October 2016

Use WLST to test all your datasources

To day I had some problems with deploying my composite to the server of my current customer. Apparently the Server had some problems with datasources. But since there are many I did not feel much for checking them one by one with the console. Using Google I got the following examples

Thus I had to combine those two, sauce it with my own way of wlst-scripting (see my other blog entries). Also I want a tabular form, that got me into troubles with printing the result of the testPool() method. But I came up with the following script, testDS.py:
#############################################################################
# Test Datasources on a domain
#
# @author Martien van den Akker, Darwin-IT Professionals
# @version 2.1, 2016-10-04
#
#############################################################################
# Modify these values as necessary
import sys, traceback
scriptName = sys.argv[0]
#
#
lineSeperator='__________________________________________________________________________________'
#
#
def usage():
  print 'Call script as: '
  print 'Windows: wlst.cmd '+scriptName+' -loadProperties localhost.properties'
  print 'Linux: wlst.sh '+scriptName+' -loadProperties environment.properties'
  print 'Property file should contain the following properties: '
  print "adminUrl=darlin-vce:7001"
  print "adminUser=weblogic"
  print "adminPwd=welcome1"
#
#
def connectToadminServer(adminUrl, adminServerName):
  print(lineSeperator)
  print('Try to connect to the AdminServer')
  try:
    connect(userConfigFile=usrCfgFile, userKeyFile=usrKeyFile, url=adminUrl)
  except NameError, e:
    print('Apparently user config properties usrCfgFile and usrKeyFile not set.')
    print('Try to connect to the AdminServer adminUser and adminPwd properties')
    connect(adminUser, adminPwd, adminUrl)
#
#
def main():
  try:
    pad='                                                                               '
    print(lineSeperator)
    print('Check datasources for domain')
    print(lineSeperator)
    print ('Connect to the AdminServer: '+adminServerName)
    connectToadminServer(adminUrl, adminServerName)
    print(lineSeperator)
    allServers=domainRuntimeService.getServerRuntimes();
    if (len(allServers) > 0):
      for tempServer in allServers:
        jdbcServiceRT = tempServer.getJDBCServiceRuntime();
        dataSources = jdbcServiceRT.getJDBCDataSourceRuntimeMBeans();
        print('\nServer '+tempServer.getName())
        if (len(dataSources) > 0):
          print('Datasource                                                  '[:30]+'\tState\tTest')
          for dataSource in dataSources:
            testPool = dataSource.testPool()
            dataSourceName = dataSource.getName()+pad
            dataSourceNamePad=dataSourceName[:30]
            if (testPool == None):
              print dataSourceNamePad+'\t'+dataSource.getState()+'\tOK'
            else:
              print dataSourceNamePad+'\t'+dataSource.getState()+'\tFailure: '
              print testPool
    print(lineSeperator)
    print('Done...')
    print(lineSeperator)
  except NameError, e:
    print('Apparently properties not set.')
    print "Please check the property: ", sys.exc_info()[0], sys.exc_info()[1]
    usage()
  except:
    apply(traceback.print_exception, sys.exc_info())
    exit(exitcode=1)
#
main();

Of course it's easy to extent the table with properties from the monitor script in WebLogic DataSource Monitoring Using WLST.

Which runs pretty neat. Run it with a shell script like the following testDS.sh script:
#!/bin/bash
#############################################################################
# Test DataSources  using wlst.
#
# @author Martien van den Akker, Darwin-IT Professionals
# @version 2.1, 2016-06-27
#
#############################################################################
#
. fmw12c_env.sh
echo
echo Test Datasources
wlst.sh ./testDS.py -loadProperties fmw.properties

For the fmw12c_env.sh and fmw.properties files look here.

Sunday, 25 September 2016

Replacement of environment variables or properties in Bash

Earlier I wrote about the automatic installation of Fusion Middleware components using response files. A thing that lacked in my scripts was that although I had a FMW_HOME variable set in my enviroment shell script, the response files had the location hard coded in them. At the time I hadn't had the chance to figure out how to do property/variable replacement in shell. I do know how to do it with ANT. But I figured that installing ANT for only this was a bit too much, since with the installation of FMW you already get ANT as a module.

For an upgrade of my scripts to FMW 12.2.1.1, I did a Google-search on it and found: http://stackoverflow.com/questions/415677/how-to-replace-placeholders-in-a-text-file. The top 2 suggestions were:

  1. sed -e "s/\${i}/1/" -e "s/\${word}/dog/" template.txt
  2. i=32 word=foo envsubst < template.txt
Although the first option was favoured by many and considered the answer on the querstion, I personally favour the second. It turns out that sed does not accept references to the environment variables as a replacement. And that makes the replacements hardcoded again. The second does accept environment variable references. Actually, if the variable-reference in the template file  is already present in the environment, no actual replacement assignment have to be provided.

So let's say my response file template looks like:
[ENGINE]

#DO NOT CHANGE THIS.
Response File Version=1.0.0.0.0

[GENERIC]

#Set this to true if you wish to skip software updates
DECLINE_AUTO_UPDATES=true

#
MOS_USERNAME=

#
MOS_PASSWORD=<SECURE VALUE>

#If the Software updates are already downloaded and available on your local system, then specify the path to the directory where these patches are available and set SPECIFY_DOWNLOAD_LOCATION to true
AUTO_UPDATES_LOCATION=

#
SOFTWARE_UPDATES_PROXY_SERVER=

#
SOFTWARE_UPDATES_PROXY_PORT=

#
SOFTWARE_UPDATES_PROXY_USER=

#
SOFTWARE_UPDATES_PROXY_PASSWORD=<SECURE VALUE>

#The oracle home location. This can be an existing Oracle Home or a new Oracle Home
ORACLE_HOME=${FMW_HOME}

#Set this variable value to the Installation Type selected. e.g. Fusion Middleware Infrastructure, Fusion Middleware Infrastructure With Examples.
INSTALL_TYPE=Fusion Middleware Infrastructure

#Provide the My Oracle Support Username. If you wish to ignore Oracle Configuration Manager configuration provide empty string for user name.
MYORACLESUPPORT_USERNAME=

#Provide the My Oracle Support Password
MYORACLESUPPORT_PASSWORD=<SECURE VALUE>

#Set this to true if you wish to decline the security updates. Setting this to true and providing empty string for My Oracle Support username will ignore the Oracle Configuration Manager configuration
DECLINE_SECURITY_UPDATES=true

#Set this to true if My Oracle Support Password is specified
SECURITY_UPDATES_VIA_MYORACLESUPPORT=false

#Provide the Proxy Host
PROXY_HOST=

#Provide the Proxy Port
PROXY_PORT=

#Provide the Proxy Username
PROXY_USER=

#Provide the Proxy Password
PROXY_PWD=<SECURE VALUE>

#Type String (URL format) Indicates the OCM Repeater URL which should be of the format [scheme[Http/Https]]://[repeater host]:[repeater port]
COLLECTOR_SUPPORTHUB_URL=



Saved as 'fmw_12.2.1.1.0_infrastructure.rsp.tpl'; note the reference ORACLE_HOME=${FMW_HOME}. And I have set FMW_HOME with an fmw12c_env.sh script, as described in former posts. Then I only have to do:
envsubst < fmw_12.2.1.1.0_infrastructure.rsp.tpl >>fmw_12.2.1.1.0_infrastructure.rsp
To have the file copied to fmw_12.2.1.1.0_infrastructure.rsp with a replaced FMW_HOME variable:
...
#
SOFTWARE_UPDATES_PROXY_PASSWORD=<SECURE VALUE>

#The oracle home location. This can be an existing Oracle Home or a new Oracle Home
ORACLE_HOME=/u01/app/oracle/FMW12211

#Set this variable value to the Installation Type selected. e.g. Fusion Middleware Infrastructure, Fusion Middleware Infrastructure With Examples.
INSTALL_TYPE=Fusion Middleware Infrastructure
...

Couldn't be more simple, I'd say. Nice thing is that this enables me to do more directives. So, learned something again, from a question dated 7,5 years ago...

Wednesday, 7 September 2016

How to solve JSchException Algorithm negotiation fail. And get logging out of JSch in SoapUI.

I was so glad with my SoapUI solution to SFTP files to a server. But so dissapointed I couldn't have it working at my customer.

After I changed the log entries to log with e.message, I got the line:
Wed Sep 07 11:17:43 CEST 2016:INFO:JSchException Algorithm negotiation fail

Now I needed more information than that. But the hint is at least that there is a mismatch in the available cyphers client side verses server side.

So first I wanted to get more logging out of Jsch. It turns out that it has it's own Logger framework, but the bright side of that is that you can easily wrap your own logging mechanism. In the case of SoapUI it is log4j. So create a java project with the libraries jsch-0.1.54.jar and from the SoapUI libs: log4j-1.2.14.jar. Then I created the following class file from an example from the answer in this stackoverflow question.

My version is:
package nl.darwin.jsch.log;


import com.jcraft.jsch.Logger; 

/**
 * Class to route log messages generated by JSch to Apache Log4J Logging.
 *
 * @author mrabbitt
 * @see com.jcraft.jsch.Logger
 */
public class JSchLog4JLogger implements Logger {
    private org.apache.log4j.Logger logger;
    
    /**
     * Constructor with custom category name 
     * 
     * @param logger the logger from Apache Log4J.
     */
    public JSchLog4JLogger(org.apache.log4j.Logger logger) {
        this.logger = logger;
    }
    
    /**
     * Default constructor
     */
    public JSchLog4JLogger() {
        this(org.apache.log4j.Logger.getLogger(Logger.class.getName()));
    }

    /* (non-Javadoc)
     * @see com.jcraft.jsch.Logger#isEnabled(int)
     */
    public boolean isEnabled(int level) {
        switch (level) {
        case DEBUG:
            return logger.isDebugEnabled();
        case INFO:
            return logger.isInfoEnabled();
        case WARN:
            return logger.isInfoEnabled();
        case ERROR:
            return logger.isInfoEnabled();
        case FATAL:
            return logger.isInfoEnabled();
        }
        return false;
    }

    /* (non-Javadoc)
     * @see com.jcraft.jsch.Logger#log(int, java.lang.String)
     */
    public void log(int level, String message) {
        switch (level) {
        case DEBUG:
            logger.debug(message);
            break;
        case INFO:
            logger.info(message);
            break;
        case WARN:
            logger.warn(message);
            break;
        case ERROR:
            logger.error(message);
            break;
        case FATAL:
            logger.fatal(message);
            break;
        }
    }
}

Then Jar it and add it to the bin/ext older of SoapUI (like in the previous blog post).
Now a simple extra line is needed and an import in your groovy script :
import nl.darwin.jsch.log.JSchLog4JLogger
...
  JSch.setLogger(new JSchLog4JLogger(log))
  JSch ssh = new JSch()

So simply set the logger on the JSch class, before the instantiation. Then the logging of JSch appears in the SoapUI logging, as easy as that.
It turned out that the server required the use of aes256-ctr, while the jre of SoapUI (which is Java 7 in SoapUI 5.2.1) has limited JCE policy. As is suggested here.

You can download the unlimited JCE policies here:
JDK
Unlimited JCE Download
JDK 1.6http://www.oracle.com/technetwork/java/javase/downloads/jce-6-download-429243.html
JDK 1.7http://www.oracle.com/technetwork/java/javase/downloads/jce-7-download-432124.html
JDK 1.8http://www.oracle.com/technetwork/java/javase/downloads/jce8-download-2133166.html

For SoapUI, download the JDK 1.7 policy. Go to your SoapUI Home folder, and navigate to the security library folder within the JRE. For instance: c:\Program Files\SmartBear\SoapUI-5.2.1\jre\lib\security.

Unzip the JCE to a new folder UnlimitedJCEPolicy within the security folder. Create a another backup folder like LimitedJCEPolicy and copy the jars US_export_policy.jar and local_policy.jar to the LimitedJCEPolicy folder. And copy the corresponding files from UnlimitedJCEPolicy to the security folder, replacing the original ones.

Restart SoapUI and you're good to go.


Use SoapUI to test your SFTP based services

SoapUI is my favorite tool to do unit tests. I'd try to keep my self to test based development and build up tests along with the development service. For SOAP or REST based services this goes quite intuitively using SoapUI. For database driven it is a little harder, but SoapUI has a nice JDBC activity, that supports DML as well as callable statements as stored procedures.

But for files and especially SFTP its a little more complicated. For a while I'm working on a filebased integration with SAP as source system.

I configured and defined the SOASuite FTP adapter to use my SSH user (oracle) on my VirtualBox VM. Until now I tested it using the SFTP/SCP client from MobaXTerm (enthousiastically recommended: download here). But not so handy for unit tests.

I wanted to automate this using SoapUI. With some searching I found that JCraft Java Secure Channel library was the best and easiest option. I did take a look at Apache Commons Net. But couldn't get it to work so easily. Download the jsch-0.1.54.jar (or newer) file and copy it to the ./bin/ext folder in your SoapUI home:


And restart SoapUI.

Create a new empty SoapUI project, create a TestSuite called something like 'Utilities' and a TestCase called 'TC-FTP'. Add the following properties to the TestCase:

Property
Value
ftpHostdarlin-vce-db
ftpPort22
ftpUsernameoracle
ftpPasswordwelcome1
localFilePathd:/Projects/2016MySapProject/ExampleFiles/SAP-MESSAGE.XML
remoteFileDir/home/oracle/SapHR/in

In the TestCase add a Groovy Script called FTP add the script below. I took the example from snip2code.com (also found elsewhere) and refactered it to:
//Send Files to SSH Location
//
//Download jsch-0.1.54.jar from http://www.jcraft.com/jsch/ and copy it to  SoapUI-Home/bin/ext location
//Example from: https://www.snip2code.com/Snippet/413499/SoapUI-Groovy-Script-compatible-SFTP-fil

//import org.apache.commons.net.ftp.FTPSClient
import com.jcraft.jsch.*
//
// Properties
// 
def testCase = testRunner.testCase;


def String ftpHost = testCase.getPropertyValue("ftpHost")
def int ftpPort = testCase.getPropertyValue("ftpPort").toInteger()
def String ftpUsername = testCase.getPropertyValue("ftpUsername")
def String ftpPassword = testCase.getPropertyValue("ftpPassword")
def String localFilePath = testCase.getPropertyValue("localFilePath")
def String remoteFileDir = testCase.getPropertyValue("remoteFileDir")
//
//
Session session = null
Channel channel = null
try {
  log.info("Starting sftp upload process")
  JSch ssh = new JSch()
      
  session = ssh.getSession(ftpUsername, ftpHost, ftpPort)
  session.setConfig("StrictHostKeyChecking", "no"); //auto accept secure host
  session.setPassword(ftpPassword)
  session.connect()
  log.info("Connected to session")
      
  channel = session.openChannel("sftp")
  channel.connect()
  log.info("Connected to channel")
      
  ChannelSftp sftp = (ChannelSftp) channel;
  sftp.put(localFilePath, remoteFileDir);
  log.info("File Uploaded " + localFilePath + " TO " + remoteFileDir)
    
} catch (JSchException e) {
  e.printStackTrace()
  log.info("JSchException " + e.message)
} catch (SftpException e) {
  e.printStackTrace()
  log.info("SftpException " + e.message)
} finally {
  if (channel != null) {
    channel.disconnect()
    log.info("Disconnected from channel")
  }
  if (session != null) {
    session.disconnect()
    log.info("Disconnected from session")
  }
  log.info("sftp upload process complete")
}

Changes I did was to define the input values based on the properties from the testcase. And move the session and channel variable declartions out of the try, to get it accessible from the finally branch. And to replace e.printStackTrace from the logging by e.message, to have a propery message (e.printStackTrace returns null) in the logging.

The reason that I suggest to have it in a separate test cases is to enable it to be called from actual testcases with parameters. To do so add to your test case a call-test case activity:

Set the following properties:

Choose 'Run primary TestCase (wait for running to finish, Thread-Safe)' option as Run Mode.

And provide the property values.

This script copies a file from a file location and uploads it. But I want to be able to insert some runtime specific options to refer to in asserts and later JDBC calls. To check on specific running instances. So I want to be able to adapt the content running in my test case. Actually I want to upload a string fetched from a property, maybe with expanded properties.

So I copied the testcase and groovy activity and adapted the script to:
//Send Files to SSH Location
//
//Download jsch-0.1.54.jar from http://www.jcraft.com/jsch/ and copy it to  SoapUI-Home/bin/ext location
//Example from: https://www.snip2code.com/Snippet/413499/SoapUI-Groovy-Script-compatible-SFTP-fil

//import org.apache.commons.net.ftp.FTPSClient
import com.jcraft.jsch.*
import java.nio.charset.StandardCharsets
//
// Properties
// 
def testCase = testRunner.testCase;
//
def String ftpHost = testCase.getPropertyValue("ftpHost")
def int ftpPort = testCase.getPropertyValue("ftpPort").toInteger()
def String ftpUsername = testCase.getPropertyValue("ftpUsername")
def String ftpPassword = testCase.getPropertyValue("ftpPassword")
def String fileContent = testCase.getPropertyValue("fileContent")
def String remoteFile = testCase.getPropertyValue("remoteFile")
//
Channel channel = null
Session session = null
try {
  log.info("Starting sftp upload process")  
  JSch ssh = new JSch()
      
  session = ssh.getSession(ftpUsername, ftpHost, ftpPort)
  session.setConfig("StrictHostKeyChecking", "no"); //auto accept secure host
  session.setPassword(ftpPassword)
  session.connect()
  log.info("Connected to session")
      
  channel = session.openChannel("sftp")
  channel.connect()
  log.info("Connected to channel")
      
  ChannelSftp sftp = (ChannelSftp) channel; 

  byte[] fileContentBytes =   fileContent.getBytes(StandardCharsets.UTF_8)
  InputStream fileInputStream = new ByteArrayInputStream(fileContentBytes);
  log.info("Start uploaded to " + remoteFile)
  sftp.put(fileInputStream, remoteFile);
  log.info("File Content uploaded to " + remoteFile)
    
} catch (JSchException e) {
  e.printStackTrace()
  log.info("JSchException " + e.message)
} catch (SftpException e) {
  e.printStackTrace()
  log.info("SftpException " + e.message)
} catch (Exception e) {
  e.printStackTrace()
  log.info("Exception " + e.message)
} finally {
  if (channel != null) {
    channel.disconnect()
    log.info("Disconnected from channel")
  }
  if (session != null) {
    session.disconnect()
    log.info("Disconnected from session")
  }
  log.info("sftp upload process complete")
}

here the lines and related properties:
def String localFilePath = testCase.getPropertyValue("localFilePath")
def String remoteFileDir = testCase.getPropertyValue("remoteFileDir")

are changed to:
def String fileContent = testCase.getPropertyValue("fileContent")
def String remoteFile = testCase.getPropertyValue("remoteFile")
Then the lines:
  byte[] fileContentBytes =   fileContent.getBytes(StandardCharsets.UTF_8)
  InputStream fileInputStream = new ByteArrayInputStream(fileContentBytes);

convert the fileContent property value to an InputString. And that is given as an input to the statement sftp.put(fileInputStream, remoteFile);. Notice that since we upload file content, we need to provide a remoteFile path, including file name, insead of a remote directory. And that we need an extra import java.nio.charset.StandardCharsets.

It would be nice if the guys from SmartBear add both put and get as a seperate activity.  

Friday, 2 September 2016

.... this is only a psuedo object?

Yesterday I was working on a BPEL project that I created before the summer holidays. I wanted to implement it further. But on first redeployment I ran into:
[12:18:01 PM] ----  Deployment started.  ----
[12:18:01 PM] Target platform is  (Weblogic 12.x).
[12:18:01 PM] Running dependency analysis...
[12:18:01 PM] Building...
[12:18:08 PM] Deploying profile...
[12:18:09 PM] Wrote Archive Module to D:\Projects\2016DWN\SOASuite\HRApplication\DWN_CdmHR\trunk\SOA\DWN_CdmHR\CDMHRDomainService\deploy\sca_CDMHRDomainService.jar
[12:18:18 PM] Deploying sca_CDMHRDomainService.jar to partition "default" on server SoaServer1 [http://darlin-vce-db.darwin-it.local:8001]
[12:18:18 PM] Processing sar=/D:/Projects/2016DWN/SOASuite/HRApplication/DWN_CdmHR/trunk/SOA/DWN_CdmHR/CDMHRDomainService/deploy/sca_CDMHRDomainService.jar
[12:18:18 PM] Adding sar file - D:\Projects\2016DWN\SOASuite\HRApplication\DWN_CdmHR\trunk\SOA\DWN_CdmHR\CDMHRDomainService\deploy\sca_CDMHRDomainService.jar
[12:18:18 PM] Preparing to send HTTP request for deployment
[12:18:18 PM] Creating HTTP connection to host:darlin-vce-db.darwin-it.local, port:8001
[12:18:18 PM] Sending internal deployment descriptor
[12:18:19 PM] Sending archive - sca_CDMHRDomainService.jar
[12:18:19 PM] Received HTTP response from the server, response code=500
[12:18:19 PM] Error deploying archive sca_CDMHRDomainService.jar to partition "default" on server SoaServer1 [http://darlin-vce-db.darwin-it.local:8001] 
[12:18:19 PM] HTTP error code returned [500]
[12:18:19 PM] Error message from server:
There was an error deploying the composite on SoaServer1: Deployment Failed: Error occurred during deployment of component: HREmployeeProcess to service engine: implementation.bpel, for composite: CDMHRDomainService: ORABPEL-05215

Error while loading process.
The process domain is encountering the following errors while loading the process "HREmployeeProcess" (composite "default/CDMHRDomainService!1.0*soa_6e4206b5-3297-4f53-9944-734349aed8ab"): this is only a psuedo object.
This error contained an exception thrown by the underlying process loader module.
Check the exception trace in the log (with logging level set to debug mode). If there is a patch installed on the server, verify that the bpelcClasspath domain property includes the patch classes.
.
 
[12:18:19 PM] Check server log for more details.
[12:18:19 PM] Error deploying archive sca_CDMHRDomainService.jar to partition "default" on server SoaServer1 [http://darlin-vce-db.darwin-it.local:8001] 
[12:18:19 PM] Deployment cancelled.
[12:18:19 PM] ----  Deployment incomplete  ----.
[12:18:19 PM] Error deploying archive file:/D:/Projects/2016DWN/SOASuite/HRApplication/DWN_CdmHR/trunk/SOA/DWN_CdmHR/CDMHRDomainService/deploy/sca_CDMHRDomainService.jar 
 (oracle.tip.tools.ide.fabric.deploy.common.SOARemoteDeployer)

So the I was googling around, and found this blog entry. This one suggested a mismatch between the project and referenced wsdl's/xsd's in the MDS.

So I refreshed the MDS, restarted the whole SOA Server, but no luck.

At the doorstep of removing the lot of components and references, I decided to take a last closer look to the composite.xml.

The BPEL process component HREmployeeProcess had a reference to the service HREmployeeProcessSubscriber. The latter was based on a wsdl in the mds:
  <reference name="HREmployeeProcessSubscriber"
             ui:wsdlLocation="oramds:/apps/CDM/services/domain/operations/hrm/v2/EmployeeDomainEntityEventService.wsdl">
    <interface.wsdl interface="http://darwin-it.nl/services/domain/operations/hrm/v2/#wsdl.interface(EmployeeDomainEntityEventServicePortType)"/>
    <binding.ws port="http://darwin-it.nl/services/domain/operations/hrm/v2/#wsdl.endpoint(hremployeeprocessa_client/EmployeeDomainEntityEventServicePort)"
                location="http://darlin-vce-db:8001/soa-infra/services/default/HRSubscriberA/HREmployeeEventServiceA?WSDL"
                soapVersion="1.1"/>
  </reference>
But in the reference in the bpel component it refered to the BPEL process on the server:
<reference name="HREmployeeProcessSubscriber"
                 ui:wsdlLocation="http://darlin-vce-db:8001/soa-infra/services/default/HRSubscriberA/HREmployeeEventServiceA?WSDL">
        <interface.wsdl interface="http://darwin-it.nl/services/domain/operations/hrm/v2/#wsdl.interface(EmployeeDomainEntityEventServicePortType)"/>
      </reference>

Since the wsdl defined in the ui:wsdlLocation attribute needs to be available on compiling and loading of the component by the component-engine it's recommended to use a reference to an abstract wsdl in the mds. In this case I replaced the ui:wsdlLocation in the service reference by the mds. But apparently I forgot the BPEL Comnponent. To replace that, you should do this in the partnerlink definition in the BPEL Process. Because the composite.xml is automatically updated. Because the abastract wsdl lacks the partnerlink types, as you might know, JDeveloper suggests to create a wrapper wsdl for you.

Now, because of the synchronizations between bpel and the composite, you might need to hack the composite and the bpel process, to get things consistent again (at least I had to). But then, having it resolved, the composite was deployable again... And the BPEL process wasn't so pseudo anymore.

Thursday, 1 September 2016

The third one on Creating weblogic user, now for SOA Suite

A few months ago I figured out how to create specific users with restricted access to Service Bus components. I blogged about it in part 2 of creating WebLogic users for ServiceBus 12c. But the series lacks an explanation on restricted user access on SOASuite.

Today in a question about Roles on Oracle Community Forums, the reference to this elaborate blog entry was given: Restricted View, by Antony Reynolds.

I think that blog explains it well. Unfortunately the link to 7.2 Partition Roles, Anthony mentioned, did not work. What I found (12.1.3) is 7.3 Securing Access to Partitions (12.1.3) and 7.3 Securing Access to SOA Folders (12.2.1). (Apparently from 12.2.1 onwards, partitions are called SOA folders...)




Friday, 26 August 2016

Mount NTFS disk in Linux 7

Today I wanted to pass an old disk in a usb-case to my son. It was from an old Windows Laptop and even though I'm administrator, I wasn't able to read the documents in an other user's folder.

So I thought, let's do it from an Oracle Linux 7 VM, as root. But it turns out that Oracle linux did not support NTFS by default.

But with the trick in this link I managed to do it.

To sum up, especially for my self:

Add the EPEL-7 repository from Fedora:
# wget https://dl.fedoraproject.org/pub/epel/epel-release-latest-7.noarch.rpm

# rpm -ivh epel-release-latest-7.noarch.rpm

This worked for me. Although I had to step over my issue that I add a Fedora repo to Oracle Linux.... An other option could have been (probably if under Redhat Linux or CentOS:
# yum install epel-release
...
# yum clean all
...
# yum update
...

Install the NTFS-3g package:
# yum install ntfs-3g -y

And enable NTFS support for FileManagers:
# yum install ntfsprogs -y

Then I got to the FileManager and I could browse the disk. It was mounted at '/run/media/oracle/803638DB3638D3BE'.

Wednesday, 24 August 2016

Generate Admin Channels to improve Weblogic Admin Console performance (and of FMW-Control)

At one of my customers we have quite an impressive domain configuration. It's a FMW domain with SOA, OSB, BAM, WSM, MFT in clusters of 4 nodes. The thing is that when having started all the servers, the console becomes slooooooowwwww. Not to speak of FMW Control (em).

One suggestion is to set the 'Invocation Timeout Seconds' under MyDomain->Configuration->General->Advanced to a value like 2. And 'Management Operation Timeout' under Preferences->Shared Preferences to a value like 5:

This surely makes the console responsive again. But it actually means that the console gives up right away when querying for the (health) state of the servers. So in stead of a health of 'OK', you get a 'server unreachable' message.

When having a lot of servers in the domain, they all share the same Admin Channel, and this seams to get over flooded. AdminServer does not get the responses in time. Sometimes a new request leads to a proper response, but in fact it takes a lot of time.

To reduce the load on the default channel, we created an admin channel per managed server. Since it's a lot of servers, and we need to do it on several environments, so I created a wlst-script for it.
The script createAdminChannels.py:
#############################################################################
# Create AdminChannels for WebLogic Servers
#
# @author Martien van den Akker, Darwin-IT Professionals
# @version 1.1, 2016-08-24
#
#############################################################################
# Modify these values as necessary
import sys, traceback
scriptName = sys.argv[0]
#
#
lineSeperator='__________________________________________________________________________________'
#
#
def usage():
  print 'Call script as: '
  print 'Windows: wlst.cmd '+scriptName+' -loadProperties localhost.properties'
  print 'Linux: wlst.sh '+scriptName+' -loadProperties environment.properties'
  print 'Property file should contain the following properties: '
  print "adminUrl=localhost:7001"
  print "adminUser=weblogic"
  print "adminPwd=welcome1"
#
#
def connectToadminServer(adminUrl, adminServerName):
  try:
    print(lineSeperator)
    print('Try to connect to the AdminServer')
    try:
      connect(userConfigFile=usrCfgFile, userKeyFile=usrKeyFile, url=adminUrl)
    except NameError, e:
      print('Apparently user config properties usrCfgFile and usrKeyFile not set.')
      print('Try to connect to the AdminServer adminUser and adminPwd properties')
      connect(adminUser, adminPwd, adminUrl)
  except WLSTException:
    message='Apparently AdminServer not Started!'
    print (message)
    raise Exception(message)
#
#
def createAdminChannel(serverName, adminListenPort):
  print(lineSeperator)
  channelName = serverName+'-AdminChannel'
  try:
    cd('/Servers/'+serverName+'/NetworkAccessPoints/'+channelName)
    print('Channel '+channelName +' for '+serverName+' already exists.')
  except WLSTException: 
    try:
      print('Create Admin Channel for server: '+serverName+', with port: '+adminListenPort)
      cd('/Servers/'+serverName)
      cmo.createNetworkAccessPoint(channelName)
      cd('/Servers/'+serverName+'/NetworkAccessPoints/'+channelName)
      cmo.setProtocol('admin')
      cmo.setListenPort(int(adminListenPort))
      cmo.setEnabled(true)
      cmo.setHttpEnabledForThisProtocol(true)
      cmo.setTunnelingEnabled(false)
      cmo.setOutboundEnabled(false)
      cmo.setTwoWaySSLEnabled(false)
      cmo.setClientCertificateEnforced(false)
      print ('Succesfully created channel: '+channelName)
    except WLSTException:
      apply(traceback.print_exception, sys.exc_info())
      message='Failed to create channel '+channelName+'!'
      print (message)
      raise Exception(message)
#
#
def main():
  try:
    print (lineSeperator)
    print ('Start Osb Cluster')
    print (lineSeperator)
    print('\nConnect to AdminServer ')
    connectToadminServer(adminUrl, adminServerName)
    print(lineSeperator)
    print('Start Edit Session')
    edit()
    startEdit()
    #
    #Create Admin Channels
    # Administrators
    print('\nCreate Admin Channels')
    serverNameList=serverNames.split(',')
    serverAdminPortList=serverAdminPorts.split(',')
    #
    idx=0
    for serverName in serverNameList:
      adminPort=serverAdminPortList[idx]
      createAdminChannel(serverName, adminPort)
      idx=idx+1
    # Activate changes
    print(lineSeperator)
    print('Activate Changes')
    save()
    activate(block='true')
    #
    print('\nExiting...')
    exit()
  except NameError, e:
    print('Apparently properties not set.')
    print "Please check the property: ", sys.exc_info()[0], sys.exc_info()[1]
    usage()
  except:
    apply(traceback.print_exception, sys.exc_info())
    stopEdit('y')
    exit(exitcode=1)
#call main()
main()
exit()


The shell script to call it, createAdminChannels.sh:
#!/bin/bash
#############################################################################
# Create AdminChannels
#
# @author Martien van den Akker, Darwin-IT Professionals
# @version 2.1, 2016-08-24
#
#############################################################################
#  
. fmw12c_env.sh
export PROPERTY_FILE=$1
echo
echo Create Admin Channels
wlst.sh ./createAdminChannels.py -loadProperties $PROPERTY_FILE

And the example property file, darlin-vce-db.properties:
#############################################################################
# Properties voor Creeëren SOADomain
#
# @author Martien van den Akker, Darwin-IT Professionals
# @version 1.0, 2016-04-15
#
#############################################################################
#
# Properties for AdminServer
adminServerName=Adminserver
adminUrl=darlin-vce-db:7001
# AdminUser
adminUser=weblogic
adminPwd=welcome1
#
serverNames=AdminServer,OsbServer1,SoaServer1
serverAdminPorts=7101,8111,8101
#

Call the script as $> createAdminChannels.sh darlin-vce-db.properties

In the property file you'll need to name every server in the property serverNames. And for each server the particular Admin Listen Port in serverAdminPorts, in the exact same order. Start with the AdminServer.


At the end of the script the changes are activated and then the AdminServer listens over https on the changed port.

Important: the servers need to be down, except for the the AdminServer.

Unfortunately the infrastructure database was apparently down. So I haven't been able to start SOA, BAM, etc. to see if it is performant now. But I have high hopes...

Update wlst.sh
When you update the AdminServer as above, it will by default use the DemoTrust keystore. And probably the listenaddress might not necessarily match the url that is used in the connect-URL.

So by default you might run into errors when trying to connect to the AdminServer using wlst.
First you need to adapt the Admin URL to something like 'tls://darlin-vce-db:7101'. Explicitly prefix with 'tls://' and adapt the port to the new admin-listen-port.

Then the wlst script need to be adapted. The following parameters need to be added to the wlst command:
  • -Dweblogic.security.SSL.ignoreHostnameVerification=true
  • -Dweblogic.security.TrustKeyStore=DemoTrust
To do so, find out where the wlst.sh/cmd script is located. Under linux you can perform:
[oracle@darlin-vce-db bin]$ which wlst.sh
/u01/app/oracle/FMW12210/oracle_common/common/bin/wlst.sh

Of course after setting the weblogic environment (see one of my earlier blogs describing the fmw12c_env.sh script).

Edit the wlst.sh file and go to the bottom of the file, and add the properties to the JVM_ARGS variable:
...
JVM_ARGS="${WLST_PROPERTIES} ${JVM_D64} ${UTILS_MEM_ARGS} ${CONFIG_JVM_ARGS} -Dweblogic.security.SSL.ignoreHostnameVerification=true -Dweblogic.security.TrustKeyStore=DemoTrust"
if [ -d "${JAVA_HOME}" ]; then
 eval '"${JAVA_HOME}/bin/java"' ${JVM_ARGS} weblogic.WLST '"$@"'
else
 exit 1
fi
But setting the CONFIG_JVM_ARGS in a script like fmw12c_env.sh might be a better idea:
...
export CONFIG_JVM_ARGS='-Dweblogic.security.SSL.ignoreHostnameVerification=true -Dweblogic.security.TrustKeyStore=DemoTrust'
...



Unable to login with a SQL Authenticator

For a project, we are migrating Forms to ADF.
There is also a number of reports which are not to be migrated yet.
Therefore, we need to keep the users in the database.
As we do not want to maintain two user stores, we thought it to be a good idea to create an authenticator in WebLogic to authenticate to the database.
There are loads of blog posts / support notes on how to configure a SQL Authenticator, so I won't repeat this procedure.
Take a look at Oracle Support Document 1342157.1 or a post from Edwin Biemond


However, we noticed a flaw in these (old) references.


In WebLogic 12c, when we create a SQL Authenticaotr, there is a filed named Identity Domain.
From the Oracle documentation, we learn:
All Authentication providers included in WebLogic Server support identity domains. If the identity domain attribute is set on an Authentication provider, that Authentication provider can authenticate only users who are defined in that identity domain.
...
An identity domain is a logical namespace for users and groups, typically representing a discrete set of users and groups in the physical datastore. Identity domains are used to identify the users associated with particular partitions.


As we do not use partitions in our domain, there is no use for an Identity domain.
But we did not know that when we setup the authenticator (and who reads the entire manual right??)


So following the previously mentioned resources, we created the SQL authenticator and entered the domain name in the Identity Domain field.
This resulted in a not working authenticator.
Symptoms:
  • When a user tried to login using database credentials, the autentication faild Always
  • No error message in the log
  • No activity in the datase (no query was executed to check the credentials)
To further analyse the issue, we added som Java options to the Admin server, using the setDomainEnv script:
JAVA_OPTIONS="${JAVA_OPTIONS} -Dweblogic.kernel.debug=true -Dweblogic.log.StdoutSeverity=Debug -Dweblogic.log.LogSeverity=Debug -Dweblogic.StdoutDebugEnabled=true -Dweblogic.log.LoggerSeverity=Debug -Dweblogic.debug.DebugSecurityAtn=true" 
 export JAVA_OPTIONS 

This gave us more insight in the issue. The log file now revealled:
####<23-aug-2016 15:13:02 uur CEST> <Debug> <SecurityAtn> <BSCC6112> <DefaultServer> <[ACTIVE] ExecuteThread: '5' for queue: 'weblogic.kernel.Default (self-tuning)'> <<WLS Kernel>> <> <d489addb-c960-41f1-babb-f912aec329bc-00000036> <1471957982959> <[severity-value: 128] [rid: 0] [partition-id: 0] [partition-name: DOMAIN] > <BEA-000000> <weblogic.security.providers.authentication.shared.DBMSAtnLoginModuleImpl.login exception: 
java.security.PrivilegedActionException: javax.security.auth.login.LoginException: javax.security.auth.callback.UnsupportedCallbackException: Unrecognized Callback: class weblogic.security.auth.callback.IdentityDomainUserCallback 
weblogic.security.auth.callback.IdentityDomainUserCallback@31789514
 at java.security.AccessController.doPrivileged(Native Method)
 at com.bea.common.security.internal.service.LoginModuleWrapper.login(LoginModuleWrapper.java:114)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
 at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 at java.lang.reflect.Method.invoke(Method.java:497)
 at javax.security.auth.login.LoginContext.invoke(LoginContext.java:755)
...


Unfortunately, we still had no clue.
As you have read from the beginning of the post, you might think that setting the Identity Domain field to DOMAIN might solve the issue (the log states partition-name: DOMAIN) but no, that's no solution either.

The trick is to leave the Identity Domain filed blank. After that the authenticator worked like a charm.

Thursday, 14 July 2016

Set connection retry frequency on DataSource in WebLogic 12c.

I have encountered several times in WebLogic 12c that when the ConnectionPool of a DataSource in WebLogic could not be initialized because due to a connection error or invalid username or password, the server could not be started.

I can't remember having encounterd this problem in 11g, and this week I struggled with it with one of my customers. The perception of the DBA there was that in 11g the server did start up, but the DataSource would go in Suspended-state. Half a year ago, one of the admins just removed a datasource because it made the 12c WebLogic server unstartable. "Taking a short turn", we would say in Dutch.

Now, having a database that is down is a reality at my customers. At least in development and test environments. Also at some of my customers it is a reality to have databases refreshed with an earlier clone. Causing for instance database passwords becoming invalid. It's quite inconvenient not being able to start the servers. Especially when the AdminServer can't be started because of it. And with SOASuite and OSB this is a reality since some of the consoles and composers are targeted to the AdminServer.

This week I found there were 2 options to get the server started:
  • Temporarily untarget the datasource
  • Set the initial and minimum connections to 0. 
The first was acceptable for one of my customers because they have just one cluster to which the problematic datasources was targeted to. Untargeting would leave all the other settings. And retargetting doesn't raise questions because there's only one target option.

But in a complex domain with several clusters it might not be too obvious to which cluster(s) or server(s) the datasource should be retargeted.

Changing the initial/minimum connections is not ideal either. Because you need to remember what the preferred settings were. These are important when using them for web applications or services used by web applications where performance is key.

But today I stumbled upon a third option: 'Connection Creation Retry Frequency' which can be found under DataSource -> Configuration -> Connection Pool -> Advanced:

Set this to 300 to have a retry every 5 minutes. It is described as:
The number of seconds between attempts to establish connections to the database.

If you do not set this value, data source creation fails if the database is unavailable. If set and if the database is unavailable when the data source is created, WebLogic Server will attempt to create connections in the pool again after the number of seconds you specify, and will continue to attempt to create the connections until it succeeds.

When set to 0, connection retry is disabled.'

Learning all the time...



Start and stop a WebLogic (SOA/OSB) Domain

So let's start the day with a blog. In the past few months I created scripts to install FMW products and build a WebLogic domain for it. For most of my findings I did a blog already:
And then I did not list the blogs about the scripts to setup and patch the QuickStarts. Nice  to mention is that this week 12.2.1.1 is released. So I can adapt my scripts with the new software (I'm not going to post that, you should be able to figure that out yourself, maybe I'll provide the changed snippets when I come to it).

But the important missing part in this series are the start and stop of the domain. So let me provide and describe my scripts.

Oracle does provide scripts to start and stop your NodeManager and WebLogic servers. When you configure the NodeManager and Servers correctly you can use those to add the NodeManager to the init.d services on Linux or to your Windows Services. Then the NodeManager would start the AdminServer and other Servers automatically.

However, on my demo/training/development systems I'd like to be able to start/stop the parts myself.  So here we go.

Settings

I'm working with Linux and create shell scripts to kick-off the wlst scripts that I use to start the particular components. Therefor I use a settings script called fmw12c_env.sh to call the setWLSEnv.sh from the FMW home. I take it that you're able to translate them to Windows Command file if applicable. I provided it a couple of time already. But for completeness here it is:
#!/bin/bash
echo set Fusion MiddleWare 12cR2 environment
export FMW_HOME=/u01/app/oracle/FMW12210
export NODEMGR_HOME=/u01/app/work/domains/osb_domain/nodemanager

export SOA_HOME=$FMW_HOME/soa
export OSB_HOME=$FMW_HOME/osb
export MFT_HOME=$FMW_HOME/mft
#
echo call setWLSEnv.sh
. $FMW_HOME/wlserver/server/bin/setWLSEnv.sh
export PATH=$FMW_HOME/oracle_common/common/bin:$WL_HOME/common/bin/:$WL_HOME/server/bin:$PATH

Then the wlst scrips need a property file fmw.properties:
#############################################################################
# Set SOABPM Domain properties
#
# @author Martien van den Akker, Darwin-IT Professionals
# @version 1.0, 2016-06-27
#
#############################################################################
#
#Per domain nodemanager...
#############################################################################
# Set SOABPM Domain properties
#
# @author Martien van den Akker, Darwin-IT Professionals
# @version 1.0, 2016-06-27
#
#############################################################################
#
#Per domain nodemanager...
#nmHome=/u01/app/work/domains/osb_domain/nodemanager
nmHome=/u01/app/work/domains/soabpm_domain/nodemanager
#nmHome=/u01/app/work/domains/soabpm12c_dev/nodemanager
nmHost=darlin-vce-db
nmPort=5555
nmType=plain
#
#Domain
#wlsDomainName=osb_domain
#wlsDomainName=soabpm12c_dev
wlsDomainName=soabpm_domain
wlsDomainsHome=/u01/app/work/domains
#
#AdminServer
adminServerName=AdminServer
#adminServerName=OsbAdmin
adminUrl=darlin-vce-db:7001
adminUser=weblogic
adminPwd=welcome1
usrCfgFile=wlsCfgFile
usrKeyFile=wlsKeyFile
#Clusters
osbClr=OsbCluster
soaClr=SoaCluster

Create User Config and Key files

In the property file above you'll see the adminUser and adminPwd (password). On a production environment you probably don't want to have those in plain sight. WebLogic provides a means to encrypt those in to a user config and a key file. When you generate those with the default names to your home folder you even can connect to the AdminServer without providing the username/password. In my scripts I use named files as defined in the usrCfgFile and usrKeyFile properties. But first we need to generate those and for that we need to connect to the AdminServer using the adminUser and adminPwd properties.

I created the script createUserConfigFiles.py to generate those files:
#############################################################################
# Stop AdminServer 
#
# @author Martien van den Akker, Darwin-IT Professionals
# @version 2.1, 2016-06-27
#
#############################################################################
# Modify these values as necessary
import sys, traceback
scriptName = 'stopAdmin.py'
#
#
lineSeperator='__________________________________________________________________________________'
#
#
def usage():
  print 'Call script as: '
  print 'Windows: wlst.cmd '+scriptName+' -loadProperties localhost.properties'
  print 'Linux: wlst.sh '+scriptName+' -loadProperties environment.properties'
  print 'Property file should contain the following properties: '
  print "adminUrl=localhost:7101"
  print "adminUser=weblogic"
  print "adminPwd=welcome1"
#
#
def main():
  try:
    print(lineSeperator)
    print('Create user config files')
    print(lineSeperator)
    print('\nConnect to the AdminServer: '+adminServerName)
    connect(adminUser, adminPwd, adminUrl)
    #
    print('\nStore Config files')
    storeUserConfig(usrCfgFile,usrKeyFile)
    #   
    print('\nExiting...')
  except NameError, e:
    print('Apparently properties not set.')
    print "Please check the property: ", sys.exc_info()[0], sys.exc_info()[1]
    usage()
  except:
    apply(traceback.print_exception, sys.exc_info())
    stopEdit('y')
    exit(exitcode=1)
#
main();

After connecting to the AdminServer it is the storeUserConfig(usrCfgFile,usrKeyFile) that creates the files. If you don't supply the parameters it will create the default files in your home folder (for instance /home/oracle), then you can connect just by issueing connect('localhost:7001'). For more on this, read this blog.

To call the script above you can use the script createUserConfigFiles.sh or the contents of it:
#!/bin/bash
#############################################################################
# Create user config files using wlst.sh
#
# @author Martien van den Akker, Darwin-IT Professionals
# @version 2.1, 2016-06-27
#
#############################################################################
#  
. fmw12c_env.sh
echo
echo Create User Config files
wlst.sh ./createUserConfigFiles.py -loadProperties fmw.properties

After having greated the files you can (and should!) remove the adminUser and adminPwd properties from the fmw.properties file.

Start the NodeManager

 In de domain home/bin folder (for instance /u01/app/work/domains/soabpm_domain/bin) you'll find scripts to start and stop the NodeManager: startNodeManager.sh and stopNodeManager.sh. These are fine to use in the service configurations in Linux or Windows.

However, running them from the shell/command line will cause your session to be taken for the NodeManager. Stopping the session will cause the NodeManager to be stopped. Not so handy if you want to start the NodeManager in an overall script that starts the complete domain.

But you can start the NodeManager from wlst as well. And then it spawns the NodeManager into a seperate process in  the background.
The script startNodeManager.py is quite simple:
#############################################################################
# Start nodemanager
#
# @author Martien van den Akker, Darwin-IT Professionals
# @version 2.1, 2016-06-27
#
#############################################################################
# Modify these values as necessary
import sys, traceback
scriptName = sys.argv[0]
#
#
lineSeperator='__________________________________________________________________________________'
#
#
def usage():
  print 'Call script as: '
  print 'Windows: wlst.cmd '+scriptName+' -loadProperties localhost.properties'
  print 'Linux: wlst.sh '+scriptName+' -loadProperties environment.properties'
  print 'Property file should contain the following properties: '
  print "adminUrl=localhost:7101"
  print "adminUser=weblogic"
  print "adminPwd=welcome1"
# 
# 
def startNM(nmHost, nmPort, nmHome, nmType):
  print(lineSeperator)
  print ('Start NodeManager')
  startNodeManager(verbose='true', NodeManagerHome=nmHome, ListenPort=nmPort, ListenAddress=nmHost);
  print('Finished starting NodeManager')
#
def main():
  try:
    startNM(nmHost, nmPort, nmHome,  nmType);
  except NameError, e:
    print('Apparently properties not set.')
    print "Please check the property: ", sys.exc_info()[0], sys.exc_info()[1]
    usage()
  except:
    apply(traceback.print_exception, sys.exc_info())
    stopEdit('y')
    exit(exitcode=1)
#
main()

It's mainly about the startNodeManager(verbose='true', NodeManagerHome=nmHome, ListenPort=nmPort, ListenAddress=nmHost) statement in the  startNM() function, which is quite self explaining, I think.  No username password is required to start the NodeManager, but it does need to know what it's home and listenPort and Address should be.

As with all the scripts I  declare a scripName variable with the assignment of the 0-th argument: sys.argv[0]. This is used in the usage() function that is called in case of a NameError exception. This exception is raised when certain variable or function references are made but not found. Normally this would mean that a certain property is not set in the property file.

To run it from the shell you can use the startNodeManager.sh script:

#!/bin/bash
#############################################################################
# Start nodemanager using wlst
#
# @author Martien van den Akker, Darwin-IT Professionals
# @version 2.1, 2016-06-27
#
#############################################################################
#  
. fmw12c_env.sh
echo
echo Start NodeManager
wlst.sh ./startNodeManager.py -loadProperties fmw.properties
For every python script in the procession of this blog I have an accompanying shell script like this. So I won't provide those for the other scripts.

Stopping the NodeManager is a little more complicated:
#############################################################################
# Stop nodemanager
#
# @author Martien van den Akker, Darwin-IT Professionals
# @version 2.1, 2016-06-27
#
#############################################################################
# Modify these values as necessary
import sys, traceback
scriptName = sys.argv[0]
#
#
lineSeperator='__________________________________________________________________________________'
#
#
def usage():
  print 'Call script as: '
  print 'Windows: wlst.cmd '+scriptName+' -loadProperties localhost.properties'
  print 'Linux: wlst.sh '+scriptName+' -loadProperties environment.properties'
  print 'Property file should contain the following properties: '
  print "adminUrl=localhost:7101"
  print "adminUser=weblogic"
  print "adminPwd=welcome1"
# 
#
def connectToNM(nmHost, nmPort, nmHome, wlsDomainName, wlsDomainHome, nmType):
  try:
    print(lineSeperator)
    print('Try to connect to the Node Manager')
    try:
      nmConnect(userConfigFile=usrCfgFile, userKeyFile=usrKeyFile, host=nmHost, port=nmPort, domainName=wlsDomainName, domainDir=wlsDomainHome, nmType=nmType)
    except NameError, e:
      print('Apparently user config properties usrCfgFile and usrKeyFile not set.')
      print('Try to connect to the NodeManager adminUser and adminPwd properties')
      nmConnect(username=adminUser, password=adminPwd, host=nmHost, port=nmPort, domainName=wlsDomainName, domainDir=wlsDomainHome, nmType=nmType)
    print('Connected to the Node Mananger')
  except WLSTException:
    message='Apparently NodeManager not Started!'
    print (message)
    raise Exception(message)
# 
# 
def stopNM(nmHost, nmPort, nmHome, wlsDomainName, wlsDomainHome, nmType):
  try:
    print(lineSeperator)
    print ('Try to connect to the Node Manager')
    connectToNM(nmHost, nmPort, nmHome, wlsDomainName, wlsDomainHome, nmType);
    print ('Connected to the Node Mananger')
    print ('Stop the NodeManager')
    stopNodeManager();
    print('Finished stapping NodeManager')
  except WLSTException:
    print ('Apparently NodeManager not Started!')
#
def main():
  try:
    wlsDomainHome = wlsDomainsHome+'/'+wlsDomainName
    stopNM(nmHost, nmPort, nmHome, wlsDomainName, wlsDomainHome, nmType);
  except NameError, e:
    print('Apparently properties not set.')
    print "Please check the property: ", sys.exc_info()[0], sys.exc_info()[1]
    usage()
  except:
    apply(traceback.print_exception, sys.exc_info())
    stopEdit('y')
    exit(exitcode=1)
#
main()

You first need to connect to the NodeManager. This is done in connectToNM() that illustrates the use of the user config and key files with the adminUser/adminPwd properties as a fall back. First it tries to connect using the userConfigFile/userKeyFile combination. If that leads to a NameError, then you probably did not provide those, so it retries it on with the adminUser/adminPwd properties.
This construct is reused when trying to connect to the AdminServer to start the rest of the domain.
When it is connected the nodeManager can be stopped, simply with the stopNodeManager() command.

Start/Stop the AdminServer

The startAdmin.py provided below is used to Start the Admin Server. Again, there are  startWeblogic.sh and stopWeblogic.sh scripts in the bin folder of the domain. But those will start the server in the foreground (although you can move them to the background of course). But mainly the servers aren't under control of the NodeManager, what has it's advantages. So the startAdmin.py script:
#############################################################################
# Start AdminServer via NodeManager and try to connect to it.
# Result of this script should be that you are connect to the AdminServer.
#
# @author Martien van den Akker, Darwin-IT Professionals
# @version 2.1, 2016-06-27
#
#############################################################################
# Modify these values as necessary
import sys, traceback
scriptName = sys.argv[0]
#
#
lineSeperator='__________________________________________________________________________________'
#
#
def usage():
  print 'Call script as: '
  print 'Windows: wlst.cmd '+scriptName+' -loadProperties localhost.properties'
  print 'Linux: wlst.sh '+scriptName+' -loadProperties environment.properties'
  print 'Property file should contain the following properties: '
  print "adminUrl=localhost:7101"
  print "adminUser=weblogic"
  print "adminPwd=welcome1"
# 
#
def connectToNM(nmHost, nmPort, nmHome, wlsDomainName, wlsDomainHome, nmType):
  try:
    print(lineSeperator)
    print('Try to connect to the Node Manager')
    try:
      nmConnect(userConfigFile=usrCfgFile, userKeyFile=usrKeyFile, host=nmHost, port=nmPort, domainName=wlsDomainName, domainDir=wlsDomainHome, nmType=nmType)
    except NameError, e:
      print('Apparently user config properties usrCfgFile and usrKeyFile not set.')
      print('Try to connect to the NodeManager adminUser and adminPwd properties')
      nmConnect(username=adminUser, password=adminPwd, host=nmHost, port=nmPort, domainName=wlsDomainName, domainDir=wlsDomainHome, nmType=nmType)
    print('Connected to the Node Mananger')
  except WLSTException:
    message='Apparently NodeManager not Started!'
    print (message)
    raise Exception(message)
    #print 'Start Nodemanager'
    #startNodeManager(verbose='true', NodeManagerHome=nmHome, ListenPort=nmPort, ListenAddress=nmHost);
    #print 'Retry to connect to the Node Manager';
    #nmConnect(username=adminUser, password=adminPwd, host=nmHost, port=nmPort, domainName=wlsDomainName, domainDir=wlsDomainHome, nmType=nmType);
#
#
def connectToadminServer(adminUrl, adminServerName):
  try:
    print(lineSeperator)
    print('Try to connect to the AdminServer')
    try:
      connect(userConfigFile=usrCfgFile, userKeyFile=usrKeyFile, url=adminUrl)
    except NameError, e:
      print('Apparently user config properties usrCfgFile and usrKeyFile not set.')
      print('Try to connect to the AdminServer adminUser and adminPwd properties')
      connect(adminUser, adminPwd, adminUrl)
  except WLSTException:
    print('Apparently AdminServer not Started!')
    print('Start AdminServer')
    nmStart(adminServerName)
    print('Retry to connect to the AdminServer')
    try:
      connect(userConfigFile=usrCfgFile, userKeyFile=usrKeyFile, url=adminUrl)
    except NameError, e:
      print('Apparently user config properties usrCfgFile and usrKeyFile not set.')
      print('Try to connect to the AdminServer adminUser and adminPwd properties')
      connect(adminUser, adminPwd, adminUrl)
#
#
def main():
  try:
    wlsDomainHome = wlsDomainsHome+'/'+wlsDomainName
    print(lineSeperator)
    print('Start '+adminServerName+' for domain in : '+wlsDomainHome)
    print(lineSeperator)
    print ('Connect to the Node Manager')
    connectToNM(nmHost, nmPort, nmHome, wlsDomainName, wlsDomainHome, nmType)
    print ('Connect to the AdminServer: '+adminServerName)
    connectToadminServer(adminUrl, adminServerName)
  except NameError, e:
    print('Apparently properties not set.')
    print "Please check the property: ", sys.exc_info()[0], sys.exc_info()[1]
    usage()
  except:
    apply(traceback.print_exception, sys.exc_info())
    exit(exitcode=1)
#
main();

The script first uses connectToNM(nmHost, nmPort, nmHome, wlsDomainName, wlsDomainHome, nmType) to connect to the NodeManager. Just in the same way as in the stopNodeManager.py script.
Then it tries to connect to the AdminServer using connectToadminServer(adminUrl, adminServerName). Now, the connect to the NodeManager might be a little over done. But in the connectToadminServer() function it tries to connect to the AdminServer the same way as to the NodeManager: first with the userConfigFile, userKeyFile combination. But when those properties aren't set, it uses the username password combination. If a connection to the AdminServer fails, the function concludes that it isn't started yet. But since it is still connected to the nodemanager it can start the AdminServer using nmStart(adminServerName). So it is important that in the adminServerName property in the fmw.properties the correct name of the AdminServer is given.

To stop the admin server I have the stopAdmin.py script:
#############################################################################
# Stop AdminServer 
#
# @author Martien van den Akker, Darwin-IT Professionals
# @version 2.1, 2016-06-27
#
#############################################################################
# Modify these values as necessary
import sys, traceback
scriptName = sys.argv[0]
#
#
lineSeperator='__________________________________________________________________________________'
#
#
def usage():
  print 'Call script as: '
  print 'Windows: wlst.cmd '+scriptName+' -loadProperties localhost.properties'
  print 'Linux: wlst.sh '+scriptName+' -loadProperties environment.properties'
  print 'Property file should contain the following properties: '
  print "adminUrl=localhost:7101"
  print "adminUser=weblogic"
  print "adminPwd=welcome1"
#
#
def connectToadminServer( adminUrl, adminServerName):
  try:
    print(lineSeperator)
    print('Try to connect to the AdminServer using user config')
    connect(userConfigFile=usrCfgFile, userKeyFile=usrKeyFile, url=adminUrl)
  except NameError, e:
    print('Apparently user config properties usrCfgFile and usrKeyFile not set.')
    print('Try to connect to the AdminServer adminUser and adminPwd properties')
    connect(adminUser, adminPwd, adminUrl)
  except WLSTException:
    message='Apparently AdminServer not Started!'
    print (message)
    raise Exception(message)
#
#
def main():
  try:
    wlsDomainHome = wlsDomainsHome+'/'+wlsDomainName
    print(lineSeperator)
    print('Stop '+adminServerName+' for domain in : '+wlsDomainHome)
    print(lineSeperator)
    print('\nConnect to the AdminServer: '+adminServerName)
    connectToadminServer(adminUrl, adminServerName)
    #
    print('\nShutdown the AdminServer: '+adminServerName)
    shutdown(force='true')
    #
    print('\nFinished stopping AdminServer: '+adminServerName)
  except NameError, e:
    print('Apparently properties not set.')
    print "Please check the property: ", sys.exc_info()[0], sys.exc_info()[1]
    usage()
  except:
    apply(traceback.print_exception, sys.exc_info())
    exit(exitcode=1)
#
main();

This one does a connect to the AdminServer (in the same two layer approach), and when connected it performs a shutdown(force='true'). This is equivalant to the 'Force Shutdown Now' option in the console. Since no server or cluster name is provided the AdminServer is shutdown.

Start&Stop Domain

In my create domain scripts I create clusters for every component. Even when I have only one managed server (for instance in a development environment) I added the servers to a cluster. This makes it easier to expand the domain with multiple nodes. And it makes the management the same for every environment.

So to start and stop the domain I work per cluster. The nice thing then is that when connected you can just issue a start cluster command and the AdminServer uses the different nodeManagers on the hosts to start the servers simultaneously. This reduces the startup time significantly compared to starting the servers of a cluster one by one.

 Now, it might be that one of the servers couldn't be started right away. And this fails the start cluster command. So I added a little code to try to start the different servers one by one. I think it is somewhat over done in most cases, but it does illustrate some functions to figure out the servers to start and to determine their state. So here is my startDomain.py script:
#############################################################################
# Start SOA and OSB domain
#
# @author Martien van den Akker, Darwin-IT Professionals
# @version 1.1, 2016-06-27
#
#############################################################################
# Modify these values as necessary
import sys, traceback
scriptName = sys.argv[0]
#
#
lineSeperator='__________________________________________________________________________________'
#
#
def usage():
  print 'Call script as: '
  print 'Windows: wlst.cmd '+scriptName+' -loadProperties localhost.properties'
  print 'Linux: wlst.sh '+scriptName+' -loadProperties environment.properties'
  print 'Property file should contain the following properties: '
  print "adminUrl=localhost:7001"
  print "adminUser=weblogic"
  print "adminPwd=welcome1"
# 
#
def connectToNM(nmHost, nmPort, nmHome, wlsDomainName, wlsDomainHome, nmType):
  try:
    print(lineSeperator)
    print('Try to connect to the Node Manager')
    try:
      nmConnect(userConfigFile=usrCfgFile, userKeyFile=usrKeyFile, host=nmHost, port=nmPort, domainName=wlsDomainName, domainDir=wlsDomainHome, nmType=nmType)
    except NameError, e:
      print('Apparently user config properties usrCfgFile and usrKeyFile not set.')
      print('Try to connect to the NodeManager adminUser and adminPwd properties')
      nmConnect(username=adminUser, password=adminPwd, host=nmHost, port=nmPort, domainName=wlsDomainName, domainDir=wlsDomainHome, nmType=nmType)
    print('Connected to the Node Mananger')
  except WLSTException:
    message='Apparently NodeManager not Started!'
    print (message)
    raise Exception(message)
    #print 'Start Nodemanager'
    #startNodeManager(verbose='true', NodeManagerHome=nmHome, ListenPort=nmPort, ListenAddress=nmHost);
    #print 'Retry to connect to the Node Manager';
    #nmConnect(username=adminUser, password=adminPwd, host=nmHost, port=nmPort, domainName=wlsDomainName, domainDir=wlsDomainHome, nmType=nmType);
#
#
def connectToadminServer(adminUrl, adminServerName):
  try:
    print(lineSeperator)
    print('Try to connect to the AdminServer')
    try:
      connect(userConfigFile=usrCfgFile, userKeyFile=usrKeyFile, url=adminUrl)
    except NameError, e:
      print('Apparently user config properties usrCfgFile and usrKeyFile not set.')
      print('Try to connect to the AdminServer adminUser and adminPwd properties')
      connect(adminUser, adminPwd, adminUrl)
  except WLSTException:
    print('Apparently AdminServer not Started!')
    print('Start AdminServer')
    nmStart(adminServerName)
    print('Retry to connect to the AdminServer')
    try:
      connect(userConfigFile=usrCfgFile, userKeyFile=usrKeyFile, url=adminUrl)
    except NameError, e:
      print('Apparently user config properties usrCfgFile and usrKeyFile not set.')
      print('Try to connect to the AdminServer adminUser and adminPwd properties')
      connect(adminUser, adminPwd, adminUrl)
#
# Get the Servers of Cluster 
def getClusterServers(clustername):
  #Cluster config to be fetched from ServerConfig
  print(lineSeperator)
  print('\nGet Servers from cluster '+clustername)
  serverConfig()
  cluster = getMBean("/Clusters/" + clustername)
  if cluster is None:
    errorMsg= "Cluster " + clustername + " does not appear to exist!"
    print errorMsg
    raise(Exception(errorMsg))
  print "Found cluster "+ clustername+ "."
  servers = cluster.getServers()
  return servers
#
#
def serverStatus(serverName):
  serverRuntime=getMBean('/ServerRuntimes/'+serverName)
  if serverRuntime is None:
    print("Server Runtime for  " + serverName + " not found, server apparently SHUTDOWN")
    serverState="SHUTDOWN"
  else:
    print "Found Server Runtime for "+ serverName+ "."
    serverState = serverRuntime.getState()
  return serverState
#
#
def startClusterServersOneByOne(clusterName):
  print(lineSeperator)
  print ('Start servers for cluster: '+clusterName)
  servers=getClusterServers(clusterName)
  # Need to go to domainRuntime to get to the serverRuntimes.
  domainRuntime()
  #
  for server in servers:
    print(lineSeperator)
    serverName = server.getName()
    print('ServerName: '+serverName)
    serverState = serverStatus(serverName)
    print('Server '+serverName+': '+serverState)
    if serverState=="SHUTDOWN":
      print ('Server '+serverName+' is not running so start it.')
      start(serverName)
    elif serverState=="RUNNING":
      print ('Server '+serverName+' is already running')
    else:
      print ('Server '+serverName+' in state '+serverState+', not startable!')
  #
  print ('\nFinished starting servers.')
#
#
def startClusterServers(clusterName):
  print(lineSeperator)
  print ('Start servers for cluster: '+clusterName)
  #
  try:
    start(clusterName,'Cluster')
  except WLSTException:
    print "Apparently Cluster in incompatible state!", sys.exc_info()[0], sys.exc_info()[1]
    startClusterServersOneByOne(clusterName)
  state(clusterName,'Cluster')
  #
  print ('\nFinished starting servers.')
#
#
def main():
  try:
    wlsDomainHome = wlsDomainsHome+'/'+wlsDomainName
    print (lineSeperator)
    print ('Start Osb Cluster')
    print('\nConnect to AdminServer ')
    print (lineSeperator)
    print ('Connect to the Node Manager')
    connectToNM(nmHost, nmPort, nmHome, wlsDomainName, wlsDomainHome, nmType)
    print ('Connect to the AdminServer: '+adminServerName)
    connectToadminServer(adminUrl, adminServerName)
    #
    print('Start servers for cluster: '+osbClr)
    startClusterServers(osbClr)
    #
    print('Start servers for cluster: '+soaClr)
    startClusterServers(soaClr)
    #
    print('\nExiting...')
    exit()
  except NameError, e:
    print('Apparently properties not set.')
    print "Please check the property: ", sys.exc_info()[0], sys.exc_info()[1]
    usage()
  except:
    apply(traceback.print_exception, sys.exc_info())
    exit(exitcode=1)
#call main()
main()
exit()

It first connects to the AdminServer in the familiar way. If that fails it tries to start it. So far it is copied from the startAdmin.py script. Then it calls the startClusterServers(clusterName) function for each cluster. You need to add each cluster here by hand. I could loop over the clusternames, but you might need to have them started in a certain order. This function first issues the start(clusterName,'Cluster') command. Providing the 'Cluster' parameter as a start type, a complete cluster can be started. If that fails, then apparently one or more servers failed to start. If so it tries to start the servers one by one in the startClusterServersOneByOne(clusterName) function. It gets the servers attached to the cluster using getClusterServers(clustername) function. This is done using the getServers() method of the cluster. This needs to be done in the serverConfig() state. After that for each server the status is fetched by querying the serverRuntime of the server in the serverStatus() function. ServerRuntimes are available in the DomainRuntime() state, so it is needed to switch the state using that function. If the ServerRuntime is not avaiable, the state is considered to be 'SHUTDOWN'. If the server is down a start(serverName) is issued. Lastly in the startClusterServers() function the state(clusterName,'Cluster') is called to print the state of the servers in the cluster.

To Stop the domain I created the stopDomain.py which mainly works in reverse:
#############################################################################
# Stop OSB Domain
#
# @author Martien van den Akker, Darwin-IT Professionals
# @version 1.1, 2016-06-27
#
#############################################################################
# Modify these values as necessary
import sys, traceback
scriptName = sys.argv[0]
#
#
lineSeperator='__________________________________________________________________________________'
#
#
def usage():
  print 'Call script as: '
  print 'Windows: wlst.cmd '+scriptName+' -loadProperties localhost.properties'
  print 'Linux: wlst.sh '+scriptName+' -loadProperties environment.properties'
  print 'Property file should contain the following properties: '
  print "adminUrl=localhost:7001"
  print "adminUser=weblogic"
  print "adminPwd=welcome1"
#
#
def connectToadminServer(adminUrl, adminServerName):
  try:
    print(lineSeperator)
    print('Try to connect to the AdminServer')
    try:
      connect(userConfigFile=usrCfgFile, userKeyFile=usrKeyFile, url=adminUrl)
    except NameError, e:
      print('Apparently user config properties usrCfgFile and usrKeyFile not set.')
      print('Try to connect to the AdminServer adminUser and adminPwd properties')
      connect(adminUser, adminPwd, adminUrl)
  except WLSTException:
    message='Apparently AdminServer not Started!'
    print (message)
    raise Exception(message)
#
# Get the Servers of Cluster 
def getClusterServers(clustername):
  #Cluster config to be fetched from ServerConfig
  print(lineSeperator)
  print('\nGet Servers from cluster '+clustername)
  serverConfig()
  cluster = getMBean("/Clusters/" + clustername)
  if cluster is None:
    errorMsg= "Cluster " + clustername + " does not appear to exist!"
    print errorMsg
    raise(Exception(errorMsg))
  print "Found cluster "+ clustername+ "."
  servers = cluster.getServers()
  return servers
#
#
def serverStatus(serverName):
  serverRuntime=getMBean('/ServerRuntimes/'+serverName)
  if serverRuntime is None:
    print("Server Runtime for  " + serverName + " not found, server apparently SHUTDOWN")
    serverState="SHUTDOWN"
  else:
    print "Found Server Runtime for "+ serverName+ "."
    serverState = serverRuntime.getState()
  return serverState
#
#
def stopClusterServers(clusterName):
  print(lineSeperator)
  print ('Stop servers for cluster: '+clusterName)
  #
  try:
    shutdown(clusterName,'Cluster')
  except WLSTException:
    print "Apparently Cluster in incompatible state!", sys.exc_info()[0], sys.exc_info()[1]
    state(clusterName,'Cluster')
    print ('Try to stop servers for cluster: '+clusterName+', one by one')
    servers=getClusterServers(clusterName)
    # Need to go to domainRuntime to get to the serverRuntimes.
    domainRuntime()
    #
    for server in servers:
      print(lineSeperator)
      serverName = server.getName()
      print('ServerName: '+serverName)
      serverState = serverStatus(serverName)
      print('Server '+serverName+': '+serverState)
      if serverState=="RUNNING":
        print ('Server '+serverName+' is running so shut it down.')
        shutdown(name=serverName, force='true')
      elif serverState=="SHUTDOWN":
        print ('Server '+serverName+' is already down.')
      else:
        print ('Server '+serverName+' in state '+serverState+', not stoppable!')
  #
  print ('\nFinished stopping servers.')
#
#
def main():
  try:
    wlsDomainHome = wlsDomainsHome+'/'+wlsDomainName
    print (lineSeperator)
    print ('Stop Osb Domain')
    print (lineSeperator)
    print('\nConnect to the AdminServer: '+adminServerName)
    connectToadminServer(adminUrl, adminServerName)
    print(lineSeperator)
    print('First stop servers from cluster '+osbClr)
    stopClusterServers(osbClr)
    #
    print(lineSeperator)
    print('\nShutdown the AdminServer: '+adminServerName)
    shutdown(force='true')
    print ('\nFinished stopping servers.')
    #
    print('\nExiting...')
    exit()
  except NameError, e:
    print('Apparently properties not set.')
    print "Please check the property: ", sys.exc_info()[0], sys.exc_info()[1]
    usage()
  except:
    apply(traceback.print_exception, sys.exc_info())
    exit(exitcode=1)
#call main()
main()
exit()

In the sameway as in starting the servers this script first tries to shutdown the cluster as a whole. The AdminServer takes advantage of the NodeManager to simulaneously stop the servers. The NodeManager first tries to connect to the particular server to hand over a suicide pill. But if it can't reach the server directly it releases it from suffering by doing an explicit OS-Level kill process...
After the shutdown of the servers the AdminServer is shut down.

Conclusion

This concludes this article on starting and stopping your domain. I hope it was enlighting and entertaining. One improvements  would be to reuse the scripts on starting and stopping the AdminServer in the starting and stopping of the Domain. You see that there is some code-overlap. But the way I call them with an explicit property file prevents doing a execfile(). But the properties loaded aren't passed to the execfile() apparently. And it needs a path to the other file. So that's why I for now combined the code into one script.