In this blog I will describe how to translate a native format input stream (e.g. received by the file adapter), containing multiple instances with a diversity of size and content, to a xml payload.
I will do this in three steps. First I will give a sample data stream and explain the structure of it. After that I will show and explain the required schema definition, and I will end with the resulting xml payload.
As a start I will give an example of the received data stream:
10099990001Name Change 2009999000120141116 30099990001Marcel van de Glind 35099990001Marcel van der Glind @@@10099990002Address Change 2009999000220141116 30199990002Dorpstraat 6 3720AB35199990002Dorpsplein 6 3720BC@@@10099990003Name/Address Change 2009999000320141115 30099990003Marcel van de Glind 30199990003Dorpstraat 6 3720AB35099990003Marcel van der Glind 35199990003Dorpsplein 6 3720BC@@@10099990004Historical details 20099990004 2014111530099990004Marcel van de Glind 30199990004Dorpstraat 6 3720AB30199990004Around the corner AB200030199990004Big City AB201435099990004Marcel van der Glind 35199990004Dorpsplein 6 3720BC35199990004At the corner AB200135199990004Small Town AB2013@@@
The following definition is used for this data stream:
- Mutations are separated by @@@.
- A mutation consists of multiple records (with a minimum of 2 and a maximum of let say 20).
- The records in a mutation can be of different types.
- Records of the same type have the same length.
- Records of different types can have different length.
- The first three characters of a record indicate the record type.
- Records and record fields has a fixed length format.
The following table shows the record definition of the different record types:
|Record type||Record definition|
|100||Mutation ID||8 numbers|
|Record label||20 characters|
|200||Mutation ID||8 numbers|
|Start date||8 numbers|
|End date||8 numbers|
|300||Mutation ID||8 numbers|
|First name||20 characters|
|Last name||40 characters|
|301||Mutation ID||8 numbers|
|Postal Code||6 characters|
|350||Mutation ID||8 numbers|
|Historical First name||20 characters|
|Historical Last name||40 characters|
|351||Mutation ID||8 numbers|
|Historical Address||30 characters|
|Historical Postal Code||6 characters|
Based on this information the sample input stream can be divided in the following four mutations.
30099990001Marcel van de Glind
35099990001Marcel van der Glind
30199990002Dorpstraat 6 3720AB
35199990002Dorpsplein 6 3720BC
30099990003Marcel van de Glind
30199990003Dorpstraat 6 3720AB
35099990003Marcel van der Glind
35199990003Dorpsplein 6 3720BC
30099990004Marcel van de Glind
30199990004Dorpstraat 6 3720AB
30199990004Around the corner AB2000
30199990004Big City AB2014
35099990004Marcel van der Glind
35199990004Dorpsplein 6 3720BC
35199990004At the corner AB2001
35199990004Small Town AB2013
Comments: the first mutation consists of a 100, 200, 300 and 350 record. The first record in the mutation, the 100 record, has a mutation id “99990001” and a record label “Name Change”. In a similar way the content of all the records can be determined.
After explaining the sample data stream, I will now give the schema definition to translate the data stream to an xml format payload.
There are a number of interessing things in this schema definition.
- First you can see the uniqueMessageSeparator specified as ‘@@@’ in the schema definition. This separator indicates the parting between the mutations in the data stream. Which will eventually result in the creations of multiple instances in the SOA/BPM suite. Without this separator only one instance is created (with the whole data stream as payload).
- Next we have got the choice operation. This operation contains 6 parts as listed below:
Without getting into all the details, the choice checks the first three characters to determine the record type. The maxOccurs means in this case that every mutation allways consist of 10 records. A number of ‘real’ records filled up with filler records.
- The brings me to the last interresting line. The filler. This is the record identified by ‘@@@’. Notice that this is also the marker of the mutation ending. The filler is a zero length record. This means that the datastream pointer stays at the same position. With the consequence that the next record is still identified by ‘@@@’. Without the maxOccurs setting we would have an infinit loop. Now the loop ends after 10 iterations.
The definition of the other subtypes is specified below:
The resulting payload for the third mutation. As you can see 6 normal records (all of a different type) and 4 filler records.