Friday, December 17, 2004

Mass Copy (or not)

We have a number 'Alert' scenarios where we want to send out user alerts, and also retain the body of a message (or part thereof) for resubmission, to replay the orchestration once appropriate remedial action has been taken. I thought we could be clever, and define an envelope schema that would contain details of the alert, in human-readable form, together with an <any> element that could contain the message, for resubmission. By defining the parent of this element as the BodyXPath of the envelope, the alert message itself could be resubmitted, at which point the inner message would be stripped out and re-processed.

Alert
 |
 + - Message
 |
 + - Resubmit
      |
      + - <Any>

It was the first time I'd used the Mass Copy functoid, and was annoyed to find that it copies the children and attributes of the node selected, rather than the node itself - meaning that you lose the root node name if it isn't already in the output schema. The xsl generated is thus:

<Resubmit>
  <xsl:copy-of select="./@*" />
  <xsl:copy-of select="./*" />
</Resubmit>

(where the <resubmit> element is the parent of the <any>, and our body xpath node.)

This didn't work for us, as we needed the name of the root element copied over, so I replaced the Mass Copy functoid with a Scripting functoid, selected Inline XSLT as the script type, and used the following:

<Resubmit>
  <xsl:copy-of select="." />
</Resubmit>

I'm surprised that there isn't a property of the Mass Copy functoid that allows you to do this, as it seems relatively obvious (and easy to implement). Has anyone else had to do this?

Debatching follow-up

Following on from Stephen's debatching sample, I thought I'd mention a side-effect of envelope de-batching that can cause problems if it isn't appreciated in the design. One of the strengths of the envelope over xpath debatching is the fact that messages are subsequently processed in parallel, which can lead to performance gains. If an envelope containing 10 messages is sent through a debatching pipeline (e.g. XMLReceive), 10 messages are delivered to the messagebox, and 10 orchestration instances are created.

If the debatched messages are picked up by an orchestration, which in turn consumes a web service, all 10 requests are made in parallel. Which is fine if the envelope contained ten messages. Not so fine if it contains 1000. This approach can turn BizTalk into a very effective Denial-of-Service launch pad, as IIS on our (recovering) development server can testify.

We still favour this approach, but decided to mitigate the burst load by spacing the requests out over an arbitrary period of time. This is achieved by applying a random delay to the orchestration prior to the request, using a configurable period (stored in BTSNTSvc.exe.config), e.g. we want the 1000 requests to be spread out over 5 minutes, so the orchestration picks a random number between 0 and 300 (seconds), and applies the relevant delay.

This works extremely well, and has allowed us to fine-tune our implementation to fit the environment, combining the power of parallel message processing with a more even spread of the load.

A couple of additional points worth noting:
1. The messages are delivered in random order in this case, as with all envelope debatching scenarios.
2. The System.Random uses a time-based seed, which seems to rely on millisecond values, meaning that messages arriving within the same millisecond (does happen, apparently) will get the same delay. This can be mitigated by implementing a singleton, if it's really necessary.

Consuming heteregeneous web services

When consuming a web service, you need to create a message that maps to the web service request. Most people's first instinct (having created the message variable) is to head for the Transform shape within their message Construct. Select a source message, select the request message as the map destination, and then get mapping.

Something that has repeatedly cropped up in the newsgroups is when the request message doesn't appear in the drop-down of available messages in the Transform shape wizard. This is usually caused because the available parts in the request message (i.e. the request parameters) are primitive types (string, int, etc.), not complex types. If this is the case you need to use the Message Assignment shape instead, and use the standard "." property notation within the shape expression editor:

MyRequestMsg.Param1 = "X";
MyRequestMsg.Param2 = 1;

So - primitive types use the Assignment shape, complex types use the Transform:

void MyMethod1(string param1); // primitive parts only - use the Assignment shape
void MyMethod2(MyType param1); // complex parts only - use the Transform shape

There is, however, a third, intermediate case, when the method signature contains both:

void MyMethod3(string param1, MyType param2);

In this case, you'll notice that both parts appear in the Assignment expression IntelliSense, suggesting that you need to use this shape, but you need to supply an object of type MyType as the second parameter, which is where it starts to go "off-road".

Turns out that MyType is defined within one of the Reference*.xsd schema created when the web reference is added / updated.

In this mixed scenario, in addition to declaring a message of type MyMethod_request, you will need to declare an additional message of type MyType, by selecting it from the relevant Reference*.xsd (you may have to hunt around a bit if the web service is complicated.)

Within the message Construct shape, you can then use a Transform to create the MyType message with a map, and then an Assignment to assign a value to the string parameter, and the MyType message to the second.

(There may well be other ways to achieve this, but it works for me :-))

BizTalk 2006 roadmap available

Get it here.

Once you've had a look can I recommend that you all visit the Fiorano site, and download the following whitepaper - "Fiorano ESB Versus the Competition". You'll have to fill in your details in order to get it, and about an hour later their (insanely hyper) UK sales guy will call. Tell him you're a BizTalk expert, and then sit back and listen whilst he gives you his view of the BizTalk roadmap (it's dead in the water, the dodo of software.)

You might then like to point him in the direction of David Chappell's latest posting, which makes pretty grim reading for the likes of Fiorano.

Monday, December 13, 2004

Envelopes pt 3

Stephen Thomas has posted an extremely comprehensive guide to de-batching of collections, which should be compulsory reading for anyone attempting this. I'm very pleased to see that his hard-work has confirmed my own gut-feeling, which suggested that Xpath debatching would be more suitable for small collections, whilst pipeline envelope processing would be the most performant.

This is exactly what we've already gone with using xpath within orchestrations where batch sizes are no more than 10 records, with envelopes for the rest.

Re. the envelope-mapping issue, well we're lucky, as our source batches are collections returned as web service responses, so having to apply a map within the orchestration is a no-brainer.

Our current implementation, for large batches, is therefore:
1. Consume web service which returns large collection response.
2. Map response to an envelope schema within the orchestration.
3. Send the envelope message out through a passthru pipeline.
4. Receive the envelope through an Xml pipeline, et voila - you have individual messages.

(Remember - the specific issue that the map solves is the injection of single instance 'header' data into each record. The mapping / orchestration are not required for simple record de-batching.)

Send port filename placeholders ("macros")

I have no idea why they're called macros, and it explains why I could never find them in the documentation, but here's the updated list of %xyz% placeholders that you can use for filenames within FILE send ports.

(Original documentation link is ms-help://BTS_2004/Operations/htm/ebiz_ops_adapt_file_zdax.htm

Bloggers Guide To BizTalk (December)

December edition of the Bloggers Guide is out now; thanks to Alan for doing all the work collating it (perhaps we should all offer to chip in and take turns to do it - or would that be claiming some of Alan's glory for ourselves?)
Good also to see Matt in it, as I believe he was parachuted into the gap I created when I left Conchango earlier in the year. Sorry about that Matt, but hopefully I'm now forgiven?

Something's up...

Lots of postings about a couple of announcements from the mothership today. First is MOOL, which is a subscription service that allows you to synch Hotmail with Outlook. This seems to explain why my Hotmail mail account in Outlook stopped working a couple of months ago!

The second is more of a mystery - could be a desktop search engine, could some toolbar thing (aargh). I downloaded something called Lookout a few weeks ago, which is a desktop search tool that was bought recently by M$ - it didn't really work for me, and kept crashing, but its fans rave about it?

Sunday, December 12, 2004

Envelopes pt 2

Duncan Millard has posted some more info on the whole envelope issue here. He's done a lot more investigative work than I have, so keep an eye on his blog for more details.

RSS feed added

I've added an RSS feed to the Atom one that's already available, thanks to Feedburner .

Friday, December 10, 2004

Jawbreaker

It's not something I'd normally admit to, but I've become mildly addicted to the game Jawbreaker on my Orange mobile phone. Thought I'd mention this just to see if anyone can beat last night's record score of 676, including a single "Big Burst" of 552? I'm using the "Standard" style, though I'm not really sure what that means?

Pass-phrases

Interesting article on password use - something that vexes me on a daily basis.
http://blogs.msdn.com/robert_hensing/archive/2004/07/28/199610.aspx

Thursday, December 09, 2004

Envelopes, maps and pipelines

I was in the process of posting to one of the newsgroups this afternoon when I saw that someone else had beaten me to it, and posted on the exact same topic, Envelope schemas and mapping.

It's a fairly common scenario - a message needs to be split, with common 'header' information being included in each message. I had thought that it would be possible to do this in a pipeline, and I've found various posts referring to this situation (see end of post for references), none of which really answered the question.

To demonstrate this in action I'll use the classic customer / order example. I have a starting message which contains header and body information, and an orchestration that uses the individual order messages, with the customer data being inserted into each message, as shown below:

Original message:
<CustomerOrdersResponse>
 <Customer>...</Customer>
 <Orders>
  <Order>...</Order>
  <Order>...</Order>
 </Orders>
</CustomerOrdersResponse>

Messages as required by Orchestration:
<CustomerOrder>
 <Customer>...</Customer>
 <Order>...</Order>
</CustomerOrder>

I have defined three schemas to accomodate this - Customer, Order and CustomerOrder:
Customer:
<Customer>
 <Firstname>John</Firstname>
 <Lastname>Doe</Lastname>
 <Age>30</Age>
</Customer>

Order:
<Order>
 <Item>1</Item>
 <Qty>2</Qty>
</Order>

CustomerOrder:
<CustomerOrder>
 <Customer>...</Customer>
 <Order>...</Order>
</CustomerOrder>

The first thing to do try is a simple splitting of the messages using an envelope schema. Create a new schema, set its Envelope property to true, and the Body XPath of the root node to the xpath that points to the root of the collection - in this case to the <Orders/> node. The sample xml message below shows the relationship between envelope and body parts:
Envelope
<CustomerOrdersResponse> (Set Body XPath property of this node to xpath of Orders node below)
 <Customer>...</Customer>
- - - - - - - - - - - - - - - - -
Body
 <Orders>
  <Order>...</Order>
  <Order>...</Order>
 </Orders>
- - - - - - - - - - - - - - - - -
</CustomerOrdersResponse>

When a message that conforms to an envelope is processed in an (XML) pipeline, the envelope is discarded, and the repeating body records are split out into separate messages. The message above would be converted into two <Order/> messages. So far, so simple.

The problem now is how to get a CustomerOrder message out of the above, rather than the Order alone.

My first idea was to use a map to convert the original message to a collection of CustomerOrder messages, and redefine the envelope schema to use the new message:
<CustomerOrders>
 <CustomerOrder>
  <Customer>...</Customer>
  <Order>...</Order>
 </CustomerOrder>
 <CustomerOrder>
  <Customer>...</Customer>
  <Order>...</Order>
 </CustomerOrder>
<CustomerOrders>

I figured that if I put this map into a send pipeline, then I could send the original CustomerOrderResponse message out through this pipeline, converting it into a CustomerOrders message, which would then be picked up by a receive location which would extract the individual CustomerOrder messages.

Which is where it all goes wrong.

If you try and do this, you'll get an exception thrown by the send pipeline saying that "Document type "CustomerOrder" does not match any of the given schemas." I looked around, and found that someone else had had a similar issue, and concluded that mapping to an Envelope schema is not possible within a pipeline. Which brings me back to where I started, as this is exactly the issue that Duncan Millard and I collided on whilst posting to microsoft.public.biztalk.general.

You can map to Envelope schemas within an orchestration, and it turns out that this is how we are getting around the problem - the original CustomerOrdersResponse message is mapped to CustomerOrders within an orchestration, sent out through a send pipeline (without further mapping), then picked up by a receive location that splits out the CustomerOrder messages.

There is an alternative method - using XPath within a loop / expression combination. This may well be better in certain situations, but it seems a shame not to use the inherent envelope functionality. I believe it may also be possible using some custom pipeline magic, and Stephen Thomas' blog posting (see below) has some interesting stuff on property promotion / demotion tricks, but none of it seems very simple.

Does anyone know of an easier way to do all of this in one pass?

References:

How to split an XML message in BizTalk 2004 using Document & Envelope Schemas (Jan Tielens)


BizTalk Server will split up your documents for you. (Scott Woodgate)


Property Promotion and Demotion in BizTalk 2004 (Stephen Thomas)


Looping around Message Elements (Darren Jefford)

Monday, December 06, 2004

XSD pt 2

I've recently been struggling with yet another xsd issue - importing schemas.

I've found that when importing schemas, multiple imports of the same 'base' schema cause duplicate type declarations, which VS.NET prevents.

e.g.
1. Define a datatype "MyDT" in a schema called CommonSchema.xsd
2. Create a second schema, which imports CommonSchema, called SchemaA.xsd.
3. Create a new schema, SchemaB.xsd, which itself imports CommonSchema.xsd.
4. Try to import SchemaA.xsd into SchemaB.xsd. VS.NET will throw an exception:

"1. The simpleType 'http://xyz.com/schemas/CommonSchema:MyDT' has already been declared. An error occurred at ..."

This makes some sort of sense, but I think VS.NET should be able to resolve this issue?

Interestingly, if I do the same operation in XMLSpy, it can validate the schema without errors - which suggests that this import scenario is valid within an XSD, just not within BizTalk schemas?

Has anyone come across the same issue, and if so, did you manage to solve it? (I've tried playing with RootReferences, namespaces etc., with no luck.)

Blogger

I don't know about anyone else using Blogger, but I find it extremely unreliable (albeit free, so I shouldn't really complain.)

I often compose postings only to click on Publish, and find that my session has timed out in the background (?), and my posting (some of which can be quite long) is fired into a black hole, never to return. A particular problem is the Ctl+S save option, which seems to do nothing else but consume my nascent wisdom.

The only solution seems to be to write postings in notepad, then copy and paste into the create box until it finally agrees to publish?