Friday, December 17, 2004

Mass Copy (or not)

We have a number 'Alert' scenarios where we want to send out user alerts, and also retain the body of a message (or part thereof) for resubmission, to replay the orchestration once appropriate remedial action has been taken. I thought we could be clever, and define an envelope schema that would contain details of the alert, in human-readable form, together with an <any> element that could contain the message, for resubmission. By defining the parent of this element as the BodyXPath of the envelope, the alert message itself could be resubmitted, at which point the inner message would be stripped out and re-processed.

Alert
 |
 + - Message
 |
 + - Resubmit
      |
      + - <Any>

It was the first time I'd used the Mass Copy functoid, and was annoyed to find that it copies the children and attributes of the node selected, rather than the node itself - meaning that you lose the root node name if it isn't already in the output schema. The xsl generated is thus:

<Resubmit>
  <xsl:copy-of select="./@*" />
  <xsl:copy-of select="./*" />
</Resubmit>

(where the <resubmit> element is the parent of the <any>, and our body xpath node.)

This didn't work for us, as we needed the name of the root element copied over, so I replaced the Mass Copy functoid with a Scripting functoid, selected Inline XSLT as the script type, and used the following:

<Resubmit>
  <xsl:copy-of select="." />
</Resubmit>

I'm surprised that there isn't a property of the Mass Copy functoid that allows you to do this, as it seems relatively obvious (and easy to implement). Has anyone else had to do this?

Debatching follow-up

Following on from Stephen's debatching sample, I thought I'd mention a side-effect of envelope de-batching that can cause problems if it isn't appreciated in the design. One of the strengths of the envelope over xpath debatching is the fact that messages are subsequently processed in parallel, which can lead to performance gains. If an envelope containing 10 messages is sent through a debatching pipeline (e.g. XMLReceive), 10 messages are delivered to the messagebox, and 10 orchestration instances are created.

If the debatched messages are picked up by an orchestration, which in turn consumes a web service, all 10 requests are made in parallel. Which is fine if the envelope contained ten messages. Not so fine if it contains 1000. This approach can turn BizTalk into a very effective Denial-of-Service launch pad, as IIS on our (recovering) development server can testify.

We still favour this approach, but decided to mitigate the burst load by spacing the requests out over an arbitrary period of time. This is achieved by applying a random delay to the orchestration prior to the request, using a configurable period (stored in BTSNTSvc.exe.config), e.g. we want the 1000 requests to be spread out over 5 minutes, so the orchestration picks a random number between 0 and 300 (seconds), and applies the relevant delay.

This works extremely well, and has allowed us to fine-tune our implementation to fit the environment, combining the power of parallel message processing with a more even spread of the load.

A couple of additional points worth noting:
1. The messages are delivered in random order in this case, as with all envelope debatching scenarios.
2. The System.Random uses a time-based seed, which seems to rely on millisecond values, meaning that messages arriving within the same millisecond (does happen, apparently) will get the same delay. This can be mitigated by implementing a singleton, if it's really necessary.

Consuming heteregeneous web services

When consuming a web service, you need to create a message that maps to the web service request. Most people's first instinct (having created the message variable) is to head for the Transform shape within their message Construct. Select a source message, select the request message as the map destination, and then get mapping.

Something that has repeatedly cropped up in the newsgroups is when the request message doesn't appear in the drop-down of available messages in the Transform shape wizard. This is usually caused because the available parts in the request message (i.e. the request parameters) are primitive types (string, int, etc.), not complex types. If this is the case you need to use the Message Assignment shape instead, and use the standard "." property notation within the shape expression editor:

MyRequestMsg.Param1 = "X";
MyRequestMsg.Param2 = 1;

So - primitive types use the Assignment shape, complex types use the Transform:

void MyMethod1(string param1); // primitive parts only - use the Assignment shape
void MyMethod2(MyType param1); // complex parts only - use the Transform shape

There is, however, a third, intermediate case, when the method signature contains both:

void MyMethod3(string param1, MyType param2);

In this case, you'll notice that both parts appear in the Assignment expression IntelliSense, suggesting that you need to use this shape, but you need to supply an object of type MyType as the second parameter, which is where it starts to go "off-road".

Turns out that MyType is defined within one of the Reference*.xsd schema created when the web reference is added / updated.

In this mixed scenario, in addition to declaring a message of type MyMethod_request, you will need to declare an additional message of type MyType, by selecting it from the relevant Reference*.xsd (you may have to hunt around a bit if the web service is complicated.)

Within the message Construct shape, you can then use a Transform to create the MyType message with a map, and then an Assignment to assign a value to the string parameter, and the MyType message to the second.

(There may well be other ways to achieve this, but it works for me :-))

BizTalk 2006 roadmap available

Get it here.

Once you've had a look can I recommend that you all visit the Fiorano site, and download the following whitepaper - "Fiorano ESB Versus the Competition". You'll have to fill in your details in order to get it, and about an hour later their (insanely hyper) UK sales guy will call. Tell him you're a BizTalk expert, and then sit back and listen whilst he gives you his view of the BizTalk roadmap (it's dead in the water, the dodo of software.)

You might then like to point him in the direction of David Chappell's latest posting, which makes pretty grim reading for the likes of Fiorano.

Monday, December 13, 2004

Envelopes pt 3

Stephen Thomas has posted an extremely comprehensive guide to de-batching of collections, which should be compulsory reading for anyone attempting this. I'm very pleased to see that his hard-work has confirmed my own gut-feeling, which suggested that Xpath debatching would be more suitable for small collections, whilst pipeline envelope processing would be the most performant.

This is exactly what we've already gone with using xpath within orchestrations where batch sizes are no more than 10 records, with envelopes for the rest.

Re. the envelope-mapping issue, well we're lucky, as our source batches are collections returned as web service responses, so having to apply a map within the orchestration is a no-brainer.

Our current implementation, for large batches, is therefore:
1. Consume web service which returns large collection response.
2. Map response to an envelope schema within the orchestration.
3. Send the envelope message out through a passthru pipeline.
4. Receive the envelope through an Xml pipeline, et voila - you have individual messages.

(Remember - the specific issue that the map solves is the injection of single instance 'header' data into each record. The mapping / orchestration are not required for simple record de-batching.)

Send port filename placeholders ("macros")

I have no idea why they're called macros, and it explains why I could never find them in the documentation, but here's the updated list of %xyz% placeholders that you can use for filenames within FILE send ports.

(Original documentation link is ms-help://BTS_2004/Operations/htm/ebiz_ops_adapt_file_zdax.htm

Bloggers Guide To BizTalk (December)

December edition of the Bloggers Guide is out now; thanks to Alan for doing all the work collating it (perhaps we should all offer to chip in and take turns to do it - or would that be claiming some of Alan's glory for ourselves?)
Good also to see Matt in it, as I believe he was parachuted into the gap I created when I left Conchango earlier in the year. Sorry about that Matt, but hopefully I'm now forgiven?

Something's up...

Lots of postings about a couple of announcements from the mothership today. First is MOOL, which is a subscription service that allows you to synch Hotmail with Outlook. This seems to explain why my Hotmail mail account in Outlook stopped working a couple of months ago!

The second is more of a mystery - could be a desktop search engine, could some toolbar thing (aargh). I downloaded something called Lookout a few weeks ago, which is a desktop search tool that was bought recently by M$ - it didn't really work for me, and kept crashing, but its fans rave about it?

Sunday, December 12, 2004

Envelopes pt 2

Duncan Millard has posted some more info on the whole envelope issue here. He's done a lot more investigative work than I have, so keep an eye on his blog for more details.

RSS feed added

I've added an RSS feed to the Atom one that's already available, thanks to Feedburner .

Friday, December 10, 2004

Jawbreaker

It's not something I'd normally admit to, but I've become mildly addicted to the game Jawbreaker on my Orange mobile phone. Thought I'd mention this just to see if anyone can beat last night's record score of 676, including a single "Big Burst" of 552? I'm using the "Standard" style, though I'm not really sure what that means?

Pass-phrases

Interesting article on password use - something that vexes me on a daily basis.
http://blogs.msdn.com/robert_hensing/archive/2004/07/28/199610.aspx

Thursday, December 09, 2004

Envelopes, maps and pipelines

I was in the process of posting to one of the newsgroups this afternoon when I saw that someone else had beaten me to it, and posted on the exact same topic, Envelope schemas and mapping.

It's a fairly common scenario - a message needs to be split, with common 'header' information being included in each message. I had thought that it would be possible to do this in a pipeline, and I've found various posts referring to this situation (see end of post for references), none of which really answered the question.

To demonstrate this in action I'll use the classic customer / order example. I have a starting message which contains header and body information, and an orchestration that uses the individual order messages, with the customer data being inserted into each message, as shown below:

Original message:
<CustomerOrdersResponse>
 <Customer>...</Customer>
 <Orders>
  <Order>...</Order>
  <Order>...</Order>
 </Orders>
</CustomerOrdersResponse>

Messages as required by Orchestration:
<CustomerOrder>
 <Customer>...</Customer>
 <Order>...</Order>
</CustomerOrder>

I have defined three schemas to accomodate this - Customer, Order and CustomerOrder:
Customer:
<Customer>
 <Firstname>John</Firstname>
 <Lastname>Doe</Lastname>
 <Age>30</Age>
</Customer>

Order:
<Order>
 <Item>1</Item>
 <Qty>2</Qty>
</Order>

CustomerOrder:
<CustomerOrder>
 <Customer>...</Customer>
 <Order>...</Order>
</CustomerOrder>

The first thing to do try is a simple splitting of the messages using an envelope schema. Create a new schema, set its Envelope property to true, and the Body XPath of the root node to the xpath that points to the root of the collection - in this case to the <Orders/> node. The sample xml message below shows the relationship between envelope and body parts:
Envelope
<CustomerOrdersResponse> (Set Body XPath property of this node to xpath of Orders node below)
 <Customer>...</Customer>
- - - - - - - - - - - - - - - - -
Body
 <Orders>
  <Order>...</Order>
  <Order>...</Order>
 </Orders>
- - - - - - - - - - - - - - - - -
</CustomerOrdersResponse>

When a message that conforms to an envelope is processed in an (XML) pipeline, the envelope is discarded, and the repeating body records are split out into separate messages. The message above would be converted into two <Order/> messages. So far, so simple.

The problem now is how to get a CustomerOrder message out of the above, rather than the Order alone.

My first idea was to use a map to convert the original message to a collection of CustomerOrder messages, and redefine the envelope schema to use the new message:
<CustomerOrders>
 <CustomerOrder>
  <Customer>...</Customer>
  <Order>...</Order>
 </CustomerOrder>
 <CustomerOrder>
  <Customer>...</Customer>
  <Order>...</Order>
 </CustomerOrder>
<CustomerOrders>

I figured that if I put this map into a send pipeline, then I could send the original CustomerOrderResponse message out through this pipeline, converting it into a CustomerOrders message, which would then be picked up by a receive location which would extract the individual CustomerOrder messages.

Which is where it all goes wrong.

If you try and do this, you'll get an exception thrown by the send pipeline saying that "Document type "CustomerOrder" does not match any of the given schemas." I looked around, and found that someone else had had a similar issue, and concluded that mapping to an Envelope schema is not possible within a pipeline. Which brings me back to where I started, as this is exactly the issue that Duncan Millard and I collided on whilst posting to microsoft.public.biztalk.general.

You can map to Envelope schemas within an orchestration, and it turns out that this is how we are getting around the problem - the original CustomerOrdersResponse message is mapped to CustomerOrders within an orchestration, sent out through a send pipeline (without further mapping), then picked up by a receive location that splits out the CustomerOrder messages.

There is an alternative method - using XPath within a loop / expression combination. This may well be better in certain situations, but it seems a shame not to use the inherent envelope functionality. I believe it may also be possible using some custom pipeline magic, and Stephen Thomas' blog posting (see below) has some interesting stuff on property promotion / demotion tricks, but none of it seems very simple.

Does anyone know of an easier way to do all of this in one pass?

References:

How to split an XML message in BizTalk 2004 using Document & Envelope Schemas (Jan Tielens)


BizTalk Server will split up your documents for you. (Scott Woodgate)


Property Promotion and Demotion in BizTalk 2004 (Stephen Thomas)


Looping around Message Elements (Darren Jefford)

Monday, December 06, 2004

XSD pt 2

I've recently been struggling with yet another xsd issue - importing schemas.

I've found that when importing schemas, multiple imports of the same 'base' schema cause duplicate type declarations, which VS.NET prevents.

e.g.
1. Define a datatype "MyDT" in a schema called CommonSchema.xsd
2. Create a second schema, which imports CommonSchema, called SchemaA.xsd.
3. Create a new schema, SchemaB.xsd, which itself imports CommonSchema.xsd.
4. Try to import SchemaA.xsd into SchemaB.xsd. VS.NET will throw an exception:

"1. The simpleType 'http://xyz.com/schemas/CommonSchema:MyDT' has already been declared. An error occurred at ..."

This makes some sort of sense, but I think VS.NET should be able to resolve this issue?

Interestingly, if I do the same operation in XMLSpy, it can validate the schema without errors - which suggests that this import scenario is valid within an XSD, just not within BizTalk schemas?

Has anyone come across the same issue, and if so, did you manage to solve it? (I've tried playing with RootReferences, namespaces etc., with no luck.)

Blogger

I don't know about anyone else using Blogger, but I find it extremely unreliable (albeit free, so I shouldn't really complain.)

I often compose postings only to click on Publish, and find that my session has timed out in the background (?), and my posting (some of which can be quite long) is fired into a black hole, never to return. A particular problem is the Ctl+S save option, which seems to do nothing else but consume my nascent wisdom.

The only solution seems to be to write postings in notepad, then copy and paste into the create box until it finally agrees to publish?

Wednesday, November 24, 2004

XSD.exe and namespaces

I've spent many hours this afternoon tracking down a schema issue - the classic "The part or fragment may not exist in the database." exception that means that a message cannot be resolved to a known schema. This is usually caused by having deployed two identical schema, or by not having deployed the schema at all.

A common fix is to undeploy and redeploy all schemas. This wouldn't work in my case, and I even went as far as reconfiguring BizTalk (and deleting all the databases).

Unfortunately this still didn't fix the issue, which was really confusing, as I only had two assemblies deployed, so it shouldn't have been too hard to work through all the available schemas!

I eventually tracked it down to a local web service that I was calling from within an orchestration. I'd used the BizTalk xsd schema and the xsd.exe tool to create the .cs file containing all of the classes behind my test web service. The web service worked fine through WebServiceStudio, so I knew that it was working.

The problem is that xsd.exe applies the same schema target namespaces to the System.Xml.Serialization.Xml*Attribute attibutes, which meant that I had a namespace clash through the imported Reference.xsd in the orchestration.

So - when using xsd.exe to create classes, always check the namespace attributes.

Tuesday, November 23, 2004

Convoy processing

Those of you who subscribe to the various microsoft.public.biztalk.* newsgroups may have seen a number of posts from myself re. convoy processing, and the perils of correlation set subscriptions. The end result of all of this has been quite dramatic, in terms of our understanding of convoys, and has resulted in us removing them from our design. This is obviously quite a serious step, and so I thought it might be worth passing on some of our recent experience.

First, a quick recap on the messaging architecture. The pub-sub model is described pretty well here, amongst others, so I won't repeat it; suffice to say that messages are delivered to the messagebox through receive locations, and then matched to subscriptions using a combination of message type (as defined by the schema), and any applied filters. Subscribers include send ports (for content-based routing), orchestrations, and orchestration instances (in the case of correlation, and therefore convoys.)

If a subscription exists, then any matching message that arrives in the messagebox will be consumed.

So - how are subscriptions created, and when? The easiest way to check this is to use the BTSSubscriptionViewer utility, found in the SDK\Utilities directory. Using this you can see that in the case of send ports and orchestrations, the subscriptions exist all the time that the artefact is enlisted. For correlation sets things are a little more complex, and this is where it started to unravel for us.

Correlation subscriptions are created when the correlation set is initialized - through a receive or send shape within an orchestration. In a common sequential convoy scenario, the set is intialized, and then a Listen shape is used to pick up the correlated messages (see Alan Smith's sample here.) Without thinking about it, it's easy to assume that before the listen shape is reached, messages will be discarded, regardless of correlation matches, however this is not so! Once the subscription is created, messages will be consumed by the orchestration regardless of the listen shape.

There are two scenarios that we have come across where this causes problems:

1. When theres is a delay of some sort after the initialisation, during which new messages might reasonably be expected to be discarded, and not consumed.
2. When receiving correlated messages at a faster rate than the orchestration is capable of processing them.

We hit the first scenario when using an orchestration to manage a publication schedule. We were receiving an initial message, sleeping until a given date, then sending the first message on, and entering a Listen-Loop, which was being used to hoover up updates to the initial message. We found that updates received during in the initial delay were being published rather than discarded. There is a work around for this, and that is to use a fake Send shape just before the Listen-Loop to initialise the correlation at the last minute. It's ugly, but it works.

The second scenario is much more serious, and has proved a show-stopper for us. Consider the situation where an orchestration is not only using a convoy to batch up messages, but processing the messages as well. As in Alan's sample, the batch limits ("completeness conditions") are set by one of two parameters - a batch size, and a timeout value. This means that either the number of messages processed reaches a set limit, at which point the batch is delivered, and the orchestration dies, or there is a sufficiently long delay in between individual incoming messages for the orchestration to deliver the batch as it currently stands and then die. (e.g. If the convoy picks up 10 messages, output them, else if the convoy picks up 5 messages, then sits for 10 minutes waiting for the next message, output the batch of 5 only.)

In our test orchestration we had the following setup:

- A receive location delivering messages at a rate of 2/second.
- An orchestration picking up the messages in a convoy, and processing each message.
- The processing of each message takes 2 seconds (simulated using a delay.)
- A batch limit of 10 messages, output to a flat file, after which the orchestration dies.

We then sent 100 messages in to the receive location, expecting to see 10 flat files appear.

What we actually saw was 3 files appearing, with no sign of the missing messages. The explanation appears to be as follows:

The correlation set is initialised when the first message is received, at which point a subscription is created for all further messages (all 100 messages matched the correlation.)
The orchestration takes 20 seconds to process the ten messages it requires. However, as the messages are being received at a rate of 2/sec, 40 messages have been delivered to the messagebox in this time, and all of them match the subscription of the correlation set. They are therefore consumed, and not discarded (yet). Neither is a new orchestration instance created. This means that a net 30 messages are consumed, but NOT processed. At the end of the 20 seconds, the orchestration dies, and the outstanding 30 messages discarded.

If you then look at the services report in HAT, you should see that the orchestration is marked as "Completed with discarded messages". The missing messages should be visible in the messages report, again with the status "Suspended", "Completed with discarded messages". You could, of course, save these messages and manually resubmit them, but obviously in a production environment this is not an option.

The lesson from all of this seems to be that you should always think of convoys in light of the subscriptions that they use to consume messages, and understand when these subscriptions are created, and what might happen to the messages that fall in between the gaps.

Caveat convoy, as they say.

UPDATE: see this for a very informative posting on the background to this problem.

MSMQT configuration and testing

I've been struggling along with MSMQT this morning, trying to get to grips with its installation and configuration, and how to get messages delivered through an MSMQT adapter. A couple of pointers for others who are trying this for the first time, and having problems:

1. Once MSMQT is installed (add a new adapter through the Admin console - see the documentation for details), the easiest way to test it, and the only way when developing on a single machine, is to set up an MSMQT send adapter to point to an MSMQT receive location, as below:

a.) Set up a receive location that uses the FILE adapter.
b.) Set up a send port that subscribes to the above port (using BTS.ReceivePortName filter) to consume the file, and uses the MSMQT send adapter to send it to a queue.
c.) Set up a second receieve location (in a different receive port - if it's in the same one as above you'll get a recursive message loop), that uses MSMQT to receive the message sent by b.) above.
d.) Set up a second send port that subscribes to the MSMQT receive location, and outputs the message to a file.

Stick a file into a.) and it should pop out of d.)

This'll demonstrate FILE -> MSMQT -> FILE transfer of a message, and demonstrates the MSMQT is correctly installed.

The more interesting issue is how to send a message to the receive queue using the System.Messaging API, which is what you'll want to be doing most of the time.

The easiest way to do this is to use the SDK\Samples\Adapters\SendMSMQMessage sample. Just open the solution and run it as is.

The gotcha is that you can't use this sample from the same machine - it must be running on a separate machine, that has MSMQ (not MSMQT) installed.

Plaxwoe

Well, the experiment has come to an end, and it's time to report back. I never got round the duplication issue, and ended
up with several contacts / appointments duplicated numerous times, which proved a real pain.

The key seems to be not to have both ActiveSync and Plaxo sync'ing with more than one machine. The following steps should help you keep everything in order, without duplication.

1. Update Plaxo.
2. Clear all PIM from computers, delete ActiveSync partnerships, uninstall Plaxo.
3. Reinstall Plaxo on a single 'master' computer, and sync data.
4. Attach Smartphone, create a new partnership, and Sync data to phone.
5. Create further partnership with second computer, but do NOT install Plaxo, and sync from phone.

I don't know what you do if you want more than two computers in sync, as ActiveSync is limited to two partnerships (I believe.)

An alternative might be to synchronize the computers using Plaxo, and then only create a single ActiveSync partnership.

Good luck.

Friday, November 12, 2004

Firefox 1.0

It's finally arrived - get it here.

When you've installed it, get the excellent SAGE rss extension here.

Monday, November 08, 2004

BPM whitepaper

Excellent whitepaper from David Chappell here, that puts BizTalk into context, alongside other "Business Process Management" servers.

Smartphone update

I've installed the Jeyo Mobile Extender software so that I can now send / receive SMS from my desktop when my Smartphone is connected. I've been using the Mobile Companion, which is a great bit software, so hopefully it'll prove it's worth (it's only $14.95, which at today's exchange rate makes it practically free for UK users :-)) - I've also downloaded the Personalizer software, which I'll report back on when I get a chance to play.

Whilst on the Smartphone subject, time for a few gripes. First and foremost - I can't find a way to send contact details via SMS (only to "beam" them via IrDa / BT), which is related to the second gripe - it appears as if the SMS stack doesn't understand vCards. This lack of standard phone functionality is a huge hole IMHO, and so I'm still giving M$ the benefit of the doubt, and assuming that it's just my stupidity in not understanding how to do it.

Parallel convoy sample

Stephen Thomas has posted a new sample, this time of a concurrent (or parallel) convoy.

Sunday, November 07, 2004

Windows Train Edition

Sitting on the train on the way home on Friday evening, when it slowed down for an unplanned stop. The driver then explained that he had to "reboot the train". The lights went out, then gradually everything started up again and we were on our way.

Sell Panasonic

It's easy to see why Apple is so revered amongst the design community, when digital home products like this are still being produced.

Technically it may well be unsurpassed, but it looks like a video from the 1980's, and is called the "DMRE55EBS".

Jonathan Ives can't be the only designer in the world interested in working with technology. It's so depressing.

Monday, November 01, 2004

Next Blog

One of the great features of Blogger is the "Next Blog" button, which goes to someone else's public blog. This can be any blog on Blogger, and so not necessarily a technical blog.

My first go brought up The Shelter 1/2.

Plaxo-n-and-on

Stacy Martin, from Plaxo, has actually replied to my earlier post about my sync'ing problems... I'm astounded. Not only do Plaxo produce the best-looking software around, they're also surfing the darkest recesses of the web to bring aid to the undeserving.

I'm glad to say that I spent some time over the weekend singing Plaxo's praises to some of my unbelieving friends.

Something from nothing

Richard Veryard has posted an interesting reply to my acknowledgement of non-arrival of messages quandary, pointing out the value of 'nothing' in certain circumstances...

Thursday, October 28, 2004

Orchestration debug info

Wouldn't it be nice to be able to use the Console.Write() function within an orchestration shape to view internal message and variable values, without going through the orchestration debugger.

Turns out you can... using DebugView from SysInternals.

If you use the System.Diagnostics.Trace.Write function within expression shape, the output is picked up by DebugView.

Tuesday, October 26, 2004

Business intelligence

One of the business processes I'm modelling at the moment includes the following scenario:
A message, type MessageType1, is received by an orchestration, which publishes it via a send port, then listens for a message, type MessageType2, which correlates with the first message using some common attribute. Pretty simply stuff.

The only fly in the ointment is that occassionally messages of MessageType2 will arrive at the receive location without a corresponding MessageType1 match. These messages would normally be suspended, as messages "with no matching subscription", but in our scenario we need to process these 'unmatched' messages using a separate orchestration. This new orchestration therefore has an activate receive shape that subscribes to MessageType2 messages. However, this would pick up every MessageType2 message, including those that are also picked up by the convoy, which we don't want.

A potential solution is to mark the unmatched messages in advance, using some attribute that we can filter by. The problem with this is that the business were not making any distinction between the matched and unmatched messages at the point of creation?

The answer was to go back to the business to better understand why some messages will be correlated, and others won't, which in turn forced the business to look at the process in more depth. This proved to be a very useful lesson for all concerned, and coincidentally helped to clarify a number of other issues with the overall solution.

Another example of the need for a close relationship between BAs and designers at the earliest stages of solution design.

Sunday, October 24, 2004

Friday, October 22, 2004

Client relations

Another gem from TheDailyWTF (abridged) :

(Developer) So how do we determine the status on an order?


(Client) Look at the field called "status" on the order.

Does every order have a status field?

Yes, every order has a status field.

Absolutely, positively, every order? So if one is missing it is a user error and we do not have to process it?

That's right, you should always have a status field.

- 3 months later -

Hey, the system isn’t processing some orders. Go fix it.

- 2 hours later -

There is no status field on the order. You said that would never happen.

Oh, but this is remote call forwarding, that has no status field.

But you said that every order will have a status, we wrote this down, now you are telling us it does not?

Every order DOES have a status, just not the remote call forwarding ones.

You keep using that word, I don’t think it means what you think it does...

Service instance lifecycles

Found this in an article about BPEL (albeit from 2002):

"Web services implemented as BPEL4WS processes have an instanced life cycle model. That is, a client of these services always interacts with a specific instance of the service (process). So how does the client create an instance of the service?

Unlike traditional distributed object systems, in BPELWS instances are not created via a factory pattern. Instead, instances in BPEL4WS are created implicitly when messages arrive for the service. That is, instances are identified not by an explicit "instance ID" concept, but by some key fields within data messages. For example, if the process represents an order fulfillment system, the invoice number could be the "key" field to identify the specific instance involved with the interaction. Thus, if a matching instance isn't available when a message arrives at a "startable" point in the process, a new instance is automatically created and associated with the key data found in the message. Messages can only be accepted at non-startable points in a process after a suitable instance has been located; that is, in these cases the messages are in fact always delivered to specific instances. In BPEL4WS, the process of finding a suitable instance or creating one if necessary is called message correlation."

Snippet Compiler

One of my biggest annoyances with designing orchestrations in BizTalk is how you test expressions. As an example, this morning I've been working with Timespans, and wanted to know:
  • whether you could have a negative timespan (by subtracting the current datetime from a date in the past)
  • what happens if you put a negative timespan into a delay shape.
I decided I needed to run a few lines to sanity-check what I was trying to do, using the following code:
// to generate a negative timespan value subtract a date from an earlier date.
System.DateTime deadline = System.DateTime.Now;
System.Threading.Thread.Sleep(1000);
System.TimeSpan timespan = deadline.Subtract(System.DateTime.Now);
// what is the expected value of timespan - can it be negative, and if so - what happens now:
System.Threading.Thread.Sleep(timespan);

I didn't want to open a new console project to work this out, so I dug around and came across SnippetCompiler (a colleague pointed me at it). This allows you to test small snippets without going through the whole "new...project" routine, and looks like becoming invaluable.
I do have a couple of gripes with the AutoComplete / IntelliSense functionality, but it's probably unreasonable to complain when it's free ;-) .

Synchronicity part 2

The synchronisation experiment has not been a great success. As many might have predicted events and contacts are uploaded / downloaded in some apparently random manner, causing everyone's birthdays to be duplicated every time I sync Plaxo, and contacts on my phone to be deleted by ActiveSync. It's infuriating that this problem still plagues PIM software.

Anyone I've met in the last three months has been wiped, including the electrician who's currently rewiring my flat, in my absence!

I do however, have a fairly complete set of info on Plaxo, sync'd to my home and work desktops. Adding in the Smartphone causes a few headaches, but things are definitely better than they were.

Thursday, October 21, 2004

C500 update

The first ROM upgrade for the C500 is here already. Don't forget to save photos onto the storage card, or computer, otherwise they'll get zapped by the hard reset.

(BTW if you want something a little more permanent, send your pics to Stickpix via MMS [07746197446], and they'll print them out and send them to you. FREE for a limited period.)

Alice in Wonderland

One of my main concerns with our current design is the requirement for an alert to be triggered if no message of a given type is received before some given deadline (with the deadline being message-instance specific). This is difficult to accomplish on a message-event basis, as orchestration instances are instantiated by the arrival of a message. An instance cannot be aware of the non-existance of a message before it arrives, as it doesn't exist itself! (I'm sure something similar appeared as a plotline in The Hitchhiker's Guide to the Galaxy.)

This means that the orchestration instance, that would ordinarily activate upon receipt of a message, must exist already, before the message arrives. This is achieved by seeding an orchestration instance with a "message-expected" message that includes the deadline date, and for it to then listen for the expected message (even worse - it might actually have to poll for the message itself).

This seems completely counter-intuitive to the message-driven architecture that BTS espouses; an added complication is that once the instance is running, the deadline to which it is bound is fixed - you can't go in and tweak it, unless the orchestration is designed in such a way that it can receive (a la "sequential uniform convoy") further updates to its own deadline, by which time the orchestration is so complex that it would have been easier to shift the scheduling to a separate external application. This is particularly relevant in the current situation as the deadline might be 4 months from the time of the activation message, giving more than enough time for real-world events to require a change in the schedule.

So, we're now in a situation where BizTalk itself is managing the receipt of messages - it has to find out what messages are expected, create a new orchestration instance to listen for each one, and raise an alert if they don't arrive on time.

Et voila. Like some great illusion, we've managed to turn BizTalk inside-out, and make it the application.

Daily WTF

From today's Daily WTF:

///TODO: add so that it actually does something with orderPlanWeekId
///TODO: Maybe I don't need to, try to understand what the above TODO was for


http://thedailywtf.com/archive/2004/10/20/2763.aspx

It's exactly the sort of comment I find myself writing :-(

Spirit of "Martian"

I don't really understand what these guys are selling, but I like it anyway. Anyone who actively promotes a reduction in "the impact of visual mediocrity on the quality of life of those who use computers" gets my vote.

If only it was wireless...

Wednesday, October 20, 2004

Google go-slow

Don't know about anyone else, but the Google desktop search engine causes standard google searches to grind to a halt, even when preferences are set to ignore desktop matches?

Tuesday, October 19, 2004

Date-dependent convoys

The project I'm currently working is very heavily driven by various external system dates, and we've been having exhaustive discussions on how to implement the schedule - using BizTalk itself, using an external "scheduler" application, or relying on the source data systems to provide data at the requisite time. (My argument has always been the BTS is the wrong place to have any scheduling, as BTS is message-driven, and should simply react to what its been given, however I'm fighting a one-man war on that front!)


There are two types of scheduled process involved:
1. Data must be pulled (itself something I don't like) from systems at a given date.
2. Data will be pushed into BTS when it's made available, but cannot then be sent on to the consumers until some nominal "start date" has been passed. Furthermore, if a nominal "end date" has also been passed then the message should be killed off entirely. The final issue is that of updates. If a message is submitted before the start date, updates to the data contained in the original message could also be submitted via the same channel, in which case the final update is the one that the consumer is interested in (i.e. managing duplicates internally).

I may well post more about this scheduling business, as it's something I haven't directly come across before, and it raises some fairly fundamental issues re. message-driven "real-time" architectures (and how appropriate they are in such circumstances)

In the meantime I've done a simple demo that might be of interest. I've modelled the second scenario, where messages are submitted and then held, using a sequential uniform convoy to suck up all messages and correlate them; since I've done a quick demo I thought I might as well post it here in case anyone else finds it useful.

The zip file contains the following BTS artefacts:
1. A message schema, containing the aforementioned start and end dates, together with a field to use for correlation, and a data field that can be used to verify that the latest message is the one that comes out of the orchestration.
2. A property schema used to promote the correlation data.
3. The orchestration. This has three possible outputs:
  • If the end date of the initial message has passed, the message is sent to a port marked as "timedout".
  • If the start date has passed already, the message will simply shoot out the end, unchanged.
  • If the start date has not yet passed the orchestration will sit and listen for further messages ofthe same type (and correlation set). New messages are used to overwrite the original data. When the start date arrives the last message to be received will be the one that is sent.
In reality it looks as if our messages won't contain the start and end dates, and that the orchestration will have to call a web service to get these dates (the dates are held in a separate
system), but this demo works pretty well, and gives a quick insight to convoys.

Enjoy.

STOP PRESS As Blogger doesn't host files, I'll have to host it from home, which I can't set up from here. I'll sort it asap.

Friday, October 15, 2004

Synchronicity

Ever since I first connected a mobile phone to a computer, c. 1998, I've been looking for synchro-nirvana, with a single view of all my contact and calendar info from computer to mobile to web. this is actually much more important to me than email which i'm not that bothered about. my various experiences with sync'ing mobile phones have been very frustrating, and the lack of investment in sync software from nokia (my preferred phone supplier ;-) ) has been very disappointing, to say the least.

Well, I've now gone and done it - bought myself a smartphone. sync'ing with the computer is great, as you'd expect. Activesync has been the bane of my life for a long time, but the latest version seems very stable, though that might be the USB cable, which is a little more reliable than irDA!

The more complex sync is from desktop to desktop. If I add up both client-site and home computers I've probably had five or six different 'main' sources of PIM this year alone. Sync'ing these is a lot more complicated, and has until now involved a CD with a 500MB .pst being burnt / imported at various intervals.

I've decided to end this nonsense, and have consequently settled on Plaxo as my central reference for all calendar and contact info (task and notes too, though these aren't so important). I now simply synch Plaxo with my work computer, home computer, parents' computer, laptop, AND smartphone.

Am i now the most synchronised man on the planet?

Autocomplete closure

Many of my friends and ex-colleagues are aware that I once disgraced myself by sending a humorous (and genuine I might add - it was NOT a photoshop-job) picture to my entire company as the result of an AutoComplete trauma; "XYZ Developers" turned out as "XYZ Global". (I'll post the picture at a later date when I'm back at home.)

I took a couple of things away from this experience:
1. Beware AutoComplete.
2. When you recall a message, the message isn't recalled silently, rather Exchange sends a message to the intended recipient asking if they agree to have the message recalled, which is a bit like adding a "READ ME" notice in size 48 font (bold) to the original message.

I survived, but I've been wary of AutoComplete ever since, especially as it also has a habit of storing invalid email addresses. It essentially scoops up any old cr*p that you type in the Tp/cc/bcc box and adds it to the list.

Imagine my delight, therefore, when I hit delete whilst highlighting an invalid old email address in the a/c list, whereupon it vanished. I did a quick google, and it turns out that this is a supported behaviour (see KB289975); however this can't be done with single entries.
A more radical suggestion is just to blitz the entire a/c list (KB287623) - take your pick.

Google desktop

No, not the annoying search toolbar that searches the internet, but a personal Googling of your local machine. Read some techie background on it here. Whilst we're on the subject, Picasa is excellent for managing images, and is owned by Google, as, I believe, is Blogger?

IPO or no-IPO, they do seem to be buying / producing good software. (I also love the fact that it works with Firefox - whose time has surely come.)

Wednesday, October 13, 2004

Grumpy old men, and computers

Whilst in Cambridge (see previous post) I'm staying with my parents in Suffolk. Last night I had to watch "Grumpy Old Men" on the tv - my father's new favourite programme, and of course the subject of computers came up, and how superfluous they are to daily life. I *think* I was expected to put up a spirited defence of them at some point, but instead I started thinking about the different ways in which people use them.

I hardly ever use my computer recreationally. I don't play games, I don't edit photos, I don't download music, I don't do my own accounts, write letters to the local paper, or study for a correspondance course at the Open University. In fact, when I'm not working with it, my home computer is really only used to store stuff (photos (unedited) off my digital camera, music ripped off my own CDs, contacts, calendar etc., etc.) It's basically a large virtual filing cabinet, which makes the demise of the Martian Netdrive all the more tragic.

For those of you who never saw this, for a brief moment in time a couple of years ago a company called Martian was selling a wireless, 'silent', hard drive, that you just plug in to the mains, and leave in a cupboard somewhere. Unfortunately it never really took off, and I believe they now work with OEMs rather than selling direct. Surely its time has come?

Mapping (old school)

I'm currently working in Cambridge (UK), and was looking up an address on the various mapping sites in the UK - primarily streetmap, and multimap. These two always give inconsistent results, and searching them effectively is something of a black art. (e.g. "East Road" gives only half a dozen matches with multimap, all in London, but "east road cambridge" gives me the correct match - even with "GB" selected? Yesterday multimap told me that my current postcode doesn't exist, whilst streetmap found it immediately - surely these guys use the same data???)

Anyway, whenever I find what I'm looking for I always go for the "aerial" button to gawp at the aerial photo, and this time I discovered the excellent map overlay that multimap have done. If I were american I'd tell you how cool this is, but I'm not, so I'll just let you figure it out for yourself here.

Monday, October 11, 2004

SOAP receive pipelines and missing messages.

I always thought that SOAP send and receive ports had to use the PassThruxxx pipelines, but I've found out today that not only is that not the case, the fact that I wasn't using an XmlReceive pipeline has been the cause of all my problems over the last couple of days.

If you use a PassThru promoted properties are not promoted (fairly obvious when you think about it), so receive shapes that have a correlation set attached are never activated, and messages go through with the "no matching subscription" error.

Aaargh.

Thursday, October 07, 2004

Flat file schemas and xs:date anomalies.

I've been struggling with a flat file schema for the past couple of days, and have been having some very inconsistent results with the xs:date datatype. I'm using a custom pipeline with flatfile disassembler to convert a pipe-delimited file to xml, then applying a map in the receive port to convert the output to a canonical form.

One of the fields is a date, in the form "dd MMM yyyy", e.g. "10 nov 1980". So, I cast this to the xs:date datatype in the schema, and set the "Custom date/time format" property to "dd MMM yyyy". This works very well when the date is there, but blows up when the field is missing, even though I've set the schema attribute "/schema/annotation/appinfo/@suppress_empty_nodes" to "true", which should cope with missing values.
The exception thrown is a pipeline exception, complaining about the format of the date. It appears as if the pipeline is attempting to cast the non-existant value to the date format before doing the null-value test. I *think* this is a bug?

My first work-around was to use a xs:string instead, in which case the suppress_empty_nodes attribute works, and no output element is produced for the missing value.
This morning, however, I've just used a standard xs:date datatype (without custom format), which should only accept the format "yyyy-mm-dd", rerun the test using "xxx" as the date in question, and it passed!!

So... custom date formats definitely cause a problem with missing values, but they do at least attempt to resolve the value to a date ("10 nnv 1980" threw an expected exception.)
Standard date formats seem (this morning) to accept anything you give it, but they do at least obey the suppress_empty_nodes instruction.

SP1 anyone (yes - I have installed the Rollup Package 1)?

(I originally posted the issue to the newsgroups here, thanks to those who replied.)


Tuesday, October 05, 2004

Car-crash IT

Seeing a tv programme described as car-crash tv the other day made me think about its IT equivalent. We've all seen it - executive gets sold on a BIG IDEA after a couple of conferences in California / Nice / New York (delete as appropriate), doesn't really understand it, and starts down the road to disaster, lip-licking consultants in tow.
You know it's going to go wrong from the very beginning, but are powerless to do anything (other than stand to the side and watch, transfixed as it disintegrates, hoping you can write it out of your CV!)

Those working in Government departments must have a particularly good view of this...

Web services everywhere == SOA ?

This is something that has been vexing me on my current project. My client has made a strategic decision to embrace web services, and thereby achieve SOA nirvana. Trouble is, no one here seems to know why an SOA is important, or how slapping web services on the front of all their legacy systems will help. The answer, of course, is that it won't, and that sticking a SOAP interface on a batch job is simply missing the point.

Drift back through the years, and you may remember the fanfare surrounding Bill G's "Business at the Speed of Thought". It foresaw the frictionless interchange of data within an organisation and beyond, bringing companies ever closer to their partners and customers. Executives would demand real-time business critical information, and companies would react to change in a fraction of the time... etc, etc. The idea got lost for a while, but is back with a vengeance along with the SOA, ESB and "real-time enterprise" (e.g. IBM's relentless "on-demand" adverts!)

The RTE redefines the way data is exchanged - it'll be on-demand, event-driven and message-based (you can add in loosely-coupled and asynchronous if you like that sort of thing). Item-level data changes can be broadcast to anyone who needs to know, and lengthy, error-prone, batch jobs will no longer be necessary* (is there such a thing as "downtime" in the global economy?)

This obviously requires rethinking the way your business operates.

So - you can't just SOAP your legacy systems and claim SOA victory. A well-designed SOA must include deep changes in your business processes, and if you find that after the consultants have left you're still working in exactly the same way, only with SOAP requests replacing flat-files, then ask for your money back!

-------
* I realise that batch jobs will still prevail in many areas, where large volumes of data are being exchanged - the important thing is to recognize where they still have value, and to learn to say NO when someone suggests replacing it with a web service!

Friday, October 01, 2004

Thursday, September 30, 2004

When is a web service not a web service?

I've long had a bit of an issue with the definition of the term "web service". An old colleague of mine and I wrote a fantastic mobile Windows CE application a few years ago that had to have MSMQ surgically removed at the 11th hour owing to a bug in the TCP stack on CE (or something like that.) We replaced it with XML over HTTP, and as far as I'm aware it's still going strong.

I've always described the project as using web services, but we didn't use SOAP, and we were really just posting text to an old-fashioned ASP page. Web services do now seem to predicate the need for SOAP, which I don't really agree with, but the official definition (or the nearest thing I could find) is rather less precise on the protocol. This is what W3C has to say on the subject:

"There are many things that might be called "Web services" in the world at large. However, for the purpose of this Working Group and this architecture, and without prejudice toward other definitions, we will use the following definition:
A Web service is a software system designed to support interoperable machine-to-machine interaction over a network. It has an interface described in a machine-processable format (specifically WSDL). Other systems interact with the Web service in a manner prescribed by its description using SOAP-messages, typically conveyed using HTTP with an XML serialization in conjunction with other Web-related standards."

The interesting bit here (for me) is the last sentence, and the use of the word "typically" and "other web-related standards".
Does this mean that SOAP messages exchanged over FTP or SMTP could be defined as web services (though I'm not sure how WSDL would fit into this)?
Does a web service have to be request-response, or can it be simply push / pull?

Enterprise Integration Patterns

Just started Enterprise Integration Patterns by Hohpe & Woolf. I'm only 50 pages in, but it's already proven of value in communicating some core concepts in the whole message/event-driven architecture to my client. An invaluable reference for anyone working in this 'space'.

Tuesday, September 28, 2004

Is anything safe?

Following on from the "J2EE is dead" comment below, now the internet itself is under threat: "The end of the internet is nigh"

Monday, September 27, 2004

Naming collision

OK - turns out (unsurprisingly) that they know each other already - http://www.oreillynet.com/pub/wlg/2874

Kevlar Suit

Nice provocative article from David [.NET] Chappell on the demise of the J2EE community process. Should get a few people hot under the collar.

http://www.adtmag.com/article.asp?id=9779

Sunday, September 26, 2004

Let the blogging begin

I've been reading "Enterprise Service Bus" over the last week, to try and put BizTalk into context, alongside integration, messaging and SOA solutions from the non-M$ side of the fence.
Unfortunately, the author (about whom more later), dismisses the entire M$ effort in a single sentence in chapter 1, "...its integration capabilities are locked into BizTalk, which is a hub-and-spoke integration server... to qualify as an ESB, both a distributed message bus and distributed integration capabilities need to exist."
Hmmm. I'm not sure I agree with this - one of the strengths of BTS and its "host" architecture is that it can be distributed, even if it all ultimately sits on top of the message box 'hub'?
Well, I persevered with the book, which is excellent btw, and am convinced more than ever the BTS satisfies almost all of the author's requirements for an ESB.
The author is David A Chappell, who many might think they know as the David Chappell of the eponymous consultancy, and BizTalk evangalist. Which makes his comments all the stranger.
This David Chappell, however, is "Chief Technology Evangalist" of Sonic Software, a J2EE shop. What are the chances of that - two leading exponents of integration software, working from opposites sides of the fence, with the same name?