Thursday, October 07, 2004

Flat file schemas and xs:date anomalies.

I've been struggling with a flat file schema for the past couple of days, and have been having some very inconsistent results with the xs:date datatype. I'm using a custom pipeline with flatfile disassembler to convert a pipe-delimited file to xml, then applying a map in the receive port to convert the output to a canonical form.

One of the fields is a date, in the form "dd MMM yyyy", e.g. "10 nov 1980". So, I cast this to the xs:date datatype in the schema, and set the "Custom date/time format" property to "dd MMM yyyy". This works very well when the date is there, but blows up when the field is missing, even though I've set the schema attribute "/schema/annotation/appinfo/@suppress_empty_nodes" to "true", which should cope with missing values.
The exception thrown is a pipeline exception, complaining about the format of the date. It appears as if the pipeline is attempting to cast the non-existant value to the date format before doing the null-value test. I *think* this is a bug?

My first work-around was to use a xs:string instead, in which case the suppress_empty_nodes attribute works, and no output element is produced for the missing value.
This morning, however, I've just used a standard xs:date datatype (without custom format), which should only accept the format "yyyy-mm-dd", rerun the test using "xxx" as the date in question, and it passed!!

So... custom date formats definitely cause a problem with missing values, but they do at least attempt to resolve the value to a date ("10 nnv 1980" threw an expected exception.)
Standard date formats seem (this morning) to accept anything you give it, but they do at least obey the suppress_empty_nodes instruction.

SP1 anyone (yes - I have installed the Rollup Package 1)?

(I originally posted the issue to the newsgroups here, thanks to those who replied.)

No comments: