vrijdag 18 september 2009

The things every IT Professional should know about XML Schema design Part 1

In my every day work I see a lot of XML stuff going by. Usually the quality or schema design of the validating XML Schema is not good enough to be constructive and reusable. The thing I started to wonder about was why?


There are some basic rules a good XML Schema complies to. It all starts with thinking and documenting the general schema design pattern (the five are Russian Doll, Venetian Blind, Salami Slice, Garden of Eden and Bologna). It is a simple choice to make, and there is lots of information available on the net with the pros and cons for each choice. My personal favorite is the Garden of Eden and I recommend this for everyone considering building a corporate, domain or canonical datamodel based on XML technologies.


The second thing to really think about is namespace design. Good namespace design is practical and will work for you in understanding the domain you are talking about. Some guidelines I use when making my namespace design:
1) Make it urn based, not protocol based;
Rationale: Protocol based suggest that the XML Schema can be found at that address, while this is usually not the case. URN based namespaces are more expressive.
2) Use the major version in your namespace, so no dates but major versions;
  • Not OK: urn:nl:free-your-energy:schema:data
  • Not OK: urn:nl:free-your-energy:schema:data:20090918
  • OK: urn:nl:free-your-energy:schema:data:1
Rationale: Major version in namespace will guarantee compliancy througout this major version. Anything that doesn't use a version will result in problems when making a new version or release. Dates don't tell anything.
The upcoming weeks I will elaborate in detail on XML Schema design with more tips.