This article describes how the W3C Schema SimpleType definitions can be transformed through generation to strong-types that can be used directly in development projects.
3. INTRODUCTION
In the previous article in this series, several challenges related to the translation of
W3C Schema-based data definitions into concrete strong types, were set out.
Here we will look at the euphemistically called SimpleType (they can, as we saw in
the previous article, be anything but simple) and their relative the W3C Primitive.
We will use a Schema set from the second version of Open Travel
(www.opentravel.org) for the year 2014 – 2014B.
THE SIMPLETYPE
The W3C SimpleType has the general definition:
<simpleType
id=xs:ID
name=NCName
any attributes>
(annotation?,(restriction|list|union))
</simpleType>
So, using Open Travel as our example Schema set, we see such definitions as:
<xs:simpleType name="StateProvCodeType">
<xs:annotation>
<xs:documentation xml:lang="en">The standard code or abbreviation for the state,
province, or region.</xs:documentation>
</xs:annotation>
<xs:restriction base="StringLength1to8">
<xs:minLength value="2"/>
</xs:restriction>
</xs:simpleType>
where the value of StateProvCodeType has a restriction applied which constrains it to
be a StringLength1to8 having a minimum length as specified by minLength. So, a
valid value would be “CA”, “Ny” or “nb” (StringLength1to8 does not enforce an
upper-case constraint). StringLength1to8 is defined elsewhere in the Open Travel
Schema set and restricts a string value to have between 1 and 8 characters,
inclusive.
Also:
<xs:simpleType name="ListOfISO3166">
<xs:annotation>
<xs:documentation xml:lang="en">List of country codes in ISO 3166
format.</xs:documentation>
</xs:annotation>
<xs:list itemType="ISO3166"/>
</xs:simpleType>
4. 2
Where the value represented in this definition is a whitespace-separated list of
items, each of which is the string value of an ISO3166 (ISO3166-2), defined
elsewhere in the Open Travel Schema set. So, given that ISO3166 is defined as:
<xs:simpleType name="ISO3166">
<xs:annotation>
<xs:documentation xml:lang="en">Specifies a 2 character country code as defined in
ISO3166.</xs:documentation>
</xs:annotation>
<xs:restriction base="xs:string">
<xs:pattern value="[a-zA-Z]{2}"/>
</xs:restriction>
</xs:simpleType>
Where the restriction is now defined by a Regex pattern, based on the W3C
Primitive “xs:string”, and hence a valid value for ListOfISO3166 would be:
“UK DE IT FR”
In addition, in the Open Travel set, we can find a SimpleType involving a Union,
TimeOrDateTimeType:
<xs:simpleType name="TimeOrDateTimeType">
<xs:annotation>
<xs:documentation xml:lang="en">Allows for the specification of a date time or just
time.</xs:documentation>
</xs:annotation>
<xs:union memberTypes="xs:dateTime xs:time"/>
</xs:simpleType>
Here we see a pair of entities, “xs:dateTime” and “xs:time”, types from the W3C
Primitive family, which can occupy the object “TimeOrDateTimeType”, but only one of
these may be referenced at any one time.
As noted in the previous article, the Choice and Union W3C Schema idioms are not
reflected in modern strong typed languages such as C#, Java or Kotlin.
RESTRICTIONS
In the previous section, we encountered the SimpleType component, “restriction”.
In particular, we highlighted the “minLength” and “pattern” forms.
These are only two of the possible restriction forms, restriction facets, which can be
applied to specify the range of valid values for a SimpleType. The table below shows
the extent of such restriction facets:
5. 3
Restriction form Example Comment
EnumerationFacet <xs:simpleType>
<xs:annotation>
<xs:documentation>
</xs:documentation xml:lang=="en">The value
is constrained to be one of the enumerates
</xs:documentation>
</xs:annotation>
<xs:restriction base="xs:string">
<xs:enumeration value="Audi"/>
<xs:enumeration value="Golf"/>
<xs:enumeration value="BMW"/>
</xs:restriction>
</xs:simpleType>
The value of the
SimpleType is
constrained to be one of
the specified
enumerates. The
restriction is based on
the W3C Primitive
“xs:string”
FractionDigitsFacet <xs:simpleType name="Money">
<xs:annotation>
<xs:documentation xml:lang="en">Specifies an
amount, max 3 decimals.
</xs:documentation>
</xs:annotation>
<xs:restriction base="xs:decimal">
<xs:fractionDigits value="3"/>
</xs:restriction>
</xs:simpleType>
Specifies the maximum
number of decimal
places allowed. Must be
equal to or greater than
zero. The restriction is
based on the W3C
Primitive “xs:decimal”
LengthFacet <xs:simpleType>
<xs:annotation>
<xs:documentation xml:lang="en">
Specifies the maximum number of characters or
list items allowed
</xs:documentation>
</xs:annotation>
<xs:restriction base="xs:string">
<xs:length value="8"/>
</xs:restriction>
</xs:simpleType>
Specifies the exact
number of characters or
list items allowed. Must
be equal to or greater
than zero. The
restriction is based on
the W3C Primitive
“xs:string”
MaxExclusiveFacet <xs:simpleTypeGlobalSimpleType name="ExMaxExclusive>
<xs:annotation>
<xs:documentation xml:lang="en">
Specifies the upper bounds for numeric values
</xs:documentation>
<xs:annotation>
<xs:restriction base="xs:integer">
<xs:maxExclusive value="3"/>
</xs:restriction>
</xs:simpleType>
Specifies the upper
bounds for numeric
values (the value must
be less than this value).
The restriction is based
on the W3C Primitive
“xs:integer”
MaxInclusiveFacet <xs:simpleType name="WaitQueueLengthType">
<xs:annotation>
<xs:documentation xml:lang="en">
Specifies the inclusive upper bounds for a
numeric value
</xs:documentation>
</xs:annotation>
<xs:restriction base="xs:integer">
<xs:maxInclusive value="5"/>
</xs:restriction>
</xs:simpleType>
Specifies the upper
inclusive bounds for
numeric values (the
value must be less than
or equal to this value).
The restriction is based
on the W3C Primitive
“xs:integer”
MaxLengthFacet <xs:simpleType name="ExMaxLength">
<xs:annotation>
<xs:documentation xml:lang="en">
Specifies the maximum length of a string
value
Specifies the maximum
length of a string value.
The restriction is based
6. 4
</xs:documentation>
</xs:annotation>
<xs:restriction base="xs:string">
<xs:maxLength value="12"/>
</xs:restriction>
</xs:simpleType>
on the W3C Primitive
“xs:string”
MinExlusiveFacet <xs:simpleType name="ExMinExclusive">
<xs:annotation>
<xs:documentation xml:lang="en">
Specifies the minimum exclusive bound for a
numeric value
</xs:documentation>
</xs:annotation>
<xs:restriction base="xs:integer">
<xs:minExclusive value="1" />
</xs:restriction>
</xs:simpleType>
Specifies the lower
bounds for numeric
values (the value must
be greater than this
value). The restriction is
based on the W3C
Primitive “xs:string”
MinInclusiveFacet <xs:simpleType name="ExMinInclusive">
<xs:annotation>
<xs:documentation xml:lang="en">
Specifies the lower inclusive bounds for a
numeric value
</xs:documentation>
</xs:annotation>
<xs:restriction base="xs:integer">
<xs:minInclusive value="1"/>
</xs:restriction>
</xs:simpleType>
Specifies the lower
bounds for numeric
values (the value must
be greater than or equal
to this value). The
restriction is based on
the W3C Primitive
“xs:integer”
MinLengthFacet <xs:simpleType name="StateProvCodeType">
<xs:annotation>
<xs:documentation xml:lang="en">The standard
code or abbreviation for the state, province, or
region.</xs:documentation>
</xs:annotation>
<xs:restriction base="StringLength1to8">
<xs:minLength value="2"/>
</xs:restriction>
</xs:simpleType>
Specifies the minimum
number of characters or
list items allowed. Must
be equal to or greater
than zero. The
restriction is based on
the OTA type
“StringLength1to8”
MinMaxLengthFacet <xs:simpleType name="StringLength0to255">
<xs:annotation>
<xs:documentation xml:lang="en">Used for
Character Strings, length 0 to
255.</xs:documentation>
</xs:annotation>
<xs:restriction base="xs:string">
<xs:minLength value="0"/>
<xs:maxLength value="255"/>
</xs:restriction>
</xs:simpleType>
Specifies the minimum
inclusive and maximum
inclusive number of
characters or list items
allowed. The restriction
is based on the W3C
Primitive “xs:string”
PatternFacet <xs:simpleType>
<xs:annotation>
<xs:documentation xml:lang="en">
Defines, via a Regex expression, the exact
sequence of characters that are acceptable
</xs:documentation>
</xs:annotation>
<xs:restriction base="xs:string">
<xs:pattern value="[a-zA-Z0-9]{8}"/>
</xs:restriction>
</xs:simpleType>
Defines the exact
sequence of characters
that are acceptable. The
restriction is based on
the W3C Primitive
“xs:string”
7. 5
TotalDigitsFacet
<xs:simpleType name="ExTotalDigits">
<xs:annotation>
<xs:documentation xml:lang="en">
Defines the total number of digits of a
numeric value
</xs:documentation>
</xs:annotation>
<xs:restriction base="xs:integer">
<xs:totalDigits value="5"/>
</xs:restriction>
</xs:simpleType>
Facet to define the total
number of digits of a
numeric value. The
restriction is based on
the W3C Primitive
“xs:integer”
WhitespaceFacet <xs:simpleType>
<xs:annotation>
<xs:documentation xml:lang="en">
Specifies how whitespace in a candidate
value is to be handled (preserve, collapse)
</xs:documentation>
</xs:annotation>
<xs:restriction base="xs:string">
<xs:whiteSpace value="collapse"/>
</xs:restriction>
</xs:simpleType>
Specifies how white
space (line feeds, tabs,
spaces, and carriage
returns) is handled. The
value of the restriction
can be “preserve”, to
retain the whitespace
elements, or “collapse”
to remove the
whitespace elements.
The restriction is based
on the W3C Primitive
“xs:string”
These restriction facets need to be handled fluently and consistently in our
generation process. The resultant types need to apply both the direct type
restrictions as well as any inherited from a base class.
As can be seen, some are relevant only for string representations and others are
specific for numeric representations. This aspect also needs to be reflected in our
handling of restrictions.
APPLYING RESTRICTIONS
The restrictions need to be applied when we set the value of, in this case, a
SimpleType. The general structure is as follows:
Ensure a non-null candidate value string
If ( list ) split candidate on any whitespace character
For each ( element )
For each ( restriction facet specified for type )
If ( restriction is Whitespace restriction facet )
Apply whitespace restriction
Else
Validate element against facet-specific restriction
Else
For each ( restriction facet specified for type )
If ( restriction is Whitespace restriction facet )
Apply whitespace restriction
Else
8. 6
Validate element against facet-specific restriction
When the validation fails, we might have our type throw an exception
In the case where the SimpleType is a Union, then we might generate type-specific
value setters which follow the above validation flow.
THE GENERATION PROCESS
In our generator, the consuming of a (set of) XSD Schema(s) and emitting target
code, follows the pattern set out below:
The individual phases of the generation process will now be described.
The Parser Phase
To have a consistent view of any input schema our generator transforms the basic,
standard XSD form into an intermediate form, somewhat a similar process to a
source code compiler. This intermediate form is essentially a tree with generator-
and element-specific leaves. A simple example of such an intermediate
representation is shown below for the SimpleType StringLength0to128:
9. 7
XML
<xs:simpleType name="StringLength0to128">
<xs:annotation>
<xs:documentation xml:lang="en">Used for Character Strings, length 0 to 128.
</xs:documentation>
</xs:annotation>
<xs:restriction base="xs:string">
<xs:minLength value="0"/>
<xs:maxLength value="128"/>
</xs:restriction>
</xs:simpleType>
Tree
+--------------------------
SimpleType:- StringLength0to128
Documentation:- [en] Used for Character Strings, length 0 to 128.
SimpleTypeRestriction:- baseType: xs:string
MinLengthRestriction: baseType: xs:string minLength: 0
MaxLengthRestriction: baseType: xs:string maxLength: 128
+--------------------------
The tree node textual representation given here is quite simple. Things get
somewhat more complex when we look at ComplexType and Element types where
various content models come into play.
The Generation Phase
In the generation phase, the structure element tree is scanned, and the elements
are passed to the generator, which uses a target language-specific extension part to
emit the final form.
10. 8
Now we have a broad overview of a generation process, we need to get specific –
hello to Schema Studio…
SCHEMA STUDIO
Schema Studio is the desktop application which currently hosts the W3c Schema
processing code base.
At start-up it looks as shown below:
11. 9
The main view comprises a split panel, the left side of which, at this stage, displays
the root node of a Tree View which will show silos of a Schema set yet to be loaded,
whilst the right side represents a general output console window. On the far left of
the view is a set of, currently docked, Tab Windows which provide statistics related
to the silo collection (yet to be displayed), individual selected silo and individual
selected Schema. Above the split panel is a menu bar giving access to several
operational menu items.
It can easily cater for large Schema sets, for example, the ISO20022 set which, in
the version V2022, contains 31 silos and a total of 1582 Schemas, as shown below:
12. 10
The Schema collection provided by, say, Open Travel, for example, in version
V2014B, is organised into 9 silos which contain 275 Schemas, a structure which is
described in a so-called, Build Organisation file. Given that we already constructed
one (Schema Studio can do this task) and it has been previously opened in the
application, then we can open it via the “most recently used” menu item as
indicated below:
13. 11
Once the build organisation file is successfully read, the structure, silos and
Schemas, are displayed in the left-hand tool window, grouped by silo.
Selecting the root of the schema set in the tool window and opening the docked Silo
Collection Properties tool window, we see both the silo collection statistics as well as
the current configuration data, in this case as read from the Build Organisation file.
14. 12
The configuration data is editable in the tool window directly, but can also be
changed in the Configuration Options panel, accessed via the menu selection
(File>Configuration Options…):
In both cases any changes are persisted back to the Build Organisation file.
15. 13
As can be seen, the configuration data comprises:
1. the location where generated code will be written
2. the location of the application extension parts library which provide the
concrete generation code for the various Schema element and target
development languages
3. the correct link to the W3cPrimitives library so that references in the
generated code to W3C primitives is correct
4. the correct version number of the W3C primitives
Hovering on the Schema element (air/OTA_SimpleTypes.xsd), we are shown the
top-level documentation as well as a count of the Schemes contained in the silo.
Expanding the (air/OTA_SimpleType.xsd) tree view element, we see the content as
well as a range of other related collection, e.g., “Include”, “Import” etc.
In this view the individual types defined in the Schema are show along with their
corresponding XSD type, e.g., “Element”, “SimpleType”, “AttributeGroup” etc.
16. 14
Scrolling down and selecting and then hovering on the SimpleType element
DateOrDateTimeType, we see the top-level documentation for the type definition:
And now, if we right-click on the tree view node, we see the context menu as shown
below:
17. 15
Here the two contextual options are:
1. Show – if selected causes the selected item to be displayed, in XSD context,
in a separate tab view in the right-hand window
2. Generate – if selected causes the selected XSD item to be generated as code
in a form and in a location appropriate to the current configuration settings
If the “Generate” option is selected for the SimpleType “DateOrDateTimeType”, which
has a Union content model, the corresponding strong type class is generated, in the
location as specified in the configuration data, with the progress of the generation
process logged in the right-hand “Console” tab view where we see:
18. 16
Apart from logging the basic statistics such as the appropriate silo and Schema, the
generation process signals where the emitted code will be written and the target file
name.
The logging output here also hints at the first pass which is involved in generation
when it shows “Building Structure Element Repository…”.
For information it also signals the time taken to perform the generation.
REMARKS
In this article, we have looked at several SimpleType definitions and their
corresponding C# classes as expressed in documentation. We have also set out an
example set of unit tests for these classes. SimpleType types are a fundamental
component of the type ecosystem related to W3C Schema definitions, so having a
reliable platform to generate appropriate strong types is a very positive step.
As noted in the Schema Studio section, the configuration of this application allows
us to specify if the generated classes have appropriate methods to
serialize/deserialize data. The topic of serialization/deserialization is pended to a
later article.
A range of type generations are described in the Class Documentation Appendix, for
which corresponding tests are provided in the Unit Test Appendix.
19. 17
APPENDIX: CLASS DOCUMENTATION
One of the key assets for a developer when using a class library is documentation.
This appendix gives an overview of the documentation available from the generated
types we saw in the above sections and Part 1 of this series.
It should be noted that the types we consider here are but a small subset of all the
types that can be generated from the OpenTravel/OTA/V2014B Schema set.
SimpleType Name Content-Type XSD Comment
DateOrDateTimeType Union A construct to validate either a date or a
dateTime value.
ListOfISO3166 List List of country codes in ISO 3166 format.
AlphaNumericStringLength1 Restriction/Pattern Used for Alpha-Numeric Strings, length 1.
ActionType Restriction/Enumeration Identifies an action to take place.
StateProvCodeType Restriction/MinLength The standard code or abbreviation for the
state, province, or region.
ShortDescriptionType Restriction/MinMaxLength A textual description.
Money Restriction/FractionDigits Specifies an amount, max 3 decimals.
58. 56
APPENDIX: UNIT TESTS
As part of our generation process, we need to assert that the emitted classes meet
their specification. This we do by means of unit testing.
In this section we illustrate the tests that are applied to the classes noted in the
previous section.
Index
DateOrDateTimeType
ListOfISO3166
AlphaNumericStringLength1
ActionType
StateProvCodeType
ShortDescriptionType
Money
61. 59
AlphaNumericStringLength1
ActionType
Here we use a helper method to ensure that the Custom Attribute value for the
ActionType enumeration, “and”, is as expected.
This base value will be important when we come to look at the
serialization/deserialization process.