Simple and complex types in XML Schema
Posted in Programming on Mar 16, 2006
If you’ve worked much with XML Schema, or tried to read a schema, you’ve probably run into markup that refers to simple and complex types. These terms can be confusing. In this article I’ll explain what they mean in simple terminology, and point you to resources that can help you learn more.
Suppose I’m writing code to talk to a web service, and I’ve been getting error messages complaining about something called “email” being malformed when I try to invoke the
getPreferences operation. I want to figure out exactly what the service expects me to send it. I open up the WSDL and search until I find the relevant definitions:
<element name="email"> <simpleType> <restriction base="xsd:string" /> </simpleType> </element> <element name="getPreferences"> <complexType> <sequence> <element name="email" type="email" /> </sequence> </complexType> </element>
WSDL is written with XML Schema, so I’m looking at an XML Schema document. But I’ve forgotten exactly what it means. What are
complexType again? Unless I work with schemas fairly frequently, I get confused about this (even though I’ve worked with schemas for many years). I always have to refresh my memory.
The most succinct answer is as follows:
In XML Schema, there is a basic difference between complex types which allow elements in their content and may carry attributes, and simple types which cannot have element content and cannot carry attributes.
That’s from the XML Schema Primer, which I highly recommend. I think it’s probably the best introduction to XML Schema.
Now I know the web service is expecting an element that looks like the following:
<getPreferences> <email>firstname.lastname@example.org</email> </getPreferences>
This is pretty simple. Why do I have to refresh my memory whenever I haven’t worked with schemas for a few months? The answer is, the above refers to element type. There’s also something called content type, and some of the names are similar (and therefore confusing). Each element’s content is one of several content types:
blue green red
blue #000 redfor a set of colors
<foo/>) and either conveys information by just existing, or has attributes but no content.
Just to clarify: elements have an element type, and their content has a content type. By the way, attributes can only have simple types, because they cannot themselves have attributes or children.
For further reading, I again heartily recommend the Primer linked above. Another good resource is Priscilla Walmsley’s Definitive XML Schema. She not only knows her stuff (she’s part of the W3C XML Schema Working Group), but she writes very well.