XML Best Practices

If you are setting up your XML think of the following things. They can help you to create documents with better structure and compatibility. In short:

  • Elements vs. attributes: element-text is searchable, attributes are meta-information.
  • Using inline elements to insert text in a text makes it incompatible for searching and translating.
  • Elements that belong together are best grouped in another element.
  • To ensure compatibility encode your XML in UTF-8.

Elements vs. attributes

Using inline elements, of which the content is specified elsewhere, makes it difficult to translate such a text, since there is an untranslatable inline element.

Some text about something important that contains a word that is not actually
placed here, but elsewhere, either to insure correct spelling or some other
value, like the <directors_name/>, or the <datethatthecompanywasfounded/>
in different translations.

For example if you were to use and define <this_product_name/> somewhere else, translating this text would not always translate the product name correctly, since in some languages nouns are also subject to changes given their position in a sentence.

Perhaps it is what you want, but this is something to be aware of.

Searchable text

Text that needs to be searchable is best set in a element, rather than in an attribute. Attributes are best used for meta-information.

Grouping elements

When using elements that belong in a group, then it is best to create an element to represent that group. For example the definition list HTML element has a definition term and definition data. These can occur without any sequence and without any restriction on number. You can start a definition list with a data field, followed by a bunch of terms and then some more data fields. An example of something that requires more structure is the following:

<header>my first header</header>
<paragraph>some text to explain what I meant with my header</paragraph>
<header>And another header</header>
<paragraph>here I go again</paragraph>

This is not very strict. We advise you to group your terms and data fields. Much like in a list, you would get a definition item, which is specified as having a sequence of one (or more) terms, followed by one or more definitions. This works much better for Xopus, because when the format is unclear, the interface will be unclear, and therefore not easy to edit. So the above example is more manageable in the following form:

<section>
  <header>my first header</header>
  <paragraph>some text to explain what I meant with my header</paragraph>
</section>
<section>
  <header>And another header</header> 
  <paragraph>here I go again</paragraph>
</section>