This page intentionally left blank. ⬇️, ➡️, or spacebar 🛰 to start slidedeck. --- # XML --- # XML XML is a means of sharing data between computers. It can be read and understood by humans, but its primary purpose is to be a way of exchanging information between computers or systems. By doing this in a structured, standardized way, it allows for data to be passed from one system to the other without any ambiguity. If you think of the game "Telephone" where people pass a message between each other through whispering, it is easy to have that original message distorted somewhere along the way. XML is a way to prevent data distortion. --- # XML XML is a relative of HTML - XML is **extensible**: it does not consist of a fixed (limited) set of tags - XML must be **well-formed** according to a defined specification - XML documents can be **validated** against a specific schema (not just one global schema) - XML is about the data (its contents) and not about how that data is presented --- # XML structure - XML must have a **declaration** - XML can have **namespaces** - XML has **elements** - XML can have **attributes** --- # Namespaces .center[![](/img/ns.png)] --- # Document declaration ` ` This is to let the software reading an XML file that you are going to present it with XML. Computers don't always trust the file extension or other indicators that a document is what it says it is. This says "I have prepared this XML, and this declaration at the top of the file lets you know that XML is coming through next." --- # Simple Example ```
My Father's Dragon
Ruth Stiles Gannett
Children's
``` --- # Example with attributes ```
Jeanne Dielman, 23, Quai du Commerce, 1080 Bruxelles
Chantal Akerman
``` --- # How to expand? ```
My Father's Dragon
Ruth Stiles Gannett
Children's
``` --- # Simple Example ```
My Father's Dragon
El Dragon de papa
エルマーの冒険
Ruth Stiles Gannett
Ruth Chrisman Gannett
...
``` --- # XML has rules You must follow these rules when you create XML syntax: - All XML elements must have a closing tag. - XML tags are case sensitive. - All XML elements must be properly nested. - All XML documents must have a root element. - Attribute values must always be quoted. --- # All XML elements must have a closing tag Not OK: ```
``` --- # XML tags are case sensitive Not OK: ```
``` --- # All XML elements must be properly nested. Not OK: ```
``` --- # All XML documents must have a root element ```
``` --- # Attribute values must always be quoted Not OK: ```
``` --- # What if I want to use characters that are already being used? You will have to "escape" them by using other characters that are designed to represent what you want. Some examples: - type
& amp ;
to display the `&` character - type
& lt ;
to display the `<` character - type
& gt ;
to display the `>` character - type
& quot ;
to display the `"` character - type
& apos ;
to display the `'` character (Without spaces, and including the semicolon!) --- # Where else can you find XML? XML as a way to exchange data is the foundation of several well-known file formats: - SVG files - PDF files - Microsoft Office files (.docx, .xlsx, and .pptx) --- # A note on validation XML can be validated against its specification, to make sure it is well-formed structured data, with no errors like described above. But XML can also be validated against other XML files known as XSD. XSD stands for XML Schema Definition. These XSD files describe the intended structure of a document -- so not only checking to see if the metadata is well-formed, but if it adheres to more rigid standards like the expectation of specific fields or attributes, or in a specific order. --- # Also, DTDs DTD stands for "Document type definition". It is another way to define a desired XML structure, but the syntax is different from XML itself (although similar, I guess). A `DTD` can be seen at the top of every valid HTML page: `` Note this is different from the XML declaration (``) at the top of XML documents, and note how the structure is different. --- # Additional Resources - [Getting started with XML: A workshop](http://infomotions.com/musings/getting-started/getting-started.pdf) - [xmllint](https://linux.die.net/man/1/xmllint) --- # Learning more - [Computers](/presentations/computers.html) - [Data wrangling](/presentations/data-wrangling.html) - [Metadata](/presentations/metadata.html) - [JSON](/presentations/json.html) - [XSLT](/presentations/xslt.html) [Home](/)