Markup languages structure, annotate, and format textual information for electronic devices. From early GML to HTML and XML, they use tags for presentation and data exchange, evolving to meet diverse digital needs like web content, configuration, and data encoding.
Markup Languages: What They Are and Why They Matter
article
Around the mid-1990s, when the Internet and the WWW (World Wide Web) were introduced for general use, web browsers became an essential launchpad to access documents on the web. Hypertext Markup Language (HTML) was the markup language used in a web browser; therefore, it started to be used widely. However, HTML is not the first markup language ever. Markup languages have been used ever since textual information needed to be structured, annotated, and formatted to be understandable across electronic tools and devices.
Why Markup Language?
When electronic devices such as computers, printers, and display monitors became prevalent in presenting, processing, and describing textual documents and information, there was a need to develop an annotation language to mark up text.
To meet these new challenges, developers grappled with questions like:
- How do you present a textual document on a monitor, formatted with title/headers, etc.?
- How do you send the formatting information to a printer?
- How do you exchange textual information over the web while preserving the structure of the information and conveying the description of the information?
A markup language was the answer and solution to all of these. A markup language serves one or more of three functions:
- Presentation Language
- Description Language
- Processing Instructions Language
First of a Kind
The IBM Generalized Markup Language (GML) was one of the first markup languages to be used widely to structure and format textual information. It became the one-markup-fits-all that could be used to format the same document for various devices: screen, dot-matrix printer, and laser printer. IBM used GML to develop its Information Structure Identification Language (ISIL) publishing tool for generating documentation. GML was formalized by the ISO Standard for information processing as Standard Generalized Markup Language (SGML) in 1986.
Let’s Use Tags
Most markup languages, including the earliest ones; GML, SGML; and the latter-day HTML and XML, use tags to describe, annotate, structure, and format textual information on electronic devices. SGML, the standard language for defining markup languages, provides a syntax based on elements. An element consists of a start tag <some-element> and an end tag </some-element>. HTML is an application of SGML and uses tags to describe a document’s title, paragraphs, and headers. A web browser uses these tags to display a document. Extensible Markup Language (XML), another application of SGML, is widely used to exchange information over the web using documents consisting of tags.
Evolving Standards
Markup languages can serve different functions, and the specific standard for each markup language evolves accordingly. The World Wide Web Consortium (W3C) develops and manages standards for markup languages used on the Web, HTML and XML. To facilitate the use of HTML, its related W3C standard allows certain tags to be omitted. For example, if an HTML document doesn’t start with a comment, the html element’s start tag <html> and the end tag </html> may be omitted. A p element, used to add a text paragraph, may omit the end tag </p> in certain types of tag use. However, XML doesn’t have a provision to omit end tags for elements with start tags. Both XML and HTML support empty elements; however, the syntax is slightly different. In XML, an empty element is denoted with a self-closing syntax, <some-tag />. In HTML5, the slash isn't required for empty/void elements (e.g., <br>), but is typically used (e.g., <img ... />).
Yet Another Markup Language
Markup languages provide a predictable structure and format for textual information. Schemas can be used to require a document to follow a specific structure. An XML Schema is an example of specifying structural conformity. With the versatility that markup languages provide, they are used for all types of applications, both web and non-web, including the exchange of structured information, web services, and instructions for programs (e.g., Program Call Markup Language). Some modern-day markup languages, such as the YAML Ain't Markup Language (originally Yet Another Markup Language), differ significantly from the SGML standard syntax but nevertheless are designed for information and data encoding. YAML is used for creating configuration files and for storing & transmitting data.
Lets Hang!