D-Type Text Layout Extension

A simple extension for complex Unicode scripts.
D-Type Text Layout Extension

Download Purchase

Commonly used scripts such as Latin, Greek or Cyrillic are easy to display. All you need to do is render their characters in a simple linear progression from left to right, and the resulting text is correctly displayed. Unfortunately, not all of the world’s scripts are that simple. Many scripts, just to be displayed correctly, require special processing such as character reordering, contextual shaping, ligatures, positioning adjustments, etc. These scripts are known as complex Unicode scripts. Arabic, Indic and Thai are among those scripts. Even Latin scripts can be considered “complex” in some contexts, as they often use ligatures and various types of positioning adjustments (e.g. kerning) to enhance the appearance of displayed text.

The Unicode Standard alone does not help software developers with the task of laying out text. Unicode deals with the units of textual content (characters) and provides a good solution for the computer representation, storage and interchange of text. However, Unicode does not address the units of textual display (glyphs) and does not provide a solution to the problem of actual text layout, shaping, and advanced typography. Obviously, a global, efficient and portable Unicode-based text layout/shaping engine is necessary to help developers with this quite challenging task.[1]

Challenges

To better understand the problems that layout and shaping engines must overcome, here are some of the complications associated with the display of various world scripts:

Directionality

Arabic and Hebrew are read from right to left. Consequently, the order of characters in presentation differs from that in storage. Character positioning, cursor movement and text selection in bidirectional context (the context in which left-to-right and right-to-left text runs coexist) is typically the biggest challenge to overcome. The characters are not laid out in a simple linear progression from left to right. In other words, the logical order of characters (the order in which the user enters text as a sequence of keystrokes) can be different from the visual order (the order in which glyphs are represented to the user).

FIG 1 In bidirectional text the trailing edge of one character is not necessarily adjacent to the leading edge of the next character. The above example shows one logically contiguous selection of characters (but visually disjointed).
FIG 2 Another bidirectional text example. This text is read right-to-left except the date, which is read left-to-right.

Contextual Forms

Arabic scripts are not only read from right to left; they also require special processing to display contextual forms properly. For example, the visual appearance of a character in Arabic can change significantly depending on its position within a word and the characters that surround it. Most (but not all) characters have four different visual forms: isolated (when the character stands alone), initial (at the beginning of a word), middle (within the word) and final (at the end of a word).[2] This means that layout and shaping engines must not only shape those forms properly but also detect word boundaries within a given run of text.[3]

FIG 3 The above example shows an Arabic text sample without any special processing (in which characters are in their isolated form) and then the same text sample again with contextual shaping enabled (in which characters take their proper form depending on whether they are at the beginning, in the middle or at the end of the word).

Ligatures

With Latin, Greek, Cyrillic and even Chinese, Korean, Japanese (CJK) scripts, there is often a direct one-to-one mapping between a character and its glyph. However, in Arabic, Indic and other complex scripts, several characters can combine to form a whole new glyph. These special glyphs are called ligatures. Although Latin scripts can also utilize ligatures, most Latin ligatures are optional and designed to improve the aesthetic appearance of certain character combinations. In contrast, in Arabic and many other complex scripts, certain ligatures are mandatory. In those cases, it is unacceptable to present certain character combinations without using the appropriate ligature.

FIG 4 Ligatures are not only used in complex scripts such as Arabic but sometimes in Latin scripts too. The first ligature in the above illustration is the Arabic Lam-Alef ligature which is mandatory for Arabic scripts. The remaining ligatures are some of the Latin standard and discretionary ligatures.
FIG 5 Ligatures in Latin scripts are usually optional.

Glyph Reordering

The South Asian family of scripts (Indic) exhibit rendering complications that are not found in any other script. Letters are drawn in a different order from that in which they are typed or stored in memory; glyphs are inserted or rearranged; and complex ligatures are formed. The actual amount of pre-processing necessary to convert a series of Unicode Devanagari characters into a series of glyphs is extensive. It should therefore come as no surprise that the Unicode Standard had to dedicate more than one hundred pages[4] just describing the proper processing of Devanagari characters.

FIG 6 Contextual shaping for Indic scripts must deal with complex glyph rearrangements and ligatures.

Multiple Code Points

The challenge with contextual shaping is that a given character, for all its various glyph forms, usually has only one defined code point in the Unicode Standard. Similarly, ligatures often do not have a Unicode code point.[5] It is the responsibility of the layout and shaping engine to determine, at run time and depending on the context, the appropriate visual form of each character in the text.

Kerning

Kerning is a typographic adjustment that modifies the spacing between specific pairs of characters to enhance the overall appearance and readability of displayed text. While optional, kerning is commonly applied in Latin, Cyrillic and Greek scripts and is a standard practice in professional typesetting. Proper kerning can significantly improve the visual harmony of text, making it more aesthetically pleasing and easier to read.

FIG 7 Text layout with kerning.
FIG 8 Text layout without kerning.

Notes

Solution

D-Type Text Layout Extension, thanks to the underlying HarfBuzz text shaping engine, solves all of these problems in a simple and straightforward way. All complex script rendering is performed in a uniform and consistent manner. The application is responsible for supplying the Text Layout Extension with an array of Unicode character codes in reading or logical order, while the extension returns an array of glyphs to display in the correct visual order, along with the coordinates necessary to properly position those glyphs. Additionally, it provides character indices to map each glyph back to the input text array. These positioned glyphs can then be easily rendered using D-Type Font Engine.

The benefit of this approach is that software developers do not have to be familiar with the various complex scripts or any of the shaping rules that might be applicable to each script. Regardless of the script, the Text Layout Extension is always utilized in the same consistent way. It is only important to be aware of the following basic concepts:

As mentioned above, D-Type Text Layout Extension internally relies on the HarfBuzz text shaping engine, a popular open-source, portable and platform-independent layout engine capable of shaping many complex Unicode scripts, including Arabic, Bengali, Devanagari, Gujarati, Gurmukhi, Han, Hebrew, Kannada, Malayalam, Oriya, Tamil, Telugu and Thai. The HarfBuzz text shaping engine uses layout tables found in font files and the knowledge of generic script shaping rules to lay out complex scripts.

D-Type Text Layout Extension takes care of all the font-specific tasks and interaction with the HarfBuzz text shaping engine. Software developers can now use one simple extension to display all supported complex scripts without the need to write their own font access interfaces. D-Type Text Layout Extension is an extension of D-Type Font Engine that makes it possible to easily render complex scripts, hiding from the developer all the complexity associated with the text shaping process and the need to interface with the HarfBuzz text shaping engine directly.

Benefits

For software developers who use or plan to use D-Type rendering technology, D-Type Text Layout Extension brings the following benefits:

  • No need to access fonts. Developers don’t have to manage or access the font files themselves. D-Type Text Layout Extension uses the same font IDs as D-Type Font Engine.

  • Caching of font layout tables. D-Type Text Layout Extension caches frequently used layout tables found in font tables, ensuring that subsequent access to the same tables is efficient and quick.

  • Caching of layout instances. D-Type Text Layout Extension caches layout instances for various complex scripts, allowing the same shaping rules to be applied to different text runs quickly and efficiently.

  • Small, compact, portable. The entire D-Type Text Layout Extension, which includes the latest HarfBuzz text shaping engine, font access interfaces and the caching sub-system fits in approximately 600 - 800 KB of machine code, depending on the platform.

  • Easy, single package solution. All you need to render complex world’s script is D-Type Font Engine and D-Type Text Layout Extension. Together, these two libraries function as a single library.

The most recent D-Type Text Layout Extension includes HarfBuzz text shaping engine 11.2.1 that was released in July, 2025. As new HarfBuzz text shaping engine releases become available, the Text Layout Extension will be updated to support the most recent version.

Key Specifications

Text Shaping Engine

  • HarfBuzz

Supported Font Layout Tables

  • OpenType
  • AAT (Apple Advanced Typography)

Algorithms

  • Unicode Bidirectional Algorithm (BiDi)
  • Line Breaker (for some scripts)
  • Hyphenator (for Latin-based languages)
  • Caching of Font Layout Tables
  • Caching of Layout Instances

Input Text Encoding Schemes

  • ANSI
  • UTF-8
  • UCS-2 Big Endian
  • UCS-2 Little Endian
  • UCS-4 Little Endian
  • UCS-4 Big Endian
  • UTF-16 Little Endian
  • UTF-16 Big Endian
  • UTF-32 Little Endian
  • UTF-32 Big Endian
  • Auto-detect

Supported Unicode Scripts

  • Arabic
    • Arabic
    • N’Ko
    • Syriac
  • Indic
    • Devanagari
    • Bengali
    • Gujarati
    • Gurmukhi
    • Kannada
    • Malayalam
    • Oriya
    • Tamil
    • Telugu
  • CJK (Chinese, Japanese, Korean)
    • Han (CJK Unified Ideographs)
    • Hiragana
    • Katakana
    • Hangul
  • Thai and Lao
  • Khmer
  • Myanmar
  • Tibetan
  • Hebrew
  • Georgian
  • Armenian
  • Greek
  • Cyrillic
  • Latin
  • Universal Shaping Engine (USE): Covers complex scripts not explicitly listed above.
  • Default Shaping Model: Includes Tifinagh and many other non-complex scripts.
  • Emoji: Supports emoji modifier sequences, flag sequences, and Zero Width Joiner (ZWJ) sequences.

This is not an exhaustive list; it is also possible that future versions of the HarfBuzz text shaping engine may add support for additional Unicode scripts.

Output Text Direction

  • Left-to-Right
  • Right-to-Left
  • Top-to-Bottom
  • Bottom-to-Top

Dependencies

D-Type Font Engine

Availability

Static or shared (dynamically linked) library for:

  • Microsoft Windows (all versions, both Intel and ARM based)
  • macOS (all versions, both Intel and ARM based)
  • Linux (all modern distributions, both Intel and ARM based)
  • BSD (FreeBSD, NetBSD, OpenBSD)
  • Raspberry Pi
  • Android
  • iOS
  • Xbox
  • Custom builds (32-bit and 64-bit architectures)

See Platforms and Portability for details.


Screenshots

Here are a few screenshots that show D-Type Text Layout Extension in action.

FIG 9 Unicode scripts with ligatures
FIG 10 Bidirectional scripts

Need More Information?

If you have a question about D-Type technology that you can’t find the answer to, please use our Obtain Additional Information form. We will publish your question along with our response within a few days and notify you once the answer is available on our website.

Additionally, you may find it helpful to explore the history of D-Type releases and review the D-Type News page.

Get Started Now Using D-Type

Available in binary, object, and/or source code format for any hardware or operating system environment, D-Type technology is an excellent choice for software developers seeking a rendering solution that is affordable, mature, reliable, secure, well-maintained, well-supported, super-fast and packed with features.

About D-Type Contact Us

Copyright © 1996-2025 D-Type Solutions. Last updated on August 22, 2025.