Post without Account — your post will be reviewed, and if appropriate, posted under Anonymous. You can also use this link to report any problems registering or logging in.

CTS 5 - Fallback fonts (font glyph substitution)

  • 7 Replies
  • 2770 Views
*

Offline sciurius

  • Jr. Member
  • **
  • 67
    • View Profile
    • Website
CTS 5 - Fallback fonts (font glyph substitution)
« March 17, 2017, 04:40:39 AM »
I'm not sure if this is the right thread, but anyway...

In this modern Unicode era we run into the problem that there are many more symbols to be shown than are present in a particular font. For example, when dealing with music I want to show a sharp ♯, flat ♭, delta Δ, and so on. I notice that many applications, like the web browser I'm typing this in, use font glyph substitution using fallback fonts if necessary. The glyphs are borrowed from a different font. Google has developed free fonts (the Noto-fonts) to complement its standard font (Roboto) with virtually all glyphs defined in Unicode.

I think  supporting fallback fonts is a must for a future-proof PDF::API2.

Update: added CTS 5 tracking number
« Last Edit: March 17, 2017, 09:25:14 AM by Phil »

*

Offline Phil

  • Global Moderator
  • Hero Member
  • *****
  • 619
    • View Profile
Re: CTS 5 - Fallback fonts (font glyph substitution)
« Reply #1: March 17, 2017, 09:24:13 AM »
Indeed, this would be the right place to request this feature, and it's a good one. Browsers have done this for quite a while, first looking for a glyph in the first requested font-family, then the second, etc. until the default fallback. Actually, come to think of it, I'm not sure what happens if you explicitly list one or more font-families in CSS and the glyph is not found in any of them — does a browser try its own list of fonts in a desperate attempt to find something instead of a tofu? For PDF generation, that might be something configurable.

What is the industry standard for mapping a requested (but missing) glyph to something else, especially if they don't happen to have the same CId or Unicode code point or a standard name? The original encoding would say what the character is, but in a fallback font it could be at a different code point or under a different name or ID.

I see you're having some fun with the entity code buttons/BBCodes. They're not standard SMF, but a "mod".

*

Offline sciurius

  • Jr. Member
  • **
  • 67
    • View Profile
    • Website
Re: CTS 5 - Fallback fonts (font glyph substitution)
« Reply #2: March 17, 2017, 03:00:43 PM »
Industry standard:

HarfBuzz is an OpenType text shaping engine.

The current HarfBuzz codebase [...] is stable and under active maintenance. This is what is used in latest versions of Firefox, GNOME, ChromeOS, Chrome, LibreOffice, XeTeX, Android, and KDE, among other places.

[...] fun with the entity code buttons/BBCodes:

Do I? I just typed the characters using my compose key ( Compose + # + # → ♯ , Compose + # + b → ♭, Compose + Compose + g + D → Δ (allright, I created the entry for the last one myself in my private .XCompose) ).

*

Offline Phil

  • Global Moderator
  • Hero Member
  • *****
  • 619
    • View Profile
Re: CTS 5 - Fallback fonts (font glyph substitution)
« Reply #3: March 18, 2017, 11:26:14 AM »
That compose key sounds pretty powerful! Presumably it's configured to put out certain Unicode sequences, and doesn't handle everything. I don't have a compose key, but in this forum I can do ♯ ([entn]9839[/entn]), ♭ ([entn]9837[/entn]), and Δ ([ent]Delta[/ent]). Of course, that doesn't help me if I want to enter these characters into a file… but I can copy and paste from the forum into a UTF-8 file.

*

Offline sciurius

  • Jr. Member
  • **
  • 67
    • View Profile
    • Website
Re: CTS 5 - Fallback fonts (font glyph substitution)
« Reply #4: March 19, 2017, 11:24:00 AM »
Depending on your desktop, you can designate one of the keys to be Compose.
This is from the keyboard preferences of the MATE desktop:

*

Offline Phil

  • Global Moderator
  • Hero Member
  • *****
  • 619
    • View Profile
Re: CTS 5 - Fallback fonts (font glyph substitution)
« Reply #5: December 25, 2017, 02:41:17 PM »
 PhilterPaper commented on Oct 25

See also https://www.catskilltech.com/forum/pdf-builder-general-discussions/font-families/. This is a somewhat related discussion for PDF::Builder knowing what font variants are available for each supported font, and being able to specify font variants (bold, italic, small caps, etc.) in a simple and consistent way. It might even involve going to another font, or building a "synthetic font", to get unsupported variants. Someone would have to specify a list of fallback fonts to look at, before either crudely synthesizing a variant (e.g., font size change for small caps, or overprinting (with offset) for bold) or just giving up.

 PhilterPaper commented on Oct 25

One thing to keep in mind is that a requested glyph is going to be known only by its code point and encoding (xFF in Latin-1 is different than xFF in CP1253, and isn't valid in UTF-8). At that point (knowing the output encoding of a text string), each character could be checked against a list of supported glyphs for a font, and a request to generate an alternate font could be made. Unfortunately, this sounds like it could be quite costly in time and/or space, but that may be the price to pay, especially if not all operating systems offer utilities to do this. Another complication would be that for some glyphs, you might want one particular fallback font, while for others, you might desire a different fallback font -- a global list of fonts may not produce the best results.

Anyway, before proceeding with anything, we need to give this matter a lot of careful thought, to make sure we're covering as many wishlist items as possible with the minimum effort and code.

*

Offline Phil

  • Global Moderator
  • Hero Member
  • *****
  • 619
    • View Profile
Re: CTS 5 - Fallback fonts (font glyph substitution)
« Reply #6: January 05, 2020, 11:24:31 AM »
@terefang commented

have you looked into unifont.pm ? this is what this module is for.

@PhilterPaper commented

I don't think this is what unifont() is intended for. As far as I can tell, unifont specifies that a certain opened font is to be used for a certain range of Unicode points. The request made by Johan is for font fallback, if a requested Unicode point does not have a glyph available in the current font. A browser does this, looking down a list of desired typefaces (font families) until a glyph is found.

I suppose that if you know in advance that a desired glyph will be unavailable in your chosen font, that you could select (via unifont) an alternate font to use for that Unicode point, but that's not quite what was requested (and involves a lot more manual work).

@terefang thanks for joining in the discussion, and I hope to see more input from you!

*

Offline Phil

  • Global Moderator
  • Hero Member
  • *****
  • 619
    • View Profile
Re: CTS 5 - Fallback fonts (font glyph substitution)
« Reply #7: January 07, 2020, 10:49:05 AM »
terefang commented 2020-01-06

@PhilterPaper

yes, you are right that (with plain unifont.pm) you have to know in advance the codepoints to fallback to.

nonetheless are the introspection capabilities of the font objects good enough for this, so the code to be written wont get too complicated.

the real question would be which font to fallback to.

PhilterPaper commented 2020-01-06

The original request (from Johan) is for HTML/CSS style to go through a list of fonts (let's assume italic and bold are dealt with properly), in the indicated order, until a font is found that has a glyph for the desired Unicode point. In other words, give a list of most desirable to least desirable typefaces/fonts and the Unicode text to set, and hopefully most of the time you'll get glyphs from the most desired typeface, with occasional fallbacks to the less desired typefaces.

Anyway, it shouldn't be terribly hard to do such a thing. The code just needs to detect that there is no glyph (CId) for the Unicode character, and (rather than outputting a blank or an empty box) go down the list of alternate fonts, opening up the appropriate ones and checking if the desired glyph exists. This might be a good extension to Johan's Text::Layout, which already concerns itself with keeping track of the desired typeface and whether it's bold and/or italic (among other things). IIRC you have to pre-open all the fonts you'll use, so worst case you'd have to open some more for alternates.

While we're at it, the unifont() method of specifying different lists of specific opened fonts for ranges of Unicode characters (single-byte encodings too?) might be blended in.