One thing good text handling should do is
kerning of letters, which means to close up (usually) pairs of letters to prevent excessive white space between. The classic example is A and V (or V and A), where the sloped sides match up nicely. Unfortunately, I can't (yet) demonstrate that in this forum, as I can't insert margins on individual characters. Some day maybe I'll add that capability to the forum software. In the meantime, look at an article on kerning (e.g.,
https://en.wikipedia.org/wiki/Kerning) to see it illustrated. On this web browser: WAVE (it may or may not be kerned for you).
Fonts normally include
kerning tables, where given a pair of candidate letters to be output, it looks up in a table how much to the left the second character should be moved (or, possibly, to the right in some rare cases). The distance is scaled by the font size currently in use. The intent is to close up large gaps. This works fairly well for a given typeface and attributes (e.g., Times-Roman normal weight, at a given size). However, it can break down in a number of cases, leaving the output with no kerning.
- Unusual or rare accented characters are being used, that the font designer didn't account for. This is especially true if the character has to be built up or composited from a base letter and one or more combining accent marks. There may simply not be a table entry for that.
- The font characteristics change between the two letters. This could be the font size, its weight (boldness/lightness), even its slant or obliqueness (not to mention italics). The tables assume the same characteristics between two characters. A common example might be "faking" small caps by scaling down the small caps from regular sized capitals. It's not a great way to get small caps, but if the font doesn't include them, that may be the best you can do. You want to print Time (the weekly magazine) in small caps: TIME. Note that the "I" is probably not tucked under the "T", as it should be. It may even be possible that the word is split between "T" and "I", due to the HTML tags!
- The font face (family) changes (e.g., from Times-Roman to Helvetica). I've never seen a kerning table to handle such a case.
What could be done to improve this situation? Anything algorithmic should probably leave current cases alone (where a kerning table entry is provided), as the amount of kern has been very carefully (usually) adjusted by a typographer to look its best. Those cases aside, what could be done? Has anyone looked at drawing a
convex polygon (hull) around each character in a pair, and adjusting the start of the second character to make the nearest "approach" of the two a certain maximum distance (scaled by the smaller font-size in use)? Creating a convex polygon is going to be a fairly expensive process, so it should be done once and included with the font data. Creating one on-the-fly may be needed if the character in question already has had an accent mark added to it.
There is also the issue where a pure convex hull or polygon may not be appropriate. For instance, a vertical "stack" of accent marks may not interfere in the least with the neighboring character's convex hull, but would force the letters further apart than necessary. There also may be cases where an "inlet" in a (non-convex) polygon could allow admission of part of another letter.
Anything involving a fair amount of geometry (measuring the closest approach of two polygons) is going to be computationally expensive, and may be worth it only for the finest quality typesetting intended for printing. For web pages, it would almost certainly not be worth the effort. However, once a pair's kerning has been calculated, the value might be stored in a database or lookup table for future use.