Truly Smart Quotes

  1 Replies

March 01, 2017, 02:26:18 PM
When writing extensive amounts of text, such as long-winded postings to this forum, it would be nice to get some assistance from the computer in adding typographically correct (and visually pleasing) markup to the text, in the correct form for the character encoding used and the processing to be done on the text (HTML, BBCode, straight UTF-8 characters, etc.). It can always be done manually, either as you go or after-the-fact cleanup, but it's far more convenient to have this done for you as you concentrate on the wordsmithing.

The big ones are quotation marks and apostrophes. The ASCII straight quotes " and ' are just not satisfactory for real text that you want to be proud of. Proper “quotation marks” and ‘single quotes’ (apostrophes) are nice looking (aren't they?), but a pain to manually enter as you type. En (–) and em (—) dashes are a lot better looking than single and double hyphens - and -- in properly formatted text. Special punctuation such as ellipses (…) and various forms of trademark signs (®, ™, ℗, and ℠) are surprisingly common.

Microsoft has "Smart Quotes" enabled (by default) in a number of its products, such as Word. In its version of common single-byte encodings (e.g., Western/CP 1252/Windows 1252 cf. Latin-1/ISO-8859-1), it takes the very rarely used control characters defined in x80 through x9F, and replaces them with a number of commonly used non-ASCII punctuation and some accented letters. In most cases, it looks something like (CP 1252):

Hex  Char  Unicode  HTML entity  Name  Reserved use
80U+20AC€Euroreserved control
82U+201A‚Low-"9" opening quotation markBreak Permitted Here
83ƒU+0192ƒ or ƒFlorin/script f/folderNo Break Here
84U+201E„Low-"99" opening quotation markIndex
85U+2026…EllipsisNext Line
86U+2020†Single daggerStart of Selected Area
87U+2021‡Double daggerEnd of Selected Area
88ˆU+02C6ˆCircumflex ^ accent (combining?)Character Tabulation Set
89U+2030‰o/oo per milleCharacter Tabulation with Justification
8AŠU+0160Š or Š  S + caron accentLine Tabulation Set
8BU+2039&lsaquo;Single left angle quote < (guillemet)Partial Line Down
8CŒU+0152&OElig;OE ligaturePartial Line Up
8EŽU+017D&Zcaron; or &#381;  Z + caron accentSingle Shift Two
91U+2018&lsquo;"6" opening quotation markPrivate Use One
92U+2019&rsquo;"9" closing quotation mark/apostrophe  Private Use Two
93U+201C&ldquo;"66" opening quotation markSet Transmit State
94U+201D&rdquo;"99" closing quotation markCancel Character
95U+2022&bull;Solid bulletMessage Waiting
96U+2013&ndash;En-dashStart of Guarded Area
97U+2014&mdash;Em-dashEnd of Guarded Area
98˜U+02DC&tilde;Tilde ~ accent (combining?)Start of String
99U+2122&trade;Trademark TMreserved control
9AšU+0161&scaron; or &#353;  s + caron accentSingle Character Introducer
9BU+203A&rsaquo;Single right angle quote > (guillemet)Control Sequence Introducer
9CœU+0153&oelig;oe ligatureString Terminator
9EžU+017E&zcaron; or & #382;  z + caron accentPrivacy Message
9FŸU+0178&Yuml;Y + diaeresis/umlaute accentApplication Program Command

Most of these characters are now well supported by all browsers, although some older browsers may have trouble with some of them. Note that double angle brackets « and » are not included here, although the single versions are.

So, when working with some sort of editor or word processor, how does it know which quotation mark (opening or closing) to use when I type "? How about which single quote (apostrophe) when I type '? Single and double quotes can come unpaired — some publishing styles may put an opening double quote at the beginning of a paragraph, when the whole thing is a quote, but omit the closing quote. When I type ', is that an opening single quote, or an apostrophe used in a contraction? When I type $i--, I don't want it thinking I want an em-dash there in place of the post-decrement operator! Word and similar products generally do a fairly good job of guessing what I mean, but they can be very insistent on what they think I mean, and refuse to let me override their Smart Quote entries! That is very frustrating. The editor or word processor should learn to just stay out of the way when I override their ruling.

Perhaps a happy medium would be to have editor "buttons" near the text entry window, as this forum uses for BBCode tags, to insert this special punctuation only when the user calls for it. The downside to this is that every quote and dash means I have to pause my typing and move the mouse to the button and click it (although accelerator keys could help with the more commonly used special characters). Also keep in mind that different intended uses of typed text can call for different ways of indicating that character, from simply inserting a UTF-8 character to inserting an HTML entity or BBCode markup, whether it was automatically determined or manually inserted.


Re: Truly Smart Quotes
Reply #1: March 02, 2017, 04:54:07 AM
I've been using a ‘compose key’ on my keyboards since the 80s. VT2xx keyboards at the time had a real compose key, nowadays on a standard keyboard I use ‘Right Ctrl’ for this purpose.
Many systems let you define one of the keys to function as compose key.
Adding fancy quotes and diacritical characters is as easy as [Compose] plus < plus " → “ (left double), [Compose] plus > plus " → ” (right double), [Compose] + u + " → ü and so on. And many symbols as well, like the → arrows.

It has been under your fingertips for ages…