8-bit character encodings for PostScript

Joachim Wuttke, 2011-


Extend PostScript fonts from their default 7-bit ASCII encoding to 8 bits so that additional characters and symbols are supported without resorting to octal codes like \375 = ü.


The following macro transforms a standard font into an extended one, using an arbitrary encoding array:

/ReEncode { % inFont outFont encoding | -
   /MyEncoding exch def
   exch findfont
   dup length dict
      {def} forall
      /Encoding MyEncoding def
} def

The additional characters will appear correctly only if the .ps file is saved in the appropriate 8-bit encoding. Unicode is not recognized by PostScript drivers; it results in incorrect multi-symbol sequences. Usually, encoding is set through a text editor option.

§1: ¿Äße Søren Crême brulée?

Application is easiest for the Western European latin-1 (ISO/IEC 8859-1) encoding, which is predefined in the PostScript command ISOLatin1Encoding. It contains German umlauts ÄÖÜäöü and ß, French Çç, accented letters áâà…, Spanish Ññ, inverted marks ¡¿, and some more symbols like ±°§.

Define extended fonts:

/Helvetica             /HelveticaLatin1             ISOLatin1Encoding ReEncode
/Helvetica-Oblique     /HelveticaLatin1-Oblique     ISOLatin1Encoding ReEncode
/Helvetica-Bold        /HelveticaLatin1-Bold        ISOLatin1Encoding ReEncode
/Helvetica-BoldOblique /HelveticaLatin1-BoldOblique ISOLatin1Encoding ReEncode

Use them to show text:

/HelveticaLatin1 findfont 1.2 scalefont setfont
(¿Äße Søren Möhren?) show

§2: Hors d'œuvre für 1€

To support the Euro symbols € and the French o-e ligatures Œ and œ, latin-9 (ISO/IEC 8859-15) encoding is needed. Since ISOLatin9Encoding is not predefined, it must be set explicitely:

/ISOLatin9Encoding [
 /.notdef /.notdef /.notdef /.notdef /.notdef /.notdef /.notdef /.notdef
 /.notdef /.notdef /.notdef /.notdef /.notdef /.notdef /.notdef /.notdef
 /.notdef /.notdef /.notdef /.notdef /.notdef /.notdef /.notdef /.notdef
 /.notdef /.notdef /.notdef /.notdef /.notdef /.notdef /.notdef /.notdef
 /space /exclam /quotedbl /numbersign /dollar /percent /ampersand
 /quoteright /parenleft /parenright /asterisk /plus /comma /minus
 /period /slash /zero /one /two /three /four /five /six /seven /eight
 /nine /colon /semicolon /less /equal /greater /question /at /A /B /C /D
 /E /F /G /H /I /J /K /L /M /N /O /P /Q /R /S /T /U /V /W /X /Y /Z
 /bracketleft /backslash /bracketright /asciicircum /underscore
 /quoteleft /a /b /c /d /e /f /g /h /i /j /k /l /m /n /o /p /q /r /s /t
 /u /v /w /x /y /z /braceleft /bar /braceright /asciitilde /.notdef
 /.notdef /.notdef /.notdef /.notdef /.notdef /.notdef /.notdef /.notdef
 /.notdef /.notdef /.notdef /.notdef /.notdef /.notdef /.notdef /.notdef
 /dotlessi /grave /acute /circumflex /tilde /macron /breve /dotaccent
 /dieresis /.notdef /ring /cedilla /.notdef /hungarumlaut /ogonek /caron
 /space /exclamdown /cent /sterling /Euro /yen /Scaron /section /scaron
 /copyright /ordfeminine /guillemotleft /logicalnot /hyphen /registered
 /macron /degree /plusminus /twosuperior /threesuperior /Zcaron /mu
 /paragraph /periodcentered /zcaron /onesuperior /ordmasculine
 /guillemotright /OE /oe /Ydieresis /questiondown /Agrave /Aacute
 /Acircumflex /Atilde /Adieresis /Aring /AE /Ccedilla /Egrave /Eacute
 /Ecircumflex /Edieresis /Igrave /Iacute /Icircumflex /Idieresis /Eth
 /Ntilde /Ograve /Oacute /Ocircumflex /Otilde /Odieresis /multiply
 /Oslash /Ugrave /Uacute /Ucircumflex /Udieresis /Yacute /Thorn
 /germandbls /agrave /aacute /acircumflex /atilde /adieresis /aring /ae
 /ccedilla /egrave /eacute /ecircumflex /edieresis /igrave /iacute
 /icircumflex /idieresis /eth /ntilde /ograve /oacute /ocircumflex
 /otilde /odieresis /divide /oslash /ugrave /uacute /ucircumflex
 /udieresis /yacute /thorn /ydieresis
] def

Printer support for the new symbols is only guaranteed for PostScript Level ≥ 3.



08may2011 first published.