Cyrillic in LATEX and Postscript and Unicode


Jump to a topic:
— Cyrillic in LATEX
— Cyrillic in Postscript
— Cyrillic in Unicode

You might also be interested in this free on-line journal on Postscript and PDF.

Here is how to deal with LATEX fonts:


Cyrillic in LATEX

The following produces Cyrillic Postscript output for me. There are other ways of doing this, see http://www.tug.org/TeXnik/mainFAQ.cgi?file=language/cyrillic for a starting point.

1 — Use the cyrillic package
Include this line in your latex preamble:
\usepackage{cyrillic}
If that fails with an error about being unable to find the cyrillic package, and you cannot find the right software package to add, you could try my cyrillic.sty file. Put it somewhere, and now you will use it like this:
\usepackage{/home/cromwell/.latex/cyrillic}
Note that you do not include the ".sty" part of the file name.

2 — Define some Cyrillic fonts
Include these lines in your latex preamble, right after the above \usepackage line:

\newcommand{\cyrrm}{\fontencoding{OT2}\selectfont\textcyrup}
\newcommand{\cyrit}{\fontencoding{OT2}\selectfont\textcyrit}
\newcommand{\cyrsl}{\fontencoding{OT2}\selectfont\textcyrsl}
\newcommand{\cyrsf}{\fontencoding{OT2}\selectfont\textcyrsf}
\newcommand{\cyrbf}{\fontencoding{OT2}\selectfont\textcyrbf}
\newcommand{\cyrsc}{\fontencoding{OT2}\selectfont\textcyrsc}
%%%% cyrrm = "Roman", or really upright, normal font
%%%% cyrit = Italic (cursive forms of letters)
%%%% cyrsl = Italic (non-cursive forms of letters)
%%%% cyrsf = Sans-serif
%%%% cyrbf = Bold-face 

3 — Use transliteration
For the most part, latex will "do the right thing" turning your ASCII typing into Russian, if you are careful. Examine your output carefully, and adjust as needed. I have no idea about transliteration of other Slavic languages that use Cyrillic — Ukrainian and Belarussian are probably close enough, but for Serbian, Macedonian, Bulgarian, and other South Slavic languages, look at some of those web sites above.

Some special characters:

\cprime      ь     "soft sign"
\cdprime     ъ     "hard sign"
\u{i}     й     "i-kratkaya"
\"{e}     ё     "yoh"
\`{e}     э     "e-oborotnoye"
\`{E}     Э     "E-oborotnoye"

This page is very helpful: http://www.bitjungle.com/files/isoent-ref.pdf

Here is an example, the same silly text in several fonts:

%% Start the document
\documentclass[letterpaper,12pt]{letter}
\usepackage[dvips]{color}
\makeatother
%% Cyrillic font definitions
\usepackage{/home/cromwell/.latex/cyrillic}
\newcommand{\cyrrm}{\fontencoding{OT2}\selectfont\textcyrup}
\newcommand{\cyrit}{\fontencoding{OT2}\selectfont\textcyrit}
\newcommand{\cyrsl}{\fontencoding{OT2}\selectfont\textcyrsl}
\newcommand{\cyrsf}{\fontencoding{OT2}\selectfont\textcyrsf}
\newcommand{\cyrbf}{\fontencoding{OT2}\selectfont\textcyrbf}
\newcommand{\cyrsc}{\fontencoding{OT2}\selectfont\textcyrsc}
\newcommand{\lat}{\fontencoding{OT1}\selectfont}

%%% Support for "\begin{alltt}...\end{alltt}"
\usepackage{alltt}
%%% Support \euro for Euro symbol
\usepackage{textcomp}
\makeatother
\newcommand{\euro}{\textsf{\texteuro}}

\begin{document}

{\cyrrm{Zdravstvu\u{i}te! \\
Krasivaya sobaka ili krasivie sobaki. \\
Ob{\cdprime}ekha\u{i}te Rossii! \\
Kreml\cprime -- doma Krushcheva i Gorbach\"{e}va.}}

{\cyrsl{Zdravstvu\u{i}te! \\
Krasivaya sobaka ili krasivie sobaki. \\
Ob{\cdprime}ekha\u{i}te Rossii! \\
Kreml\cprime -- doma Krushcheva i Gorbach\"{e}va.}}

{\cyrit{Zdravstvu\u{i}te! \\
Krasivaya sobaka ili krasivie sobaki. \\
Ob{\cdprime}ekha\u{i}te Rossii! \\
Kreml\cprime -- doma Krushcheva i Gorbach\"{e}va.}}

{\cyrsf{Zdravstvu\u{i}te! \\
Krasivaya sobaka ili krasivie sobaki. \\
Ob{\cdprime}ekha\u{i}te Rossii! \\
Kreml\cprime -- doma Krushcheva i Gorbach\"{e}va.}}

{\cyrbf{Zdravstvu\u{i}te! \\
Krasivaya sobaka ili krasivie sobaki. \\
Ob{\cdprime}ekha\u{i}te Rossii! \\
Kreml\cprime -- doma Krushcheva i Gorbach\"{e}va.}}

{\cyrsc{Zdravstvu\u{i}te! \\
Krasivaya sobaka ili krasivie sobaki. \\
Ob{\cdprime}ekha\u{i}te Rossii! \\
Kreml\cprime -- doma Krushcheva i Gorbach\"{e}va.}}

\end{document} 

And, here is the result, after generating Postscript with latex, and converting that to PNG and cropping it with convert from the ImageMagick suite:

Cyrillic Postscript output

Cyrillic in Postscript

The theory is that you can do something like the following and get Postscript that renders Cyrillic:

%!
%%Creator: Your Name Here
%%BoundingBox: 0 0 792 611
%%
%% Postscript Cyrillic demo
%%
%% Define measurements in millimeters, 1 mm = 2.834645 Postscript point
/mm { 2.834645 mul } def
%% Use the Cyrillic-Italic font.  Could be just Cyrillic, etc:
/Cyrillic-Italic findfont 12 scalefont setfont
%% Move to the location (50mm, 50mm) and Russify my name:
50 mm 50 mm moveto (Robert Vilhelmoviq Kromvell) midshow
showpage 

You have to figure out the quirky character-to-character mapping. Some letters are obvious, just the ASCII letter that is pronounced in a Roman-alphabet language much like the corresponding Cyrillic one is in a Slavic language. Others are not, like these:

-/_  for  "eh/EH"
j/J  for  "zh/ZH"
y/Y  for  "e-kratkaya/E-KRATKAYA"
[/{  for  "yuri/YURI"
]/}  for  "yu/YU"
h/H  for  "kh/KH"
q/Q  for  "ch/CH"
w/W  for  "sh/SH"
x/X  for  "shch/SHCH"
c/C  for  "ts/TS"
+/\# for  "YAT/yat" 

The one that I cannot figure out is the Cyrillic character "ya" or я — if you know how to do this with the ASCII encoding, without remapping your keyboard to a Cyrillic character set, please let me know!


Cyrillic in Unicode

The real answer is what you find at the Unicode organization's site. I have this table for my own purposes — I have a copy on my laptop, and I don't have to bother with PDF rendering. Plus, you can see how well your browser renders Unicode...

Both Firefox and Konqueror do fine for codes 0400-045f, which covers the most common characters and everything I would care to do. Unicode describes the codes as:
0400-040f — Cyrillic extensions
0410-044f — Basic Russian alphabet
0450-045f — Cyrillic extensions
0460-0481 — Historic letters
0482-0489 — Historic miscellaneous
048a-04f9 — Cyrillic extensions
04fa-04ff — Additions for Nivkh
0500-050f — Komi letters
0510-0513 — Cyrillic extensions
Codes 048a-04ff are mostly for Cyrillic representation of non-Slavic languages like Sami, Azerbaijani, Yakut, Tatar, and so on. 0500-0513 are entirely for Cyrillic representation of Komi, Enets, Khanty, Chuckchi, etc. Read the Unicode pages to see how arcane some of these are.

Coverage of codes 0460-04ff (historic and non-Slavic) is spotty in Konqueror and better in Firefox on OpenBSD, even with all three packages xcyrillic, kde-i18n-ru, and firefox-i18n-ru installed. However, it's well-rendered and complete in both of those browsers on Linux.

To use this table: Place the code between &#x and ;. So, the Russian word да is created with:
да

Basic Russian Alphabet
Ѐ 0400 А 0410 Р 0420 а 0430 р 0440 ѐ 0450 Ѡ 0460 Ѱ 0470 Ҁ 0480 Ґ 0490 Ҡ 04a0 Ұ 04b0 Ӏ 04c0 Ӑ 04d0 Ӡ 04e0 Ӱ 04f0 Ԁ 0500 Ԑ 0510
Ё 0401 Б 0411 С 0421 б 0431 с 0441 ё 0451 ѡ 0461 ѱ 0471 ҁ 0481 ґ 0491 ҡ 04a1 ұ 04b1 Ӂ 04c1 ӑ 04d1 ӡ 04e1 ӱ 04f1 ԁ 0501 ԑ 0511
Ђ 0402 В 0412 Т 0422 в 0432 т 0442 ђ 0452 Ѣ 0462 Ѳ 0472 ҂ 0482 Ғ 0492 Ң 04a2 Ҳ 04b2 ӂ 04c2 Ӓ 04d2 Ӣ 04e2 Ӳ 04f2 Ԃ 0502 Ԓ 0512
Ѓ 0403 Г 0413 У 0423 г 0433 у 0443 ѓ 0453 ѣ 0463 ѳ 0473 ҃ 0483 ғ 0493 ң 04a3 ҳ 04b3 Ӄ 04c3 ӓ 04d3 ӣ 04e3 ӳ 04f3 ԃ 0503 ԓ 0513
Є 0404 Д 0414 Ф 0424 д 0434 ф 0444 є 0454 Ѥ 0464 Ѵ 0474 ҄ 0484 Ҕ 0494 Ҥ 04a4 Ҵ 04b4 ӄ 04c4 Ӕ 04d4 Ӥ 04e4 Ӵ 04f4 Ԅ 0504
Ѕ 0405 Е 0415 Х 0425 е 0435 х 0445 ѕ 0455 ѥ 0465 ѵ 0475 ҅ 0485 ҕ 0495 ҥ 04a5 ҵ 04b5 Ӆ 04c5 ӕ 04d5 ӥ 04e5 ӵ 04f5 ԅ 0505
І 0406 Ж 0416 Ц 0426 ж 0436 ц 0446 і 0456 Ѧ 0466 Ѷ 0476 ҆ 0486 Җ 0496 Ҧ 04a6 Ҷ 04b6 ӆ 04c6 Ӗ 04d6 Ӧ 04e6 Ӷ 04f6 Ԇ 0506
Ї 0407 З 0417 Ч 0427 з 0437 ч 0447 ї 0457 ѧ 0467 ѷ 0477 ҇ 0487 җ 0497 ҧ 04a7 ҷ 04b7 Ӈ 04c7 ӗ 04d7 ӧ 04e7 ӷ 04f7 ԇ 0507
Ј 0408 И 0418 Ш 0428 и 0438 ш 0448 ј 0458 Ѩ 0468 Ѹ 0478 ҈ 0488 Ҙ 0498 Ҩ 04a8 Ҹ 04b8 ӈ 04c8 Ә 04d8 Ө 04e8 Ӹ 04f8 Ԉ 0508
Љ 0409 Й 0419 Щ 0429 й 0439 щ 0449 љ 0459 ѩ 0469 ѹ 0479 ҉ 0489 ҙ 0499 ҩ 04a9 ҹ 04b9 Ӊ 04c9 ә 04d9 ө 04e9 ӹ 04f9 ԉ 0509
Њ 040a К 041a Ъ 042a к 043a ъ 044a њ 045a Ѫ 046a Ѻ 047a Ҋ 048a Қ 049a Ҫ 04aa Һ 04ba ӊ 04ca Ӛ 04da Ӫ 04ea Ӻ 04fa Ԋ 050a
Ћ 040b Л 041b Ы 042b л 043b ы 044b ћ 045b ѫ 046b ѻ 047b ҋ 048b қ 049b ҫ 04ab һ 04bb Ӌ 04cb ӛ 04db ӫ 04eb ӻ 04fb ԋ 050b
Ќ 040c М 041c Ь 042c м 043c ь 044c ќ 045c Ѭ 046c Ѽ 047c Ҍ 048c Ҝ 049c Ҭ 04ac Ҽ 04bc ӌ 04cc Ӝ 04dc Ӭ 04ec Ӽ 04fc Ԍ 050c
Ѝ 040d Н 041d Э 042d н 043d э 044d ѝ 045d ѭ 046d ѽ 047d ҍ 048d ҝ 049d ҭ 04ad ҽ 04bd Ӎ 04cd ӝ 04dd ӭ 04ed ӽ 04fd ԍ 050d
Ў 040e О 041e Ю 042e о 043e ю 044e ў 045e Ѯ 046e Ѿ 047e Ҏ 048e Ҟ 049e Ү 04ae Ҿ 04be ӎ 04ce Ӟ 04de Ӯ 04ee Ӿ 04fe Ԏ 050e
Џ 040f П 041f Я 042f п 043f я 044f џ 045f ѯ 046f ѿ 047f ҏ 048f ҟ 049f ү 04af ҿ 04bf ӏ 04cf ӟ 04df ӯ 04ef ӿ 04ff ԏ 050f

Home Page Unix/Linux TCP/IP Infosec Travel Radio Site Map Contact
Use /bin/vi! Manipulate images with ImageMagick! Hosted on OpenBSD
Hosted on Apache Valid XHTML 1.1! Valid CSS!
© Bob Cromwell Nov 2008. Created with /bin/vi and ImageMagick, hosted on OpenBSD with Apache.    Root password available here