Transliteration of Sanskrit

Discuss the word processor
Post Reply
Drew.Bond
Posts: 2
Joined: Sat Aug 30, 2008 2:31 am

Transliteration of Sanskrit

Post by Drew.Bond »

I prepare some documents that use sanskrit words. These are sometimes represented as truetype fonts and sometimes as transliterated into roman characters. Some of these can be represented using special characters, but not all. In the past I have used MS Word field codes which, for example, look like this:

d with a dot under it = {eq \O(d,\s\do2(.))}
h with a dot under it = {eq \O(h,\s\do2(.))} and
l with a dot under it = {eq \O(l,\s\do2(.))}

The full range on character transpositions may be seen here http://homepages.ihug.co.nz/~drew.bond/ ... trokes.pdf
These codes were established in Word before Writer was available. I now want to use ooo for all my work. Can I achieve the various transpositions I used in Word documents in new documents prepared with Writer?

Cheers

Drew Bond
OOo 2.4.X on Ms Windows XP + ubuntu linux
User avatar
Bhikkhu Pesala
Posts: 1253
Joined: Mon Oct 08, 2007 1:27 am

Re: Transliteration of sanskrit

Post by Bhikkhu Pesala »

I haven't used that method before, but it looks unnecessarily complex. It looks like you're using Overstrike, which may not position the dots accurately in all cases. It is best to use a font with the required accented glyphs.

Try my Custom Keyboards or use OpenOffice macros to insert the special characters required for Sanskrit. Several fonts support these glyphs from Latin Extended Additional — ḥ ṣ ṛ ṝ ḷ ḹ etc. If you need more fonts, see my fonts page.
Idiot Compassion
LibreOffice 6.0.4 on Windows 10
Drew.Bond
Posts: 2
Joined: Sat Aug 30, 2008 2:31 am

Re: Transliteration of sanskrit

Post by Drew.Bond »

Thankyou for your advice. I agree the method is clunky but it is beginning to show its years - more than a decade - and it worked in MS Word!!
User avatar
Robert Tucker
Volunteer
Posts: 1250
Joined: Mon Oct 08, 2007 1:34 am
Location: Manchester UK

Re: Transliteration of sanskrit

Post by Robert Tucker »

On Linux there is no problem with this. Hit the compose key and then typing !d produces ḍ, !h > ḥ, !l > ḷ and so on. !D > Ḍ, !L > Ḷ. -a > ā, -e > ē, -A > Ā, -E > Ē. 'c > ć, 's > ś, 'C > Ć, 'S > Ś, ~n > ñ, ~N > Ñ.
LibreOffice 7.x.x on Arch and Fedora.
Omkarnath
Posts: 12
Joined: Sun Mar 14, 2010 3:20 pm

Re: Transliteration of sanskrit

Post by Omkarnath »

Robert Tucker wrote:On Linux there is no problem with this. Hit the compose key and then typing !d produces ḍ, !h > ḥ, !l > ḷ and so on. !D > Ḍ, !L > Ḷ. -a > ā, -e > ē, -A > Ā, -E > Ē. 'c > ć, 's > ś, 'C > Ć, 'S > Ś, ~n > ñ, ~N > Ñ.
Hi - this looks fantastic! I activated the compose key and tested it:
Vowels work great: ã ē ī õ ū (though only ã ī and ū are used in sanskrit)
But I have trouble with consonants:
these work great: ṭ ḍ Ṭ Ḍ - but the s-subdot does not. I get "§" when I type [compose]+!s
Also, I have trouble with the semi-vowels: ṛ ḷ Ṛ Ḷ work well, but how to make them with the line above?

Best would be to have transcription keyboard layout in the standard distribution. I created one was it in 2006 and submitted to IndLinux project, but it was never included in the distribution.
OpenOffice 3.1 on Ubuntu 9.10
User avatar
Robert Tucker
Volunteer
Posts: 1250
Joined: Mon Oct 08, 2007 1:34 am
Location: Manchester UK

Re: Transliteration of sanskrit

Post by Robert Tucker »

If you switch to the US International (AltGr dead keys) keyboard you can get ṣ with AltGr+<shift>+<hyphen> followed by s.

The problem with a macron over r's and l's is, I believe, that there isn't a font that contains them. You will probably need to type the letter then use the combining macron U+0304 (Ctrl+Shift+U then 0304). The result isn't always too neat: r̄, R̄, l̄, L̄.

Also have a look at rfc1345:

http://113street.wordpress.com/2008/01/ ... ntu-linux/

You need to install the m17n pakage for iBus or SCIM and then the generic tables for m17n.
LibreOffice 7.x.x on Arch and Fedora.
Omkarnath
Posts: 12
Joined: Sun Mar 14, 2010 3:20 pm

Re: Transliteration of sanskrit

Post by Omkarnath »

Robert Tucker wrote:If you switch to the US International (AltGr dead keys) keyboard you can get ṣ with AltGr+<shift>+<hyphen> followed by s.
I tried - right now I have US intl AltGr dead keys, and I do not get s-subdot. Actually I just get a normal s.
The problem with a macron over r's and l's is, I believe, that there isn't a font that contains them.
No, it is not a problem with a font - there exists unicode code point for this glyph and they are defined in Unicode Latin Exteneded Additional table, and the glyphs are increasingly being included in many fonts, such as verdana, arial unicode and URV Palladio ITU, as well as Gentian, as suggested by somebody in another thread. And now it is also included in the latest version of Times New Roman. Here is the sample: ṝ ḹ
You will probably need to type the letter then use the combining macron U+0304 (Ctrl+Shift+U then 0304). The result isn't always too neat: r̄, R̄, l̄, L̄.
Well that actually results in wrong letter, thus making it impossible to convert the typed text into devanagari script with automatic converter. It is important to produce technically correct unicode text.

I looked into your article and am reading it now. Though I would prefer using standard keyboard layout rather than SCIM. I tried scim earlier and didn't like it. It also messed up my system pertty nicely so I removed it.
OpenOffice 3.1 on Ubuntu 9.10
Omkarnath
Posts: 12
Joined: Sun Mar 14, 2010 3:20 pm

Re: Transliteration of sanskrit

Post by Omkarnath »

Oh Robert, I can see we are old fellows - we have been discussing earlier on this thread I am writing with nick Arjuna:
http://www.oooforum.org/forum/viewtopic ... 629#176629
OpenOffice 3.1 on Ubuntu 9.10
Omkarnath
Posts: 12
Joined: Sun Mar 14, 2010 3:20 pm

Re: Transliteration of sanskrit

Post by Omkarnath »

I found my old posting - and as it is useful in this context, I am posting it here for reference. This provides system-level implementation for indic transliteration, so that OpenOffice indic users can write their language in latin script.

I have created a transliteration keymap for Gnome / Ubuntu linux and I have included the map into the latest release of indic keymap with Hindi Bolnagri layout included.

Indic languages share many features that are reflected in various indic scripts. When working on a document or publication, which is written in english or any other language utilizing the latin letters, there is often a need for using transliteration to represent the original indic text correctly and to allow the readers to pronounce the written words correctly.

In course of time various tranliteration schemes have evolved - first due to non-standardised transliteration schemes and technical limitations of writing and typesetting tools of that time. However, a standard and lossless method of representing the indic languages in Roman script exists. This method is called National Library at Calcutta Romanization, which is an addition to method called International Alphabet of Sanskrit Transliteration (IAST). This method was introduced in Congress of Orientalists in Athens in 1912. This transliteration scheme allows publishing sanskrit text in roman scipt. The National Library of Calcutta Romanization is just an extention to it for enabling romanization of all Indic languages. This is a great method and it is widely used by indian and western publishers of sanskrit and other indic literature.

Examples:
Krishna, कृष्ण IAST: kṛṣṇa
Shiva, िशव IAST: Śiva
Devanagari, देवनागरी IAST: Devanāgarī
And Patanjali yoga sūtra 1.1: yogaścittavṛttinirodhaḥ

I had long been struggling with the un-availability of proper indic transliteration keyboard layout capable of producing these letters with diacritical marks. For this purpose I have now developed a keyboard layout, called "IAST/NLRC transliteration" layout or simply "translit".
This would provide all users of Linux the possiblity of writing quotations of indic literature in western script. Also, it would enable indian people to write their mother tongue utilizing a lossless transliteration scheme, that enables proper pronounciation of all words.

I have added the translit keymap to the latest indic keymap file that includes Hindi Bolnagri layout. Please have a look on the attached files and test them. I am open for any suggestions and corrections.

References:
http://en.wikipedia.org/wiki/National_L ... manization
http://en.wikipedia.org/wiki/IAST

// README
Translit is a keymap for transliterating indic scripts into latin, utilizing the standard transliteration schemes: IAST, NLCR and ISO 15919. Earlier there was no ready made keymap available, making it very difficult to utilize the standard transliteration schemes.

It can now be used through keyboard layout switchers by adding the option
India - IAST/NLCR transliteration / translit

Note: Normal western-european QWERTY keyboard comes by default.
To get long wovels (āĀ, īĪ, ūŪ), use:
Right Alt + a => ā,
Right Alt + A => Ā etc.

To get retroflex consonants (ḍD,ṭṬ, ṇṆ), use:
Right Alt + d => ḍ
Right Alt + D => Ḍ

To get ṇṆ, use Right Alt + n/N,
To get ṅṄ, use Right Alt + g/G

To get ॐ, use Right Alt + X, to get ̐ use Right Alt + Shift + X

Installing it

Unzip it
# unzip translit-in.zip

and copy the files to following locations:
/usr/share/X11/xkb/symbols/in
/usr/share/X11/xkb/rules/base.lst
/usr/share/X11/xkb/rules/base.xml

or to (depending of your distribution):

/etc/X11/xkb/symbols/in
/etc/X11/xkb/rules/base.lst
/etc/X11/xkb/rules/base.xml
Attachments
translit-in.zip
(111.12 KiB) Downloaded 319 times
OpenOffice 3.1 on Ubuntu 9.10
User avatar
Robert Tucker
Volunteer
Posts: 1250
Joined: Mon Oct 08, 2007 1:34 am
Location: Manchester UK

Re: Transliteration of sanskrit

Post by Robert Tucker »

Omkarnath wrote:
Robert Tucker wrote:If you switch to the US International (AltGr dead keys) keyboard you can get ṣ with AltGr+<shift>+<hyphen> followed by s.
I tried - right now I have US intl AltGr dead keys, and I do not get s-subdot. Actually I just get a normal s.
Definitely working here with Gnome on Fedora 12.
Omkarnath wrote: Here is the sample: ṝ ḹ
Oh! I see, a dot below with a macron above (I thought just a macron above). <compose key> followed by <underscore> followed by <exclamation mark> followed by letter.
LibreOffice 7.x.x on Arch and Fedora.
Omkarnath
Posts: 12
Joined: Sun Mar 14, 2010 3:20 pm

Re: Transliteration of sanskrit

Post by Omkarnath »

Robert Tucker wrote:
Omkarnath wrote:
Robert Tucker wrote:If you switch to the US International (AltGr dead keys) keyboard you can get ṣ with AltGr+<shift>+<hyphen> followed by s.
I tried - right now I have US intl AltGr dead keys, and I do not get s-subdot. Actually I just get a normal s.
Definitely working here with Gnome on Fedora 12.
Hmm strange..Well, just found:
https://help.ubuntu.com/community/ComposeKey and
https://help.ubuntu.com/community/GtkComposeTable
So I guess the solution lies here.

Robert - it is GREAT that you have brought this to my attention. I remember in mid-nineties there were no @-signs on some unix keyboards, and there you would type was it [compose character]a <space> to get @ sign. And I had been wondering why not to use it on Linux, but never gave it a second thought.

This is really a great method!
Omkarnath wrote: Here is the sample: ṝ ḹ
Oh! I see, a dot below with a macron above (I thought just a macron above). <compose key> followed by <underscore> followed by <exclamation mark> followed by letter.
Wow, this works. I have one problem with my ã - in some apps (like OOo) it is displayed with tilde rather than with macron. But here in Chrome it displays with macron.
à A asciiTilde 00C3 capital A with tilde
à A minus 00C3 capital A with tilde
à asciiTilde A 00C3 capital A with tilde
à minus A 00C3 capital A with tilde
OpenOffice 3.1 on Ubuntu 9.10
Omkarnath
Posts: 12
Joined: Sun Mar 14, 2010 3:20 pm

Re: Transliteration of sanskrit

Post by Omkarnath »

Yes got it -
There is no a with macron listed in the GTK Compose Table. There is only A with tilde. But all other vowels with macron are available. Strange - why did they leave this out? Perhaps this should be reported as a bug.

So far only the <ctrl>+<shift>+U 0304 method works for dead macron ( ̄, producing a "ā"), but this results in two key codes, and not one character.
OpenOffice 3.1 on Ubuntu 9.10
Omkarnath
Posts: 12
Joined: Sun Mar 14, 2010 3:20 pm

Re: Transliteration of sanskrit

Post by Omkarnath »

I found that for my locale the following is defined:

Code: Select all

<dead_macron> <A>                	: "Ā"   U0100 # LATIN CAPITAL LETTER A WITH MACRON
<Multi_key> <macron> <A>         	: "Ā"   U0100 # LATIN CAPITAL LETTER A WITH MACRON
<Multi_key> <underscore> <A>     	: "Ā"   U0100 # LATIN CAPITAL LETTER A WITH MACRON
<dead_macron> <a>                	: "ā"   U0101 # LATIN SMALL LETTER A WITH MACRON
<Multi_key> <macron> <a>         	: "ā"   U0101 # LATIN SMALL LETTER A WITH MACRON
<Multi_key> <underscore> <a>     	: "ā"   U0101 # LATIN SMALL LETTER A WITH MACRON
<dead_breve> <A>                 	: "Ă"   U0102 # LATIN CAPITAL LETTER A WITH BREVE
However, I do not know which key combination will produce it. Multi_key is AltGR, but which is the macron?

Likewise:

Code: Select all

<dead_belowdot> <S>              	: "Ṣ"   U1E62 # LATIN CAPITAL LETTER S WITH DOT BELOW
<Multi_key> <exclam> <S>         	: "Ṣ"   U1E62 # LATIN CAPITAL LETTER S WITH DOT BELOW
<dead_belowdot> <s>              	: "ṣ"   U1E63 # LATIN SMALL LETTER S WITH DOT BELOW
<Multi_key> <exclam> <s>         	: "ṣ"   U1E63 # LATIN SMALL LETTER S WITH DOT BELOW
how to produce the s with dot below?
OpenOffice 3.1 on Ubuntu 9.10
User avatar
Robert Tucker
Volunteer
Posts: 1250
Joined: Mon Oct 08, 2007 1:34 am
Location: Manchester UK

Re: Transliteration of Sanskrit

Post by Robert Tucker »

I can get ā with <compose key> followed by <hyphen> followed by a.

As above I can get ṣ is with the US International (AltGr dead keys) keyboard and AltGr+<shift>+<hyphen> followed by s.
keyboards.jpg
The alternative is Ctrl+Shift+U followed by 1E63 for lowercase or U1E62 for uppercase followed by <Enter>.

My multikey is set to the left Windows key:
multikey.jpg
If you use right Alt (AltGr) might it not mess up the combinations you can already get with that key?
LibreOffice 7.x.x on Arch and Fedora.
Omkarnath
Posts: 12
Joined: Sun Mar 14, 2010 3:20 pm

Re: Transliteration of Sanskrit

Post by Omkarnath »

Oh I made error in my previous posting - MultiKey is Compose, not AltGr.
I have compose on right menu-key, so that won't contribute to the mess-up.

When I type <compose key> followed by <hyphen> followed by a. I get ã, which looks like a with macron in this window, but once I copy-paste it to any other window, I can see that in fact it is a-tilde. If I produce the same combination in any other window, it looks a-tilde and when I paste it here (Google Chrome) it looks a-macron. Same happens in Firefox.

Also the s is defined correctly in the GtkCompose Table:

Code: Select all

<dead_abovedot> <S>              	: "Ṡ"   U1E60 # LATIN CAPITAL LETTER S WITH DOT ABOVE
<Multi_key> <period> <S>         	: "Ṡ"   U1E60 # LATIN CAPITAL LETTER S WITH DOT ABOVE
<dead_abovedot> <s>              	: "ṡ"   U1E61 # LATIN SMALL LETTER S WITH DOT ABOVE
<Multi_key> <period> <s>         	: "ṡ"   U1E61 # LATIN SMALL LETTER S WITH DOT ABOVE
<dead_belowdot> <S>              	: "Ṣ"   U1E62 # LATIN CAPITAL LETTER S WITH DOT BELOW
<Multi_key> <exclam> <S>         	: "Ṣ"   U1E62 # LATIN CAPITAL LETTER S WITH DOT BELOW
<dead_belowdot> <s>              	: "ṣ"   U1E63 # LATIN SMALL LETTER S WITH DOT BELOW
<Multi_key> <exclam> <s>         	: "ṣ"   U1E63 # LATIN SMALL LETTER S WITH DOT BELOW
but I cannot produce it correctly. <compose> followed by <exclamation mark> followed by <s> produces §.
Also on US International (AltGr dead keys) hitting AltGr+<shift>+<hyphen> followed by s produces normal s.

Something is conflicting somewhere. I think it has something to do with fi_FI locale. I cannot think of anything else.
I have created new ~/.XCompose file where I have disabled the annoying § and few other annoying things, and once I finish my work, I will restart X and see whether that file helps.
OpenOffice 3.1 on Ubuntu 9.10
User avatar
Robert Tucker
Volunteer
Posts: 1250
Joined: Mon Oct 08, 2007 1:34 am
Location: Manchester UK

Re: Transliteration of Sanskrit

Post by Robert Tucker »

Omkarnath wrote:When I type <compose key> followed by <hyphen> followed by a. I get ã, which looks like a with macron in this window, but once I copy-paste it to any other window, I can see that in fact it is a-tilde. If I produce the same combination in any other window, it looks a-tilde and when I paste it here (Google Chrome) it looks a-macron. Same happens in Firefox.
I know ā and ã look much the same in these forum posts even at a greater font size:
ā ã
though the difference is apparent if you use code:

Code: Select all

ā ã
I get a with a macron ā with <compose key> followed by a hyphen followed by a and a with a tilde ã with <compose key> followed by a swung dash followed by a.

Omkarnath wrote: Also the s is defined correctly in the GtkCompose Table:

Code: Select all

<dead_abovedot> <S>              	: "Ṡ"   U1E60 # LATIN CAPITAL LETTER S WITH DOT ABOVE
<Multi_key> <period> <S>         	: "Ṡ"   U1E60 # LATIN CAPITAL LETTER S WITH DOT ABOVE
<dead_abovedot> <s>              	: "ṡ"   U1E61 # LATIN SMALL LETTER S WITH DOT ABOVE
<Multi_key> <period> <s>         	: "ṡ"   U1E61 # LATIN SMALL LETTER S WITH DOT ABOVE
<dead_belowdot> <S>              	: "Ṣ"   U1E62 # LATIN CAPITAL LETTER S WITH DOT BELOW
<Multi_key> <exclam> <S>         	: "Ṣ"   U1E62 # LATIN CAPITAL LETTER S WITH DOT BELOW
<dead_belowdot> <s>              	: "ṣ"   U1E63 # LATIN SMALL LETTER S WITH DOT BELOW
<Multi_key> <exclam> <s>         	: "ṣ"   U1E63 # LATIN SMALL LETTER S WITH DOT BELOW
but I cannot produce it correctly. <compose> followed by <exclamation mark> followed by <s> produces §.
Yes I have the same file at /usr/share/X11/locale/en_US.UTF-8/Compose and get a section mark § with <compose> followed by <exclamation mark> followed by <s>
Omkarnath wrote:Also on US International (AltGr dead keys) hitting AltGr+<shift>+<hyphen> followed by s produces normal s.
Are you pressing the <AltGr> <shift> and <hyphen> keys all at the same time?
LibreOffice 7.x.x on Arch and Fedora.
Omkarnath
Posts: 12
Joined: Sun Mar 14, 2010 3:20 pm

Re: Transliteration of Sanskrit

Post by Omkarnath »

This is really strange behavior. Shifting between Finnish, Intl American and India layouts causes the layouts to get totally screwed up. Now my AltGr combination does not work for Finnish keyboard anymore. I cannot produce @ sign on finnish keyboard, normally it is AltGr+2, nor Can I produce square brackets, or plus sign. Also the compose function does not work propery; I cannot produce a-macron, i-macron, or u-macron.

AltGr+Shift+S produces now: ʂ
Compose-!-s produces now: ṣ

I think it is the "include" definitions in the xkb file that is causing this behavior. From now I will disable the intl english layout, and stick to my native Finnish one.

While working on this, I finally got green light from the indic developers and my transliteration keyboard layout was approved for inclusion to the xkb layouts. I will make some fine-tuning and will submit it - it will be included in the distribution in some months time. If everything will go fine, it will be included in the xkb updates in few months time, and will be available for all Ubuntu (and other xkb) users as part of the standard XKB package.

Once done, you can go to keyboard preferences, choose the following:
India - IAST/NLCR transliteration / translit
OpenOffice 3.1 on Ubuntu 9.10
User avatar
Robert Tucker
Volunteer
Posts: 1250
Joined: Mon Oct 08, 2007 1:34 am
Location: Manchester UK

Re: Transliteration of Sanskrit

Post by Robert Tucker »

When playing around with keyboard layouts some years ago I found it very easy to get the Gnome keyboard layouts out of step with those of the X system. Google gnome+x-system+layouts+error and you will get hits such as:
“The X system keyboard settings differ from your current GNOME keyboard settings…“

Followed by:
“Error activating XKB configuration…“
http://linuxtidbits.wordpress.com/2007/ ... -warnings/

You don't mention getting any error signals but what you describe reads very much like a similar result.
LibreOffice 7.x.x on Arch and Fedora.
Post Reply