ScriptSource

Script

Devanagari (Nagari)Deva

Subject areas for this script

7

Entries in this subject area

Entries can contain text, graphics, media, files and software. Scroll down to see the entry on this page, or click on the entry title to see full details.

Title
Character list for Sanskrit written with Devanagari (Nagari
Reordering and Data Storage Order
Unicode Status (Currency)
Unicode Status (Devanagari)
Unicode Status (Vedic)
Visarga and glottal stop alternates
Writing Conjuncts in the Devanagari Script

1

Blog posts in this subject area

These are posts from the blogs on this site; the full blogs can be accessed under the Topics link. Scroll down to see the blog posts on this page, or click on the title to see full details.

Title
Virama or Halant, which model do I choose?

0

Discussions in this subject area

Discussions include ideas, opinions or questions that invite comments from other ScriptSource users.

There are no discussions for this subject.

154 154

The following table shows which Unicode characters are uniquely associated with this script. A language which uses the script may use additional symbols not listed here. See individual writing system pages for complete listings.

Characters associated with this script

USV Graphic If these graphics are not displaying correctly, click for information. Character If these characters are not displaying correctly, click for information. Name
0900 DEVANAGARI SIGN INVERTED CANDRABINDU
0901 DEVANAGARI SIGN CANDRABINDU
0902 DEVANAGARI SIGN ANUSVARA
0903 DEVANAGARI SIGN VISARGA
0904 DEVANAGARI LETTER SHORT A
0905 DEVANAGARI LETTER A
0906 DEVANAGARI LETTER AA
0907 DEVANAGARI LETTER I
0908 DEVANAGARI LETTER II
0909 DEVANAGARI LETTER U
090A DEVANAGARI LETTER UU
090B DEVANAGARI LETTER VOCALIC R
090C DEVANAGARI LETTER VOCALIC L
090D DEVANAGARI LETTER CANDRA E
090E DEVANAGARI LETTER SHORT E
090F DEVANAGARI LETTER E
0910 DEVANAGARI LETTER AI
0911 DEVANAGARI LETTER CANDRA O
0912 DEVANAGARI LETTER SHORT O
0913 DEVANAGARI LETTER O
0914 DEVANAGARI LETTER AU
0915 DEVANAGARI LETTER KA
0916 DEVANAGARI LETTER KHA
0917 DEVANAGARI LETTER GA
0918 DEVANAGARI LETTER GHA
0919 DEVANAGARI LETTER NGA
091A DEVANAGARI LETTER CA
091B DEVANAGARI LETTER CHA
091C DEVANAGARI LETTER JA
091D DEVANAGARI LETTER JHA
091E DEVANAGARI LETTER NYA
091F DEVANAGARI LETTER TTA
0920 DEVANAGARI LETTER TTHA
0921 DEVANAGARI LETTER DDA
0922 DEVANAGARI LETTER DDHA
0923 DEVANAGARI LETTER NNA
0924 DEVANAGARI LETTER TA
0925 DEVANAGARI LETTER THA
0926 DEVANAGARI LETTER DA
0927 DEVANAGARI LETTER DHA
0928 DEVANAGARI LETTER NA
0929 DEVANAGARI LETTER NNNA
092A DEVANAGARI LETTER PA
092B DEVANAGARI LETTER PHA
092C DEVANAGARI LETTER BA
092D DEVANAGARI LETTER BHA
092E DEVANAGARI LETTER MA
092F DEVANAGARI LETTER YA
0930 DEVANAGARI LETTER RA
0931 DEVANAGARI LETTER RRA
0932 DEVANAGARI LETTER LA
0933 DEVANAGARI LETTER LLA
0934 DEVANAGARI LETTER LLLA
0935 DEVANAGARI LETTER VA
0936 DEVANAGARI LETTER SHA
0937 DEVANAGARI LETTER SSA
0938 DEVANAGARI LETTER SA
0939 DEVANAGARI LETTER HA
093A DEVANAGARI VOWEL SIGN OE
093B DEVANAGARI VOWEL SIGN OOE
093C DEVANAGARI SIGN NUKTA
093D DEVANAGARI SIGN AVAGRAHA
093E DEVANAGARI VOWEL SIGN AA
093F ि DEVANAGARI VOWEL SIGN I
0940 DEVANAGARI VOWEL SIGN II
0941 DEVANAGARI VOWEL SIGN U
0942 DEVANAGARI VOWEL SIGN UU
0943 DEVANAGARI VOWEL SIGN VOCALIC R
0944 DEVANAGARI VOWEL SIGN VOCALIC RR
0945 DEVANAGARI VOWEL SIGN CANDRA E
0946 DEVANAGARI VOWEL SIGN SHORT E
0947 DEVANAGARI VOWEL SIGN E
0948 DEVANAGARI VOWEL SIGN AI
0949 DEVANAGARI VOWEL SIGN CANDRA O
094A DEVANAGARI VOWEL SIGN SHORT O
094B DEVANAGARI VOWEL SIGN O
094C DEVANAGARI VOWEL SIGN AU
094D DEVANAGARI SIGN VIRAMA
094E DEVANAGARI VOWEL SIGN PRISHTHAMATRA E
094F DEVANAGARI VOWEL SIGN AW
0950 DEVANAGARI OM
0953 DEVANAGARI GRAVE ACCENT
0954 DEVANAGARI ACUTE ACCENT
0955 DEVANAGARI VOWEL SIGN CANDRA LONG E
0956 DEVANAGARI VOWEL SIGN UE
0957 DEVANAGARI VOWEL SIGN UUE
0958 DEVANAGARI LETTER QA
0959 DEVANAGARI LETTER KHHA
095A DEVANAGARI LETTER GHHA
095B DEVANAGARI LETTER ZA
095C DEVANAGARI LETTER DDDHA
095D DEVANAGARI LETTER RHA
095E DEVANAGARI LETTER FA
095F DEVANAGARI LETTER YYA
0960 DEVANAGARI LETTER VOCALIC RR
0961 DEVANAGARI LETTER VOCALIC LL
0962 DEVANAGARI VOWEL SIGN VOCALIC L
0963 DEVANAGARI VOWEL SIGN VOCALIC LL
0966 DEVANAGARI DIGIT ZERO
0967 DEVANAGARI DIGIT ONE
0968 DEVANAGARI DIGIT TWO
0969 DEVANAGARI DIGIT THREE
096A DEVANAGARI DIGIT FOUR
096B DEVANAGARI DIGIT FIVE
096C DEVANAGARI DIGIT SIX
096D DEVANAGARI DIGIT SEVEN
096E DEVANAGARI DIGIT EIGHT
096F DEVANAGARI DIGIT NINE
0970 DEVANAGARI ABBREVIATION SIGN
0971 DEVANAGARI SIGN HIGH SPACING DOT
0972 DEVANAGARI LETTER CANDRA A
0973 DEVANAGARI LETTER OE
0974 DEVANAGARI LETTER OOE
0975 DEVANAGARI LETTER AW
0976 DEVANAGARI LETTER UE
0977 DEVANAGARI LETTER UUE
0978 DEVANAGARI LETTER MARWARI DDA
0979 DEVANAGARI LETTER ZHA
097A DEVANAGARI LETTER HEAVY YA
097B DEVANAGARI LETTER GGA
097C DEVANAGARI LETTER JJA
097D DEVANAGARI LETTER GLOTTAL STOP
097E DEVANAGARI LETTER DDDA
097F ॿ DEVANAGARI LETTER BBA
A8E0 COMBINING DEVANAGARI DIGIT ZERO
A8E1 COMBINING DEVANAGARI DIGIT ONE
A8E2 COMBINING DEVANAGARI DIGIT TWO
A8E3 COMBINING DEVANAGARI DIGIT THREE
A8E4 COMBINING DEVANAGARI DIGIT FOUR
A8E5 COMBINING DEVANAGARI DIGIT FIVE
A8E6 COMBINING DEVANAGARI DIGIT SIX
A8E7 COMBINING DEVANAGARI DIGIT SEVEN
A8E8 COMBINING DEVANAGARI DIGIT EIGHT
A8E9 COMBINING DEVANAGARI DIGIT NINE
A8EA COMBINING DEVANAGARI LETTER A
A8EB COMBINING DEVANAGARI LETTER U
A8EC COMBINING DEVANAGARI LETTER KA
A8ED COMBINING DEVANAGARI LETTER NA
A8EE COMBINING DEVANAGARI LETTER PA
A8EF COMBINING DEVANAGARI LETTER RA
A8F0 COMBINING DEVANAGARI LETTER VI
A8F1 COMBINING DEVANAGARI SIGN AVAGRAHA
A8F2 DEVANAGARI SIGN SPACING CANDRABINDU
A8F3 DEVANAGARI SIGN CANDRABINDU VIRAMA
A8F4 DEVANAGARI SIGN DOUBLE CANDRABINDU VIRAMA
A8F5 DEVANAGARI SIGN CANDRABINDU TWO
A8F6 DEVANAGARI SIGN CANDRABINDU THREE
A8F7 DEVANAGARI SIGN CANDRABINDU AVAGRAHA
A8F8 DEVANAGARI SIGN PUSHPIKA
A8F9 DEVANAGARI GAP FILLER
A8FA DEVANAGARI CARET
A8FB DEVANAGARI HEADSTROKE
A8FC DEVANAGARI SIGN SIDDHAM
A8FD DEVANAGARI JAIN OM

7
  • Main characters

    [अ आ इ ई उ ऊ ऋ ॠ ऌ ए ऐ ओ औ क ख ग घ ङ च छ ज झ ञ ट ठ ड ढ ण त थ द ध न प फ ब भ म य र ल व श ष स ह ा ि ी ु ू ृ ॄ ॢ े ै ो ौ ँ ः ् ऽ]

    Auxiliary characters

    Index characters

    Punctuation characters

    Numbering system

  • The term ‘reordering’ is difficult to define, partly because the concept can be approached from two different perspectives. The definition we use on ScriptSource is that a script is said to require reordering if the order in which some characters are written does not match the order in which they are pronounced. This definition approaches the concept of reordering from an orthographic perspective.

    To illustrate, the word below, written in the Devanagari script, is pronounced chinnu and means “to know” in the Nepali language.

    Notice that the order in which the characters are written in Devanagari does not reflect the order in which they are spoken. Specifically, the i is written before the ch, even though it is pronounced after it.

    An alternative way to approach the concept of reordering is from an encoding perspective. From this perspective, a script such as Devanagari is only said to require reordering if the characters are stored in memory in the order in which they are pronounced, but are reordered before rendering. So in the example above, the characters are stored as च + ि + न... (ch + i + n...), but before they are rendered on the screen they are reordered to ि + च + न... (i + ch + n...) to produce the correct spelling of the word. This system of encoding and storing the characters is called logical ordering.

    For the majority of scripts, the question of whether a script requires reordering is the same whichever you approach it from an orthographic or an encoding perspective. This is because, if a script requires orthographic reordering, like Devanagari, it is usually encoded in Unicode in logical order. So for the text to be correctly written or rendered, reordering is required both in the orthographic and in the encoding sense.

    However, there are a small number of scripts - Thai, Lao, and Tai Viet - which require orthographic reordering, but do not require reordering in the encoding sense. This is because these scripts are encoded in Unicode in visual order, not in logical order. So the characters are stored in memory in the order in which they appear on the page, and do not need to be reordered before they are rendered.

    For this reason, these three scripts include in their features table “Reordering: yes”, even though someone who defines the concept of reordering from an encoding perspective would disagree. These scripts do require orthographic reordering, but they are encoded in visual order, so they do not require reordering in the encoding sense.

    ContributorSteph Holloway
  • In The Unicode Standard, Currency symbols are discussed in  Chapter 22 Symbols. Currency symbols generally have an inherited script property rather than a specific script property.

    The Currency Symbols block was first encoded in The Unicode Standard version 1.1. Since that time the encoding has undergone a number of modifications; the symbols are now encoded in the following blocks:

    BlocksCharacter RangeAdded in Unicode VersionUnicode Charts
    C0 Controls and Basic Latin 0024 1.1  U0000
    C1 Controls and Latin-1 Supplement 00A2..00A5 1.1  U0080
    Latin Extended-B 0192 1.1  U0180
    Arabic 060B 4.1  U0600
    Bengali 09F2..09F3 1.1  U0980
    Gujarati 0AF1 4.0  U0A80
    Tamil 0BF9 4.0  U0B80
    Thai 0E3F 1.1  U0E00
    Khmer 17DB 3.0  U1780
    Currency Symbols 20A0..20CF 1.1  U20A0
    Letterlike Symbols 2133 1.1  U2100
    CJK Unified Ideographs 5143, 5186, 5706, 5713 1.1  U4E00
    Arabic Presentation Forms-A FDFC 3.2  UFB50
    Halfwidth and Fullwidth Forms FF04, FFE0, FFE1, FFE5, FFE6 1.1  UFF00

    Subsequent to version 1.1, the following Currency characters have been added:

    CharactersUnicode VersionDocumentation
    058F 6.1  WG2 N3771,  L2/10-008
    060B 4.1  WG2 N2640,  L2/03-330
    09FB 5.2  WG2 N3311,  L2/07-192,  L2/08-288
    0AF1 4.0  L2/09-331
    0BF9 4.0  
    17DB 3.0  
    20AB 2.0  
    20AC 2.1  WG2 N1566.html, L2/97-081 (not online)
    20AD 3.0  WG2 N1720.doc,  WG2 N1720,  L2/98-061
    20AE 3.0 WG2 N1857 (not online), L2/98-360 (not online)
    20AF 3.0  WG2 N1946,  WG2 N1946_drachma,  L2/99-025,  WG2 N3866,  L2/10-253
    20B0 3.2  WG2 N2188, L2/98-309 (not available online),  L2/00-092
    20B1 3.2  WG2 N2040.doc,  WG2 N2156.doc,  L2/00-013,  WG2 N2161.doc,  L2/00-053
    20B2..20B3 4.1  WG2 N2579,  L2/03-095
    20B4..20B5 4.1  WG2 N2743,  L2/04-139
    20B6 5.2  WG2 N3387,  L2/07-332
    20B7 5.2  WG2 N3390,  L2/08-115
    20B8 5.2  WG2 N3392,  L2/08-116
    20B9 6.0  L2/10-051,  L2/10-251,  WG2 N3862,  L2/10-249,  WG2 N3887,  L2/10-258
    20BA 6.2  WG2 N4258,  L2/12-117,  WG2 N4273,  L2/12-132
    20BB 7.0  WG2 N4308,  L2/12-242
    20BC 7.0  L2/11-231,  L2/11-366,  WG2 N4163,  L2/11-420,  WG2 N4168,  L2/12-047,  WG2 N4445,  L2/13-180
    20BD 7.0  WG2 N4512,  L2/13-235,  WG2 N4529,  L2/14-039
    20BE 8.0  WG2 N4593,  L2/14-161,  L2/15-168
    20BF 10.0  L2/11-129,  L2/15-229
    A838 5.2  WG2 N3334,  L2/07-238,  WG2 N3367,  L2/07-354,  L2/07-390
    FDFC 3.2 WG2 N1856 (not online), L2/98-359 (not online),  WG2 N2373,  L2/01-354

    Documentation refers to  ISO Working Group and Unicode proposals

    A number of proposals for its inclusion have been submitted to the Unicode Technical Committee and WG2:

    1997-06-23 Proposal for addition of a new character: EURO SIGN — National bodies of Canada, Finland, Iceland, US, the Unicode Consortium and V.S. Umamaheswaran (expert) ( WG2 N1566.html, L2/97-081 (not online))

    1998-02-27 KIP SIGN - Laotian Currency Sign — V.S. Umamaheswaran ( WG2 N1720.doc,  WG2 N1720,  L2/98-061)

    1998-09-10 Proposal to encode the "German Penny Symbol" — Elmar Dünßer ( WG2 N2188, L2/98-309 (not available online))

    1998-09 Addition of the RIAL sign on ISO 10646 — Japan (WG2 N1856 (not online), L2/98-359 (not online))

    1998-09 Addition of Tugrik sign on ISO 10646 — Japan (WG2 N1857 (not online), L2/98-360 (not online)

    1998-09 Addition of Peso sign on ISO 10646 — Japan (WG2 N1858 (not online), L2/98-361 (not online)

    1999-01-20 Addition of the DRACHMA SIGN to the UCS — ELOT / Everson ( WG2 N1946,  WG2 N1946_drachma,  L2/99-025)

    1999-06-10 Peso sign — Philippines and Japan ( WG2 N2040.doc)

    2000-01-06 Peso sign and Peseta sign (U-20A7) — Takayuki K. Sato ( WG2 N2156.doc,  L2/00-013)

    2000-02-20 Peso -Character sample — Takayuki K. Sato ( WG2 N2161.doc,  L2/00-053)

    2000-03-14 Proposal to add German Penny Symbol — The Unicode Consortium ( L2/00-092)

    2001-09-20 Proposal to add Arabic Currency Sign Rial to the UCS — Roozbeh Pournader ( WG2 N2373,  L2/01-354)

    2003-02-24 Proposal to encode the GUARANI SIGN and the AUSTRAL SIGN in the UCS — Michael Everson ( WG2 N2579,  L2/03-095)

    2003-10-01 Revised proposal to encode the AFGHANI SIGN in the UCS — Michael Everson, Roozbeh Pournader ( WG2 N2640,  L2/03-330)

    2004-04-23 Proposal to encode the HRYVNIA SIGN and the CEDI SIGN in the UCS — Michael Everson ( WG2 N2743,  L2/04-139)

    2004-05-18 Encoding of Devanagari Rupee Sign in Devanagari code block — Gov't of India ( L2/04-236)

    2004-05-19 Proposal of Myanmar Currency Sign — Myanmar N B ( WG2 N2769,  L2/04-199)

    2007-07-31 Towards an Encoding for North Indic Number Forms in the UCS — Anshuman Pandey ( WG2 N3334,  L2/07-238)

    2007-09-24 Proposal to encode the Livre Tournois sign in the UCS — David R. Sewell ( WG2 N3387,  L2/07-332)

    2007-10-07 Proposal to Encode North Indic Number Forms in ISO/IEC 10646 — Anshuman Pandey ( WG2 N3367,  L2/07-354)

    2007-10-08 Proposal to Encode the Ganda Currency Mark for Bengali in ISO/IEC 10646 — Anshuman Pandey ( WG2 N3311,  L2/07-192)

    2007-10-14 Changes in L2/07-354 North Indic Number Forms (vs. L2/07-139) — Deborah Anderson ( L2/07-390)

    2008-03-06 Proposal to encode the Esperanto spesmilo sign in the UCS — Michael Everson ( WG2 N3390,  L2/08-115)

    2008-03-06 Proposal to encode the Kazakh tenge sign in the UCS — Michael Everson ( WG2 N3392,  L2/08-116)

    2008-08-04 Public Review Issue #123: Bengali Currency Numerator Values — Ken Whistler ( L2/08-288)

    2009-04-06 Proposal to encode a Florin currency symbol — German N.B. ( WG2 N3588,  L2/09-113)

    2009-10-07 Proposal to Deprecate GUJARATI RUPEE SIGN — Anshuman Pandey ( L2/09-331)

    2010-01-29 Govt. of India’s inputs on document no. L2/10-029 — Swaran Lata ( L2/10-051)

    2010-02-10 Proposal to encode an Armenian Dram currency symbol — Karl Pentzlin ( WG2 N3771,  L2/10-008)

    2010-05-03 Additional notes on the Florin symbol — Karl Pentzlin ( L2/10-163)

    2010-07-16 Proposal to Encode India’s National Currency Symbol — Rabin Deka ( L2/10-251)

    2010-07-19 Proposal to encode the INDIAN RUPEE SIGN in the UCS — Michael Everson ( WG2 N3862,  L2/10-249)

    2010-07-19 Proposal to change the glyph of the DRACHMA SIGN — Michael Everson ( WG2 N3866,  L2/10-253)

    2010-08-04 How to Pick a Representative Glyph for a New Currency Symbol — Ken Whistler, Asmus Freytag ( L2/10-289)

    2010-08-09 Comment on L2/10-230, Proposal to encode a modifier letter used in French abbreviations in the UCS — Eric Muller ( L2/10-315)

    2010-09-01 Proposal to encode the Indian Rupee Symbol in the UCS — Gov't of India / Swaran Lata ( WG2 N3887,  L2/10-258)

    2011-03-24 Addition of Bitcoin Sign — Sander van Geloven ( L2/11-129)

    2011-08-05 Revised Proposal to encode Azerbaijani manat sign in the UCS (minor update) — Mykyta Yevstifeyev ( L2/11-231)

    2011-10-18 Proposal to encode historic currency signs of Russia in the UCS — Yuri Kalashnov, Ilya Yevlampiev, Karl Pentzlin, Roman Doroshenko ( WG2 N4208,  L2/11-273)

    2011-10-21 Additional evidence for the Azerbaijan Manat symbol — Karl Pentzlin ( L2/11-366)

    2011-10-31 Letter from Central Bank of Azerbaijan Regarding Manat Sign — Karl Pentzlin ( WG2 N4163,  L2/11-420)

    2011-11-10 Proposal to add the currency sign for the Azerbaijani Manat to the UCS — German N.B. ( WG2 N4168,  L2/12-047)

    2012-04-17 Proposal to encode the Turkish Lira Sign in the UCS — Michael Everson ( WG2 N4258,  L2/12-117)

    2012-04-24 Feedback on Early Russian Currency Symbols (L2/11-273=N4208) — Ralph Cleminson, David Birnbaum ( L2/12-148)

    2012-04-27 Proposal to Encode the Turkish Lira Symbol in the UCS — N. Sacit Uluirmak ( WG2 N4273,  L2/12-132)

    2012-05-06 Notes on the feedback document L2/12-148 regarding Early Russian Currency Symbols (L2/11-273 = WG2 N4208) by Ralph Cleminson and David Birnbaum (dated 2012-04-24) — Karl Pentzlin ( L2/12-183)

    2012-07-24 Proposal for one historic currency character, MARK SIGN — Nina Marie Evensen, Deborah Anderson ( WG2 N4308,  L2/12-242)

    2012-10-29 Default property values for unassigned code points in the Currency Symbols block — Laurentiu Iancu ( L2/12-345)

    2013-06-10 Proposal to add the currency sign for the Azerbaijani Manat to the UCS — Karl Pentzlin ( WG2 N4445,  L2/13-180)

    2014-02-04 Proposal to encode the RUBLE SIGN in the UCS — Michael Everson ( WG2 N4512,  L2/13-235)

    2014-02-11 Proposal to add the currency sign for the RUSSIAN RUBLE to the UCS — Russian NB ( WG2 N4529,  L2/14-039)

    2014-08-14 Adding Georgian Lari currency sign — George Melashvili ( WG2 N4593,  L2/14-161)

    2014-07-28 Recommendations to UTC #140 August 2014 on Script Proposals — Deborah Anderson, Ken Whistler, Rick McGowan, Roozbeh Pournader, Laurentiu Iancu ( L2/14-170)

    2015-07-06 The Lari Symbol: Implementation Principles and Supplementary Manual — National Bank of Georgia / Giorgi Shermazanashvili ( L2/15-168)

    2015-10-02 Proposal for addition of bitcoin sign — Ken Shirriff ( L2/15-229)

    UTC #145 Minutes ( L2/15-254) (See E.2 for decision and action items)

    2017-01-31 Proposal to encode Iranian Currency Sign TOMAN to the UCS — Toman O Rial ( L2/17-060)

    Recommendations to UTC #151 May 2017 on Script Proposals ( L2/17-153) (See point 18.)

    UTC #151 Minutes ( L2/17-103) (See E.7 for decision and action items)

    ContributorLorna Evans
  • In The Unicode Standard, Devanagari script implementation is discussed in  Chapter 12 South and Central Asia-I: Official Scripts of India.

    The Devanagari script was first encoded in The Unicode Standard version 1.0. The script is now encoded in the following blocks:

    BlocksCharacter RangeAdded in Unicode VersionUnicode Charts
    Devanagari 0900..097F 1.0  U0900
    Devanagari Extended A8E0..A8FF 5.2  UA8E0

    Vedic Extensions may be used with the Devanagari script as well as many other Indic scripts.

    Subsequent to version 1.0, the following characters have been added to the Devanagari script:

    CharactersUnicode VersionDocumentation
    0900 5.2  n3235.pdf/ L2/07-095,  n3383.pdf/ L2/08-050,  n3385.pdf/ L2/08-092 (comparison between n3235 and n3383)
    0904 4.0  n2425.pdf/ L2/02-117
    093A..093B 6.0  n3731.pdf/ L2/09-389
    094E 5.2  n3235.pdf/ L2/07-095,  n3383.pdf/ L2/08-050,  n3385.pdf/ L2/08-092 (comparison between n3235 and n3383)
    094F 6.0  n3731.pdf/ L2/09-389
    0955 5.2  n3235.pdf/ L2/07-095,  n3383.pdf/ L2/08-050,  n3385.pdf/ L2/08-092 (comparison between n3235 and n3383)
    0956..0957 6.0  n3731.pdf/ L2/09-389
    0971 5.1  n3125.pdf/ L2/06-137
    0972 5.1  n3249.pdf/ L2/07-027R
    0973..0977 6.0  n3731.pdf/ L2/09-389
    0978 7.0  WG2 N3970,  L2/10-475
    0979..097A 5.2  n3235.pdf/ L2/07-095,  n3383.pdf/ L2/08-050,  n3385.pdf/ L2/08-092 (comparison between n3235 and n3383)
    097B..097C 5.0  n2934.pdf/ L2/05-082
    097D 4.1  n2543.pdf/ L2/02-394
    097E..097F 5.0  n2934.pdf/ L2/05-082
    A8E0..A8FB 5.2  n3235.pdf/ L2/07-095,  n3383.pdf/ L2/08-050,  n3385.pdf/ L2/08-092 (comparison between n3235 and n3383)
    A8FC 8.0  WG2 N4260,  L2/12-123
    A8FD 8.0  WG2 N4408,  L2/13-056
    A8FE..A8FF pending  L2/15-335

    Documentation refers to  ISO Working Group and Unicode proposals

    ContributorScriptSource Staff
  • The Vedic Extensions block was first encoded in The Unicode Standard version 5.2. Vedic Extensions are discussed in  Chapter 12 South and Central Asia-I: Official Scripts of India in the Devanagari Extended section. Vedic Extensions may be used with many Indic scripts. Unicode Status pages for scripts that use the Vedic Extensions are found here: Bengali, Devanagari, Grantha, Gujarati, Gurmukhi, Kannada, Malayalam, Newa, Oriya, Sharada, Tamil and Telugu.

    The script is now encoded in the following blocks:

    BlocksCharacter RangeAdded in Unicode VersionUnicode Chart
    Vedic Extensions 1CD0..1CFF 5.2  U1CD0

    Subsequent to version 5.2, the following characters have been added to the Vedic Extensions block:

    CharactersUnicode VersionDocumentation
    1CF3 6.1  WG2 N3861,  L2/09-343
    1CF4 6.1  WG2 N3844,  L2/09-344
    1CF5..1CF6 6.1  WG2 N3881,  L2/10-257
    1CF7 10.0  L2/15-160
    1CF8..1CF9 7.0  WG2 N4134,  L2/11-267

    Documentation refers to  ISO Working Group and Unicode proposals

    ContributorLorna Evans
  • The image below shows alternate forms of the Devanagari visarga and glottal stop. Both of these letters have variations with respect to the inclusion of the horizontal "clothesline."

    In the case of the visarga, the horizontal line is never written in Hindi and Nepali, but several other languages prefer to include it in order to not give the appearance of a word break.

    The glottal stop character is used in Limbu, where inclusion of the line is a stylistic alternate reflecting the writer's preference.

    ContributorSharon Correll
  • The languages for which the Devanagari script is used regularly employ consonant clusters such as [kʃ], [pr], [sv] and [tj], both at the beginning of and within words. To represent these in writing, the inherent vowel contained in every consonant letter must be deleted somehow. To this end, consonant ligatures (generally called conjuncts when referring to Indic scripts) number in the hundreds. Some consonants, particularly those with a rounded bottom such as ड da, cannot be written with a ligated conjunct; these are modified with a halant symbol to indicate that the vowel has been muted. So the sequence dla would be written with two distinct letters: ड्ल.

    Most consonant letters have a 'half' form which enables them to represent consonant clusters without the use of halant. The half form tends to look similar to the 'full' form of the letter, but without the vertical stem which is graphically and historically related to the a vowel letter. This 'half' form can join to the left of a 'full' consonant letter so that they can be pronounced in sequence. For example, the letters त ta and स sa can combine to produce त्स [tsə].

    Some letters do not join horizontally in this way, but instead stack vertically. This particularly applies in cases where the first letter does not have a vertical stem, such as ट and ठ ṭh which stack to produce the conjunct ट्ठ [ʈʈhə]. Some combinations are attested in both horizontal and vertical arrangements, such as kka (क+क), which can be written either क्क or क्क.

    The final class of consonant clusters are those which are represented using a new letter, which is not so easily decomposable into the shapes of the individual letters comprising it. Commonly used examples of these are:

    ka + ष ṣa = क्ष
    ja + ञ ñ = ज्ञ
    ta + त ta = त्त
    ta + र ra = त्र

    The use or non-use of a ligature is optional in some words. For example, [bɪlkul] ('entirely') can be written with the l+k conjunct ल्क or with the full forms of both letters (probably using a halant below the l) without changing the pronunciation or the meaning of the word. In the case of other words, the use or non-use of a conjunct is determined by the word's morphology and changes its meaning. For example, [kərta], when written with the r+t conjunct represents the Hindi word for 'doer/maker' कर्ता, but when written with a full r and a full t, the break in letters represents a morpheme break between the root of the verb 'to do' [kər-], and the imperfect participle masculine ending [-ta], so the word करता means '(he) does'. Note that this means the orthography does not necessarily represent syllables in a phonological sense; both senses of [kərta] contain the syllables kər+ta but the syllable break is not represented when using a conjunct.

    ContributorScriptSource Staff

1
  • Posted by Martin Hosken on 2012-06-06 03:49:00

    In comparing the encoding of a script like Myanmar with Devanagari, one immediately notices that there is somewhat of a difference in encoding models. Devanagari follows what I will call the halant model. In this the halant is used to mark that the inherent vowel associated with a consonant is not to sound. (That is, the inherent vowel is killed.) Myanmar, on the other hand, follows what I will call the virama model. Myanmar uses a virama to kill the inherent vowel of a consonant. On the surface one would expect that, since both scripts are marking an identical linguistic process, they should use the same encoding model. To understand how and why the models are different, we need to examine them in some more detail.

    In the halant model, whether a conjunct is formed is a matter of convention and free variation. If a conjunct isn't created where one might be expected, the text might look slightly odd, but it can't be considered to be wrongly spelled. Thus the question of when two consonants with a halant between should be conjoined is left up to the font. Yes, there are codes to inhibit conjunction in certain situations, but these codes (ZWJ and ZWNJ) are not considered part of spelling. The question as to their usefulness is one for a different posting.

    In the virama model, there are two characters that have the function of killing a consonant's inherent vowel. In Devanagari terms, one corresponds to a forced visible halant and the other to a forced conjunct-forming character. Why are there these two? The reason is that in scripts that use the virama model, whether the killer is visible or conjoining (rendered by stacking the second consonant under the first) is not a free, stylistic question, but is part of spelling. If you conjoin when you should use a visible killer then you have spelt the word wrongly. Likewise the other way around.

    Those designing encodings, therefore, have to decide whether the script they are encoding conjoins consonants for stylistic reasons or whether making a conjunct in contrast to marking a visible killer is a question of spelling. If it is free and stylistic, then the halant model is probably the best way to go; if it is not free, but is part of the spelling, then the virama model is best.

0

Copyright © 2017 SIL International and released under the  Creative Commons Attribution-ShareAlike 3.0 license (CC-BY-SA) unless noted otherwise. Language data includes information from the  Ethnologue. Script information partially from the  ISO 15924 Registration Authority. Some character data from  The Unicode Standard Character Database and locale data from the  Common Locale Data Repository. Used by permission.