1 / 48

Hindi – Urdu Transliteration issues

Rahmat Yousufzai and Amba Kulkarni. Hindi – Urdu Transliteration issues. ا ب پ ت ث ٹ ج چ ح خ د ذ ڈ ر ز ژ ڑ س ش ص ض ط ظ ع غ ف ق ک گ ل م ن و ہ ھ ء ی ے. Urdu Alphabet. Characteristics of Urdu alphabet.

Download Presentation

Hindi – Urdu Transliteration issues

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.


Presentation Transcript

  1. Rahmat Yousufzai and Amba Kulkarni Hindi – Urdu Transliteration issues

  2. ا ب پ ت ث ٹ ج چ ح خ د ذ ڈ ر ز ژ ڑ س ش ص ض ط ظ ع غ ف ق ک گ ل م ن و ہ ھ ء ی ے Urdu Alphabet

  3. Characteristics of Urdu alphabet • Most of the Urdu characters join with the following character and make one ligature. • Example: تکلیف तक्लीफ • This Urdu word is combination of 5 characters • 5 4 3 2 1 • ت ک ل ی ف • Urdu characters typically have different shapes in different positions – beginning, middle, last.

  4. Characteristics contd ... • Some characters do not join with the following character and are written in full form even if they come in a middle position. • ا آ د ذ ڈ ر ز ژ ڑ و • Example: • مالک آم بدلہ تہذیب بڈھا برما بزنس ویژن کڑیل بولنا • Please note that there is no space in between.

  5. Hindi Alphabet • अ आ इ ई उ ऊ ऋ • ए ऐ ओ औ अं अः • क ख ग घ ङ • च छ ज झ ञ • ट ठ ड ढ ण • त थ द ध न • प फ ब भ म • य र ल व ष श स ह

  6. Consonants missing in Hindi • ث ، ح ،خ ، ز ، ذ ، ص ، ض ، ط ، ظ ، ع ، غ ، ف ، ق • These characters do not exist in Hindi. They are borrowed from Arabic and are used for the words borrowed from Arabic/Persian only.

  7. contd2

  8. Contd3

  9. Contd4

  10. Contd5 • If ع comes in between or as a last character then most of the Urdu speakers normally pronounce it as Alif but in poetry, special care is taken to pronounce it correctly.

  11. Contd6

  12. Contd7 • غ This sound also does not exist in Hindi. To represent this character, ग or ग़ is used. However normally Hindi writers do not use the dot.

  13. Contd8 • ژThis is a Persian character and does not exist in Arabic. In Hindi this is represented by ज and ज़. • Example: Television ٹےلی ویژنटेलीविज़न

  14. Contd9 • ھ This is Do-chashmi He and gives its sound only when joined with certain characters. • An important point to be noted:Arabic has a character ھ ( do-chashmi he ). This character retains its shape only when it comes as first and middle position. In the last position it gets changed as ہ (gola he ). • Suggestion: For Urdu we should use ھ with unicode u06BE.

  15. Contd8 • ں Noon without dot (Noon Gunna) • When this character comes in between the word then dot is marked. This creates ambiguity as it can be read as ن • Example: • کانچ چھینٹا

  16. Urdu characters borrowed from Hindi • ٹ ، ڈ ، ڑट, ड, ड़ • These characters do not exist in Arabic or Persian. These have been borrowed from Hindi.

  17. Certain Hindi characters representation in Urdu • چھ ،جھ ، گھ ، کھ ، ٹھ ، ڈھ ڑھ ، ، تھ ، دھ، بھ ، پھ • ख, घ, झ, छ, ठ, ढ, ढ़, थ, ध, फ, भThese are the Hindi characters and are represented in Urdu by adding ھ (Do-Chashmi He) to the initial character.

  18. Contd2 • رھ ، لھ ، مھ ، نھ ، ںھ، وھ، ےھ • There are no specific characters in Hindi. However the sound is represented as under. • न्ह , मह , ल्ह, र्ह ,व्ह ,ँह, ंह, ेह

  19. Contd3 • Example:تےرھواں ، کولھو ، تمھارا ، ننھا ، اوںھ، وھےل • तेरहवाँ, कोल्हू, तुमहारा, नन्हा, ऊँह, • व्हेल • However in Urdu these words are also written as تےرہواں ، کولہو ، تمہاراBut ننھا is written without change.

  20. Ambiguous Characters • 1. aliph (ا ) & Ena (ع)These two characters are pronounced differently, but Urdu speakers do not pay attention to the difference. • Example: عام(आम common) ، آم (आम mango)

  21. Contd2 • 2. sa: se (ث), sIna (س), svAda (ص)se (ث) and svAda (ص) • These are purely arabic and Persian characters and are used in only Arabic or Persian words. Where as sIna (س) is used in Hindi, Urdu, Persian and Arabic.The above characters including س are written as स in Hindi.

  22. Contd3 • 3. Ta: Te (ت), Toya (ط)Toya (ط) is purely Arabic and Persian character and is used only in Persian and Arabic words. • Both the characters are written as त in Hindi.

  23. Contd4 • 4. he: badI he (ح)& gola he (ہ)badI he (ح) is Arabic/Persian character. • gola he (ہ) is common in Arabic, Persian, Urdu and Hindi. • Hindi equivalent for both: ह

  24. Contd5 • 5. ja: jAla (ذ), je (ز), jvAda (ض), joya (ظ), • PArasI je (ژ)Z sound is not available in Hindi and almost in all Indian languages. Instead j is used. • The sound of ja: jAla (ذ), je (ز), jvAxa (ض), joya (ظ) are almost the same and all are Arabic/Persian characters. • je (ژ) is purely Persian character and is not available in even Arabic. • For all the above characters ज or ज़ is used. Most of the times dot below is not written.

  25. Contd2 • It may be noted that if the next character after अं is प, फ, ब, भ, म then it is pronounced as "अम" otherwise it is pronounced as "अन".Example: • 1., अंबाला, अंभोज, अंमरانبالہ ، انبھوج ، عنبر • 2. चंपा, कंफू, बंबू, गंभीर, संमान • 3. अंक, अंग, अंतर, अंधा, अंडा

  26. Contd3 Also, when अं comes as a last character of the word then it gives the sound of "अम".Example:अहं, स्वयं, बालकंبالکم سویم اہم Interestingly the same rule of प, फ, ब, भ, म is applied in Urdu also but mostly the character م is used in proper nouns and English words.Example: امبانی امپھل امپائر अंपायर, अंफल अंबानी

  27. Contd4 • अःThis also is the combination of अand ःExample :It does not come as the first character of the word however it comes in the middle and last.Example:अंतःप्रवेश, अंतःकरण, प्रायः, अतः • اتہ پرایہ انتہ کرن انتہ پروےش

  28. Contd6 • ङ,ञ,णThese characters do not exist in Urdu. Instead ن is used.Example:वाङ्मय, पिङ्गलाوانگ مے ، پنگلاचञ्चल(चञ्चल), गुञ्जन(गुञ्जन) چنچل ، گنجنकारण, कणकکارن ، کنک

  29. Contd7 • ष, शThese two characters have almost the same sound with a minute difference.In Uru there is only one character ش to express this sound.Example: • षष्ठी,आकर्षण, पुरूष • ششٹھی ، آکرشن ، پرش • शरीर, मुश्किल, किशमिश • کشمش مشکل شریر

  30. Diacritic marks in Urdu • In Urdu, there are no Matras like Hindi. • Urdu has some diacritic marks but uses them only in elementary books.

  31. Contd2 • Zabaraَ This is placed above the character to indicate a consonant with अ. • Ex- بَबZerِ This is placed below the character and to indicate a vowel इ or ए. • Ex- بِबि, बे

  32. Contd3 • Pesh ُ This is placed above the character and creates the sound of उ along with the sound of the character on which it is applied. Ex- بُबुJazama ْ Equivalent of Halant in Devanagari • Ex- بْब् शब्द شبْد

  33. Contd4 • Tashdeed ّ This is used for reduplication as inہلّا _گلّا، دھبّاहल्लागुल्ला , धब्बाDo zabar ً This is placed above the last character Alif ( ا ) and gives the sound of n . It may be noted that the character just before Alif should be with Zabar.Example : فوراًफ़ौरन (the character ر is with Zabar but normally it is not written)

  34. Contd5 • Do zer ٍ This is placed above the last character Alif ( ا ) and gives the sound of n . It may be noted that the character just before Alif should be with zer but normally it is not written.Example: نسلاً بعد نسلاٍनस्लनबादनस्लिन (The character ل is with Zer )Ulta pesh : ٗ This is placed above the character and gives the sound of oo (ऊ)Example: بعدہٗمالہ ٗबादहूमालहू

  35. Contd6 • Khada zabar: ٰ This adds the sound of Alif to the character on which it is applied. Mostly it is put on Choti ye and Badi ye (ی ، ے ) and the efect of it ie Aa sound is transfered to the character earlier to Choti ye and Badi ye. It comes in the middle also and the character on which it is applied is added with the sound Aa.Example: اعلیٰ ، مصطفےٰ ، ہٰذاआलामुस्तफ़ाहाज़ा

  36. Contd7 • Other Diacritic marks are pesh and khada zer.

  37. Gender • Rules for gender in most of the words which have been derived from Hindi/Indian languages, do not change between Urdu and Hindi. But for some of the words which have been borrowed from Arabic or persian, the gender changes.vyavastha (feminine) انتظام (Masculine)aakarshan (Masculine) کشش (feminine)prakash (Masculine) روشنی (feminine)

  38. Compound Words • In Urdu two words are joined together. • ( Same as in English where Apostrophe is used to join the words and Apostrophe gives the sense of "of"). In Urdu, the words are joined by "Izaafat". There are three types of Izaafat.

  39. Contd2 • 1. Zer ِ is added after the first word and then the other word is written. Example: دردِ دلMost important thing is that there has to be space after zer ِ other wise the words may join together and will be problematic to read correctly.Example: شاخِ گلIf space is not given then the word will appear like this. شاخِگل

  40. Contd3 • 2. If the last character of the first word is "he" or "choti ye", then Hamza is added after the first word. Example: نغمۂ آب، گرمیٔ عشق

  41. Contd4 • 3. If the last character of the first word is alif or wav then "Hamza Badi ye" is added after the first word.Example: اداۓ خاص، بوۓ گل

  42. Compound words without Izaafat • Normally in Hindi some words are written togetherExample isaka, usaka, Taajmahal etc but in Urdu they are written separately. • اس کا، اس کا، تاج محلIt may be noted that if Tajmahal is written together then it will not be readable. تاجمحل

  43. Typographical errors • 1. In Urdu when choti ye ی and ے come as a last character of a word then it retains its original shape. But if it comes in middle then it is difficult to recognize because the appearance is same in hand written text and Word processing packages like Inpage. Unicode badi ye does not join with the following character. example: کےلا کیلا

  44. Contd2 • In all Urdu packages this will be written as کیلا which creates ambiguity and transliteration through machine becomes problematic.

  45. Contd3 • 2. There are certain characters in Urdu which do not join with the following character. These characters are ا،آ،د، ذ، ڈ، ر، ز، ژ، ڑ و . Data entry operators do not care much to give space between the two words as it is difficult for them to notice the joined position of the words. Due to this machine takes the two words as one and fails to process the word.

  46. Contd4 • 3. The diacritic marks are ignored in Urdu and hence apparently there is no difference between • इस اسand उस اس

  47. Ambiguity • हवाہوا (noun) • हुआہوا (verb) • In Urdu both words are written in the same way but meaning is different. The spelling in Hindi is also different.

More Related