AboutThe online Pashto-English dictionary is a research project funded by the University of Arizona Libraries.
- Project Directors: Yan Han and Atifa Rawan
- Language Experts: Noor Dawari and Rafi Baaser
- Architecture and Design: Yan Han and Paul Shen
- Software Developers: Paul Shen and Junxian Zhang
- 2020 June: Migrate to Ubuntu 18.04
- 2014 July: Initial design testing
- 2014 October: Beta version released 12,000 Pashto words
To create a recognized Pashto dictionary for the purpose of a standard language of Pashto with standardized spelling, vocabulary and pronunciation.
Pashto (پښتو ) alternatively spelled Pukhto, Pakhto or Pushto), historically as Afghani (افغاني ), is the one of the two official languages of Afghanistan (the other is Dari /Farsi/ Persian). It is also spoken as a regional language in Pakistan. Pashto is spoken by over 16 million people in Afghanistan and 30 million in Pakistan.
While building metadata records for Afghanistan Digital Libraries , We found several Pashto language dictionaries online but encountered several issues related to standardized spelling, pronunciation, romanization/transliteration, and limited content. Standard languages commonly feature: recognized dictionary (standardized spelling and vocabulary); recognized grammar; standard pronunciation; linguistic institution defining usage norms; constitutional (legal) status (frequently as an official language); effective public use (court, legislature, schools, news media) and acceptance in community.
The Pashto-English dictionary is almost near completion. There are currently about 12,000 words, and we are still expanding and adding many new words. In addition, we are working to improve and extend the English-Pashto Dictionary, which is not deployed at this moment.
The Pashto-English dictionary has been created with the following objectives in mind:
- Standardized spelling and vocabulary - Our language specialists verify and key in every single words to make sure spelling, pronunciation, and meanings.
- Standard Pronunciation - Like many other languages, there are various dialects among Pashto speakers. Our language specialists checked every single term and verified the most common and standard pronunciations used in major parts of Afghanistan.
- Standardized Romanization
The dictionary is currently adopting the American Library Association - Library of Congress (ALA-LC) romanization scheme. There are several romanization systems used, including ALA-LC, BGN/PCGN, DIN 31635, ISO233, ArabTex and others. Normally published Pashto dictionaries either use one of the above or a combination of few romanization systems.
- ALA-LC is a set of romanization guidelines developed by ALA and LC. The guidelines has been used in North American libraries and the British Library since 1975. The ALA-LC romanization guidelines include over 72 languages. Pashto was updated in 2013 and is available online.
- BGN/PCGN romanization refers to the systems for romanization adopted by the United States Board on Geographic Names (BGN) and the Permanent Committee on Geographical Names for British Office Use. Using within the U.S and British governments and agencies, the romanization systems are used primarily for standardization of roman-spellings of foreign geographical names written in non-Roman scripts (NGA, 2014). The BGN/PCGN have been approved to geographic names, it has been used for romanization of text and names.
- The ISO 233 titled “Information and documentation - Transliteration of Arabic characters into Latin characters”. ISO 233 consists of 3 parts, where 233-2 is “Arabic language - Simplified transliteration” and 233-3 is “Persian language - Simplified transliteration”.
- DIN 31635 is German standard for the transliteration of the Arabic alphabet adopted in 1982, used in most German-language publications.
- ArabTex is a free software package for TeX and LaTex (document markup languages widely used in academia) and to produce outputs for Arabic languages such as Arabic, Persian, Urdu, and Pashto.