Skip to content.

You are here: Home » Projects » ICT R&D Grants Programme » ICT R&D 2002 Recipients » Free Font For Writing Urdu


 

Nafees Nastalique-Character-Based Nastalique Font for Urdu

Document Actions
Grant awarded in January 2002 to National University of Computer and Emerging Sciences to develop a Nastalique font for Urdu.

Abstract and Project Proposal

Nafees Web Naskh

Center for Research in Urdu Language Processing (CRULP, www.crulp.org) at National University of Computer and Emerging Sciences (www.nu.edu.pk) is pleased to announce Beta release of character-based Nafees Web Naskh Open Type Font for writing Urdu in Naskh script based on Unicode standard. Guidance and calligraphy of basic glyphs for the font has been provided by Syed Jameel-ur-Rehman. He is pupil of Syed Nafees Shah and Hafiz Syed Anees-ul-Hassan.

Previous Nafees Fonts were focused on maintaining the calligraphic beauty of Urdu. Therefore, numerous rules were put it which made the fonts slow. This made Nafees Nastaleeq, Nafees Naskh and Nafees Pakistani Naskh fonts harder to use with web pages. Keeping this problem in view, CRULP team has worked in developing a simplified Naskh style, still trying to maintain the beauty of Urdu Naskh. Nafees Web Naskh OTF contains approximately 250 glyphs, including 2 ligatures. With very few OpenType substitution rules and not a single OpenType positioning rule, this font is much faster than all of the other Nafees Fonts and can be extensively used for making Urdu web sites. The font is also developed with a flat qat to enable better viewing at small font sizes. This font also supports the basic ASCII characters.

This font is being released with an Open Source License (available at the website). Nafees Web Naskh is freely downloadable from www.crulp.nu.edu.pk or www.crulp.org. Please send comments and report bugs at CrulpFonts@nu.edu.pk.

Free Font For Writing Urdu

Center for Research in Urdu Language Processing at National University of Computer and Emerging Sciences is pleased to announce the release of character-based Nafees Nasta’leeq Open Type Font for writing Urdu, based on Unicode standard.

This font is developed according to calligraphic rules, following the style of Syed Nafees Al-Hussaini (Nafees Raqam),who is one of the finest calligraphers of Pakistan. Guidance and calligraphy of basic glyphs for the font has been provided by Syed Jameel-ur-Rehman. He is pupil of Syed Nafees Shah and Hafiz Syed Anees-ul-Hassan. Nafees Nasta’leeq OTF contains approximately 1,000 glyphs, including about 26 ligatures. This font is operable on all platforms supporting OTF specifications. This work has been funded by Small Grants Program by IDRC, APDIP UNDP and APNIC.

Nafees Nasta’leeq allows Urdu computing on Microsoft 2000, NT, XP, Unix and Linux platforms. This font enables desktop and internet publishing, and electronic communication in Urdu using existing software (without any plug-ins) supporting OTF specifications, e.g. MS Word, MS Excel, MS Outlook (email), Internet Explorer, Netscape Navigator, Mozilla and MS PowerPoint.

Nafees Nasta’leeq is freely downloadable from www.crulp.nu.edu.pk or www.crulp.org. Read the release notes in PDF format (795kb).


Project Title:
Nafees Nastalique-Character-Based Nastalique Font for Urdu

Recipient Institution:
National University of Computer and Emerging Sciences
852B Block, Faisal Town,
Lahore, Pakistan

URL: www.nu.edu.pk

Project Leader:
Dr. Sarmad Hussain, Senior Research Fellow
sarmad.hussain@nu.edu.pk

Amount and Duration: USD 29,833 / 18 months

Commencement Date: February 2002

Abstract

With 60 million speakers in more than 20 countries, Urdu is a widely spoken language, especially in South Asia. It has a rich tradition of poetry and prose. Traditionally, Urdu has been written in Nastalique script. This script is cursive having a complex and context-sensitive structure. Though it is defined by well-formed rules passed down through generations of calligraphers, these rules have not been quantitatively examined and published in enough detail to enable modeling of character-based Nastalique font for computers. In addition, prevalent font specifications, e.g. True Type font, are not mathematically powerful enough to model such complex fonts. Therefore, there is no practical way to post or exchange information in Urdu through electronic media including internet (except putting scanned images of text, which makes web sites extremely slow to access). Recent advances in font technology now enable complex scripts like Nastalique to be modeled. Open Type Font (OTF) specification is one such example. This project aims to perform a quantitative analysis of Nastalique rules, and model them using OTF. On successful completion of the project, it will become easy to disseminate information in Urdu language through electronic media. As OTF is a standard formalism, no specialized software will be required to read and render this font, as it will easily be operable within the existing software supporting OTF. Output of this project will be a character-based Nastalique font for Urdu. Furthermore, the project will also produce research papers quantifying Nastalique rules to significant detail and analyzing methods for modeling and rendering complex fonts.

Background and Justification

Urdu is the national language of Pakistan and has more than 60 million speakers in more than 20 countries. Even with such extensive readership, very limited information is published on internet in Urdu. A significant limiting factor is absence of a character-based font for Urdu. Urdu is written in Nastalique script which is highly context-sensitive and cannot be realized using earlier font specifications (e.g. true type fonts). Therefore, Urdu websites are made by either using Naskh font (normally used for Arabic, and which is unnatural for Urdu readership) or by putting scanned images of text written in Nastalique (which takes a large amount of memory and makes the websites very slow to access). Therefore, to make Urdu web and other publishing more effective and efficient, a character-based Nastalique font for Urdu needs to be developed, which can be used by existing software.

Project Objectives

The recent Open Type Font (OTF) specification extends the existing font formalism to enable the realization of context-sensitive writing systems like Urdu Nastalique. This project focuses on using this formalism to develop a Nastalique font for Urdu, which in turn requires first, a scientific study of Nastalique orthography, and then its modeling using OTF specification.

Development of this font will enable the users to publish electronically in Urdu and reach out to the extensive readership across the world. As OTF is a standard formalism, specialized software would not be needed, and this font will be enabled within the existing applications and web browsers (which support OTF).

Project Beneficiaries

Definition and free disbursement of Nastalique font for Urdu will accelerate Urdu publishing through electronic media and will benefit the 60 million readers of Urdu across the world. People who do not understand a second language (e.g. English, which is the lingua franca of computers and internet) will also be able to publish and access web pages, email, chat, etc., and a host of other computer applications.

Project Sustainability

Center for Research in Urdu Language Processing (CRULP, www.crulp.nu.edu.pk) is a dedicated effort of National University of Computer and Emerging Sciences (NUCES) showing its long-term commitment to the development of Urdu Computing. CRULP is currently working in the areas of script, speech and language processing of Urdu, with dedicated faculty and labs. In addition, NUCES offers coursework and specialization in its graduate Computer Science program in these areas.

Development of OTF for Urdu Nastalique is one of many projects initiated at CRULP to provide Urdu interface to computing and information technology. The research and development undertaken for this project will be used towards these long-term objectives of CRULP.

Thus, the work done for this project will be sustained through the continuing research and development at CRULP.

Project Methodology

What is Nastalique?
Nastalique font is computationally complex for many reasons. First, letters are written using a flat nib (traditionally using bamboo pens) and both trajectory of the pen and angle of the nib define a letter. Each letter has precise writing rules, relative to the length of the flat nib, as described in Figure 1 below.

Figure 1: Rules for writing mad, alif, bay, tay and jim letters of Urdu in Nastalique

Second, this cursive font is highly context sensitive. Shape of a letter depends on multiple neighboring characters. Current work shows that shape of each target letter Lt may depend on the context of four neighboring characters: L1LtL2L3L4. This analysis also shows that a letter may have as many as sixty shapes depending on context. Third, there is no concept of right or left alignment, and all text written in Nastalique is justified. In addition, there is no concept of space in Nastalique font.

Therefore, unlike roman script that uses space to separate words and justify lines, justification in Nastalique is achieved through two measures: (i) stretching of certain letters within ligatures, (ii) moving and overlapping ligatures. This adds another dimension to existing complex contextual rules to determine the shape of letters. Further, this also adds to the difficulty of computing multiple baselines, which depend on letters within ligatures.

Proposed Methodology to Model Nastalique

Due to the complexities outlined above, there is considerable effort required to segment connected letters (ligatures) and to formulate placement and joining rules. To avoid the complexities, current systems use scanned images of ligatures, without segmentation. These ligature based systems, effective in producing good Nastalique, require excessive storage space because the number of ligatures is far more than the number of letters (at least 25,000 ligatures vs. 40 characters for Urdu).

Current project addresses this issue by creating a character-based font for Nastalique for Urdu. The approach taken is to identify all the different shapes assumed by characters in different contexts, formulate rules to join them, and model them using OTF specification.

The project is divided into three main phases. First phase is based on orthographic analysis of Nastalique for Urdu. Second phase concentrates on modeling these findings using the current font technology - Open Type Font (OTF) specification. The final phase focuses on quality assurance of the designed system.
Orthographic Analysis of Nastalique for Urdu
This phase is divided into the following steps:

1. Context-dependent shape identification - part 1
The first part of the analysis will take this continuously varying script and determine a set of shapes for each letter occurring in different contexts. Initial work done in this area shows as many as 700 shapes for 20 characters. Because calligraphers are not trained to look at shapes from the point of view of font modeling, the research team will do this work.

2. Context-dependent shape identification - part 2
Once the initial analysis of shapes is completed, two calligraphers will be employed for this verification to make it more objective and accurate.

3. Classification of Urdu characters
Various letters change similarly in similar contexts. This stage will identify the characters and contexts for which similar shape changes are observed. Documenting this behaviour will reduce the number of entries/rules in the modeling stages.

4. Formulation of placement rules
As noted above, to form a ligature (letters connected together in one form), various letters are moved in a two dimensional plane to join effectively. This results in multiple baselines for a letter depending on the context. This part of the study will concentrate on determining these placement rules.

5. Formulation of joining rules
In the final stage of analysis of Nastalique, the joining rules for various letters will be determined, i.e. how various shapes of letters join effectively with other letter shapes and how pen angle and trajectory may be kept continuous over these joins.

Nastalique modeling using OTF specification

This phase is divided into the following steps:

1. Calligraphy of ligatures
Once all the shapes of each letter have been determined, ligatures containing each shape have to be carefully written on a fixed scale for scanning purposes. This work will be done by an expert calligrapher using a fixed pen width.

2. Scanning and Segmentation
Ligatures that have been written will be scanned at a high resolution (approx. 4000 dpi) and researchers under the supervision of an expert calligrapher will segment target letter shapes. The segmentation will be done using Adobe Photoshop.

3. Vectorization of Segments for font production
The segmented letters will then be fitted with splines using Corel Trace (shipped within Corel Draw). These will be the basic vectorized shapes used in the OTF shape tables.

4. Modeling within OTF specification
Once the basic shapes have been incorporated, the placement and joining rules researched earlier will be modeled within OTF specification to create Nastalique.

Verification and Validation of Nafees Nastalique

During the development of Nafees Nastalique a test plan will be generated to verify and validate the font generated. Two aspects will be addressed. First, how this font described with OTF specification fits in with existing applications supporting this font standard. Second, to what degree the OTF modeling has faithfully captured the rule-base of Nastalique.

1. Conformance with software supporting OTF specification
Conformance will be tested with two types of applications
a. With common browsers, by creating and publishing a web-site.
b. With common desktop tools.

2. Verification of rendered Nastalique
The rendered Nastalique will be verified by expert calligraphers. They will scrutinize the basic shapes, placement and joining of these shapes vis-� -vis Nastalique. Two calligraphers will be used to maintain objectivity.

Project Timelines


The timelines of each of the phases explained in the Methodology section are tabulated below, showing their duration, expected start and end dates and their dependency on any earlier activities. Each phase is divided into sub-phases such that there will be a report generated at the end of each of sub-phase (as explained in the Monitoring section below).


Project Outputs

The projects will have the following outputs.

1. Self-extracting installation program that will install Nafees Nastalique.

2. Publications on the lexicon and rule base developed for Nastalique and modeling techniques employed for the realization of Nafees Nastalique.

3. Sample web-site creation and publication using the Nafees Nastalique.

4. The above will be made available for free download at the website of CRULP, NUCES ( www.crulp.nu.edu.pk ).

5. The developed font will also be advertised at relevant research and development forums.

Project Monitoring

In order to monitor the project, the following mechanisms will be put in place. First, a detailed project plan (in line with the plan shown above) will be made and updated weekly to track the progress of the project.

Second, reports listed below will be published at various milestones shown in the timelines.

  1. Context-dependent Shape Inventory - Document detailing context-dependent shapes of each letter, with examples.
  2. Placement and Joining rules - Document containing rule-base for placement and joining of letters, and classification of letters optimal for this rule-base.
  3. Vectorized Font - Vectorized segments without placement or joining specification.
  4. OTF Specification - Representation of Urdu Nastalique font using OTF specification.
  5. Test Plan and Execution - Pre-specified test cases for establishing conformance of Nafees Nastalique with some general-purpose software supporting OTF specification, and verification of rendered Nastalique against these test cases.
  6. Nafees Nastalique - Publication of Urdu Nastalique font using OTF specification.

Finally, six-monthly progress reports will be published. These reports will highlight progress along the research and development activities and the financial expenditures incurred against the allocated budget.

All the above will be available for internal and external audits. 

 Additional Resources

Interim Technical Report
Final Technical Report (PDF, 154kb)

ApniUrdu.com
UrduPoint.com
UrduSeek.com
UrduWord.com
Urdustan.com


Last modified 2005-06-21 02:08 PM
 
 

Powered by Plone rss logo