About Arabic Domain names

Domain names are used widely by Internet users to locate resources on the Internet via a format that is easy to remember and understand. They are used instead of the numerical addresses which are known as Internet Protocol (IP) addresses. Hence, the main objective of using domain names is to ease and simplify the use of the Internet.
Despite the worldwide spread of the Internet, the Internet domain name system has not yet supported other languages to locate resources on the Internet. Thus, users in non-English speaking countries are at a disadvantage. Using domain names in a language that is different from the users' native language defeats the main objective of having the domain name in characters rather than just numbers.

Internationalized Domain Names (IDNs) are increasingly required by nations and users world-wide. The Internet penetration in some countries is very low due to the language barrier. For example, the Internet penetration in the Arab world is estimated to be around 5.4% in 2004, which is indeed very low.
This short introduction will highlight some critical issues related to Arabic domain names pointing out some historical events and achievements.

Needs and motives

The Internet has become a global network of most if not all countries of the world with hundreds of millions of users. Currently, it is estimated that more than 45% of the Internet content is in languages other than English. Also, based on an estimation in year 2011 at least 73% of web users are native speakers for some language other than English. Regardless of the worldwide spread of the Internet, the Internet domain name system has only recently supported languages other than English.

Users in non-English speaking countries, such as the Arab users, were in disadvantages. Using domain names in a language that is different from the native language defeated the main objective of having the domain name in characters rather than just numbers. The Internet penetration in the Arab world is estimated to be about 3.3% which is indeed very low compared to other language in the world. One of the obstacles that faced the growth of using Internet in the Arab world was the language barrier.

Thus, many countries and nations are encouraging their people to use Internet. Therefore, it is important to make the Internet support the Arabic language not only in web content but also in the means to reach that content which is domain names.

It was required that the Arabic user should be able to use Arabic language from the moment he switch on the computer till the required data are retrieved from the Internet. This entails thee limitation of the need for entering non-Arabic web (URI) addresses particularly if the sites are in Arabic.

There are a number of reasons why Arabizing domain names is needed, such as:

  • There is only a small percentage of Arabs who can read and write English.
  • There are many well-known Arabic names that need to be used in the Internet.
  • English letters are not capable of representing (or substituting for) Arabic letters.
  • Encouraging the use of the Internet by Arabs who do not speak English. As the trend

Internationalized domain names (IDNs)

Multilingual domain names or Internationalized domain names were first developed in Asia-Pacific countries in 1998, which led later to the creation of a number of non-for profit organizations to supervise and pursue the deployment of multilingual domain names.

Some of those organizations are the Multilingual Internet Names Consortium (MINC), the Arabic Internet Names Consortium (AINC), the Chinese Domain Name Consortium (CDNC), the International Forum for IT in Tamil (INFITT), and the Japanese Domain Names Association (JDNA). Also, the Internet Corporation for Assigned Names and Numbers (ICANN) established an internal Internationalized Domain Name (IDN) Working Group, and the Internet Engineering Task Force (IETF) created an internationalized DNS group that was been dedicated for exploring the possibility of supporting internationalized domain names. The IDN group of IETF has issued 3 important RFCs (3490, 3491, 3492.) for Internationalized Domain Names. These RFC's made it possible for domain name servers to register non-ASCII domain names and application/client vendors to implement standardized support for handling non-ASCII characters in domain names.

Main Characteristics of Arabic Script Based Languages

The Arabic script is the second most widely used alphabetic writing system in the world after the Latin alphabet. Originally developed for writing the Arabic language and carried across much of the Eastern Hemisphere, the Arabic script has been adapted to such diverse languages as Persian, Urdu, Turkish, Spanish, and Swahili.
The Arabic language is the official language of 22 countries that are the members of the Arab League. Also, it is widely used by more than 43 Islamic countries. This means that there are more than one billion potential users could be concerned in using Arabic script domain names.

The Arabic script based languages have a number of characteristics, including the following:

  • They are written from right to left.
  • Most characters have different shapes depending on their position (beginning, middle, or end) within a word, and are probably conjugated with preceding and succeeding characters. These different shapes for a single character do not count as different code points but they are handled using different fonts.
  • Tashkeel (diacritic) is a small sing that is usually put on top or under a character for the purpose of correct pronunciation which may leads to a different meaning. It is not a letter by itself but it is a mean to correctly pronounce a letter. It is not widely used except in case of the possibility of mispronouncing words that have the same letters but with different pronunciations, and hence having different meanings.
  • Words abbreviations are not widely used.

Arabic Language and Domain names

Supporting the Arabic language in domain names created the need to investigate and address a number of issues in order to produce a set of standards that are acceptable by the Internet community at large. These standards should cover several aspects of supporting Arabic domain names in different levels, such as:

  • Linguistic issues and the accepted Arabic character set.
  • The Arabic domain name tree structure, i.e., Arabic gTLDs and ccTLDs.
  • Technical solutions to Arabize the domain name system
  • The administrative and organizational issues of Arabic root servers.

The 1st and 2nd points have been addressed by the Internet Draft that has been produced during 2003-2004 by the Arabic Domain Names Task Force (ADN-TF) under the auspices of ESCWA, and has since undergone several enhancements and updates. The last of which was performed after thorough review by the first meeting of the LAS Arab Working Group on ADN, held in Damascus on the 31/1-2/2/2005.

The 3rd point is partially addresses by the IETF 3 RFC's (3490, 3491, 3492).

Arabic Language Linguistic Issues

There are a number of linguistic issues that have been discussed and agreed upon with respect to the usage of the Arabic language in domain names. These issues include:

  • Usage of diacritics.
  • Usage of Kashidah.
  • Supporting character folding.
  • Which numerical digit st should be supported.
  • Connecting multiple words.
  • Usage of special characters.
  • Mixing with other languages.

Table below summarizes the recommendations for main linguistic issues (based on the LAS Arab Working Group on ADN).

Issue

Recommendations

Tashkeel (Diacritics)

Tashkeel should not be allowed. However, if there is a need to allowed users to entered Tashkeel as part of a domain name then it should be stripped off by nameprep

Kasheeda

Kasheeda should be disallowed

  • Folding Teh Marbuta + Heh
  • Folding different forms of Hamzah
  • Folding Alif Maqsura+Ya

Folding should not be allowed

  • Numbers
  • Arabic Zero

If it is technically possible, it is preferred to support both (Latin and Arabic) sets with folding to one set. Otherwise, Latin set is sufficient

  • Connecting Multiple Words
  • Spaces

It is recommended that multiple words are separated by the character "-".

Mixing Latin and Arabic Characters

It is recommended that Arabic domain names be pure Arabic and they should not be mixed with other languages.

Special Characters (e.g., @, #, $, %, ...)

It is recommended that Arabic domain names should follow the standard with respect to the use of special characters.

Accepted Character Set

It is recommended to use UNICODE. The following Unicode characters are accepted in Arabic domain names:

  • U0621(hamza) until U063A (gheen)
  • U0641 (feh) until U064A (yeh)
  • Arabic numbers: 0-9 (U0660 until U0669)
  • Latin numbers: 0-9 (U0030 – U0039)
  • Hyphen (U002D)
  • Dot (U002E)
  • Other than these characters are not allowed


How IDN Works ?

When a browser sees a host name such as http://www.arabic-domains.org, it passes a request to the DNS resolver service (usually built into an OS), which in turn sends a request to a nearest domain name server to return an IP address that corresponds to the host name. This IP address is then used to connect to the web server in question.

IDN allows host/domain names with non-ASCII characters for user input into a browser's location bar or URL's embedded in web pages. At the network protocol level, there is no change in the restriction that only a subset of ASCII characters be used in URL. If end users input non-ASCII characters as part of a domain name or if a web page contains a link using a non-ASCII domain name, the application must convert such input into a special encoded format using only the usual ASCII subset characters. RFC 3490 (Internationalizing Domain Names in Applications (IDNA))defines characters used in IDN to be drawn from Unicode Standard 3.2. It also defines how an application should process non-ASCII characters in such a way to conform to existing host name character restrictions.

As an example, an Arabic domain name (نطاق.السعودية) will look like the following form “xn--mgb5a8an.xn--mgberp4a5d4ar” after converting it from IDN to ASCII format (using “stringprep”,“nameprep” and “punycode” operations).

History of Arabic domain names

In the following, we will try to go through some of the most important mile stones in the history of Arabic domain names .
Defining the standard for accepted Arabic letters in ADN
in 2003 the (ESCWA) held a meeting in Beirut, Lebanon, where a grope of experts in the fields of Arabic language and in domain names and the Internet ( the task force operating under the auspices of the UN Economic Commission for Western Asia - ESCWA ) gathered to discuss the issue of accepted letters and characters that should be used in the Arabic domain names, that effort ended later by issuing an Internet draft that specified the set of allowed characters in Arabic domain names along with other linguistic issues, that document evolved later to become an RFC .

Then the draft was reviewed by the Arabic Team for Domain Names and submitted to the Arab League for final approval. The main parts of the document are the accepted Arabic character set and the Arabic Top-Level domains. The following tables summarize the main recommendations of the document. Also the document covered some management issues. The ADNPP has agreed to follow this document throughout the whole project, leaving sufficient space for each ccTLD manager to draft their own guidelines for registering Arabic domain names which should not result in any conflicts with the drafted document.
GCC Pilot Project for Arabic Domain Names
Back in 2004 ICANN did not show any near full support for IDN (i.e., idn.idn) , for that, the managers of the GCC (Gulf Cooperation Council) ccTLDs ( ae, bh, kw, om, qa,sa) in their meeting on 7th of March 2004 agreed to initiate a pilot project for Arabic domain names.
The goal of this project was to implement a test bed for Arabic Domain names in the GCC countries. that allowed all GCC countries to early experience the use of Arabic domain names, identify the needs, locate possible problems, and develop some tools.
the main objectives of the project were :

  • To gain experience and knowledge of the Arabic Domain names and share it with Arab countries.
  • Test the implantations of Arabic Domain names.
  • Build the local awareness about Arabic Domain names.
  • Establish joint work with other entities (i.e., ISPs, universities, ).
  • Possibly develop some tools related to Arabic domain names and DNS.

the project team was assembled from one or two technical persons from each GCC ccTLD ( ae, bh, kw, om, qa, sa) . by the end of the project a number of important achievements were realized, including the success in preparing a GCC root servers and test the GCC Arabic Domain names early. also some important technical guidelines were drafted as a result of this project along with policies and regulations for registering Arabic domain names.

Arab Pilot Project for Arabic Domain Names

The success of the GCC pilot project lead the Arabic Team for Domain Names, in their 2nd meeting that was held in Cairo on the 7th and 9th of May 2005, to recommend the expansion of the GCC Pilot Project for Arabic Domain Names to include all members of the Arab League (22 countries). Hence, the project was renamed as follows: (Arabic Domain Names Pilot Project) and it falls under the auspices of the Arab League. Two committees were created for the management and operation of the project: a steering committee and a technical committee. The Steering Committee's tasks included the general supervision of the project, management and supervision of the Arabic root servers, and setting policies and procedures which include participation policies and use terms and conditions. While the Technical Committee's tasks include providing technical support for participants and users, technical coordination between participants, technical supervision of the Arabic root servers, and enhancing and improving the project from the technical stand point.

The mission of the project was : ( Implementing a test bed for Arabic domain names (ADN) in the Arab world. This will allow for the early experience the use of Arabic domain names by all Arab countries, the identification of their needs, the agreement upon uniform standards, the identification of possible problems, and the development of required tools and policies.)
The project was expected to contribute to the following strategic objectives:

  • Establish and implement Arabic domain names.
  • Increase Internet use in the Arab world by addressing linguistic barriers facing Arabic-speaking users.
  • Promote the use of Arabic language and to increase the Arabic content on the Internet.
  • Promote Arab cultural identity on the Internet.

While the main objectives of the project were :

  • Make the Internet easier to use for native Arabic speakers.
  • Gain experience and knowledge in the use of Arabic domain names and share it with the Internet community.
  • Test the implantations of Arabic domain names based upon the guidelines drafted by the “Arabic Team for Domain Names”.
  • Build local awareness related to Arabic domain names.
  • Possibly, to develop necessary tools required for Arabic domain names and DNS.
  • Develop required policies and guidelines that help achieve the above objectives.

more information about the project are available here ( www.arabic-domains.org ) .

The fast track for applying to the IDN

The first breakthrough in the history of Arabic domain names and IDN in general, was the ICANN announcement for opening the fast track to apply for IDN ccTLDs on 16-November-2009 . Saudi Arabia was among the first countries to apply in the fast track, and it was one of the first countries that their IDN ccTLD to get approved ( along with UAE, Egypt and Russia ) . Saudi Arabia got the Arabic ccTLD ( السعودية. )

First Non-Latin Domain Names Go Online

Thursday 6 May 2010,was a remarkable day in the history of the Internet. in that day ICANN officially launched the internationalized domain names, for the first time in the Internet’s history, non-Latin characters could be used for an entire Internet address name. The first (internationalized) domain names (IDNs) were entered into the Internet’s root zone on that day, marking a historic first in ICANN’s globalization of the Internet.

Egypt, Saudi Arabia and the United Arab Emirates are the first three countries to use Arabic characters in the last portion of their Internet domain names – that portion to the right of the dot (or to the left for languages like Arabic which written and read from right-to-left), such as dot-eg (Egypt), dot-sa (Saudi Arabia) or dot-ae (United Arab Emirates). They are called country code top-level domains or ccTLDs.

As soon as the IDN ccTLD for Saudi Arabia (.السعودية) was added to the root servers a number of Arabic domain names automatically become among the first Arabic domain names to be on the Internet, e.g.:

  • سجل.السعودية xn--rgbn6c.xn--mgberp4a5d4ar
  • موقع.السعودية xn--4gbrim.xn--mgberp4a5d4ar
  • مركزالتسجيل.السعودية xn--mgbggrfi2ikdb7d.xn--mgberp4a5d4ar

This remarkable achievement came after many years of dedication and hard work given by SaudiNIC with close cooperation with the Arab team for Arabic domain names to support the use of Arabic domain names. for more information on project related to Arabic domain names please see the ( www.arabic-domains.org )

Following that, Saudi Arabia was the first country to provide complete Arabic domain names registration services, the plan was to introduce the registration in phases with the goal of maintaining rights and avoid overloading the registry .

Sunrise Phase of Arabic domain names registration

That was the first phase in registering Arabic domain names under the Saudi Arabic ccTLD ( السعودية. ), the period extended from 10:00 AM on Monday the 17th of Jumada II 1431 H (31th of May 2010 G), to 10:00 AM on Monday the 30th of Rajab 1431 H (12th July 2010 G). During that phase SaudiNIC accepted applications for registration of Arabic domain names from any entity or individual who had a registered trade name or mark with Ministry of Trade and Industry in Saudi Arabia, including all governmental or semi-governmental authorities and organizations.
Thos entities were given the opportunity to register Arabic domain names that correspond to their Arabic official or commercial names (without alternation, abbreviation, or translation) as they appear in the commercial registrations, trademark registration certificates, or the official name in case of governmental entities.
by the end of that phase all the requests that matched the regulation were officially registered.

Landrush Phase Arabic domain names registration

That phase was the second phase in registering Arabic domain names under the Saudi Arabic ccTLD ( السعودية. ), in which SaudiNIC began to accept applications for registration of Arabic domain names from members of the general public. During that phase, the registration was opened for any entity or individual on a first come, first served basis according to the terms and conditions guiding the registration process as well as other SaudiNICs regulations and procedures.
The Landrush phase started from 10:00 AM on Monday the 18th Shawwal 1431 H (27th September 2010 G) and it does not have an end date since it represents the continues Arabic domain names registration process .

Project based on the Arabic domain names

After the official support of the Arabic domain names registration, users from the local community are now able to obtain domain names in their native language. But utilizing the potentials of the Internet requires the introduction of the services based on Arabic domains, one of them is the Arabic Electronic Mail. The goal of this project is to develop a web based Arabic Email system that allow it's users to obtain Arabic Email addresses and then use them to communicate in the same way as the regular email .

Raseel project ( email with Arabic addresses )

The goal of this project is to develop a web based Arabic Email system that allow it's users to obtain Arabic Email addresses and then use them to communicate in the same way as the regular Email .

Motivated by the same needs and requirements that drove the efforts of the Arabic domain names, CITC represented by SaudiNIC and KACST represented by the Computer Research Institute has launched a pilot project ( Raseel - رسيل ) for email with fully Arabic addresses.

( Raseel – رسيل ) is the first deliverable resulting from the agreement the was signed between the two entities with the goal of conducting several research and development projects exploring the challenges and possible solutions in the way of adapting Arabic domains in practical tools and services after the official support of them.

The goal of this project is to develop a web based Arabic Email system that allow it's users to obtain Arabic Email addresses and then use them to communicate in the same way as the regular Email .

Recently the project team successfully completed all the technical requirements of the project and lunched the online web email system supporting fully Arabic domain names, also a dedicated website was lunched ( رسيل.السعودية ) to be the official reference of the project and the main communication getaway with the users.

The project team managed also to successfully develop a plug-in that is used to correctly display and handle Arabic domain names in the popular mail client ( MS Outlook – versions 2007 and 2010 ) .

Also a focus a grope of selected participants from CITC,KACST and other national and international entities were invited to test the system and the plug-in and provide their feedback.

All the systems were developed using free/open source systems and all were hosted locally, all the work was done by CITC staff participating in the project .

Arabic domain names registration

the registration of Arabic domain names under the Saudi Arabic ccTLD ( السعودية. ) follows the same process and regulation as the normal domain names registration under ( .sa )

Regulations

To follow are some of the published relative regulations :

 

Hosting an Arabic domain

Arabic domain names hosting is exactly like hosting normal domains with only one exception, the Arabic domain need to be translated first to a special coding called ( punycode ), that code is what actually get hosted in the DNS, below is a video to explain that ( in Arabic ) .