473,597 Members | 2,157 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

Latin & Arabic Characters on the Same Page

Hello CIWAH ...

I want to propose full internationaliz ation of three related websites:
http://africadatabase.org/
http://people.africadatabase.org/
http://institutions.africadatabase.org/

My role is mainly advisory and server management. I have very little
to do with content or page generation, so it's not something I can
do myself -- I have to persuade other people to do it.

On all three home pages, we have mixed Latin and Arabic characters
(all other pages right now are Latin-only).

However there is a difference: on http://africadatabase.org/ and
http://institutions.africadatabase.org/ the character set is
windows-1256, which is commonly used by Arabic websites. The Arabic
characters are all 8-bit, and their correct display depends (presumably)
on visitors having the required codepage installed on their computers.
If they haven't got it, the first four Arabic words will look like this:
" ".

Now I don't think that is a major problem in Arabic-speaking countries,
because they nearly always have the windows-1256 codepage. I'm more
concerned about what all the rest of us see.

The other home page http://people.africadatabase.org/ uses UTF-8, and
the Arabic characters are converted to numeric entitities. Although the
conversion is a bit of extra work, I think it is worthwhile.

CIWAH has alwasy been an exceptionally useful resource, so I would like
to ask anyone here who has a few minutes spare to look at these pages,
and report on what they see. I guess the UTF-8 page will look exactly
the same to everybody. But what about the windows-1256 pages? So far, I
have only looked at them on PC machines in countries where the Latin
character set is normal. They look OK. But what do Mac users and Linux
desktop users see?

I would be really grateful for any information people can offer -- it
will all help. And I always appreciate any other comments, criticism
or advice that might be useful. I know there are a number of character
set problems on some of those pages -- they are on a long to-do list,
which you are all welcome to make longer still.
tia etc ...

Jul 20 '05 #1
17 8328
thinkfirst wrote:
Hello CIWAH ...

I want to propose full internationaliz ation of three related websites:
http://africadatabase.org/
http://people.africadatabase.org/
http://institutions.africadatabase.org/

My role is mainly advisory and server management. I have very little
to do with content or page generation, so it's not something I can
do myself -- I have to persuade other people to do it.

On all three home pages, we have mixed Latin and Arabic characters
(all other pages right now are Latin-only).

However there is a difference: on http://africadatabase.org/ and
http://institutions.africadatabase.org/ the character set is
windows-1256, which is commonly used by Arabic websites. The Arabic
characters are all 8-bit, and their correct display depends (presumably)
on visitors having the required codepage installed on their computers.
If they haven't got it, the first four Arabic words will look like this:
" ".

Now I don't think that is a major problem in Arabic-speaking countries,
because they nearly always have the windows-1256 codepage. I'm more
concerned about what all the rest of us see.

The other home page http://people.africadatabase.org/ uses UTF-8, and
the Arabic characters are converted to numeric entitities. Although the
conversion is a bit of extra work, I think it is worthwhile.

CIWAH has alwasy been an exceptionally useful resource, so I would like
to ask anyone here who has a few minutes spare to look at these pages,
and report on what they see. I guess the UTF-8 page will look exactly
the same to everybody. But what about the windows-1256 pages? So far, I
have only looked at them on PC machines in countries where the Latin
character set is normal. They look OK. But what do Mac users and Linux
desktop users see?

I would be really grateful for any information people can offer -- it
will all help. And I always appreciate any other comments, criticism
or advice that might be useful. I know there are a number of character
set problems on some of those pages -- they are on a long to-do list,
which you are all welcome to make longer still.


good work...they all come up fine here using Opera on Win98...it looks
like Arabic script to me, though I can't read the language myself...I'm
afraid I can't check on any other OS from here

for safety's sake I'd consider going over to utf-8 for all the
multilingual pages

--
eric
www.ericjarvis.co.uk
all these years I've waited for the revolution
and all we end up getting is spin
Jul 20 '05 #2
"thinkfirst " <th***********@ yahoo.com> wrote:
On all three home pages, we have mixed Latin and Arabic characters
(all other pages right now are Latin-only).
However there is a difference: on http://africadatabase.org/ and
http://institutions.africadatabase.org/ the character set is
windows-1256, which is commonly used by Arabic websites.
It doesn't matter what "is commonly used". For example, tag soup
is commonly used. If you need an 8-bit encoding, use ISO-8859-6,
which is identical with Arabic Standard ASMO 708. My test page
http://www.unics.uni-hannover.de/nhtcapri/arabic.html6
displays perfectly in Mozilla 1.3 on Mac OS 9.1 and also with other
browsers/operating systems. On the Macintosh, you need the Arabic
Language Kit. http://www.unics.uni-hannover.de/nhtcapri/arabic.html
Now I don't think that is a major problem in Arabic-speaking countries,
because they nearly always have the windows-1256 codepage.
They don't "have the windows-1256 codepage" but they have fonts with
Arabic glyphs and operating systems suitable for the Arabic script.
The other home page http://people.africadatabase.org/ uses UTF-8, and
the Arabic characters are converted to numeric entitities.
This is possible but a better idea would be to use UTF-8:
http://www.unics.uni-hannover.de/nht...l1.html#arabic
I guess the UTF-8 page will look exactly the same to everybody.
What makes you think so?
But what about the windows-1256 pages?


The main difference between UTF-8 and Windows-1256 is *not* what you
think, i.e. different encodings. The main difference is that current
browsers use different typefaces to display them.
http://ppewww.ph.gla.ac.uk/~flavell/...ers-fonts.html
If have much Latin text on your pages, especially West European letters,
then use "charset=UT F-8" by all means.

You'll find further information on
http://ppewww.ph.gla.ac.uk/~flavell/...direction.html
Most important: Label *all* your text with DIR and LANG attributes:
e.g. <p dir="ltr" lang="fr"> <span dir="rtl" lang="ar">
Jul 20 '05 #3
thinkfirst wrote:

[snip]
CIWAH has alwasy been an exceptionally useful resource, so I would like
to ask anyone here who has a few minutes spare to look at these pages,
and report on what they see. I guess the UTF-8 page will look exactly
the same to everybody. But what about the windows-1256 pages? So far, I
have only looked at them on PC machines in countries where the Latin
character set is normal. They look OK. But what do Mac users and Linux
desktop users see?


On Linux:

http://africadatabase.org/ looks fine in Firebird 0.6.1, Konqueror 3.1.4 and
Opera 7.20 B9. Lynx 2.8.4rel.1 shows the Arabic text as a+l+m+e+l+w+m+a +t+
w+ a+l+m+e+tjy and so on, but the rest seems fine. Links 2.1pre9 gives the
same behaviour as Lynx in console mode, and shows almost all of the Arabic
text properly in graphical mode. W3M 0.4.1 and Netscape 4.79 show the
Arabic text as ÇáãÚáæã Ç and so on.

The same goes for http://people.africadatabase.org/ and
http://institutions.africadatabase.org/ except that W3M and Netscape show
the Arabic text as a series of question marks on the people site, Netscape
shows question marks on the institutions site, and W3M shows ÇáãÚáæã Ç...
on the institutions site.

Bear in mind that there's no single release of Linux, and systems can vary
wildly. I'm using a Gentoo system if you think it matters - exactly what
fonts are installed I couldn't say without rummaging around a fair bit, but
I'm pretty sure I have a few of the decent Microsoft fonts. As far as I
know, it's common for desktop distributions to include these fonts.

On Mac OS X 10.2.8:

http://africadatabase.org looks fine in Safari 1.0, Mozilla 1.4, Opera 6.03
and Omniweb 4.5. Internet Explorer 5.2.3 displays the Arabic text as
random Latin glyphs but is otherwise fine. The same applies to the other
two sites.
--
Jim Dabell

Jul 20 '05 #4
On Fri, 17 Oct 2003, Jim Dabell wrote:
Lynx 2.8.4rel.1 shows the Arabic text as a+l+m+e+l+w+m+a +t+
w+ a+l+m+e+tjy and so on,


Well, not that it's of any practical use to anyone, but take a look
at: http://ppewww.ph.gla.ac.uk/~flavell/tests/ARALYNX.GIF

This is Lynx in a putty terminal window, set for utf-8 coding, with
Courier New font selected. (RedHat 9). (It's Lynx 2.8.5dev.7 if
anyone wanted to know.)

It doesn't understand right-to-left (not even when specified
explicitly, it seems), -NOR- the need for initial, medial and final
forms; so it's not much use in this context.

To the original poster: despite the fact that many Arabic pages, for
some incomprehensibl e reason(?), use a proprietary 8-bit coding
provided by a USA corporation, I couldn't advise choosing it yourself.
You'd probably have to look quite hard to find a web browser which
understood Windows-125x codings that didn't support the corresponding
coding in the iso-8859-* series. So if you want an 8-bit coding then
I'd have to recommend the iso-series one.

But use of utf-8 codings is catching up. I'm not familiar with the
current browser population in use in the area myself, so it's hard to
make any more-detailed practical recommendations , even if I take an
interest in the technologies of character representation at a more
basic level.

Jul 20 '05 #5
"thinkfirst " <th***********@ yahoo.com> wrote:
http://africadatabase.org/
http://institutions.africadatabase.org/
http://people.africadatabase.org/


Further thoughts:

| font-family: Arial, Helvetica, Geneva, Swiss, sans-serif;

Never ever specify typefaces for non-Roman scripts! Your example is
especially bad since Helvetica, Geneva, Swiss do not contain Arabic
glyphs. Arial also does not necessarily include Arabic glyphs.
http://ppewww.ph.gla.ac.uk/~flavell/...onts.html#dont

| text-align: justify

Don't do this even with Latin/Cyrillic script as long as we don't have
reliable hyphenation. No justification without hyphenation!
"text-align: justify" with Arabic and Hebrew scripts is just silly.
Jul 20 '05 #6
Eric Jarvis <we*@ericjarvis .co.uk> wrote:
they all come up fine here using Opera on Win98...it looks
like Arabic script to me, though I can't read the language myself..


I'm afraid it's an illusion. I don't know Arabic either, but I know a
little about the writing system. And viewing http://africadatabase.org/
on Opera 7.11 (Win98), I see Arabic characters but
a) written left to right
b) presented using isolated form glyphs, making - I suppose - the
appearance really inappropriate.

IE 6.0 on Win98 seems to show it OK. But testing on another system,
with IE 5.5, I get a prompt that tells that the browser is trying to
download some plugin (or something) and hangs there. I have a vague
recollection of older versions of IE not being able to select glyphs
appropriately.

I'm afraid there's not much an author can do. Unicode contains
characters that corresponds to the specific glyph forms, but using them
would be awkward and would not guarantee correct display (partly
because those characters may not appear in common fonts).

--
Yucca, http://www.cs.tut.fi/~jkorpela/
Pages about Web authoring: http://www.cs.tut.fi/~jkorpela/www.html

Jul 20 '05 #7
"Jukka K. Korpela" <jk******@cs.tu t.fi> wrote:
IE 6.0 on Win98 seems to show it OK. But testing on another system,
with IE 5.5, I get a prompt that tells that the browser is trying to
download some plugin (or something) and hangs there. I have a vague
recollection of older versions of IE not being able to select glyphs
appropriately.


You need the "Arabic Language Support". I don't remember whether this
comes with Windows 98 or is installed by Internet Explorer.
Mozilla 1.5 on MS Windows 98 with "Arabic Language Support" displays
Arabic and Persian OK. (For Urdu, you would need third-party extensions
or Windows 2000.)
Jul 20 '05 #8
On Fri, 17 Oct 2003, Jukka K. Korpela wrote:
I'm afraid there's not much an author can do.
I don't honestly think there's much that an author -needs- to do.

Surely those who can read Arabic script will have their browsers set
up to be capable of browsing the normal ways in which Arabic is
published on the web? Whereas for those who cannot read it, it hardly
matters what's displayed there. Missing-glyphs might look ugly, but
it's not going to disrupt the whole display: so they can still read
the parts which are in a script that they can read.

Andreas will be along any moment with one of his magic Google searches
to tell us how many Arabic pages were found with this or that
encoding...?
Unicode contains characters that corresponds to the specific glyph
forms, but using them would be awkward and would not guarantee
correct display
Furthermore, such usage is deprecated, and will almost certainly ruin
the ability to find the text via a search engine. One is supposed to
use the characters from the 06xx Unicode range and let the rendering
engine choose the correct glyph forms.
(partly because those characters may not appear in common fonts).


I'm not sure about that. The glyphs have to be there anyway, so that
the rendering engine can use them for proper rendering of the normal
characters. Have you tried ListFont or a similar tool? I'm no
expert, but I can see the presenation forms up there at FB50 onwards,
and in the FExx block, in fonts which have 06xx populated.
Jul 20 '05 #9
Andreas Prilop wrote:
"Jukka K. Korpela" <jk******@cs.tu t.fi> wrote:
IE 6.0 on Win98 seems to show it OK. But testing on another system,
with IE 5.5, I get a prompt that tells that the browser is trying to
download some plugin (or something) and hangs there. I have a vague
recollection of older versions of IE not being able to select glyphs
appropriately.


You need the "Arabic Language Support". I don't remember whether this
comes with Windows 98 or is installed by Internet Explorer.
Mozilla 1.5 on MS Windows 98 with "Arabic Language Support" displays
Arabic and Persian OK. (For Urdu, you would need third-party extensions
or Windows 2000.)


I installed it later, but so long ago I can't remember how since it was
part of a flurry of measures I took when I started having to work
multilingually

--
eric
www.ericjarvis.co.uk
all these years I've waited for the revolution
and all we end up getting is spin
Jul 20 '05 #10

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

1
3983
by: NohaKhalifa | last post by:
Dear All , I have a very big problem regarding using Arabic Character set . I'm Developing an Arabic Web Site using Asp and connecting to Access Database . but i have a problem retreiving data from database after i uploaded site to the hosing server .... All the arabic data retreived from database using Asp appears as question marks ??????? and only staic arabic written in asp pages appears well but which retreived from database...
5
8669
by: Jukka K. Korpela | last post by:
The HTML specifications define the entities &zwj;, &zwnj;, &lrm;, &rlm; as denoting zero-width joiner, zero-width non-joiner, left to right mark, and right to left mark. Is there any evidence of any browser support to the characters so denoted, in the sense defined in the Unicode standard, chapter 15? ( &zwj;, &zwnj;, &lrm;, &rlm; ) For example, does f&zwj;i ever produce an fi ligature? In my tests, the best I get is that the characters...
35
5154
by: Dr.Tube | last post by:
Hi there, I have this web site (www.DrTube.com) which has the following DTD: <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd"> which switches Mozilla to standards compliance mode as I can confirm myself. How can I check whether IE6 and Opera do so too? TIA Regards Xavier van Unen.
10
9695
by: Arne | last post by:
Since I am Swedish, I write website content mostly in Swedish language and using charset iso-8859-1. I have (just for testing) tried to use utf-8 on a test page ( http://w1.978.telia.com/~u97802964/test.html ) but the special Swedish characters don't come out right if I dont use entities for them. The Swedish characters in question is: Latin letter a with ring above = &aring; () Latin letter a with diaeresis = &auml; () Latin letter...
1
370
by: Joe Abou Jaoude | last post by:
hi, I m trying to fill dropdownlists on my aspx web page with arabic characters. here's an example: <SELECT NAME="subject" SIZE="1" dir="rtl"> <OPTION SELECTED> <OPTION value="14"> áÅÊÕÇáÇÊ ÇáÓáßíÉ æÇááÇÓáßíÉ </OPTION> <OPTION value="3395"> ÇáÅÊÝÇÞÇÊ </OPTION>
4
3747
by: Greg | last post by:
I'm having trouble displaying and passing arabic characters from my web form. I've added requestEncoding="windows-1256" responseEncoding="windows-1256" to my web.config but it didn't help. I also added <meta http-equiv="Content-Type" content="text/html; charset=windows-1256"> to the top of the page and that still didn't fix it. When I submit a form with arabic text it comes out as %e1%c8 instead of the text. It also displays like this...
29
2807
by: amos | last post by:
Hi I'm experiencing a real nasty thing about dotnet. I've made a big application in dotnet and I would like to use ILAYERS for netscape 4. You CAN NOT USE Layers and Form buttons in Netscape 4 if you save your page in .aspx
2
4983
by: mansour via DotNetMonster.com | last post by:
Dear all I am using ASP.NET 2003 and SQL SERVER 2000 I am building a website in English and Arabic. I am having a problem with Arabic characters, when I am trying to insert Arabic text it appears "????????" I have adjusted the page charset to "Arabic (Windows)" and Codepage to"1256" And in WEB.CONFIG to <globalization requestEncoding="windows-1256" responseEncoding="windows-1256" />
1
1907
by: CideoEspada | last post by:
I have a problem with my child asp pages, They seem to not recognize Arabic characters and display them as gibberish, while on the master pages or pages that are independent from master pages they display fine. I have tried a Meta tag in the header of the masterpage with a charset of UTF-8 and set the page tag CodePage of the ASP child page to 65001, but it still didn't work. The page is located at www.nazal-bahrain.com This page displays...
0
7885
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can effortlessly switch the default language on Windows 10 without reinstalling. I'll walk you through it. First, let's disable language synchronization. With a Microsoft account, language settings sync across devices. To prevent any complications,...
0
8271
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers, it seems that the internal comparison operator "<=>" tries to promote arguments from unsigned to signed. This is as boiled down as I can make it. Here is my compilation command: g++-12 -std=c++20 -Wnarrowing bit_field.cpp Here is the code in...
1
8031
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows Update option using the Control Panel or Settings app; it automatically checks for updates and installs any it finds, whether you like it or not. For most users, this new feature is actually very convenient. If you want to control the update process,...
0
8258
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each protocol has its own unique characteristics and advantages, but as a user who is planning to build a smart home system, I am a bit confused by the choice of these technologies. I'm particularly interested in Zigbee because I've heard it does some...
0
6686
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development projectplanning, coding, testing, and deploymentwithout human intervention. Imagine an AI that can take a project description, break it down, write the code, debug it, and then launch it, all on its own.... Now, this would greatly impact the work of software developers. The idea...
0
3881
by: TSSRALBI | last post by:
Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The last exercise I practiced was to create a LAN-to-LAN VPN between two Pfsense firewalls, by using IPSEC protocols. I succeeded, with both firewalls in the same network. But I'm wondering if it's possible to do the same thing, with 2 Pfsense firewalls...
1
2399
by: 6302768590 | last post by:
Hai team i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated we have to send another system
1
1493
muto222
by: muto222 | last post by:
How can i add a mobile payment intergratation into php mysql website.
0
1231
bsmnconsultancy
by: bsmnconsultancy | last post by:
In today's digital era, a well-designed website is crucial for businesses looking to succeed. Whether you're a small business owner or a large corporation in Toronto, having a strong online presence can significantly impact your brand's success. BSMN Consultancy, a leader in Website Development in Toronto offers valuable insights into creating effective websites that not only look great but also perform exceptionally well. In this comprehensive...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.