473,662 Members | 2,524 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

Chinese filenames

Hi everyone,

Firstly, I would like to know if you can open chinese filenames under
win2000 using PHP 5.0? I have a file named ä¸*国.php, and try to open it
using
fopen(‘ä¸*国 .php','r');. I save the source file as UTF-8. I get the
error:

Warning: fopen(ä¸*国.ph p) [function.fopen]: failed to open stream: No
such file or directory in E:\Translation\ Website
Development\web root\testing\zh ongguo.php on line 8

I have triple checked that the file exists. Changing the source code
encoding to 'Unicode (UCS-2)' leads to no output in the browser
window.

Does PHP even support opening chinese filenames?

Secondly, I'm having some trouble accessing filenames with chinese
characters that have been uploaded via HTTP. I'm using PHP 5 and
Apache 2.2. When I
attempt to upload a file with chinese filenames, the file name gets
mutated into dashes, pretty much matching the behaviour described at
'http://gallery.menalto .com/node/57709'. However, I need the original
filename (to store in a DB). The post on the manual website by
kweechang at yahoo dot com at http://au3.php.net/manual/en/
features.file-upload.php describes using javascript to set a hidden
field. This would work fine but for now I'm trying not to resort to
javascript on my webpage. Does anyone know how the original filename
can be retrieved without using javascript? Maybe there is a setting in
Apache?

Thanks

Taras

Jan 29 '07 #1
2 5379
"Taras_96" <ta******@gmail .comwrote:
Does PHP even support opening chinese filenames?
I don't know how exactly fopen() handles strings containing characters
other that ASCII, but it is better to not rely on the underlying file
system for portability reasons. Always use simple ASCII characters. For
files uploaded via HTTP, store their original name in a DB table properly
created to support UTF-8.
Secondly, I'm having some trouble accessing filenames with chinese
characters that have been uploaded via HTTP. I'm using PHP 5 and
Apache 2.2. When I
attempt to upload a file with chinese filenames, the file name gets
mutated into dashes, pretty much matching the behaviour described at
'http://gallery.menalto .com/node/57709'. However, I need the original
filename (to store in a DB). The post on the manual website by
kweechang at yahoo dot com at http://au3.php.net/manual/en/
features.file-upload.php describes using javascript to set a hidden
field. This would work fine but for now I'm trying not to resort to
javascript on my webpage. Does anyone know how the original filename
can be retrieved without using javascript? Maybe there is a setting in
Apache?
0. Ensure your PHP script be properly UTF-8 encoded. This is important
if it contains some literal string.

1. Ensure the page containing the FORM be UTF-8 encoded. For example:

<?php
header("Content-Type: text/html; charset=UTF-8");
?>
<html><body>
<FORM method=post
enctype="multip art/form-data"
accept="image/gif,image/png,image/jpeg,image/pjpeg"
action="the-program-receiving-the-data.php">
Photo (accepted GIF, PNG o JPEG, max 500 KB):
<INPUT type=hidden name=MAX_FILE_S IZE value=512000>
<INPUT type=file name=PHOTO size=50 maxlength=51200 0><p>
<INPUT type=submit name=save_butto n value=Save>
</FORM>
</body></html>

The file name returned from the client will have the same encoding of the
page containing the FORM, that is UTF-8.

2. The name of the file can be acquired as a UTF-8 string:

$field = "PHOTO";

if( ! isset($_FILES) || ! isset($_FILES[$field]) )
die("No file uploaded.");

$error = (int) $_FILES[$field]['error'];
$name = (string) $_FILES[$field]['name'];
$type = (string) $_FILES[$field]['type'];
$size = (int) $_FILES[$field]['size'];
$tmp_name = (string) $_FILES[$field]['tmp_name'];

if( $error !== 0 )
die("Upload error code $error.");

# Here: check actual UTF-8 encoding and max length for $name.
# Here: check actual MIME $type against the allowed MIME types.
# Here: check actual $size limit.
# Here: store the file $tmp_name in a proper place with a proper name.

3. Ensure the DB you are using really has support for UTF-8. For example,
retrieve the file name once saved and compare it with the string just
acquired from the POST.

4. Don't try to save the file under the underlying file system using the
name provided by the client, always use some other identifier, for example
the primary key assigned by the DB (typically a simple number). Since the
file name now contain only simple ASCII chars, fopen() should not give
problems.

Best regards,
___
/_|_\ Umberto Salsi
\/_\/ www.icosaedro.it

Jan 29 '07 #2
Hey Umberto,
Does PHP even support opening chinese filenames?
I don't know how exactly fopen() handles strings containing characters
other that ASCII, but it is better to not rely on the underlying file
system for portability reasons. Always use simple ASCII characters. For
files uploaded via HTTP, store their original name in a DB table properly
created to support UTF-8.
I actually thought of doing this, but was wondering if PHP could in
fact do it. Never mind, I'll go along with your suggestion.

As for the file names after uploading, I had done everything you
suggested. I noticed that after choosing the filename in Firefox 1.5,
the filename got mangled in the text input section of the file input
(where the filename goes after you choose a file when browsing)! I
tried it on IE 6 and it worked correctly!! This implies to me that
there might be a bug in FF. I'll file a bug report on mozilla and see
what happens.

Taras
Jan 30 '07 #3

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

2
3125
by: Kobi Lurie | last post by:
Hello all, I'm trying to make a simple script beginner level script, with just functions. it uses the functions: file_get_contents substr taking into an array the text substr took then array_count_values
4
6505
by: Knackeback | last post by:
Hi, I wrote a XML file with GNU emacs 21.2.2 and with chinese character content encoded in UTF-8. I wrote something like: <?xml version="1.0" encoding="UTF-8"?> <test> <chinese>¼»</chinese> <chinese>ÄÎ</chinese> </test>
6
22251
by: Zhang Weiwu | last post by:
Hello. I am working with a php software project, in it (www.egroupware.org) Chinese simplified locate is "zh" while Traditional Chinese "tw". I wish to send correct language attribute in http header, I found "zh" is not standard. I found this line in apache2's default httpd.conf # Simplified Chinese (zh-CN) AddLanguage zh-CN .zh-cn
1
2360
by: Anthony Liu | last post by:
I believe that topic related to Chinese processing was discussed before. I could not dig out the info I want from the mail list archive. My Python script reads some Chinese text and then split a line delimited by white spaces. I got lists like
8
11960
by: pabv | last post by:
Hello all, I am having a few issues with encoding to chinese characters and perhaps someone might be able to assist. At the moment I am only able to see chinese characters when displayed as part of a datagrid. When an input textbox is displayed it does not display chinese characters, but rather the unicode characters stored in the mssql 2000 server backend.
12
3221
by: Steven Nagy | last post by:
Hi all, I have to do a website in chinese! Basically I just need to know how to output chinese characters. I am assuming its very easy, but have never done it before. I can however do simple things like changing the formats of currency and calendars and so on. I am guessing the answer is quite simple given; I assume Unicode would support all the chinese characters right? Ideally I'd like them to be able to enter their own content...
12
2736
by: Steve Howell | last post by:
The never-ending debate about PEP 3131 got me thinking about natural languages with respect to Python, and I have a bunch of mostly simple observations (some factual, some anecdotal). I present these mostly as food for thought, but I do make my own continent-by-continent recommendations at the bottom of the email. (My own linguistic biases are also disclosed at the bottom of the email.) Nationality of various technologists who use...
0
1817
by: scriptmann | last post by:
Hi, I'm trying to use os.listdir() to list directories with simplified chinese filenames. However, while I can see the filenames correctly in windows explorer, I am getting ? in the filename strings returned by os.listdir() for some chinese characters. What should I do to make os.listdir return what I see in Windows Explorer (i.e. the full simplified chinese filename)? I am confused because os.listdir() is supposed to have unicode...
2
6288
by: Wassy | last post by:
Hi, i have a website which contains both chinese and english content which is stored in a database. Each record in the dB has an english and Chinese field. If a user enters a search string i have to be able to detect which characters are latin based and which are chinese ideographs. eg) a user may enter "hello ÐÂÎÅÍø world" this is because many Chinese search phrases (especially those involved with technology may include English words...
0
8432
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However, people are often confused as to whether an ONU can Work As a Router. In this blog post, we’ll explore What is ONU, What Is Router, ONU & Router’s main usage, and What is the difference between ONU and Router. Let’s take a closer look ! Part I. Meaning of...
0
8344
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can effortlessly switch the default language on Windows 10 without reinstalling. I'll walk you through it. First, let's disable language synchronization. With a Microsoft account, language settings sync across devices. To prevent any complications,...
0
8857
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers, it seems that the internal comparison operator "<=>" tries to promote arguments from unsigned to signed. This is as boiled down as I can make it. Here is my compilation command: g++-12 -std=c++20 -Wnarrowing bit_field.cpp Here is the code in...
0
8764
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven tapestry of website design and digital marketing. It's not merely about having a website; it's about crafting an immersive digital experience that captivates audiences and drives business growth. The Art of Business Website Design Your website is...
0
8633
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each protocol has its own unique characteristics and advantages, but as a user who is planning to build a smart home system, I am a bit confused by the choice of these technologies. I'm particularly interested in Zigbee because I've heard it does some...
1
6186
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new presenter, Adolph Dupré who will be discussing some powerful techniques for using class modules. He will explain when you may want to use classes instead of User Defined Types (UDT). For example, to manage the data in unbound forms. Adolph will...
0
5654
by: conductexam | last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and then checking html paragraph one by one. At the time of converting from word file to html my equations which are in the word document file was convert into image. Globals.ThisAddIn.Application.ActiveDocument.Select();...
1
2762
by: 6302768590 | last post by:
Hai team i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated we have to send another system
2
1993
muto222
by: muto222 | last post by:
How can i add a mobile payment intergratation into php mysql website.

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.