473,386 Members | 1,823 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,386 software developers and data experts.

Store in a file a web page written in chinese

Hi,
I want to read an html page written in chinese and store it in a file
having extension .aspx , I'm not sure where is the problem, I use the
following lines of code:

String sAddress = "http://babelfish.altavista.com/babelfish/trurl_pagecontent?lp=en_zh&trurl=http://www.etantonio.it/EN/index.aspx"
;

WebRequest req = WebRequest.Create(sAddress);
WebResponse result = req.GetResponse();
Stream ReceiveStream = result.GetResponseStream();
StreamReader reader = new StreamReader(ReceiveStream, Encoding.UTF8 );
String sHtmlTradotto = reader.ReadToEnd();

StreamWriter writer = new StreamWriter( "prova.aspx" , false,
System.Text.Encoding.UTF8) ;

writer.Write(sHtmlTradotto);
writer.Flush();
writer.Close();

But the file produced didn't contain the chinese characters so, how
can I solve the problem???

Many Thanks in advance ...

Ing. Antonio D'Ottavio
Jul 21 '05 #1
3 1965
Antonio <et*******@libero.it> wrote:
I want to read an html page written in chinese and store it in a file
having extension .aspx , I'm not sure where is the problem, I use the
following lines of code:

String sAddress =
"http://babelfish.altavista.com/babelfish/trurl_pagecontent?lp=en_zh
&trurl=http://www.etantonio.it/EN/index.aspx"
;

WebRequest req = WebRequest.Create(sAddress);
WebResponse result = req.GetResponse();
Stream ReceiveStream = result.GetResponseStream();
StreamReader reader = new StreamReader(ReceiveStream, Encoding.UTF8 );
String sHtmlTradotto = reader.ReadToEnd();

StreamWriter writer = new StreamWriter( "prova.aspx" , false,
System.Text.Encoding.UTF8) ;

writer.Write(sHtmlTradotto);
writer.Flush();
writer.Close();

But the file produced didn't contain the chinese characters so, how
can I solve the problem???


Are you sure that it's returning the data in UTF-8? How are you
checking whether or not the file contained Chinese characters?

I'd look in more depth myself, but using the code above, it's
complaining that the server committed an HTTP protocol violation :(

--
Jon Skeet - <sk***@pobox.com>
http://www.pobox.com/~skeet
If replying to the group, please do not mail me too
Jul 21 '05 #2
Hi,
I simply try to connect to the url

http://babelfish.altavista.com/babel.../EN/index.aspx

with internet explorer and this is the result where I can see that the
charset=UTF-8 and I can normally see chinese symbols :
<html><meta http-equiv="content-type" content="text/html;
charset=UTF-8"><base href="http://www.etantonio.it/EN/index.aspx">
<!-- removed --><meta http-equiv="Content-Type" content="text/html ;
CHARSET=UTF-8"><base href="http://www.etantonio.it/EN/index.aspx">
<!doctype HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
<head>
<title>Etantonio</title>
<meta name="author" content="Antonio DOttavio">
<meta name="description" content="Etantonio Index">
<link href="Stili.css" rel="stylesheet" type="text/css">
</head>
<body>

<script language=JavaScript src="menu_array.js"
type=text/javascript></script>
<script language=JavaScript src="mmenu.js"
type=text/javascript></script>

<table width="750" height="430" border="0" cellpadding="0"
cellspacing="0" background="/images/EsserSpettatoriNonEstSerioElefante.jpg">
<tr>
<td valign="top">

<table width="90%" border="0" align="center" cellspacing="12">
<tr height="70" valign="top">
<td>&nbsp;</td>
<td width="25%" rowspan="2">
<p align="center"><a
href="http://babelfish.altavista.com/babelfish/trurl_pagecontent?lp=en_zh&trurl=http%3a%2f%2fwww. etantonio.it%2fEN%2fUniversita%2findex.aspx"
class="testoMedioVerde">大学</a></p>
<p align="center"
class="testoPiccolissimoVerde">学士路线的笔记在工程学电子,
论文、研究方法和适当尊敬对起源村庄。
</p>
</td>
<td width="25%" rowspan="2">
<p align="center"><a
href="http://babelfish.altavista.com/babelfish/trurl_pagecontent?lp=en_zh&trurl=http%3a%2f%2fwww. etantonio.it%2fEN%2fEconomia%2findex.aspx"
class="testoMedioVerde"></a><a
href="http://babelfish.altavista.com/babelfish/trurl_pagecontent?lp=en_zh&trurl=http%3a%2f%2fwww. etantonio.it%2fEN%2fEconomia%2findex.aspx"
class="testoMedioVerde"></a><a
href="http://babelfish.altavista.com/babelfish/trurl_pagecontent?lp=en_zh&trurl=http%3a%2f%2fwww. etantonio.it%2fEN%2fEconomia%2findex.aspx"
class="testoMedioVerde">经济</a> </p>
<p align="center"
class="testoPiccolissimoVerde">委员会、为财政社区的技术和仪器,
详尽阐述对您在,
在供选择变迁之间,
持续从1994
年个人经验的基地。</p></td>
<td width="25%">&nbsp;</td>
</tr>
<tr height="140" valign="top">
<td width="25%">
<p align="center"><a
href="http://babelfish.altavista.com/babelfish/trurl_pagecontent?lp=en_zh&trurl=http%3a%2f%2fwww. etantonio.it%2fEN%2fLavoro%2findex.aspx"
class="testoMedioVerde"></a><a
href="http://babelfish.altavista.com/babelfish/trurl_pagecontent?lp=en_zh&trurl=http%3a%2f%2fwww. etantonio.it%2fEN%2fLavoro%2findex.aspx"
class="testoMedioVerde"></a><a
href="http://babelfish.altavista.com/babelfish/trurl_pagecontent?lp=en_zh&trurl=http%3a%2f%2fwww. etantonio.it%2fEN%2fLavoro%2findex.aspx"
class="testoMedioVerde">工作</a> </p>
<p align="center"
class="testoPiccolissimoVerde">简历,
图象证实对您,
和一些仪器和参考为工作机会查寻。
</p>
</td>
<td width="25%">
<p align="center" ><a
href="http://babelfish.altavista.com/babelfish/trurl_pagecontent?lp=en_zh&trurl=http%3a%2f%2fwww. etantonio.it%2fEN%2fWeb%2fGifAnimate%2findex.aspx"
class="testoMedioVerde"></a><a
href="http://babelfish.altavista.com/babelfish/trurl_pagecontent?lp=en_zh&trurl=http%3a%2f%2fwww. etantonio.it%2fEN%2fWeb%2fGifAnimate%2findex.aspx"
class="testoMedioVerde"></a><a
href="http://babelfish.altavista.com/babelfish/trurl_pagecontent?lp=en_zh&trurl=http%3a%2f%2fwww. etantonio.it%2fEN%2fWeb%2fGifAnimate%2findex.aspx"
class="testoMedioVerde">网</a> </p>
<p align="center"
class="testoPiccolissimoVerde">搜索引擎在无数GIF
赋予生命从我选择了和详尽阐述了,
随后将来网的被插入的实验。
</p>
</td>
</tr>
<tr valign="top">
<td width="25%">
<p align="center"><a
href="http://babelfish.altavista.com/babelfish/trurl_pagecontent?lp=en_zh&trurl=http%3a%2f%2fwww. etantonio.it%2fEN%2fVarie%2findex.aspx"
class="testoMedioVerde"></a><a
href="http://babelfish.altavista.com/babelfish/trurl_pagecontent?lp=en_zh&trurl=http%3a%2f%2fwww. etantonio.it%2fEN%2fVarie%2findex.aspx"
class="testoMedioVerde"></a><a
href="http://babelfish.altavista.com/babelfish/trurl_pagecontent?lp=en_zh&trurl=http%3a%2f%2fwww. etantonio.it%2fEN%2fVarie%2findex.aspx"
class="testoMedioVerde">数</a> </p>
<p align="center"
class="testoPiccolissimoVerde">巨大我的利益发现这里出气孔,
艺术, 旅行,
激情以远对我的热点表的链接。
</p>
</td>
<td width="25%"> <div align="center"></div></td>
<td width="25%"> <div align="center"></div></td>
<td width="25%">
<p align="center"><a
href="http://babelfish.altavista.com/babelfish/trurl_pagecontent?lp=en_zh&trurl=http%3a%2f%2fwww. etantonio.it%2fEN%2fContatti%2findex.aspx"
class="testoMedioVerde"></a><a
href="http://babelfish.altavista.com/babelfish/trurl_pagecontent?lp=en_zh&trurl=http%3a%2f%2fwww. etantonio.it%2fEN%2fContatti%2findex.aspx"
class="testoMedioVerde"></a><a
href="http://babelfish.altavista.com/babelfish/trurl_pagecontent?lp=en_zh&trurl=http%3a%2f%2fwww. etantonio.it%2fEN%2fContatti%2findex.aspx"
class="testoMedioVerde">联络</a></p>
<p align="center"
class="testoPiccolissimoVerde">这里它是可能接触对我为每必要或理事会是 通过编写形式或插入消息nel
论坛delle 想法的邮件。
</p>
</td>
</tr>
</table>

</td>
</tr>
</table>
<script>InserisciFooter();</script>
<br>
<a href="http://babelfish.altavista.com/babelfish/trurl_pagecontent?lp=en_zh&trurl=http%3a%2f%2fwww. etantonio.it%2fEN%2fEN%2fUniversita%2findex.aspx"
class="trasparente"></a><a
href="http://babelfish.altavista.com/babelfish/trurl_pagecontent?lp=en_zh&trurl=http%3a%2f%2fwww. etantonio.it%2fEN%2fEN%2fUniversita%2findex.aspx"
class="trasparente"></a><a
href="http://babelfish.altavista.com/babelfish/trurl_pagecontent?lp=en_zh&trurl=http%3a%2f%2fwww. etantonio.it%2fEN%2fEN%2fUniversita%2findex.aspx"
class="trasparente">Universita 用 </a><a
href="http://babelfish.altavista.com/babelfish/trurl_pagecontent?lp=en_zh&trurl=http%3a%2f%2fwww. etantonio.it%2fEN%2fFR%2fUniversita%2findex.aspx"
class="trasparente">Universita</a>
<a href="http://babelfish.altavista.com/babelfish/trurl_pagecontent?lp=en_zh&trurl=http%3a%2f%2fwww. etantonio.it%2fEN%2fFR%2fUniversita%2findex.aspx"
class="trasparente"></a><a
href="http://babelfish.altavista.com/babelfish/trurl_pagecontent?lp=en_zh&trurl=http%3a%2f%2fwww. etantonio.it%2fEN%2fEN%2fUniversita%2findex.aspx"
class="trasparente">英语
</a><a href="http://babelfish.altavista.com/babelfish/trurl_pagecontent?lp=en_zh&trurl=http%3a%2f%2fwww. etantonio.it%2fEN%2fFR%2fUniversita%2findex.aspx"
class="trasparente">用法语</a>
</td>
</a>
<td>
</body>
</html>

I'm trying to read and store it in a file
having extension .aspx , the result is that many characters are not
right evaluated, I use the following lines of code:

String sAddress = "http://babelfish.altavista.com/babelfish/trurl_pagecontent?lp=en_zh&trurl=http://www.etantonio.it/EN/index.aspx";

WebRequest req = WebRequest.Create(sAddress);
WebResponse result = req.GetResponse();
Stream ReceiveStream = result.GetResponseStream();
StreamReader reader = new StreamReader(ReceiveStream, Encoding.UTF8 );
String sHtmlTradotto = reader.ReadToEnd();

StreamWriter writer = new StreamWriter( "prova.aspx" , false,
System.Text.Encoding.UTF8) ;

writer.Write(sHtmlTradotto);
writer.Flush();
writer.Close();

Can you help me to solve the problem???

Many Thanks in advance ...

Ing. Antonio D'Ottavio
Jul 21 '05 #3
hi jon
problem u r getting can be resolved
by updating one entry in machine.config file for unsafe headers

<httpWebRequest useUnsafeHeaderParsing="true" />
make this entry in under <Systems.net><Settings>
section of machine.config file
"Jon Skeet [C# MVP]" wrote:
Antonio <et*******@libero.it> wrote:
I want to read an html page written in chinese and store it in a file
having extension .aspx , I'm not sure where is the problem, I use the
following lines of code:

String sAddress =
"http://babelfish.altavista.com/babelfish/trurl_pagecontent?lp=en_zh
&trurl=http://www.etantonio.it/EN/index.aspx"
;

WebRequest req = WebRequest.Create(sAddress);
WebResponse result = req.GetResponse();
Stream ReceiveStream = result.GetResponseStream();
StreamReader reader = new StreamReader(ReceiveStream, Encoding.UTF8 );
String sHtmlTradotto = reader.ReadToEnd();

StreamWriter writer = new StreamWriter( "prova.aspx" , false,
System.Text.Encoding.UTF8) ;

writer.Write(sHtmlTradotto);
writer.Flush();
writer.Close();

But the file produced didn't contain the chinese characters so, how
can I solve the problem???


Are you sure that it's returning the data in UTF-8? How are you
checking whether or not the file contained Chinese characters?

I'd look in more depth myself, but using the code above, it's
complaining that the server committed an HTTP protocol violation :(

--
Jon Skeet - <sk***@pobox.com>
http://www.pobox.com/~skeet
If replying to the group, please do not mail me too

Jul 21 '05 #4

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

3
by: YAN | last post by:
my database (mysql) would be used to store Simplified Chinese and Traditional Chinese, so i would like to store the text as unicode to the database. I know there is a function...
0
by: boonkit | last post by:
Currently i store the chinese character in unicode decimal format (&#<number>;). In this way, I can display and search the chinese character correctly. However, this will take more storage space....
6
by: micmic.chion | last post by:
I am using Windows 2003 Server English Version. I wanna store the big-5 data so I install the sql server 2000 as if i install it in the Windows 2000 with Server Collation of the...
6
by: Matt Hollingworth | last post by:
We have an XML file that contains text in various languages , ie English, French, German and Chinese etc. We currently have a StringWriter object that reads this in and transforms against an...
7
by: Alan Silver | last post by:
Hello, I am just looking at VWD and seeing what needs doing to take an existing site I've written by hand and importing it into VWD. I've already discovered that I need to rename my code-behind...
3
by: Antonio | last post by:
Hi, I want to read an html page written in chinese and store it in a file having extension .aspx , I'm not sure where is the problem, I use the following lines of code: String sAddress =...
0
by: enrico.leuzzi | last post by:
Hi, sorry for the newbie question but I'm not able to store chinese characters into a column in a table.... how should the db setted and the table created to be able to store different chars...
0
by: kennymce | last post by:
Hi, I'm having trouble localizing my Oracle 9.2 / ASP web application for our Chinese-speaking users. My Oracle 9.2 Database has NLS_NCHAR_CHARACTERSET set to AL16UTF16. I've set up a test...
26
by: Hongyi Zhao | last post by:
Dear all, I want to judge the file's encoding system correctly, i.e., belong to utf-8, ansi, gbk, gb2312, gb18030, or iso-8859-a, and so on. Who can give me some hints on the fortran...
0
by: taylorcarr | last post by:
A Canon printer is a smart device known for being advanced, efficient, and reliable. It is designed for home, office, and hybrid workspace use and can also be used for a variety of purposes. However,...
0
by: Charles Arthur | last post by:
How do i turn on java script on a villaon, callus and itel keypad mobile phone
0
by: aa123db | last post by:
Variable and constants Use var or let for variables and const fror constants. Var foo ='bar'; Let foo ='bar';const baz ='bar'; Functions function $name$ ($parameters$) { } ...
0
by: ryjfgjl | last post by:
If we have dozens or hundreds of excel to import into the database, if we use the excel import function provided by database editors such as navicat, it will be extremely tedious and time-consuming...
0
by: ryjfgjl | last post by:
In our work, we often receive Excel tables with data in the same format. If we want to analyze these data, it can be difficult to analyze them because the data is spread across multiple Excel files...
1
by: nemocccc | last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
1
by: Sonnysonu | last post by:
This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...
0
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
0
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.