473,385 Members | 1,320 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,385 software developers and data experts.

How to check the encoding format of an XML

Dear All,
I have an xml without the encoding format line like "<?xml version="1.0" encoding="UTF-8"?>" at the start of the xml. I am parsing the XML using the module XML::Parser and I am able to parse it without any errors.

If I want to check the encoding format of the XML how can I check that? Is there any method in Perl to check the encoding format of the XML which is without the XML Declaration line at the start?

The XML looks like this:
Expand|Select|Wrap|Line Numbers
  1. <record>
  2.   <name>a</name>
  3.   <place>b</place>
  4. </record>
  5. <record>
  6.   <name>c</name>
  7.   <place>d</place>
  8. </record>
Can anyone help me out please?
Nov 20 '07 #1
4 1933
eWish
971 Expert 512MB
Have you looked at CPAN? There is a module called XML::ParseDTD that might be what you need.

--Kevin
Nov 20 '07 #2
Have you looked at CPAN? There is a module called XML::ParseDTD that might be what you need.

--Kevin
How can we check the encoding format of an XML when there is no DTD to be validated against? I have tried with Encode::Guess module but not able to get the format. May be something wrong in the code:

Expand|Select|Wrap|Line Numbers
  1. use Encode::Guess;
  2. use strict;
  3.  
  4. my $filename = 'records.xml';
  5. open (my $fh,$filename) or die $!;
  6. my $data = "";
  7. while($_ = <$fh>){
  8.   $data .= $_;
  9. }
  10. my $decoder = guess_encoding($data);
  11. die $decoder unless ref($decoder);
  12. print "\nref = @ref";
  13. my $utf8 = $decoder->decode($data);
where as the records.xml is of the format which I have posted above.

Can anyone please help me out.
Nov 21 '07 #3
How can we check the encoding format of an XML when there is no DTD to be validated against? I have tried with Encode::Guess module but not able to get the format. May be something wrong in the code:

Expand|Select|Wrap|Line Numbers
  1. use Encode::Guess;
  2. use strict;
  3.  
  4. my $filename = 'records.xml';
  5. open (my $fh,$filename) or die $!;
  6. my $data = "";
  7. while($_ = <$fh>){
  8.   $data .= $_;
  9. }
  10. my $decoder = guess_encoding($data);
  11. die $decoder unless ref($decoder);
  12. print "\nref = @ref";
  13. my $utf8 = $decoder->decode($data);
where as the records.xml is of the format which I have posted above.

Can anyone please help me out.

Can anyone help me out ..
Dec 3 '07 #4
eWish
971 Expert 512MB
Expand|Select|Wrap|Line Numbers
  1. use strict;
  2. use Encode::Guess;
  3.  
  4. my $file = '/path/to/file/my_xml.xml';
  5. my $data;
  6.  
  7. open (my $FH, '<', $file) || die "Can't open file:  $!";
  8.    while($data = <$FH>) {
  9.           chomp($data);
  10.  
  11.           my $decoder = Encode::Guess->guess($data);
  12.               die $decoder unless ref($decoder);
  13.               my $utf8 = $decoder->decode($data);
  14.  
  15.     }
  16. close ($FH);
Dec 4 '07 #5

Sign in to post your reply or Sign up for a free account.

Similar topics

1
by: Irmen de Jong | last post by:
Hi I'm trying to create e-mail content using the email.MIMEText module. It basically works, until I tried to send mail in non-ascii format. What I did, to test both iso-8859-15 and UTF-8...
27
by: John Roth | last post by:
PEP 263 is marked finished in the PEP index, however I haven't seen the specified Phase 2 in the list of changes for 2.4 which is when I expected it. Did phase 2 get cancelled, or is it just not...
9
by: Ksenia Marasanova | last post by:
Hi, I have a little problem with encoding. Was hoping maybe anyone can help me to solve it. There is some amount of data in a database (PG) that must be inserted into Excel sheet and emailed....
3
by: xmlguy | last post by:
XmlTextReader myXmlReader = new XmlTextReader(args); string en = myXmlReader.Encoding.EncodingName; //Console.WriteLine(x); Error: Unhandled Exception: System.NullReferenceException: Object...
5
by: Waldy | last post by:
Hi there, how do you set the encoding format of an XML string? When I was outputting the XML to a file you can specify the encoding format like so: XmlTextWriter myWriter; myWriter = new...
4
by: A_StClaire_ | last post by:
hi, I am using the following code to download multiple file types from a server. .txt files transfer fine. however Word .doc files come through garbled and I don't know enough about encoding...
4
by: shreshth.luthra | last post by:
Hi All, I am having a GUI which accepts a Unicode string and searches a given set of xml files for that string. Now, i have 2 XML files both of them saved in UTF-8 format, having characters...
6
by: saumya.agarwal | last post by:
Hi, I am using libxml2 for xml parsing. When the client application sends data to libxml2 in UTF-8 format, it works fine. But, I have a scenarion in which the client application sends data to...
3
Maidenz08
by: Maidenz08 | last post by:
How do i check whether an email id exists or not? I am following a three step validation process.. 1) syntax validation- which is pretty straight forward 2) DNS validation - I'm able to do...
0
by: Faith0G | last post by:
I am starting a new it consulting business and it's been a while since I setup a new website. Is wordpress still the best web based software for hosting a 5 page website? The webpages will be...
0
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 3 Apr 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome former...
0
by: ryjfgjl | last post by:
In our work, we often need to import Excel data into databases (such as MySQL, SQL Server, Oracle) for data analysis and processing. Usually, we use database tools like Navicat or the Excel import...
0
by: aa123db | last post by:
Variable and constants Use var or let for variables and const fror constants. Var foo ='bar'; Let foo ='bar';const baz ='bar'; Functions function $name$ ($parameters$) { } ...
0
by: ryjfgjl | last post by:
If we have dozens or hundreds of excel to import into the database, if we use the excel import function provided by database editors such as navicat, it will be extremely tedious and time-consuming...
0
by: emmanuelkatto | last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel
1
by: nemocccc | last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
1
by: Sonnysonu | last post by:
This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.