473,378 Members | 1,523 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,378 software developers and data experts.

parsing xml with perl

This should be simple enough (famous last words)...
I have an SVG file I need to split the elements from into separate svg files. this is still a work in progress but I am stuck.
they need the header info and then the data block.
each element written to a separate indexed file name (still working out how to get the indexed range from aaaa-zzzz so even windows will keep the in order).
when I run the code I get "error with <?xml version="1.0" encoding="UTF-8" standalone="no"?>: file invalid argument"
I'm know its reading and interpolating the special characters... I'm sure my code is sloppy its the first thing I've written in 5 years. I'm also sure it could be handled in a far less klunky way than I went about it.
Anything from telling me how to get past that single step to the 12 or so lines of code that would get this done the right way is appreciated.


Expand|Select|Wrap|Line Numbers
  1. #!/usr/local/bin/perl
  2. #use strict;
  3. use warnings;
  4.  
  5. my $infile = <>;
  6. my $count = 0;
  7. my $outfile = ">>Form_" . $count . '.svg';
  8. my @arr;
  9.  
  10. sub write_slice {
  11. open(OUT,">$outfile") or die "Error with outfile: $!\n";  
  12. print OUT @arr;
  13. close(OUT);
  14. @arr=();
  15. $count++;
  16. $outfile = ">>Form_" . $count . '.svg';
  17. create_slice();
  18. }
  19.  
  20. sub create_slices {
  21. foreach my $line (@arr) {
  22.  
  23.   if ($line !~ '</g>') {
  24.   push @arr, "$line\n";
  25.   next;
  26.   }
  27.   elsif ($line =~ '</g>') {
  28.   push @arr, "$line\n";
  29.   next;
  30.   }
  31.   else {
  32.   push @arr, "$line\n";
  33.   write_slice();
  34.   }
  35. }
  36. }
  37.  
  38. sub create_head {
  39. open(OUT,">headinfo") or die "Error with outfile: $!\n";  
  40. print OUT @arr;
  41. close(OUT);
  42. @arr=();
  43. $outfile = ">>Form_" . $count . '.svg';
  44. create_slices();
  45. }
  46.  
  47. chomp $infile;
  48. open IN, $infile or die "Error with infile $infile: $!\n";
  49. my @data=<IN>;
  50. close(IN);
  51.  
  52.  foreach my $line (@data) {
  53.  
  54.   if ($line !~ "</metadata>") {
  55.   push @arr, "$line\n";
  56.   next;
  57.   }
  58.   elsif ($line =~ "</metadata>") {
  59.   push @arr, "$line\n";
  60.   next;
  61.   }
  62.   else {
  63.   push @arr, "$line\n";
  64.   create_head();
  65.   }
  66.  }
  67.  
example file truncated for size;
[sample]
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<!-- Created with vcg library -->
... (bunch of stuff)
</rdf:RDF>
</metadata>

<rect width= " 23.983494cm " height= " 23.983480cm " x="1.000000cm " y="1.000000cm " style= " stroke-width:1pt; fill-opacity:0.0; stroke:rgb(0,0,0)" />
... (more stuff)
</g>

<rect width= " 23.983494cm " height= " 23.983480cm " x="1.000000cm " y="1.000000cm " style= " stroke-width:1pt; fill-opacity:0.0; stroke:rgb(0,0,0)" />
... (more stuff)
</g>
etc..
</svg>
</g>
</svg>
[/sample]
Dec 24 '10 #1
1 2257
RonB
589 Expert Mod 512MB
Don't try to manually parse xml with a simplistic rexeg and simplistic if/elsif/else block.

Use one of the standard XML parsers on cpan.
http://search.cpan.org/search?query=xml&mode=all

Here are 2 of the more commonly used parsers.
http://search.cpan.org/~grantm/XML-S.../XML/Simple.pm
http://search.cpan.org/~mirod/XML-Twig-3.37/Twig.pm
Dec 24 '10 #2

Sign in to post your reply or Sign up for a free account.

Similar topics

6
by: John Smith | last post by:
Hello, I have a rather odd question. My company is an all java/oracle shop. We do everything is Java... no matter what it is... parsing of text files, messaging, gui you name it. My question...
8
by: Gerrit Holl | last post by:
Posted with permission from the author. I have some comments on this PEP, see the (coming) followup to this message. PEP: 321 Title: Date/Time Parsing and Formatting Version: $Revision: 1.3 $...
3
by: Wayne Folta | last post by:
I hadn't really followed the state of perl for quite a few years. When Irecently found python, it just suited me and now I've switched. (And am contributing to an open-source python project. Great...
1
by: Scott | last post by:
I am new to perl, and have not found any good examples of parsing to help me out. I have a text file that I am reading into an array that has to be parsed out and put into another file. I have not...
3
by: John Smith | last post by:
Hello, I have a rather odd question. My company is an all java/oracle shop. We do everything is Java... no matter what it is... parsing of text files, messaging, gui you name it. My question...
8
by: Jean-Marie Vaneskahian | last post by:
Reading - Parsing Records From An LDAP LDIF File In .Net? I am in need of a .Net class that will allow for the parsing of a LDAP LDIF file. An LDIF file is the standard format for representing...
1
by: Robert Neville | last post by:
Basically, I want to create a table in html, xml, or xslt; with any number of regular expressions; a script (Perl or Python) which reads each table row (regex and replacement); and performs the...
4
by: R Wood | last post by:
Greetings - A recent Perl experiment hasn't turned out so well, which has piqued my interest in Python. The project is this: take a Vcard file exported from Apple's Addressbook and use a...
1
by: worlman385 | last post by:
I need to parse the following HTML page and extract TV listing data using VC++ http://tvlistings.zap2it.com/tvlistings/ZCGrid.do any good way to extract the data? is easy for VC++ to call...
1
by: andrewwan1980 | last post by:
I need help in parsing unicode webpages & downloading jpeg image files via Perl scripts. I read http://www.cs.utk.edu/cs594ipm/perl/crawltut.html about using LWP or HTTP or get($url) functions &...
0
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 3 Apr 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome former...
0
by: aa123db | last post by:
Variable and constants Use var or let for variables and const fror constants. Var foo ='bar'; Let foo ='bar';const baz ='bar'; Functions function $name$ ($parameters$) { } ...
0
by: ryjfgjl | last post by:
If we have dozens or hundreds of excel to import into the database, if we use the excel import function provided by database editors such as navicat, it will be extremely tedious and time-consuming...
0
by: ryjfgjl | last post by:
In our work, we often receive Excel tables with data in the same format. If we want to analyze these data, it can be difficult to analyze them because the data is spread across multiple Excel files...
0
by: emmanuelkatto | last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel
0
BarryA
by: BarryA | last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
1
by: nemocccc | last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
1
by: Sonnysonu | last post by:
This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.