473,750 Members | 2,533 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

split large xml files

Hi all,

I've an XML file that takes more than the hosting time limit to be readed by
a PHP script.

What I'd like to do is split the large XML file (can be more than 30MB) in
little parts and keep the header for every file.

Here is the idea:

<total>
<head>
</head>
<info>
</info>
<info>
</info>
<info>
</info>
....
</total>

The only change is the amount of "info" available. What I'd like is to split
the file to create littles ones whit the same <head></headdatas but each
with less <infotags (say limited to 3 for every file).

It's there any simple way ? This will only be done if the file is bigger
than 1MB

Bob
Aug 10 '07 #1
6 8546
$xml = simplexml_load_ file($xmlFile);
And take it from there. Have a quick read of the simplexml docs. You
should
have your solution in very little time.
Thanks for replying....
after a quick search, I've to say I'm still in PHP 4 !!! damn !!!
Aug 10 '07 #2
Hem, what to say more than thank you !!!

I'll implement it...thanks
Aug 10 '07 #3
On 10.08.2007 11:21 David Gillen wrote:
Bob Bedford said:
>Hi all,

I've an XML file that takes more than the hosting time limit to be readed by
a PHP script.

What I'd like to do is split the large XML file (can be more than 30MB) in
little parts and keep the header for every file.

Here is the idea:

<total>
<head>
</head>
<info>
</info>
<info>
</info>
<info>
</info>
...
</total>

The only change is the amount of "info" available. What I'd like is to split
the file to create littles ones whit the same <head></headdatas but each
with less <infotags (say limited to 3 for every file).

It's there any simple way ? This will only be done if the file is bigger
than 1MB
$xml = simplexml_load_ file($xmlFile);
And take it from there. Have a quick read of the simplexml docs. You should
have your solution in very little time.

Didn't test it, but I doubt simplexml would be able to load a 30MB xml
file. I think OP's best option is to use the tool that can read and
parse in small chunks, like expat (see
http://www.php.net/manual/en/function.xml-parse.php)
--
gosha bine

makrell ~ http://www.tagarga.com/blok/makrell
php done right ;) http://code.google.com/p/pihipi
Aug 10 '07 #4
On Aug 10, 2:34 am, "Bob Bedford" <b...@bedford.c omwrote:
$xml = simplexml_load_ file($xmlFile);
And take it from there. Have a quick read of the simplexml docs. You
should
have your solution in very little time.

Thanks for replying....
after a quick search, I've to say I'm still in PHP 4 !!! damn !!!
If you have files that big, simple xml is not an option, because the
memory will run out, and simple xml reads the whole file in memory and
makes a copy of it. What you really want is xml parsing in "streaming"
or "pull parsing" mode. You can read about it here:

http://www.ibm.com/developerworks/xm...nxw06XMLReader

However, I guess this is also not very helpful since you're running
PHP 4 and XMLReader has been introduced in PHP5. I am fighting this at
this moment also (with no solution yet), as I have to parse huge ONIX
files from book publishers (some are 90 Mb!). Let me know if you get
lucky.

Aug 10 '07 #5
..oO(Pavel Lepin)
>And your point is..?
Exactly what I said. The posted code doesn't follow any coding
guidelines and is _very_ hard to read and understand.

Micha
Aug 14 '07 #6

Michael Fesser <ne*****@gmx.de wrote in
<45************ *************** *****@4ax.com>:
.oO(Pavel Lepin)
>>And your point is..?

Exactly what I said. The posted code doesn't follow any
coding guidelines
The code I posted follows the PHP coding style guidelines
(the variant for short code snippets in our dev dept's CMS)
of the organisation I'm working for. I don't think I should
snap out of my habits (that weren't all that easy to
develop to boot, since the coding style I personally prefer
uses *way* more whitespace that the snippet in my OP) just
for the sake of your ease of understanding. Not only you
aren't signing my paychecks, other people might actually
find the code easier to read in the style I used, so no
reason to give you any preference.
and is _very_ hard to read and understand.
I find the coding style promoted by Zend IDE ugly and hard
to parse even with syntax highlighting, let alone by naked
eye. It's a matter of perception, and if you believe
there's any sort of consensus on preferable coding style
even in PHP community alone, you're sadly mistaken.

--
"Patience is a minor form of despair, disguised as
virtue." -- Ambrose Bierce
Aug 14 '07 #7

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

12
5684
by: Martin Dieringer | last post by:
I am trying to split a file by a fixed string. The file is too large to just read it into a string and split this. I could probably use a lexer but there maybe anything more simple? thanks m.
1
2686
by: mia456789 | last post by:
I hv a mysql db in my RH linux , there is a very large table in the db , the file size is about 2G , how can I split the file into two files - two files physically and one file logically ? is there any tools to do it ? thx
2
6005
by: damian | last post by:
I want to split a large csv file into smaller files. How can i go about this?.. thank you !
2
3147
by: jeremy.figgins | last post by:
Hi, I have a class that is fairly large and I would like to split the file into two files, but still use only one class. What is the best way to accomplish this? Thanks!
1
2621
by: Chris Ashley | last post by:
I am working with some very large bitmap files (1700 * 60000) and need to split them into vertical strips. This is because GDI+ seems to load the entire file into memory and crashes with an out of memory error. How can I read a BMP file directly at byte level and split it into smaller files? For example, into 1700 * 1000 strips.
2
3610
by: Curious Joe | last post by:
I have some files that are anywhere from 3GB to 9GB and I need to split them down to a series of smaller files similar to what the "split" command in linux can do. Unfortunately, I do not have access to a linux machine right now. I have been told that a program could be written in C/C++ that would do this very quickly. Can anyone point me to a tutorial or how-to that will teach me to write a quick program do accomplish this? for...
6
24157
by: ivan.perak | last post by:
Hello, im a beginner in VB.NET... The thing i would like to do is as it follows.... I have a text file (list of names, every name to the next line) which is about 350000 lines long. I would like to split it and create a new file at every lets say 20000 lines... so, the directory output would have to be something like this:
1
3099
by: JayDog | last post by:
I have a large data file that I split into smaller more manageable chunks (went from a 12.86 GB file to 500 MB - 1.6 GB chunks). I now want to add to the PERL script and go back through those more manageable chunks and pull out any invoices within those smaller data files that are larger than 250 MB each and print them to their own files as well. How do I go about doing that? Here is what I am currently working with... ...
7
4384
by: John Smith | last post by:
Hi, I am very new to C# and NET framework. I am trying to hash (using MD5CryptoServiceProvider) a source that is split into several files. Now when the source is in one file I can produce the correct md5 hash. My issue is how can I reproduce the correct hash when the file is split into different files.
0
9583
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers, it seems that the internal comparison operator "<=>" tries to promote arguments from unsigned to signed. This is as boiled down as I can make it. Here is my compilation command: g++-12 -std=c++20 -Wnarrowing bit_field.cpp Here is the code in...
0
9256
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each protocol has its own unique characteristics and advantages, but as a user who is planning to build a smart home system, I am a bit confused by the choice of these technologies. I'm particularly interested in Zigbee because I've heard it does some...
0
8263
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing, and deployment—without human intervention. Imagine an AI that can take a project description, break it down, write the code, debug it, and then launch it, all on its own.... Now, this would greatly impact the work of software developers. The idea...
0
6081
by: conductexam | last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and then checking html paragraph one by one. At the time of converting from word file to html my equations which are in the word document file was convert into image. Globals.ThisAddIn.Application.ActiveDocument.Select();...
0
4716
by: TSSRALBI | last post by:
Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The last exercise I practiced was to create a LAN-to-LAN VPN between two Pfsense firewalls, by using IPSEC protocols. I succeeded, with both firewalls in the same network. But I'm wondering if it's possible to do the same thing, with 2 Pfsense firewalls...
0
4888
by: adsilva | last post by:
A Windows Forms form does not have the event Unload, like VB6. What one acts like?
1
3323
by: 6302768590 | last post by:
Hai team i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated we have to send another system
2
2807
muto222
by: muto222 | last post by:
How can i add a mobile payment intergratation into php mysql website.
3
2226
bsmnconsultancy
by: bsmnconsultancy | last post by:
In today's digital era, a well-designed website is crucial for businesses looking to succeed. Whether you're a small business owner or a large corporation in Toronto, having a strong online presence can significantly impact your brand's success. BSMN Consultancy, a leader in Website Development in Toronto offers valuable insights into creating effective websites that not only look great but also perform exceptionally well. In this comprehensive...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.