473,380 Members | 1,121 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,380 software developers and data experts.

How do I open and use this xml file with php ?

290 100+
Hi,

I am looking at a download from clickbank and I notice that
it has two files, a very small on suffixed with .dtd which I
list below, and a huge file 26 Mb suffixed with .xml

( not sure if I should use the "code" tags for these filesĖyes, you should, Mod)

Here is the .dtd
Expand|Select|Wrap|Line Numbers
  1. <!ELEMENT Catalog ( Category* ) >
  2. <!ELEMENT Category ( Name, Site*, Category* ) >
  3. <!ELEMENT Commission ( #PCDATA ) >
  4. <!ELEMENT Description ( #PCDATA ) >
  5. <!ELEMENT EarnedPerSale ( #PCDATA ) >
  6. <!ELEMENT TotalEarningsPerSale ( #PCDATA ) >
  7. <!ELEMENT TotalRebillAmt ( #PCDATA ) >
  8. <!ELEMENT HasRecurringProducts ( #PCDATA ) >
  9. <!ELEMENT Gravity ( #PCDATA ) >
  10. <!ELEMENT Id ( #PCDATA ) >
  11. <!ELEMENT Name ( #PCDATA ) >
  12. <!ELEMENT PercentPerSale ( #PCDATA ) >
  13. <!ELEMENT PopularityRank ( #PCDATA ) >
  14. <!ELEMENT Referred ( #PCDATA ) >
  15. <!ELEMENT Site ( Commission? | Description+ | EarnedPerSale? | TotalEarningsPerSale? | TotalRebillAmt? | Gravity? | Id+ | PercentPerSale? | PopularityRank+ | Referred? | Title+ | HasRecurringProducts )* >
  16. <!ELEMENT Title ( #PCDATA ) >

And here is the first few lines on the .xml file.

Expand|Select|Wrap|Line Numbers
  1. <?xml version="1.0" encoding="ISO-8859-1"?>
  2. <!DOCTYPE Catalog SYSTEM "marketplace_feed_v1.dtd">
  3. <Catalog>
  4. <Category>
  5. <Name>Business to Business</Name>
  6. <Site>
  7. <Id>REGEASY</Id>
  8. <PopularityRank>1</PopularityRank>
  9. <Title><![CDATA[Registry Easy - #1 Converting Registry Cleaner & System Optimizer.]]></Title>
  10. <Description><![CDATA[Stunning Conversions With Extremely Low Refund Rate. Dedicated Affiliate Support. Extraordinary Customer Service. Any Kind Of Conversion Tracking & Multiple Landing Pages. Talk To Us! Http://www.cheesesoft.com/affiliates/registry-easy/.]]></Description>
  11. <HasRecurringProducts>false</HasRecurringProducts>
  12. <Gravity>226.333</Gravity>
  13. <EarnedPerSale>31.7204</EarnedPerSale>
  14. <PercentPerSale>75.0</PercentPerSale>
  15. <TotalEarningsPerSale>31.7204</TotalEarningsPerSale>
  16. <TotalRebillAmt>0.0</TotalRebillAmt>
  17. <Referred>68.0</Referred>
  18. <Commission>75</Commission>
  19. </Site>
  20. <Site>
  21. <Id>BRYXEN4</Id>
  22. <PopularityRank>2</PopularityRank>
  23. <Title><![CDATA[Keyword Elite 2.0: The New Generation Of Keyword Research Software!]]></Title>
  24. <Description><![CDATA[Dominate Adwords. Dominate Niche Marketing. Dominate The Search Engines. Go Here For Tons Of Affiliate Tools: Http://www.keywordelite.com/affiliate/.]]></Description>
  25. <HasRecurringProducts>true</HasRecurringProducts>
  26. <Gravity>229.6</Gravity>
  27. <EarnedPerSale>65.1052</EarnedPerSale>
  28. <PercentPerSale>48.0</PercentPerSale>
  29. <TotalEarningsPerSale>74.1738</TotalEarningsPerSale>
  30. <TotalRebillAmt>15.2186</TotalRebillAmt>
  31. <Referred>79.0</Referred>
  32. <Commission>50</Commission>
  33. </Site>

Ok - so that shows the header info and the first two lines of data.

Now, the first line of the header info refers to the .dtd file.

I could use the info in the .dtd file to create a table
with columns ( fields) as it states.

Or I could just create the table structure from looking at the first few
records in the xml file that I have shown.

Once I have done that, I guess that I write a php script
to open the xml file and then step through each row and pull out the contents that is found between the tags.

As it finds each tag it can locate the contents and update the table.

So:


Expand|Select|Wrap|Line Numbers
  1. OPEN DBASE CONNECTION then:
  2.  
  3. $CB_file = file('clickbank.xml');
  4.  
  5. for($i=0; $i<count($CB_file); $i++) {
  6.     $arrayOfLine = explode('???', $geo_arr[$i]);
  7.  
  8.     Update cbdb SET ????? = ??????
  9.     $result = mysql_query($sql) or die("could not CBDB"). mysql_error();  
  10.    break;
  11.    }
  12. }

Yes, I know that I have a lot of gaps to fill in

But, my question is, can this approach work with a
xml file of 28 Mb and based on the files that I have
can you please help me fill in the gaps.

PS I have searched and read up about XML -> MySQL but I didn' like the
look of XML-DOM or SAX-based parsers, and so would prefer to try and
get something "hand made" to work for my specific files.


Thanks for any input and help.
Oct 10 '09 #1
14 3155
zorgi
431 Expert 256MB
Did you try simplexml?

http://php.net/manual/en/book.simplexml.php

Very easy to use. Converts XML to object. Only available from PHP5
Oct 10 '09 #2
jeddiki
290 100+
Hi,
Thanks for the tip.

I have read it and as far as I can tell
this is what I need to do:

Expand|Select|Wrap|Line Numbers
  1. $CB_file = file('clickbank.xml');
  2.  
  3. $xmlstr ="<<<XML ".$CB_file." XML";
  4.  
  5. // ( do I have to ass line breaks at all ? )
  6.  
  7. // Then I continue to check validity:
  8.  
  9. $xmlObject = simplexml_load_string($xmlstr); 
  10.  
  11. // not sure about how I go to this next line
  12.  
  13. $xml = new SimpleXMLElement($xmlstr);
  14.  
After this I guess that I need a foreach loop to work through the
whole file ?
Oct 10 '09 #3
zorgi
431 Expert 256MB
If you know path to your xml file you can do this:

Expand|Select|Wrap|Line Numbers
  1. $xml = simplexml_load_file('clickbank.xml');
  2.  
$xml is now object and you could do something like this:
Expand|Select|Wrap|Line Numbers
  1.     foreach($xml->Category as $category){
  2.             echo $category->Name."<br />";            
  3.             foreach($category->Site as $site){
  4.                 echo $site->Id."<br />";
  5.             }
  6.     }
  7.  
Oct 10 '09 #4
jeddiki
290 100+
Thanks
I will work with what you have given me. :)

So are you saying that I don't need to bother with:
checking validity?

# $xmlObject = simplexml_load_string($xmlstr);
Oct 10 '09 #5
zorgi
431 Expert 256MB
simplexml_load_file ó Interprets an XML file into an object

http://php.net/manual/en/function.si...-load-file.php

extracts from that page in the manual:
  • Convert the well-formed XML document in the given file to an object.
  • Produces an E_WARNING error message for each error found in the XML data.

There is also loads of tutorials on line for simplexml:
http://www.phpfever.com/simplexml-tutorial-part1.html
Oct 10 '09 #6
Dormilich
8,658 Expert Mod 8TB
@jeddiki
I donít think you can explicitly validate the XML against the DTD with SimpleXML (well, itís not necessary in most cases)
Oct 10 '09 #7
jeddiki
290 100+
Thanks, I read that tute.

I now have something close to working :)

But I get and error on trying to write to the database:
it may be becasue I need to convert the data ?
Expand|Select|Wrap|Line Numbers
  1. $sql = "INSERT INTO `clickbank` (cat,id,pop)
  2.  VALUES                              ('$category->Name','$site->Id','$site->PopularityRank')";
  3.  
I noticed in the tute it said something that might apply:

It gave this example:

Expand|Select|Wrap|Line Numbers
  1.       $xml = Ďtest_file.xmlí;
  2.       $xml = simplexml_load_file($xml);
  3.       $value_to_store = (string) $xml->make[0]->model;
  4.       // This converts the "Mustang" SimpleXMLElement object to a string, making it disk storable.
  5.  
Does this mean that I have to do this:
$Db_id = (string) $xml->Category->$site->Id;
for each field?

And is this enough? Or do I need to add counters to keep track of which row is being processed and then use something like:

$Db_id = (string) $xml->Category[$cnt1]->$site->Id;


The error message I get from the script is :
could not execute INSERT set up clients.You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near 'desc,recurr,grav,earn,percent,totearn,rebill,refe r,comm) VALUES ' at line 1
Oct 10 '09 #8
zorgi
431 Expert 256MB
gettype( $category->Name ) results in object but
gettype( '$category->Name' ) results in sting

Based on that I suspect that your problem is not XML related any more but somewhere within you sql statement.
Oct 10 '09 #9
jeddiki
290 100+
Thanks,

I have looked again bıt I can not see anything wrong:

This is my code:

Expand|Select|Wrap|Line Numbers
  1. $file = 'clickbank.xml';
  2. $xml = simplexml_load_file($file);
  3. $cnt1 = "0";
  4. $cnt2 = "0";
  5.  
  6.  
  7. foreach($xml->Category as $category){
  8.    echo "$cnt1) ".$category->Name."<br>";
  9.    $cnt1 = $cnt1+1;            
  10.    foreach($category->Site as $site){
  11.       echo "$cnt2) ". $site->Id, $site->PopularityRank, $site->Title,    
  12.      $site->Description, $site->HasRecurringProducts, $site->Gravity, 
  13.      $site->EarnedPerSale, $site->PercentPerSale, $site->TotalEarningsPerSale, $site->TotalRebillAmt, $site->Referred, $site->Commission."<br><br> ";
  14.  
  15.  
  16.      $sql = "INSERT INTO clickbank   
  17.          (cat,id,pop,title,desc,recurr,grav,earn,percent,totearn,rebill,refer,comm)
  18.         VALUES     
  19.       ('$category->Name','$site->Id','$site->PopularityRank','$site->Title','$site->Description','$site->HasRecurringProducts','$site->Gravity','$site->EarnedPerSale','$site->PercentPerSale','$site->TotalEarningsPerSale','$site->TotalRebillAmt','$site->Referred','$site->Commission')";
  20.  
  21. $result = mysql_query($sql) or die("could not execute INSERT set up clients.". mysql_error());  
  22.          $cnt2 = $cnt2+1;
  23.              }
  24.      }
  25.  
And this is the output:
0) Business to Business
0) REGEASY1Registry Easy - #1 Converting Registry Cleaner & System Optimizer.Stunning Conversions With Extremely Low Refund Rate. Dedicated Affiliate Support. Extraordinary Customer Service. Any Kind Of Conversion Tracking & Multiple Landing Pages. Talk To Us! Http://www.cheesesoft.com/affiliates....72040.068.075

could not execute INSERT set up clients.You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near 'desc,recurr,grav,earn,percent,totearn,rebill,refe r,comm) VALUES ' at line 1
Maybe there is a reserved name I have used for a field or something ?
Oct 10 '09 #10
zorgi
431 Expert 256MB
What is structure of your clickbank table?
Oct 10 '09 #11
jeddiki
290 100+
I added this:
Expand|Select|Wrap|Line Numbers
  1. print '<pre>'.$sql.'</pre>'; 

And I got this result:

INSERT INTO clickbank (cat,id,pop,title,desc,recurr,grav,earn,percent,to tearn,rebill,refer,comm)
VALUES
('Business to Business','REGEASY','1','Registry Easy - #1 Converting Registry Cleaner & System Optimizer.','Stunning Conversions With Extremely Low Refund Rate. Dedicated Affiliate Support. Extraordinary Customer Service. Any Kind Of Conversion Tracking & Multiple Landing Pages. Talk To Us! Http://www.cheesesoft.com/affiliates...0','75')
Can anyone see any problem ?

The numbers look a bit odd!

Could the problem be that grav is set up in the table as:

double(5.2) maybe the 31.7204 doesn't fit.

Actually I don't understand that number, is it supposed be 31720.40 dollars or 317,204 dollars ? Or only 31.72 dollars ?

Or maybe it is something in that long description ?
Oct 10 '09 #12
zorgi
431 Expert 256MB
@jeddiki
From the look of that first thing that comes to mind is that something is wrong with your desc field, probably too short.

Also you should do casting of your INSERT types to match those in the database table or change your tables types all to varchar. Not sure what is better but something should be done.

What is your table structure?
Oct 10 '09 #13
jeddiki
290 100+
OK - think I have found the problem

I think it was the field named "desc" because it can be used in the ORDER part of sql command.

It was a guess but when I changed the name to "descrip" the first three rows get processed OK

This is my file structure now:

Expand|Select|Wrap|Line Numbers
  1. $sql = "CREATE TABLE `clickbank` (
  2.         `cb_id` smallint(8) NOT NULL AUTO_INCREMENT,
  3.      `id` varchar(10) NOT NULL default 'none',
  4.         `cat` varchar(50) NOT NULL default 'none',
  5.     `pop` smallint(8) NOT NULL default '1',
  6.     `title` varchar(100) NOT NULL default 'n',
  7.     `descrip` varchar(300) NOT NULL default 'n',
  8.     `recurr` char(1) NOT NULL default 'n',
  9.     `grav` double(10,2) NOT NULL default '99.99',
  10.     `earn`  double(10,2) NOT NULL default '99.99',
  11.     `percent`  double(5,2) NOT NULL default '99.99',
  12.     `totearn`  double(10,2) NOT NULL default '99.99',
  13.     `rebill`  double(10,2) NOT NULL default '99.99',
  14.     `refer`  double(10,2) NOT NULL default '99.99',
  15.     `comm`  double(5,2) NOT NULL default '99.99',
  16.     PRIMARY KEY (cb_id)
  17.  
I still have a problem and it seems to be caused by single quotes in the description data.

This is my out put
0) Business to Business
0) REGEASY1Registry Easy - #1 Converting Registry Cleaner & System Optimizer.Stunning Conversions With Extremely Low Refund Rate. Dedicated Affiliate Support. Extraordinary Customer Service. Any Kind Of Conversion Tracking & Multiple Landing Pages. Talk To Us! Http://www.cheesesoft.com/affiliates....72040.068.075

INSERT INTO clickbank ( cat, id, pop, title, descrip, recurr, grav, earn, percent, totearn, rebill, refer, comm )
VALUES ( 'Business to Business', 'REGEASY', '1', 'Registry Easy - #1 Converting Registry Cleaner & System Optimizer.', 'Stunning Conversions With Extremely Low Refund Rate. Dedicated Affiliate Support. Extraordinary Customer Service. Any Kind Of Conversion Tracking & Multiple Landing Pages. Talk To Us! Http://www.cheesesoft.com/affiliates/registry-easy/.', 'false',
'226.333', '31.7204', '75.0', '31.7204', '0.0', '68.0', '75' )

1) BRYXEN42Keyword Elite 2.0: The New Generation Of Keyword Research Software!Dominate Adwords. Dominate Niche Marketing. Dominate The Search Engines. Go Here For Tons Of Affiliate Tools: Http://www.keywordelite.com/affiliat...815.218679.050

INSERT INTO clickbank ( cat, id, pop, title, descrip, recurr, grav, earn, percent, totearn, rebill, refer, comm )
VALUES ( 'Business to Business', 'BRYXEN4', '2', 'Keyword Elite 2.0: The New Generation Of Keyword Research Software!', 'Dominate Adwords. Dominate Niche Marketing. Dominate The Search Engines. Go Here For Tons Of Affiliate Tools: Http://www.keywordelite.com/affiliate/.', 'true', '229.6', '65.1052', '48.0', '74.1738', '15.2186', '79.0', '50' )

2) MAVERICKCO3Maverick Coaching - Cell Phone Cash.Cell Phone Cash: A Brand New Course By Maverick Coaching Members Are Making At Least $279/Day With Cell Phones! Customers Get Our 'Make Money Or Its Free' Guarantee, 24/7 Phone Support! Affiliates: Http://cellphonecash.maverickcoachin...212.563486.050

INSERT INTO clickbank ( cat, id, pop, title, descrip, recurr, grav, earn, percent, totearn,rebill, refer, comm )
VALUES ( 'Business to Business', 'MAVERICKCO', '3', 'Maverick Coaching - Cell Phone Cash.', 'Cell Phone Cash: A Brand New Course By Maverick Coaching Members Are Making At Least $279/Day With Cell Phones! Customers Get Our 'Make Money Or Its Free' Guarantee, 24/7 Phone Support! Affiliates: ttp://cellphonecash.maverickcoaching.com/affiliates.php.', 'true',
'674.459', '12.9828', '50.0', '25.5462', '12.5634', '86.0', '50' )

could not execute INSERT set up clients.You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near 'Make Money Or Its Free' Guarantee, 24/7 Phone Support! Affiliates: Http://cellph' at line 4
Oct 10 '09 #14
Dormilich
8,658 Expert Mod 8TB
you have to escape single quotes inside single quotes, if you donít want to break the string.
Oct 10 '09 #15

Sign in to post your reply or Sign up for a free account.

Similar topics

0
by: Stephen | last post by:
could someone give me some advise on a problem I have. At present i have a web application which displays pdf files in a web browser using javascript and the window.open method and pointing to a...
2
by: gary smith | last post by:
I am trying to open a file (file --> open --> file) at an FTP location using visual Studio.net. Unfortunately, I get an "unspecified error" alert whenever I select a file. Can anyone help...
3
by: Murasama | last post by:
Hi there, Im trying a simple file IO operation in Visual Studio .NET 2003 and it can't seem to open the file. If I run the exe in the debug directory it works fine but if I click the start...
6
by: Charles Morrall | last post by:
I have no experience with DB2 as such, but I've been tasked with configuring backup of a server running DB2 v8 on Windows Server 2003. I do have some experience with backups in general though. The...
2
by: nissiml | last post by:
hi, i'm trying to open a asp.net web page that list files from a Windows application like winword and select a file from it . what do i have to do to make it happen, is it simple ? Thanks in...
2
by: Mattbooty | last post by:
Hello, Not sure if anyone else has seen this bug, but I have a form where the entire form is covered with a picturebox. The picturebox has a mouseup event. I also have an open file dialog for...
2
by: agphoto | last post by:
There is big or problem in open file in read and write mode.. $file = "data.txt"; $fp = fopen($file,"w+"); $line = fgets($fp,"120"); // i need only 1st line to read and upto 120 bytes echo...
4
by: DyslexicAnaboko | last post by:
Hello, I have a module that is part of larger project that is giving me trouble, so I setup an example. Brief ===== I simply want to open a text file and make the contents avaliable...
5
by: Ryan Liu | last post by:
Hi, Both way works, I'd just ask some experts which way is better? My application creates a log file daily. Now each time when I write a log, I will open the file and append to the end....
18
by: Coffee Pot | last post by:
Thanks for any advice. ~ CP
1
by: CloudSolutions | last post by:
Introduction: For many beginners and individual users, requiring a credit card and email registration may pose a barrier when starting to use cloud servers. However, some cloud server providers now...
0
by: Faith0G | last post by:
I am starting a new it consulting business and it's been a while since I setup a new website. Is wordpress still the best web based software for hosting a 5 page website? The webpages will be...
0
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 3 Apr 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome former...
0
by: Charles Arthur | last post by:
How do i turn on java script on a villaon, callus and itel keypad mobile phone
0
by: aa123db | last post by:
Variable and constants Use var or let for variables and const fror constants. Var foo ='bar'; Let foo ='bar';const baz ='bar'; Functions function $name$ ($parameters$) { } ...
0
by: ryjfgjl | last post by:
If we have dozens or hundreds of excel to import into the database, if we use the excel import function provided by database editors such as navicat, it will be extremely tedious and time-consuming...
0
BarryA
by: BarryA | last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
1
by: nemocccc | last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
1
by: Sonnysonu | last post by:
This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.