473,379 Members | 1,216 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,379 software developers and data experts.

urgent need help in parsing html tables

I am trying to parse a simple table with two headings and get the rows but I am having a big problem trying to find out how to pass the link to the html or path to the html.

Html is apparently in my desktop itself I have a path but I have no clue how to use that in HTML::TableExtract.

Expand|Select|Wrap|Line Numbers
  1. use HTML::TableExtract;
  2.  $te = HTML::TableExtract->new( headers => [qw(Date Price Cost)] );
  3.  $te->parse($html_string);
  4.  
  5.  # Examine all matching tables
  6.  foreach $ts ($te->tables) {
  7.    print "Table (", join(',', $ts->coords), "):\n";
  8.  
  9.    foreach $row ($ts->rows) {
  10.       print join(',', @$row), "\n";
  11.    }
  12.  }
Lets say I put those headings supposed heading1 and heading2 in place of Data Price
Where should put the link to the html
which is something like /home/jack/desktop/sample.html
I tried doing $html_string="/home/jack/desktop/sample.html" but it does not work at all

what am I supposed to do I appreciate if you can help me out of this .

thanks a lot
Jul 1 '08 #1
6 2279
KevinADC
4,059 Expert 2GB
If you use the better HTML::TableParser module it can open the file for you. See the parse_file method:

http://search.cpan.org/~djerius/HTML...TableParser.pm

basically:

Expand|Select|Wrap|Line Numbers
  1. $p->parse_file('c:/windows/desktop/foo.html');
  2.  
where $p is the parser object and the file path is the correct one for your computer and file. Note: you can use forward slashes in windows file/directory paths.
Jul 2 '08 #2
Thanks for the post but that looks more complicated then the previous one.
I just need to parse the a table in html which is in my desktop itself.
I do not want to use any kind of table id or sizes just the heading name.

What would be the best way to use HTML::TableExtract,
-I need to put the file path for html somewhere
(the problem I am facing here is everywhere throughout the examples in cspan html_string is already there without initialization its an incomplete program)

-I need to put the headers

Results: I need the table data thats all I am sorry but I do not want to get to see what id is my table and all that.


Please help me I think this is seems like a simple problem. I could not debug this problem because whenever I run I dont get errors and I dont get anything printed I am pretty much very irritatted and more hopeless everyday.I think I made a big mistake to tr using perl for this project the whole thing is so disorganized cant find a single example to just to that.

Please I would reall appreciate if someone can help me .

Prior thanks to all of those and thanks for the reply
Jul 2 '08 #3
KevinADC
4,059 Expert 2GB
here you go:

Expand|Select|Wrap|Line Numbers
  1. open (HTML, 'c:/path/to/foo.html') or die "$!";
  2. my $html = do {local $/; <HTML>};#puts the entire file in a scalar variable
  3. close HTML;
Now you can parse $html.
Jul 2 '08 #4
This is the program I wrote:
#!/usr/bin/perl
use HTML::TableExtract;
open (HTML, '/root/Desktop/test.html') or die "$!";
my $html = do {local $/; <HTML>};#puts the entire file in a scalar variable
$te = HTML::TableExtract->new( headers => [qw(Heading Heading_2)] );
$te->parse($HTML);
# Examine all matching tables
foreach $ts ($te->tables) {
print "Table (", join(',', $ts->coords), "):\n";
foreach $row ($ts->rows) {
print join(',', @$row), "\n";
}
}

But when I do perl program.pl it does not do anything, it gives me a prompt.
Thanks for the reply I would appreciate if you solve this problem.

I am literally not getthing anything and after I do perl program.pl I get another prompt.
Thanks , please help
Jul 2 '08 #5
Ok I think I got it there was a minor problem . Thanks a lot for help I appreciate
Jul 2 '08 #6
Hi ,
I got the table extracted and I have a huge document full of tables. From this(HTML::TableExtract) module I am trying to search for keywords(from the user input) on the parsed tables I have to print only the necessary data.
I tried going CPAN but could not really find how to search through it for particular keywords.

One way to do it would be(a rather wrong way for me since I need corresponding columns or some other relevant data from the table if I find that in that particular table):
Output the result of the parsed tables into some .text and parse it from there
but parsing from there would hinder my aim to actually get the keywords corresponding columns

Aim and problem here:: is I cant find anyway to search through the resulting parsed table and get necessary data.


thanks for the reply I appreciate
Jul 2 '08 #7

Sign in to post your reply or Sign up for a free account.

Similar topics

16
by: Terry | last post by:
Hi, This is a newbie's question. I want to preload 4 images and only when all 4 images has been loaded into browser's cache, I want to start a slideshow() function. If images are not completed...
0
by: Doug R | last post by:
Hello, I have a system that I am writing to automaticly import Credit Transaction data into a SQL Server 2000 Database. I am using a VB.Net application to detect when the file arives and prep...
0
by: Pentti | last post by:
Can anyone help to understand why re-parsing occurs on a remote database (using database links), even though we are using a prepared statement on the local database: Scenario: ======== We...
8
by: Mike | last post by:
Hello, I have a few rather urgent questions that I hope someone can help with (I need to figure this out prior to a meeting tomorrow.) First, a bit of background: The company I work for is...
3
by: Bilal | last post by:
hi all I am facing a little problem. I have a form with a dropdownlist control. I have a button and when i click the form posts back The problem is that in the on button click event i'm calling...
13
by: scorpion53061 | last post by:
Very urgent and I am very close but need a little help to get me over the edge........ I need to write these columns to a html file with each row containing these columns (seperated by breaks)....
1
by: lckarthikeyan | last post by:
I Have one doubt in parsing function in c... In my program i am using one object identifier like this 1.3.6.7.184 now i want to change like this using parsing function 1_3_6_10_184.. any...
0
by: bharathitm | last post by:
I'm working on regular expressions to parse html tags into the wiki syntax. i.e. for example, if i encounter text like - some <bmore </ btext, my regular expression should be able to convert that...
4
by: poisonedapple | last post by:
Hi , I have posted this in one of my other thread but I am posting it again since it is infact on a different topic I got the table extracted and I have a huge document full of tables. From...
0
by: Faith0G | last post by:
I am starting a new it consulting business and it's been a while since I setup a new website. Is wordpress still the best web based software for hosting a 5 page website? The webpages will be...
0
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 3 Apr 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome former...
0
by: ryjfgjl | last post by:
In our work, we often need to import Excel data into databases (such as MySQL, SQL Server, Oracle) for data analysis and processing. Usually, we use database tools like Navicat or the Excel import...
0
by: Charles Arthur | last post by:
How do i turn on java script on a villaon, callus and itel keypad mobile phone
0
by: ryjfgjl | last post by:
In our work, we often receive Excel tables with data in the same format. If we want to analyze these data, it can be difficult to analyze them because the data is spread across multiple Excel files...
0
by: emmanuelkatto | last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel
1
by: nemocccc | last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
1
by: Sonnysonu | last post by:
This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.