473,385 Members | 1,531 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,385 software developers and data experts.

How can I delete contents between

How can I delete contents between <SEC-HEADER> and </SEC-HEADER> in a htm file?
Why my code does not work?
Thanks!
Expand|Select|Wrap|Line Numbers
  1. #!/usr/bin/perl
  2.  
  3. # This is a program which can process the Edgar 10-k html file into a plain text
  4. # file without graphs and tables.
  5.  
  6. $filename="H:/Test Data/wmt2004.htm";
  7. open IN, '<', $filename or die;
  8. @contents = <IN>;
  9. close IN;
  10.  
  11. @contents = grep !/<SEC-HEADER>.*</SEC-HEADER>/ @contents;
  12.  
  13. $filenameout="H:/Test Data/wmt2004-processed.htm";
  14. open OUT, '>', $filenameout or die;
  15. print OUT @contents;
  16. close OUT;
  17.  
Jun 8 '10 #1
3 1614
numberwhun
3,509 Expert Mod 2GB
Have you taken a look at the perldoc page for grep in Perl? You will note that your grep statement should actually be coded as follows:

Expand|Select|Wrap|Line Numbers
  1. @contents = grep {!/<SEC-HEADER>.*</SEC-HEADER>/} @contents;
  2.  
As for "not working", can you please elaborate? What are you seeing that is going wrong and what are you expecting to see?

Regards,

Jeff
Jun 8 '10 #2
toolic
70 Expert
Your code has syntax errors and does not compile. Please post the actual code you are running.

You should have also posted a small snippet of your input file. Here is my guess: your input file has start and end tags on different lines. Consider:

Expand|Select|Wrap|Line Numbers
  1. use warnings;
  2. use strict;
  3.  
  4. my @contents = <DATA>; 
  5. @contents = grep { !/<SEC-HEADER>.*<\/SEC-HEADER>/ } @contents; 
  6. print @contents;
  7.  
  8. __DATA__
  9. <html>
  10.  
  11. <SEC-HEADER>foo</SEC-HEADER>
  12.  
  13. <SEC-HEADER>
  14. bar</SEC-HEADER>
  15.  
  16. </html>
This prints out:

Expand|Select|Wrap|Line Numbers
  1. <html>
  2.  
  3.  
  4. <SEC-HEADER>
  5. bar</SEC-HEADER>
  6.  
  7. </html>
In any case, you really should use one of the HTML parser modules from CPAN instead of regular expressions.
Jun 8 '10 #3
Thank you guys I figure out. Thanks very much!
I am trying to get familiar with perl.
Jun 9 '10 #4

Sign in to post your reply or Sign up for a free account.

Similar topics

0
by: Konstantin Kosinsky795314850 | last post by:
I need lines below and above image. I used: <fo:block border-bottom="0.003in solid black" padding-before="0in" padding-top="0in" padding-start="0in" padding-bottom="0in" padding-after="0in"...
3
by: jdph40 | last post by:
On a form in Access 2002, I have a check box called DayCode. In the AfterUpdate event, I have the following so when the check box is ticked, the 1st day of the year will populate the text box...
2
by: SamIAm | last post by:
Hi I want to replace the contents between brackets i.e. string text = "this is a sentence (replace me)"' Thanks, S
1
by: wayne | last post by:
Dear all, may i know how can i delete the contents of an existing file using vc++?? also, i would like to know how can i delete the whole file? thanks alot wayne
6
by: Poppy | last post by:
I use the following code to append a line of text to a text file : Dim myFILENAME As String = Server.MapPath("/ABACUS/cvdocs/" & fileName) Dim objStreamWriter As StreamWriter objStreamWriter =...
2
by: arunperi | last post by:
Hi All, I am using FILE (fprintf) operations to write into a specified file. Later, at a specific point i need to delete the contents of the file. Can anyone suggest how i can delete the...
7
by: kriz4321 | last post by:
Hi all, I need to extact the contents between two keywords(START and STOP in a file . I cont get the logic how to take the contents... Can anyone help me out!!! ...
1
by: Chandra | last post by:
Hi, Can I drag and drop html contents (DIV/Table) between 2 different webpages? I am able to do it in a single webpage by using the mouse events, but want to extend it for different webpages. ...
3
by: junkchiu | last post by:
want to remove "content" between two corresponding tags. What's the best way of doing it? I want to remove contents between <SEC-HEADER> and <table> tags. I have different tags, and I want to remove...
1
by: chazzy69 | last post by:
Hi i know that u can delete and entire field from a table i.e. DELETE FROM table WHERE bla="bla" But what i want to do is only delete the stuff contained in one cell of the field i.e. rather...
0
by: Faith0G | last post by:
I am starting a new it consulting business and it's been a while since I setup a new website. Is wordpress still the best web based software for hosting a 5 page website? The webpages will be...
0
by: ryjfgjl | last post by:
In our work, we often need to import Excel data into databases (such as MySQL, SQL Server, Oracle) for data analysis and processing. Usually, we use database tools like Navicat or the Excel import...
0
by: Charles Arthur | last post by:
How do i turn on java script on a villaon, callus and itel keypad mobile phone
0
by: ryjfgjl | last post by:
If we have dozens or hundreds of excel to import into the database, if we use the excel import function provided by database editors such as navicat, it will be extremely tedious and time-consuming...
0
by: ryjfgjl | last post by:
In our work, we often receive Excel tables with data in the same format. If we want to analyze these data, it can be difficult to analyze them because the data is spread across multiple Excel files...
0
by: emmanuelkatto | last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel
0
BarryA
by: BarryA | last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
1
by: nemocccc | last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
1
by: Sonnysonu | last post by:
This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.