473,320 Members | 1,978 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,320 software developers and data experts.

HTML parsing with regular expressions

Hi, I am new to writing in perl and am just learning the basics of regular expressions. I have an assignment to parse some data from an HTML file. In the end I need to create a CSV file. Where do I start here? I can only use pattern matching to get out the material I need. Below is the link to the page I have to parse. I need to parse out the data in the table (Date, Time, Name, Address, etc) Thanks!

http://www.iowa-city.org/police/arrests.asp?charge=94000
Nov 9 '08 #1
1 1605
KevinADC
4,059 Expert 2GB
If you need help with your existing code then please post it. If you have yet to write any code, then search online for perl regular expression tutorials. There are some here:

http://www.perlmonks.com/index.pl?node=Tutorials
Nov 9 '08 #2

Sign in to post your reply or Sign up for a free account.

Similar topics

7
by: YoBro | last post by:
Hi I have used some of this code from the PHP manual, but I am bloody hopeless with regular expressions. Was hoping somebody could offer a hand. The output of this will put the name of a form...
18
by: Shannon Jacobs | last post by:
Trying to solve this with a regex approach rather than the programmatic approach of counting up and down the levels. I have a fairly complicated HTML page that I want to simplify. I've been able to...
11
by: Martin Robins | last post by:
I am trying to parse a string that is similar in form to an OLEDB connection string using regular expressions; in principle it is working, but certain character combinations in the string being...
6
by: Mark Rae | last post by:
Hi, I'm using HttpWebRequest and HttpWebResponse to return a stream of HTML. Looking for advice as to the accepted / easiest / most efficient way to process this HTML in the background i.e. I...
1
by: Patrick | last post by:
I need to parse and HTML document of the following format. I am interested to obtain all the HTML from and including the first <div class="data"> up to and including Data updated dd/mm/yyyy...
1
by: yonido | last post by:
hello, my goal is to get patterns out of email files - say "message forwarding" patterns (message forwarded from: xx to: yy subject: zz) now lets say there are tons of these patterns (by gmail,...
17
by: Mark | last post by:
I must create a routine that finds tokens in small, arbitrary VB code snippets. For example, it might have to find all occurrences of {Formula} I was thinking that using regular expressions...
3
by: Seb | last post by:
Hello, I am trying to find some object/function able to take an HTML page (code) as an input, strip out all images, stylesheets and other external references, and returns "cleaned" HTML only...
0
by: bharathitm | last post by:
I'm working on regular expressions to parse html tags into the wiki syntax. i.e. for example, if i encounter text like - some <bmore </ btext, my regular expression should be able to convert that...
5
by: Svenn Are Bjerkem | last post by:
On Jul 23, 1:03 pm, christopher.saun...@durham.ac.uk (c d saunter) wrote: As a start I want to parse VHDL which is going to be synthesised, and I am limiting myself to the entities and the...
0
by: ryjfgjl | last post by:
ExcelToDatabase: batch import excel into database automatically...
1
isladogs
by: isladogs | last post by:
The next Access Europe meeting will be on Wednesday 6 Mar 2024 starting at 18:00 UK time (6PM UTC) and finishing at about 19:15 (7.15PM). In this month's session, we are pleased to welcome back...
0
by: Vimpel783 | last post by:
Hello! Guys, I found this code on the Internet, but I need to modify it a little. It works well, the problem is this: Data is sent from only one cell, in this case B5, but it is necessary that data...
0
by: jfyes | last post by:
As a hardware engineer, after seeing that CEIWEI recently released a new tool for Modbus RTU Over TCP/UDP filtering and monitoring, I actively went to its official website to take a look. It turned...
1
by: CloudSolutions | last post by:
Introduction: For many beginners and individual users, requiring a credit card and email registration may pose a barrier when starting to use cloud servers. However, some cloud server providers now...
1
by: Defcon1945 | last post by:
I'm trying to learn Python using Pycharm but import shutil doesn't work
0
by: af34tf | last post by:
Hi Guys, I have a domain whose name is BytesLimited.com, and I want to sell it. Does anyone know about platforms that allow me to list my domain in auction for free. Thank you
0
by: Faith0G | last post by:
I am starting a new it consulting business and it's been a while since I setup a new website. Is wordpress still the best web based software for hosting a 5 page website? The webpages will be...
0
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 3 Apr 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome former...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.