what is wrong with my script.

Im using the below to extract the text between all the <br></br>.

But it does not prints out all text and prints the normal text which is not a part of html link tag.

Example, if you have <a href="test.html" ><b>The Testing Page is here</b></a>
<b> extrat text</b>
I want to extract only - "The Testing Page is here"

Here variable $myfile

Here variable $myfile contains the whole HTML page

Expand|Select|Wrap|Line Numbers

 
while ($myfile =~ /<br.+?>(.*)<\/br>/xg) 

 {print ("a");

 print $1;

 }

Can some one help me out, what I am doing wrong here?

More Information, I am trying to extract all the text which is a link in the given HTML page.

Feb 6 '10 #1

Subscribe Post Reply

1457

modmans2ndcoming

There is no breakline tag in your example and the breakline does not have a closing tag, it is self closing.... I will assume you mean the bold tag.

The way you have written your regex, it is looking for a breakline tag so right off the bat, that needs to be fixed.

Furthermore, the way you have it written, it will only pickup on a pattern that contains a URL text between bold tags. Not very flexible.

the pattern you want to look for is anchor tag, followed by 0 or more tags which is followed by alphanumeric characters of any length and ends when you hit the open bracket of a tag.

but even with that, there is a problem if a tag is embeded in the middle of a sentence used as the link text. I'll leave that to you to figure out though, if you care to.

Feb 7 '10 #2

numberwhun

3,509

Expert Mod 2GB

You need to really examine what you are telling your code to extract and what you actually have in your data.

You are telling it to match everything between <br> and </br>, but those tags do not exist in your example. Instead, remove the 'r' and try matching the <b> </b> tag set.

Regards,

Jeff

Feb 8 '10 #3

nithinpes

410

Expert 256MB

If you use :

Expand|Select|Wrap|Line Numbers

$myfile =~ /<b>(.*)<\/b>/xg

$1 would have "The Testing Page is here</b></a>
<b> extrat text".
This is because of the greedy nature of * quantifier. To limit this behaviour in order to match minimum number of characters before finding a </b>, use:

Expand|Select|Wrap|Line Numbers

$myfile =~ /<b>(.*?)<\/b>/xg

Feb 8 '10 #4

by: Greener | last post by:

Hi, I need help badly. Can you do client-side programming instead of server-side to capture the Browser type info? If this is the case, what's wrong with the following? <script...

Javascript

what's wrong??!?!?!

by: tin | last post by:

Javascript

What is wrong with the arr.slice(x, y) in IE6?

by: F. Da Costa | last post by:

Hi, I was wondering whether someone could enlighten me as to the reason why the slice does not work in IE when the arr is passed in properly. Checked the values in the srcArr and they are...

Javascript

Sorry: its my first script. what is wrong?

by: Rtritell | last post by:

Please can you find out what's wrong, fix the script and tell me what was wrong. Im just beginning <html> <head> <title>Random Mad Lib!</title> <script language="JavaScript"> <!-- Hide

Javascript

What am I doing wrong here. Simple statement. Novice Question.

by: Paul | last post by:

HI! I get an error with this code. <SCRIPT language="JavaScript"> If (ifp==""){ ifp="default.htm"} //--></SCRIPT> Basicly I want my iframe to have a default page if the user enters in...

Javascript

What Am I doing wrong here?

by: Paul | last post by:

HI! I have a script that does not seem to work. can someone tell me what I am doing wrong here? <script language="JavaScript"> function firefoxautofix(){ parent.window.resizeBy(-1,-1)...

Javascript

Open new browser window in ASP.net. What i am doing wrong?

by: Miguel Dias Moura | last post by:

Hello, i want to open a new window when a button is clicked. I am working in ASP.net / VB. However my code is not working. This is my Script Code: <script runat="server"> Private Sub...

ASP.NET

What is .net

by: Midnight Java Junkie | last post by:

Dear Colleagues: I feel that the dumbest questions are those that are never asked. I have been given the opportunity to get into .NET. Our organization has a subscription with Microsoft that...

ASP.NET

im a total noob, but still... what am i doing wrong?

by: plemon | last post by:

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"> <html xmlns="http://www.w3.org/1999/xhtml"> <head> <meta...

Javascript

Please explain what I'm doing wrong...

by: SirG | last post by:

I'm looking for an explanation of why one piece of code works and another does not. I have to warn you that this is the first piece of Javascript I've ever written, so if there is a better way or a...

Javascript

Migrating Website to Cloud - Emmanuel Katto

by: emmanuelkatto | last post by:

Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel

General

Looking to do Android software development, any suggestions? Is flutter better?

by: nemocccc | last post by:

hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?

General

How to build RAID in BIOS?

by: Hystou | last post by:

There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...

Computer Hardware

What is ONU?

by: marktang | last post by:

ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...

General

Problem With Comparison Operator <=> in G++

by: Oralloy | last post by:

Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...

C / C++

The easy way to turn off automatic updates for Windows 10/11

by: Hystou | last post by:

Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows...

Windows Server

Discussion: How does Zigbee compare with other wireless protocols in smart home applications?

by: tracyyun | last post by:

Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each...

General

AI Job Threat for Devs

by: agi2029 | last post by:

Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing,...

Career Advice

Couldn’t get equations in html when convert word .docx file to html file in C#.

by: conductexam | last post by:

I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and...

C# / C Sharp

what is wrong with my script.

Similar topics