473,466 Members | 2,016 Online
Bytes | Software Development & Data Engineering Community
Create Post

Home Posts Topics Members FAQ

Regular Expression help

Rob
Hi,
I need to convert our word documents to html for our website. I've used
MS Word's "Save as HTML" feature and ran "Microsoft Office HTML Filtrer
2.0" to clean up the code but I am stuck with a lot of additional code
and I want to write a script that will do a custom cleanup.

The Word document has a "Table of Contents" and when I convert, I get
links at the top of my page that link to the appropriate section but I
get code like this:

<a name="_Toc54767572"></a><a name="_Toc58978952"></a><a
name="_Toc58980987"></a><a
name="_Toc58981749"></a><a name="_Toc90871301"></a><a
name="_Toc93973545"></a><a
name="_Toc126114863"></a>
<a name="_Toc157391168">My Title</a>

I get a whole bunch of empty anchor tags each with a different name and
only the last anchor tag is correct. I would like to use regular
expressions to remove all empty "a" tags.

I know how to use regular expressions with ASP 3.0 but I don't know the
pattern.

Does anyone know the regex.pattern to replace all empty <atags with an
empty string?

Thanks
Rob

*** Sent via Developersdex http://www.developersdex.com ***
Apr 26 '07 #1
3 3256

"Rob" <ro****@hotmail.comwrote in message
news:uM**************@TK2MSFTNGP05.phx.gbl...
Hi,
I need to convert our word documents to html for our website. I've used
MS Word's "Save as HTML" feature and ran "Microsoft Office HTML Filtrer
2.0" to clean up the code but I am stuck with a lot of additional code
and I want to write a script that will do a custom cleanup.

The Word document has a "Table of Contents" and when I convert, I get
links at the top of my page that link to the appropriate section but I
get code like this:

<a name="_Toc54767572"></a><a name="_Toc58978952"></a><a
name="_Toc58980987"></a><a
name="_Toc58981749"></a><a name="_Toc90871301"></a><a
name="_Toc93973545"></a><a
name="_Toc126114863"></a>
<a name="_Toc157391168">My Title</a>

I get a whole bunch of empty anchor tags each with a different name and
only the last anchor tag is correct. I would like to use regular
expressions to remove all empty "a" tags.
Rob, I think something similar to

Set RegularExpressionObject = New RegExp

With RegularExpressionObject
..Pattern = "\<a(.|\n)*\>\<\/a\>"
..IgnoreCase = True
..Global = True
End With

ReplacedText = RegularExpressionObject.Replace(InitialText, "")
Apr 27 '07 #2
Alexey Smirnov wrote on 27 apr 2007 in
microsoft.public.inetserver.asp.general:
>
"Rob" <ro****@hotmail.comwrote in message
news:uM**************@TK2MSFTNGP05.phx.gbl...
[..]
>>
I get a whole bunch of empty anchor tags each with a different name
and only the last anchor tag is correct. I would like to use regular
expressions to remove all empty "a" tags.

Rob, I think something similar to

Set RegularExpressionObject = New RegExp

With RegularExpressionObject
.Pattern = "\<a(.|\n)*\>\<\/a\>"
.IgnoreCase = True
.Global = True
End With

ReplacedText = RegularExpressionObject.Replace(InitialText, "")
..Pattern = "<a[^>]*>\s*<\/a>"

will do.

=================

However, why [yes, I know it is personal preference] not use a bit of
jscript even if you use vbs in ASP:
<% ' vbs
dim t,result
t="x<a \nhref='bbb'\n </a>\n\n<a href='bbb'x </a>"
result = deleteEmptyAnchors(t)
%>
<script language='jscript' runat='server'>
function deleteEmptyAnchors(t){
return t.replace(/<a[^>]*>\s*<\/a>/gi,'');
};
</script>
--
Evertjan.
The Netherlands.
(Please change the x'es to dots in my emailaddress)
Apr 27 '07 #3
Rob
Thanks Evertjan

I tried the other example "\<a(.|\n)*\>\<\/a\>" but my page was taking
too long to process it. Then I tried your example "<a[^>]*>\s*<\/a>" and
it works great.

Thanks again.

Rob

*** Sent via Developersdex http://www.developersdex.com ***
Apr 27 '07 #4

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

5
by: Bradley Plett | last post by:
I'm hopeless at regular expressions (I just don't use them often enough to gain/maintain knowledge), but I need one now and am looking for help. I need to parse through a document to find a URL,...
4
by: Neri | last post by:
Some document processing program I write has to deal with documents that have headers and footers that are unnecessary for the main processing part. Therefore, I'm using a regular expression to go...
10
by: Lee Kuhn | last post by:
I am trying the create a regular expression that will essentially match characters in the middle of a fixed-length string. The string may be any characters, but will always be the same length. In...
3
by: James D. Marshall | last post by:
The issue at hand, I believe is my comprehension of using regular expression, specially to assist in replacing the expression with other text. using regular expression (\s*) my understanding is...
7
by: Billa | last post by:
Hi, I am replaceing a big string using different regular expressions (see some example at the end of the message). The problem is whenever I apply a "replace" it makes a new copy of string and I...
9
by: Pete Davis | last post by:
I'm using regular expressions to extract some data and some links from some web pages. I download the page and then I want to get a list of certain links. For building regular expressions, I use...
3
by: Zach | last post by:
Hello, Please forgive if this is not the most appropriate newsgroup for this question. Unfortunately I didn't find a newsgroup specific to regular expressions. I have the following regular...
25
by: Mike | last post by:
I have a regular expression (^(.+)(?=\s*).*\1 ) that results in matches. I would like to get what the actual regular expression is. In other words, when I apply ^(.+)(?=\s*).*\1 to " HEART...
3
by: Mr.Steskal | last post by:
Posted: Wed Jul 11, 2007 7:01 am Post subject: Regular Expression Help -------------------------------------------------------------------------------- I need help writing a regular...
18
by: Lit | last post by:
Hi, I am looking for a Regular expression for a password for my RegExp ValidationControl Requirements are, At least 8 characters long. At least one digit At least one upper case character
1
by: Sonnysonu | last post by:
This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...
0
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...
0
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...
0
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each...
0
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing,...
0
by: conductexam | last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and...
0
by: TSSRALBI | last post by:
Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The...
0
by: adsilva | last post by:
A Windows Forms form does not have the event Unload, like VB6. What one acts like?
0
by: 6302768590 | last post by:
Hai team i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated ...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.