473,507 Members | 2,405 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

reg exp

Perl scipt is formatting text for HTML page. It changes things like
an & to &amp. But should not change &nbsp. It uses \ as an escape
character. So \&nbsp will become &nbsp. The final results are
correct, but is there a better way to do this?

Input file test.txt
\HOME & \  BORN \& FREE BORN FREE ' \' HELP " \" w\\\\\\\w

1st change
1a= \HOME & \  BORN \& FREE BORN FREE '' \' HELP " \"
w\\\\\\\w
2nd changes
1b= HOME &   BORN & FREE BORN FREE '' ' HELP " "
w\\\w

#!/usr/local/bin/perl5
#
%encode = ( '&' => '&',
'"' => '"',
'\'' => '\'\'' );

$data = `cat test.txt`;
print "Oa= $data\n";
$data =~ s/(?<!\\)(.)/defined($encode{$1})?$encode{$1}:$1/eg;
print "1a= $data\n";
$data =~ s/(\\)(.)/$2/g;
print "1b= $data\n";
This is perl, v5.8.0 built for PA-RISC2.0 On HP-Unix.
Jul 19 '05 #1
5 4071
Ken Chesak wrote:
Perl scipt is formatting text for HTML page. It changes things like
an & to &amp. But should not change &nbsp. It uses \ as an escape
character. So \&nbsp will become &nbsp. The final results are
correct, but is there a better way to do this?

Input file test.txt
\HOME & \&nbsp; BORN \& FREE BORN FREE ' \' HELP " \" w\\\\\\\w

1st change
1a= \HOME &amp; \&nbsp; BORN \& FREE BORN FREE '' \' HELP &quot; \"
w\\\\\\\w
2nd changes
1b= HOME &amp; &nbsp; BORN & FREE BORN FREE '' ' HELP &quot; "
w\\\w

#!/usr/local/bin/perl5
#
%encode = ( '&' => '&amp;',
'"' => '&quot;',
'\'' => '\'\'' );

$data = `cat test.txt`;
print "Oa= $data\n";
$data =~ s/(?<!\\)(.)/defined($encode{$1})?$encode{$1}:$1/eg;
print "1a= $data\n";
$data =~ s/(\\)(.)/$2/g;
print "1b= $data\n";


Don't know about better, but this does it with one substitution, and
does not require escaping of HTML entities in the original text:

$data =~ s{(&#?\w+;)|\\(.)|([&"'])}
{ $1 ? $1 : $2 ? $2 : $encode{$3} }eg;

Another thing is that I'm a bit confused about the wider purpose with
the exercise...

--
Gunnar Hjalmarsson
Email: http://www.gunnar.cc/cgi-bin/contact.pl
Jul 19 '05 #2
Ken Chesak wrote:
Perl scipt is formatting text for HTML page. It changes things like
an & to &amp. But should not change &nbsp.


You've got bad or inconsistent input data.
Whatever process created the "&nbsp;" items is responsible for making
sure that all the other & occurances are set to "&amp;". You should
fix the upstream process instead of doing post-processing.
-Joe
Jul 19 '05 #3
Gunnar Hjalmarsson <no*****@gunnar.cc> wrote in message news:<eh*********************@newsc.telia.net>...
Ken Chesak wrote:
Perl scipt is formatting text for HTML page. It changes things like
an & to &amp. But should not change &nbsp. It uses \ as an escape
character. So \&nbsp will become &nbsp. The final results are
correct, but is there a better way to do this?

Input file test.txt
\HOME & \&nbsp; BORN \& FREE BORN FREE ' \' HELP " \" w\\\\\\\w

1st change
1a= \HOME &amp; \&nbsp; BORN \& FREE BORN FREE '' \' HELP &quot; \"
w\\\\\\\w
2nd changes
1b= HOME &amp; &nbsp; BORN & FREE BORN FREE '' ' HELP &quot; "
w\\\w

#!/usr/local/bin/perl5
#
%encode = ( '&' => '&amp;',
'"' => '&quot;',
'\'' => '\'\'' );

$data = `cat test.txt`;
print "Oa= $data\n";
$data =~ s/(?<!\\)(.)/defined($encode{$1})?$encode{$1}:$1/eg;
print "1a= $data\n";
$data =~ s/(\\)(.)/$2/g;
print "1b= $data\n";


Don't know about better, but this does it with one substitution, and
does not require escaping of HTML entities in the original text:

$data =~ s{(&#?\w+;)|\\(.)|([&"'])}
{ $1 ? $1 : $2 ? $2 : $encode{$3} }eg;

Another thing is that I'm a bit confused about the wider purpose with
the exercise...


Gunnar,

Thanks, that works nicely. I had not thought of using the ";" to
anchor the html reserved words.

I had one question, what does the ? and : do on the following line,
{ $1 ? $1 : $2 ? $2 : $encode{$3} }eg;

The purpose of the script is to format the text for HTML. It was
originally changing all & to &amp. So when they started putting &nbsp
in, that was being changed to &ampnbsp. Which does not mean anything
to HTML.

Thanks again,
Ken
Jul 19 '05 #4
Ken Chesak wrote:
I had one question, what does the ? and : do on the following line,
{ $1 ? $1 : $2 ? $2 : $encode{$3} }eg;


It's called the conditional operator, and is a shorter way of writing

if ($1) {
$1
} elsif ($2) {
$2
} else {
$encode{$3}
}

See "perldoc perlop".

--
Gunnar Hjalmarsson
Email: http://www.gunnar.cc/cgi-bin/contact.pl
Jul 19 '05 #5
Gunnar Hjalmarsson <no*****@gunnar.cc> wrote in message news:<91*********************@newsc.telia.net>...
Ken Chesak wrote:
I had one question, what does the ? and : do on the following line,
{ $1 ? $1 : $2 ? $2 : $encode{$3} }eg;


It's called the conditional operator, and is a shorter way of writing

if ($1) {
$1
} elsif ($2) {
$2
} else {
$encode{$3}
}


Or a longer way of writing...

$1 || $2 || $encode{$3}

....depending on your point of view.
Jul 19 '05 #6

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

3
11177
by: William C. White | last post by:
Does anyone know of a way to use PHP /w Authorize.net AIM without using cURL? Our website is hosted on a shared drive and the webhost company doesn't installed additional software (such as cURL)...
2
5773
by: Albert Ahtenberg | last post by:
Hello, I don't know if it is only me but I was sure that header("Location:url") redirects the browser instantly to URL, or at least stops the execution of the code. But appearantely it continues...
3
22957
by: James | last post by:
Hi, I have a form with 2 fields. 'A' 'B' The user completes one of the fields and the form is submitted. On the results page I want to run a query, but this will change subject to which...
0
8431
by: Ollivier Robert | last post by:
Hello, I'm trying to link PHP with Oracle 9.2.0/OCI8 with gcc 3.2.3 on a Solaris9 system. The link succeeds but everytime I try to run php, I get a SEGV from inside the libcnltsh.so library. ...
1
8535
by: Richard Galli | last post by:
I want viewers to compare state laws on a single subject. Imagine a three-column table with a drop-down box on the top. A viewer selects a state from the list, and that state's text fills the...
4
18214
by: Albert Ahtenberg | last post by:
Hello, I have two questions. 1. When the user presses the back button and returns to a form he filled the form is reseted. How do I leave there the values he inserted? 2. When the...
1
6776
by: inderjit S Gabrie | last post by:
Hi all Here is the scenerio ...is it possibly to do this... i am getting valid course dates output on to a web which i have designed ....all is okay so far , look at the following web url ...
2
31341
by: Jack | last post by:
Hi All, What is the PHP equivilent of Oracle bind variables in a SQL statement, e.g. select x from y where z=:parameter Which in asp/jsp would be followed by some statements to bind a value...
3
23531
by: Sandwick | last post by:
I am trying to change the size of a drawing so they are all 3x3. the script below is what i was trying to use to cut it in half ... I get errors. I can display the normal picture but not the...
0
7223
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
0
7321
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...
0
7488
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each...
0
5623
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing,...
1
5045
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new...
0
3191
by: TSSRALBI | last post by:
Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The...
0
3179
by: adsilva | last post by:
A Windows Forms form does not have the event Unload, like VB6. What one acts like?
1
762
muto222
by: muto222 | last post by:
How can i add a mobile payment intergratation into php mysql website.
0
412
bsmnconsultancy
by: bsmnconsultancy | last post by:
In today's digital era, a well-designed website is crucial for businesses looking to succeed. Whether you're a small business owner or a large corporation in Toronto, having a strong online presence...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.