473,378 Members | 1,138 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,378 software developers and data experts.

reg ex expression - finding long character strings

"Garp" <ga***@no7.blueyonder.co.uk> wrote in message news:<_v*********************@news-text.cableinet.net>...
"lawrence" <lk******@geocities.com> wrote in message
news:da**************************@posting.google.c om...
This reg ex will find me any block of strings 40 or more characters in
length without a white space, yes?

[^ ]{40}
To get it to include tabs and newlines, do I to this?

[^ \n\t]{40}


\s is the whitespace token, if that's easier for you.


Good, but now my question is how to insert the white space that I
want. If I do this:

$string = ereg_replace ([^\s]{40}, " ", $string);

Then the text gets obliterated and replaced by a white space. That is
not what I want. I simply want to break up long strings (mostly urls)
that threaten to destroy the format of a page. This is especially true
of Internet Explorer, which tends to expand DIV tags to fit the
contents (Netscape lets long urls burts outside the boundries of the
DIV.)

Go look at this page using IE 5 or 6:

http://www.publicdomainsoftware.org

You'll see a comment (right now it is the second one down) that looks
like this:
>>>>>>>>>>> Misty, I assume you're the one who came up with these interesting
photos of vegetables? Are they from the ARE garden?
http://www.publicdomainsoftware.org/...egetables2.JPG ...read
more>>>>>>>>>>>


That long url is distorting the whole page. I need to break it up.

I suppose I could hit the whole string with explode() and break them
on the white space and then loop through the array and test each entry
for a length of more than 30 or 40 or so, and then stitch it all back
together with implode, but I was assuming I could do it all more
elegantly with regular expressions. I don't know much about regular
expressions, but if someone does, please let me know.
Jul 17 '05 #1
5 2493
lawrence wrote:
I suppose I could hit the whole string with explode() and break them
on the white space and then loop through the array and test each entry
for a length of more than 30 or 40 or so, and then stitch it all back
together with implode, but I was assuming I could do it all more
elegantly with regular expressions. I don't know much about regular
expressions, but if someone does, please let me know.


Try this. Change as you see fit.
<?php
function compress_url($txt, $size=40) {
$rx = '=(http://\S{' . ($size-7) . ',})=e';
$compressed_txt = preg_replace($rx,
"'[<a class=\"compressed\" href=\"$1\" title=\"$1\">'
. substr('$1', 0, $size-10)
. '...'
. substr('$1', -7)
. '</a>]'",
$txt);
return $compressed_txt;
}
$txt = '
Misty, I assume you\'re the one who came up with these interesting
photos of vegetables? Are they from the ARE garden?
http://www.publicdomainsoftware.org/...egetables2.JPG ...read
more';

# # # # # # # # # # # # # # # # # # # # # # # #
#
# Remember to define a "compressed" class in your stylesheet
#
# # # # # # # # # # # # # # # # # # # # # # # #

echo compress_url($txt);
?>
Happy Coding :-)

--
USENET would be a better place if everybody read: : mail address :
http://www.catb.org/~esr/faqs/smart-questions.html : is valid for :
http://www.netmeister.org/news/learn2quote2.html : "text/plain" :
http://www.expita.com/nomime.html : to 10K bytes :
Jul 17 '05 #2
Pedro Graca <he****@hotpop.com> wrote in message news:<sl*******************@ID-203069.user.uni-berlin.de>...
lawrence wrote:
I suppose I could hit the whole string with explode() and break them
on the white space and then loop through the array and test each entry
for a length of more than 30 or 40 or so, and then stitch it all back
together with implode, but I was assuming I could do it all more
elegantly with regular expressions. I don't know much about regular
expressions, but if someone does, please let me know.


Try this. Change as you see fit.
<?php
function compress_url($txt, $size=40) {
$rx = '=(http://\S{' . ($size-7) . ',})=e';
$compressed_txt = preg_replace($rx,
"'[<a class=\"compressed\" href=\"$1\" title=\"$1\">'
. substr('$1', 0, $size-10)
. '...'
. substr('$1', -7)
. '</a>]'",
$txt);
return $compressed_txt;
}


That looks brilliant, though I have trouble reading it. When you write:

http://\S{' . ($size-7) . ',

are the dots saying "one or more of this white space"?
Jul 17 '05 #3
lawrence wrote:
Pedro Graca <he****@hotpop.com> wrote in message news:<sl*******************@ID-203069.user.uni-berlin.de>...
Try this. Change as you see fit.
<?php
function compress_url($txt, $size=40) {
$rx = '=(http://\S{' . ($size-7) . ',})=e';
$compressed_txt = preg_replace($rx,
"'[<a class=\"compressed\" href=\"$1\" title=\"$1\">'
. substr('$1', 0, $size-10)
. '...'
. substr('$1', -7)
. '</a>]'",
$txt);
return $compressed_txt;
}


That looks brilliant, though I have trouble reading it. When you write:

http://\S{' . ($size-7) . ',

are the dots saying "one or more of this white space"?


No. They are the string concatenator; they are not part of the regular
expression.

If I want to find 40 or more non-whitespace characters in a regular
expression I do

\S{40,}

In the function, I made the length a parameter, so that should be

\S{$size,} *** DOES NOT WORK LIKE THIS!

but, for that specific function I'm already using "http://" (7 chars),
so, that part of the regexp is

\S{$size-7,} *** DOES NOT WORK LIKE THIS!

So, that $rx line concatenates these three strings:
=(http://\S{
$size - 7 *** the result of the subtraction
,})=e

giving, for $size=40

=(http://\S{33,})=e

so it will match http urls (and not https, ftp, mailto, ...) longer than
40 characters.
HTH

--
USENET would be a better place if everybody read: : mail address :
http://www.catb.org/~esr/faqs/smart-questions.html : is valid for :
http://www.netmeister.org/news/learn2quote2.html : "text/plain" :
http://www.expita.com/nomime.html : to 10K bytes :
Jul 17 '05 #4
Pedro Graca <he****@hotpop.com> wrote in message news:<sl*******************@ID-203069.user.uni-berlin.de>...
lawrence wrote:
Pedro Graca <he****@hotpop.com> wrote in message news:<sl*******************@ID-203069.user.uni-berlin.de>...
Try this. Change as you see fit.
<?php
function compress_url($txt, $size=40) {
$rx = '=(http://\S{' . ($size-7) . ',})=e';
$compressed_txt = preg_replace($rx,
"'[<a class=\"compressed\" href=\"$1\" title=\"$1\">'
. substr('$1', 0, $size-10)
. '...'
. substr('$1', -7)
. '</a>]'",
$txt);
return $compressed_txt;
}


That looks brilliant, though I have trouble reading it. When you write:

http://\S{' . ($size-7) . ',

are the dots saying "one or more of this white space"?


No. They are the string concatenator; they are not part of the regular
expression.

If I want to find 40 or more non-whitespace characters in a regular
expression I do

\S{40,}

In the function, I made the length a parameter, so that should be

\S{$size,} *** DOES NOT WORK LIKE THIS!

but, for that specific function I'm already using "http://" (7 chars),
so, that part of the regexp is

\S{$size-7,} *** DOES NOT WORK LIKE THIS!

So, that $rx line concatenates these three strings:
=(http://\S{
$size - 7 *** the result of the subtraction
,})=e

giving, for $size=40

=(http://\S{33,})=e

so it will match http urls (and not https, ftp, mailto, ...) longer than
40 characters.

So you can, so to speak, go in and out of "regex mode" by using a
single quote:

'

I assume this is simply the way PHP is built. And when I wanted to use
a real ' I suppose I would do this:

\'
Jul 17 '05 #5
lawrence wrote:
Pedro Graca <he****@hotpop.com> wrote in message news:<sl*******************@ID-203069.user.uni-berlin.de>...
lawrence wrote:
> Pedro Graca <he****@hotpop.com> wrote in message news:<sl*******************@ID-203069.user.uni-berlin.de>...
>> Try this. Change as you see fit.
>>
>>
>> <?php
>> function compress_url($txt, $size=40) {
>> $rx = '=(http://\S{' . ($size-7) . ',})=e';
>> $compressed_txt = preg_replace($rx,
>> "'[<a class=\"compressed\" href=\"$1\" title=\"$1\">'
>> . substr('$1', 0, $size-10)
>> . '...'
>> . substr('$1', -7)
>> . '</a>]'",
>> $txt);
>> return $compressed_txt;
>> }
>
> That looks brilliant, though I have trouble reading it. When you write:
>
> http://\S{' . ($size-7) . ',
>
> are the dots saying "one or more of this white space"?
No. They are the string concatenator; they are not part of the regular
expression.

If I want to find 40 or more non-whitespace characters in a regular
expression I do

\S{40,}

In the function, I made the length a parameter, so that should be

\S{$size,} *** DOES NOT WORK LIKE THIS!

but, for that specific function I'm already using "http://" (7 chars),
so, that part of the regexp is

\S{$size-7,} *** DOES NOT WORK LIKE THIS!

So, that $rx line concatenates these three strings:
=(http://\S{
$size - 7 *** the result of the subtraction
,})=e

giving, for $size=40

=(http://\S{33,})=e

so it will match http urls (and not https, ftp, mailto, ...) longer than
40 characters.

So you can, so to speak, go in and out of "regex mode" by using a
single quote:

'


No, not quite!
This is all standard string management:
http://www.php.net/manual/en/language.types.string.php

I prefer to use single quotes most of the time.

$x = 'abc'; // $x holds a three-character string
$x = $x . 14; // PHP automagically transforms the number 14 into a
// two-character string; $x now holds a five-character
// string
$x .= 'yz'; // add tow more characters to $x
// making it "abc14xy" (without the quotes)

It's the exact same thing with the regexp above :)
Instead of it being constant, it is a /dynamic/ regexp.

I assume this is simply the way PHP is built. And when I wanted to use
a real ' I suppose I would do this:

\'


If it's inside single quotes, yes.

--
USENET would be a better place if everybody read: | to email me: use |
http://www.catb.org/~esr/faqs/smart-questions.html | my name in "To:" |
http://www.netmeister.org/news/learn2quote2.html | header, textonly |
http://www.expita.com/nomime.html | no attachments. |
Jul 17 '05 #6

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

2
by: lawrence | last post by:
This reg ex will find me any block of strings 40 or more characters in length without a white space, yes? {40} To get it to include tabs and newlines, do I to this? {40}
9
by: Ron Adam | last post by:
Is it possible to match a string to regular expression pattern instead of the other way around? For example, instead of finding a match within a string, I want to find out, (pass or fail), if...
2
by: B Moor | last post by:
I have a database with 100,000's records, each with a unique reference, eg A123BNK456 I would like to generate a search facility whereby we can choose an exact match or partial match, where the...
2
by: Brian Kitt | last post by:
I have a process where I do some minimal reformating on a TAB delimited document to prepare for DTS load. This process has been running fine, but I recently made a change. I have a Full Text...
3
by: Zach | last post by:
Hello, Please forgive if this is not the most appropriate newsgroup for this question. Unfortunately I didn't find a newsgroup specific to regular expressions. I have the following regular...
25
by: Mike | last post by:
I have a regular expression (^(.+)(?=\s*).*\1 ) that results in matches. I would like to get what the actual regular expression is. In other words, when I apply ^(.+)(?=\s*).*\1 to " HEART...
11
by: Steve | last post by:
Hi All, I'm having a tough time converting the following regex.compile patterns into the new re.compile format. There is also a differences in the regsub.sub() vs. re.sub() Could anyone lend...
28
by: Marc Gravell | last post by:
In Linq, you can apparently get a meaningful body from and expression's .ToString(); random question - does anybody know if linq also includes a parser? It just seemed it might be a handy way to...
18
by: Lit | last post by:
Hi, I am looking for a Regular expression for a password for my RegExp ValidationControl Requirements are, At least 8 characters long. At least one digit At least one upper case character
1
by: CloudSolutions | last post by:
Introduction: For many beginners and individual users, requiring a credit card and email registration may pose a barrier when starting to use cloud servers. However, some cloud server providers now...
0
by: Faith0G | last post by:
I am starting a new it consulting business and it's been a while since I setup a new website. Is wordpress still the best web based software for hosting a 5 page website? The webpages will be...
0
by: taylorcarr | last post by:
A Canon printer is a smart device known for being advanced, efficient, and reliable. It is designed for home, office, and hybrid workspace use and can also be used for a variety of purposes. However,...
0
by: aa123db | last post by:
Variable and constants Use var or let for variables and const fror constants. Var foo ='bar'; Let foo ='bar';const baz ='bar'; Functions function $name$ ($parameters$) { } ...
0
by: ryjfgjl | last post by:
If we have dozens or hundreds of excel to import into the database, if we use the excel import function provided by database editors such as navicat, it will be extremely tedious and time-consuming...
0
by: emmanuelkatto | last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel
0
BarryA
by: BarryA | last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
1
by: nemocccc | last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
1
by: Sonnysonu | last post by:
This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.