By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
431,731 Members | 1,104 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 431,731 IT Pros & Developers. It's quick & easy.

regex

P: n/a
Seb
Hi,

Has anyone an idee how i can replace every character in a string if it is not alphanumeric ?

something like eregi_replace, but i don't know how i say in regex NOT.

Tnx
Jul 17 '05 #1
Share this Question
Share on Google+
4 Replies


P: n/a

http://ch.php.net/manual/en/function.preg-replace.php

$out = preg_replace("/^\W+/, "", $in);

Seb wrote:
Hi,

Has anyone an idee how i can replace every character in a string if it
is not alphanumeric ?

something like eregi_replace, but i don't know how i say in regex NOT.

Tnx


Jul 17 '05 #2

P: n/a
Seb
Tnxx, that works for me.

Here is a reference for other people with same probs :

WildcardDescription
\dMatches a digit (character class [0-9])
\DMatches a non digit ([^0-9])
\wMatches a word character ([a-zA-Z0-9_])
\WMatches a non-word character ([^a-zA-Z0-9_])
\sMatches a space character ([\t\n ])
\SMatches a non-space character ([^\t\n ])
..Matches any character
$Matches "end of line" if placed at the end of a regular expression

"Allan Rydberg" <al****@southtech.net> wrote in message
news:c4**********@newshispeed.ch...

http://ch.php.net/manual/en/function.preg-replace.php

$out = preg_replace("/^\W+/, "", $in);

Seb wrote:
Hi,

Has anyone an idee how i can replace every character in a string if it
is not alphanumeric ?

something like eregi_replace, but i don't know how i say in regex NOT.

Tnx

Jul 17 '05 #3

P: n/a
Followup-to c.l.p. This is off-topic in two groups.

Seb wrote upsidedown:
[Allan Rydberg wrote upsidedown:]
Seb wrote:
Has anyone an idee how i can replace every character in a string if it
is not alphanumeric ?

What do you mean? Will you recast the question, please, Seb?
$out = preg_replace("/^\W+/, "", $in);

(-----------------------------^
A typo there!)

I can't fit the above pattern into any of my interpretations of Seb's
question.

preg_replace('_^\W+_','',$foo)

returns $foo with one or more non-"word" characters at the beginning
stripped off. If $foo were "-_-", the first hyphen would match and
get replaced by an empty string, but the underscore and second hyphen
would remain.
Tnxx, that works for me.
Really? You used an atypical definition of "alphanumeric" then.
Despite Merrium-Webster Online's definition allowing punctuation
marks -- the inclusion of underscores are described as perverse by
FOLDOC -- alphanumerics are usually represented by the character
class [a-zA-Z0-9]. M-W gives the etymology of "alphanumeric" as
"/alpha/bet/ic/ + /numeric/", i.e., it derived from "alphabet" and
"numeric". The Manual's pattern syntax guide, however, doesn't
include underscores in its implicit definition of "alphanumeric".
(Is there an explicit definition, anywhere in the Manual?) C.f. the
character type functions,

http://www.php.net/manual/en/ref.ctype.php
Here is a reference for other people with same probs :
I reckon a better reference is the Manual, don't you?

http://www.php.net/manual/en/pcre.pattern.syntax.php
\dMatches a digit (character class [0-9])
\DMatches a non digit ([^0-9])
Although your character classes are correct and clarify your
definition, it'd be less ambiguous to state that \d matches *decimal*
digits, not just digits, and that \D matches any character that isn't
a *decimal* digit. \d does not match all hexadecimal digits, for
example.
\wMatches a word character ([a-zA-Z0-9_])
\WMatches a non-word character ([^a-zA-Z0-9_])
Your character classes are misleading.

| A "word" character is any letter or digit or the underscore
| character, that is, any character which can be part of a Perl
| "word". The definition of letters and digits is controlled by
| PCRE's character tables, and may vary if locale-specific matching
| is taking place. [ ... ]

http://www.php.net/manual/en/pcre.pattern.syntax.php
\sMatches a space character ([\t\n ])
\SMatches a non-space character ([^\t\n ])
Your character classes are incorrect and out of sync with your
natural language descriptions, which are also incorrect. The generic
character type \s matches "whitespace" characters, not just the space
character; \S matches any non-"whitespace" character. According to
the Manual, the characters \s matches are, by default, normally:
"space, formfeed, newline, carriage return, horizontal tab, and
vertical tab". The "space" in the above definition covers non-
breaking spaces and spaces, I think.
.Matches any character
... excluding newlines by default.

| Outside a character class, a dot in the pattern matches any one
| character in the subject, including a non-printing character, but
| not (by default) newline. If the PCRE_DOTALL option is set, then
| dots match newlines as well. [ ... ] Dot has no special meaning in
| a character class.

http://www.php.net/manual/en/pcre.pattern.syntax.php
$Matches "end of line" if placed at the end of a regular expression


While that may sometimes be true, it doesn't tell the whole story.
The $ isn't a "wildcard" or generic character type metacharacter.

| A dollar character is an assertion which is TRUE only if the
| current matching point is at the end of the subject string, or
| immediately before a newline character that is the last character
| in the string (by default). Dollar need not be the last character
| of the pattern if a number of alternatives are involved, but it
| should be the last item in any branch in which it appears.
|
| [ ... ] The meaning of dollar can be changed so that it matches
| only at the very end of the string, by setting the
| PCRE_DOLLAR_ENDONLY option at compile or matching time.

http://www.php.net/manual/en/pcre.pattern.syntax.php

HTH.

--
Jock
Jul 17 '05 #4

P: n/a
On Tue, 30 Mar 2004 17:26:23 +0100, John Dunlop wrote:
[ snip ]

Here is a reference for other people with same probs :


I reckon a better reference is the Manual, don't you?

http://www.php.net/manual/en/pcre.pattern.syntax.php

[ snip ]
And to "compliment"(?) John's great response.. Regex Coach maybe of
interest to help learn and understand regular expressions too. This is by
no means just aimed at beginners learning.. I use it pretty regularly to
help build regex patterns more quickly for Postfix filtering aswell as
coding.

Download / official site available at:
<http://www.weitz.de/regex-coach/>

Regards,

Ian

--
Ian.H
digiServ Network
London, UK
http://digiserv.net/

Jul 17 '05 #5

This discussion thread is closed

Replies have been disabled for this discussion.