473,406 Members | 2,273 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,406 software developers and data experts.

regex

Seb
Hi,

Has anyone an idee how i can replace every character in a string if it is not alphanumeric ?

something like eregi_replace, but i don't know how i say in regex NOT.

Tnx
Jul 17 '05 #1
4 2663

http://ch.php.net/manual/en/function.preg-replace.php

$out = preg_replace("/^\W+/, "", $in);

Seb wrote:
Hi,

Has anyone an idee how i can replace every character in a string if it
is not alphanumeric ?

something like eregi_replace, but i don't know how i say in regex NOT.

Tnx


Jul 17 '05 #2
Seb
Tnxx, that works for me.

Here is a reference for other people with same probs :

WildcardDescription
\dMatches a digit (character class [0-9])
\DMatches a non digit ([^0-9])
\wMatches a word character ([a-zA-Z0-9_])
\WMatches a non-word character ([^a-zA-Z0-9_])
\sMatches a space character ([\t\n ])
\SMatches a non-space character ([^\t\n ])
..Matches any character
$Matches "end of line" if placed at the end of a regular expression

"Allan Rydberg" <al****@southtech.net> wrote in message
news:c4**********@newshispeed.ch...

http://ch.php.net/manual/en/function.preg-replace.php

$out = preg_replace("/^\W+/, "", $in);

Seb wrote:
Hi,

Has anyone an idee how i can replace every character in a string if it
is not alphanumeric ?

something like eregi_replace, but i don't know how i say in regex NOT.

Tnx

Jul 17 '05 #3
Followup-to c.l.p. This is off-topic in two groups.

Seb wrote upsidedown:
[Allan Rydberg wrote upsidedown:]
Seb wrote:
Has anyone an idee how i can replace every character in a string if it
is not alphanumeric ?

What do you mean? Will you recast the question, please, Seb?
$out = preg_replace("/^\W+/, "", $in);

(-----------------------------^
A typo there!)

I can't fit the above pattern into any of my interpretations of Seb's
question.

preg_replace('_^\W+_','',$foo)

returns $foo with one or more non-"word" characters at the beginning
stripped off. If $foo were "-_-", the first hyphen would match and
get replaced by an empty string, but the underscore and second hyphen
would remain.
Tnxx, that works for me.
Really? You used an atypical definition of "alphanumeric" then.
Despite Merrium-Webster Online's definition allowing punctuation
marks -- the inclusion of underscores are described as perverse by
FOLDOC -- alphanumerics are usually represented by the character
class [a-zA-Z0-9]. M-W gives the etymology of "alphanumeric" as
"/alpha/bet/ic/ + /numeric/", i.e., it derived from "alphabet" and
"numeric". The Manual's pattern syntax guide, however, doesn't
include underscores in its implicit definition of "alphanumeric".
(Is there an explicit definition, anywhere in the Manual?) C.f. the
character type functions,

http://www.php.net/manual/en/ref.ctype.php
Here is a reference for other people with same probs :
I reckon a better reference is the Manual, don't you?

http://www.php.net/manual/en/pcre.pattern.syntax.php
\dMatches a digit (character class [0-9])
\DMatches a non digit ([^0-9])
Although your character classes are correct and clarify your
definition, it'd be less ambiguous to state that \d matches *decimal*
digits, not just digits, and that \D matches any character that isn't
a *decimal* digit. \d does not match all hexadecimal digits, for
example.
\wMatches a word character ([a-zA-Z0-9_])
\WMatches a non-word character ([^a-zA-Z0-9_])
Your character classes are misleading.

| A "word" character is any letter or digit or the underscore
| character, that is, any character which can be part of a Perl
| "word". The definition of letters and digits is controlled by
| PCRE's character tables, and may vary if locale-specific matching
| is taking place. [ ... ]

http://www.php.net/manual/en/pcre.pattern.syntax.php
\sMatches a space character ([\t\n ])
\SMatches a non-space character ([^\t\n ])
Your character classes are incorrect and out of sync with your
natural language descriptions, which are also incorrect. The generic
character type \s matches "whitespace" characters, not just the space
character; \S matches any non-"whitespace" character. According to
the Manual, the characters \s matches are, by default, normally:
"space, formfeed, newline, carriage return, horizontal tab, and
vertical tab". The "space" in the above definition covers non-
breaking spaces and spaces, I think.
.Matches any character
... excluding newlines by default.

| Outside a character class, a dot in the pattern matches any one
| character in the subject, including a non-printing character, but
| not (by default) newline. If the PCRE_DOTALL option is set, then
| dots match newlines as well. [ ... ] Dot has no special meaning in
| a character class.

http://www.php.net/manual/en/pcre.pattern.syntax.php
$Matches "end of line" if placed at the end of a regular expression


While that may sometimes be true, it doesn't tell the whole story.
The $ isn't a "wildcard" or generic character type metacharacter.

| A dollar character is an assertion which is TRUE only if the
| current matching point is at the end of the subject string, or
| immediately before a newline character that is the last character
| in the string (by default). Dollar need not be the last character
| of the pattern if a number of alternatives are involved, but it
| should be the last item in any branch in which it appears.
|
| [ ... ] The meaning of dollar can be changed so that it matches
| only at the very end of the string, by setting the
| PCRE_DOLLAR_ENDONLY option at compile or matching time.

http://www.php.net/manual/en/pcre.pattern.syntax.php

HTH.

--
Jock
Jul 17 '05 #4
On Tue, 30 Mar 2004 17:26:23 +0100, John Dunlop wrote:
[ snip ]

Here is a reference for other people with same probs :


I reckon a better reference is the Manual, don't you?

http://www.php.net/manual/en/pcre.pattern.syntax.php

[ snip ]
And to "compliment"(?) John's great response.. Regex Coach maybe of
interest to help learn and understand regular expressions too. This is by
no means just aimed at beginners learning.. I use it pretty regularly to
help build regex patterns more quickly for Postfix filtering aswell as
coding.

Download / official site available at:
<http://www.weitz.de/regex-coach/>

Regards,

Ian

--
Ian.H
digiServ Network
London, UK
http://digiserv.net/

Jul 17 '05 #5

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

3
by: Jon Maz | last post by:
Hi All, Am getting frustrated trying to port the following (pretty simple) function to CSharp. The problem is that I'm lousy at Regular Expressions.... //from...
9
by: Tim Conner | last post by:
Is there a way to write a faster function ? public static bool IsNumber( char Value ) { if (Regex.IsMatch( Value.ToString(), @"^+$" )) { return true; } else return false; }
20
by: jeevankodali | last post by:
Hi I have an .Net application which processes thousands of Xml nodes each day and for each node I am using around 30-40 Regex matches to see if they satisfy some conditions are not. These Regex...
17
by: clintonG | last post by:
I'm using an .aspx tool I found at but as nice as the interface is I think I need to consider using others. Some can generate C# I understand. Your preferences please... <%= Clinton Gallagher ...
6
by: Extremest | last post by:
I have a huge regex setup going on. If I don't do each one by itself instead of all in one it won't work for. Also would like to know if there is a faster way tried to use string.replace with all...
7
by: Extremest | last post by:
I am using this regex. static Regex paranthesis = new Regex("(\\d*/\\d*)", RegexOptions.IgnoreCase); it should find everything between parenthesis that have some numbers onyl then a forward...
3
by: aspineux | last post by:
My goal is to write a parser for these imaginary string from the SMTP protocol, regarding RFC 821 and 1869. I'm a little flexible with the BNF from these RFC :-) Any comment ? tests= def...
15
by: morleyc | last post by:
Hi, i would like to remove a number of characters from my string (\t \r \n which are throughout the string), i know regex can do this but i have no idea how. Any pointers much appreciated. Chris
4
by: CJ | last post by:
Is this the format to parse a string and return the value between the item? Regex pRE = new Regex("<File_Name>.*>(?<insideText>.*)</File_Name>"); I am trying to parse this string. ...
0
by: Karch | last post by:
I have these two methods that are chewing up a ton of CPU time in my application. Does anyone have any suggestions on how to optimize them or rewrite them without Regex? The most time-consuming...
0
by: Charles Arthur | last post by:
How do i turn on java script on a villaon, callus and itel keypad mobile phone
1
by: nemocccc | last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
1
by: Sonnysonu | last post by:
This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...
0
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
0
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...
0
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows...
0
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each...
0
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.