regex

Seb

Hi,

Has anyone an idee how i can replace every character in a string if it is not alphanumeric ?

something like eregi_replace, but i don't know how i say in regex NOT.

Tnx

Jul 17 '05 #1

Subscribe Post Reply

2663

Allan Rydberg

http://ch.php.net/manual/en/function.preg-replace.php

$out = preg_replace("/^\W+/, "", $in);

Seb wrote:

Hi,

Has anyone an idee how i can replace every character in a string if it
is not alphanumeric ?

something like eregi_replace, but i don't know how i say in regex NOT.

Tnx

Jul 17 '05 #2

Seb

Tnxx, that works for me.

Here is a reference for other people with same probs :

WildcardDescription
\dMatches a digit (character class [0-9])
\DMatches a non digit ([^0-9])
\wMatches a word character ([a-zA-Z0-9_])
\WMatches a non-word character ([^a-zA-Z0-9_])
\sMatches a space character ([\t\n ])
\SMatches a non-space character ([^\t\n ])
..Matches any character
$Matches "end of line" if placed at the end of a regular expression

"Allan Rydberg" <al****@southtech.net> wrote in message
news:c4**********@newshispeed.ch...

http://ch.php.net/manual/en/function.preg-replace.php

$out = preg_replace("/^\W+/, "", $in);

Seb wrote:

Hi,

Has anyone an idee how i can replace every character in a string if it
is not alphanumeric ?

something like eregi_replace, but i don't know how i say in regex NOT.

Tnx

Jul 17 '05 #3

John Dunlop

Followup-to c.l.p. This is off-topic in two groups.

Seb wrote upsidedown:

[Allan Rydberg wrote upsidedown:]
Seb wrote:
Has anyone an idee how i can replace every character in a string if it
is not alphanumeric ?

What do you mean? Will you recast the question, please, Seb?
$out = preg_replace("/^\W+/, "", $in);

(-----------------------------^
A typo there!)

I can't fit the above pattern into any of my interpretations of Seb's
question.

preg_replace('_^\W+_','',$foo)

returns $foo with one or more non-"word" characters at the beginning
stripped off. If $foo were "-_-", the first hyphen would match and
get replaced by an empty string, but the underscore and second hyphen
would remain.
Tnxx, that works for me.
Really? You used an atypical definition of "alphanumeric" then.
Despite Merrium-Webster Online's definition allowing punctuation
marks -- the inclusion of underscores are described as perverse by
FOLDOC -- alphanumerics are usually represented by the character
class [a-zA-Z0-9]. M-W gives the etymology of "alphanumeric" as
"/alpha/bet/ic/ + /numeric/", i.e., it derived from "alphabet" and
"numeric". The Manual's pattern syntax guide, however, doesn't
include underscores in its implicit definition of "alphanumeric".
(Is there an explicit definition, anywhere in the Manual?) C.f. the
character type functions,

http://www.php.net/manual/en/ref.ctype.php
Here is a reference for other people with same probs :
I reckon a better reference is the Manual, don't you?

http://www.php.net/manual/en/pcre.pattern.syntax.php
\dMatches a digit (character class [0-9])
\DMatches a non digit ([^0-9])
Although your character classes are correct and clarify your
definition, it'd be less ambiguous to state that \d matches *decimal*
digits, not just digits, and that \D matches any character that isn't
a *decimal* digit. \d does not match all hexadecimal digits, for
example.
\wMatches a word character ([a-zA-Z0-9_])
\WMatches a non-word character ([^a-zA-Z0-9_])
Your character classes are misleading.

| A "word" character is any letter or digit or the underscore
| character, that is, any character which can be part of a Perl
| "word". The definition of letters and digits is controlled by
| PCRE's character tables, and may vary if locale-specific matching
| is taking place. [ ... ]

http://www.php.net/manual/en/pcre.pattern.syntax.php
\sMatches a space character ([\t\n ])
\SMatches a non-space character ([^\t\n ])
Your character classes are incorrect and out of sync with your
natural language descriptions, which are also incorrect. The generic
character type \s matches "whitespace" characters, not just the space
character; \S matches any non-"whitespace" character. According to
the Manual, the characters \s matches are, by default, normally:
"space, formfeed, newline, carriage return, horizontal tab, and
vertical tab". The "space" in the above definition covers non-
breaking spaces and spaces, I think.
.Matches any character
... excluding newlines by default.

| Outside a character class, a dot in the pattern matches any one
| character in the subject, including a non-printing character, but
| not (by default) newline. If the PCRE_DOTALL option is set, then
| dots match newlines as well. [ ... ] Dot has no special meaning in
| a character class.

http://www.php.net/manual/en/pcre.pattern.syntax.php
$Matches "end of line" if placed at the end of a regular expression

While that may sometimes be true, it doesn't tell the whole story.
The $ isn't a "wildcard" or generic character type metacharacter.

| A dollar character is an assertion which is TRUE only if the
| current matching point is at the end of the subject string, or
| immediately before a newline character that is the last character
| in the string (by default). Dollar need not be the last character
| of the pattern if a number of alternatives are involved, but it
| should be the last item in any branch in which it appears.
|
| [ ... ] The meaning of dollar can be changed so that it matches
| only at the very end of the string, by setting the
| PCRE_DOLLAR_ENDONLY option at compile or matching time.

http://www.php.net/manual/en/pcre.pattern.syntax.php

HTH.

--
Jock

Jul 17 '05 #4

Ian.H

On Tue, 30 Mar 2004 17:26:23 +0100, John Dunlop wrote:
[ snip ]

Here is a reference for other people with same probs :

I reckon a better reference is the Manual, don't you?

http://www.php.net/manual/en/pcre.pattern.syntax.php

[ snip ]
And to "compliment"(?) John's great response.. Regex Coach maybe of
interest to help learn and understand regular expressions too. This is by
no means just aimed at beginners learning.. I use it pretty regularly to
help build regex patterns more quickly for Postfix filtering aswell as
coding.

Download / official site available at:
<http://www.weitz.de/regex-coach/>

Regards,

Ian

--
Ian.H
digiServ Network
London, UK
http://digiserv.net/

Jul 17 '05 #5

by: Jon Maz | last post by:

Hi All, Am getting frustrated trying to port the following (pretty simple) function to CSharp. The problem is that I'm lousy at Regular Expressions.... //from...

.NET Framework

How can I do this without Regex ?

by: Tim Conner | last post by:

Is there a way to write a faster function ? public static bool IsNumber( char Value ) { if (Regex.IsMatch( Value.ToString(), @"^+$" )) { return true; } else return false; }

C# / C Sharp

Regex - Memory performance

by: jeevankodali | last post by:

Hi I have an .Net application which processes thousands of Xml nodes each day and for each node I am using around 30-40 Regex matches to see if they satisfy some conditions are not. These Regex...

C# / C Sharp

Which RegEx Testing Tool Do You Prefer?

by: clintonG | last post by:

I'm using an .aspx tool I found at but as nice as the interface is I think I need to consider using others. Some can generate C# I understand. Your preferences please... <%= Clinton Gallagher ...

ASP.NET

How to get rid of the regex????

by: Extremest | last post by:

I have a huge regex setup going on. If I don't do each one by itself instead of all in one it won't work for. Also would like to know if there is a faster way tried to use string.replace with all...

C# / C Sharp

Quick regex question

by: Extremest | last post by:

I am using this regex. static Regex paranthesis = new Regex("(\\d*/\\d*)", RegexOptions.IgnoreCase); it should find everything between parenthesis that have some numbers onyl then a forward...

C# / C Sharp

A nice way to use regex for complicate parsing

by: aspineux | last post by:

My goal is to write a parser for these imaginary string from the SMTP protocol, regarding RFC 821 and 1869. I'm a little flexible with the BNF from these RFC :-) Any comment ? tests= def...

Python

Regex to remove \t \r \n from string

by: morleyc | last post by:

Hi, i would like to remove a number of characters from my string (\t \r \n which are throughout the string), i know regex can do this but i have no idea how. Any pointers much appreciated. Chris

C# / C Sharp

Parsing in between strings using Regex

by: CJ | last post by:

Is this the format to parse a string and return the value between the item? Regex pRE = new Regex("<File_Name>.*>(?<insideText>.*)</File_Name>"); I am trying to parse this string. ...

C# / C Sharp

Regex woes

by: Karch | last post by:

I have these two methods that are chewing up a ton of CPU time in my application. Does anyone have any suggestions on how to optimize them or rewrite them without Regex? The most time-consuming...

C# / C Sharp

How to turn on java script in a villaon keypad mobile phone

by: Charles Arthur | last post by:

How do i turn on java script on a villaon, callus and itel keypad mobile phone

Java

Looking to do Android software development, any suggestions? Is flutter better?

by: nemocccc | last post by:

hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?

General

Is that possible of reading the .csv file in column wise and the column have different lengths ?

by: Sonnysonu | last post by:

This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...

C / C++

How to build RAID in BIOS?

by: Hystou | last post by:

There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...

Computer Hardware

What is ONU?

by: marktang | last post by:

ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...

General

Problem With Comparison Operator <=> in G++

by: Oralloy | last post by:

Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...

C / C++

The easy way to turn off automatic updates for Windows 10/11

by: Hystou | last post by:

Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows...

Windows Server

Discussion: How does Zigbee compare with other wireless protocols in smart home applications?

by: tracyyun | last post by:

Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each...

General

Access Europe - Using VBA to create a class based on a table - Wed 1 May

by: isladogs | last post by:

The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new...

Microsoft Access / VBA

Similar topics