Hi,
this should test you guru's. I want a function that accepts text as an
argument and converts all & into & except where it is a html
character already such as , ", and of course &.
If there is already a php function for this I would like to know, but
if not, what is the GREP equivilent?
Thanks 6 2626
*** KhanyBoy wrote/escribió (10 Jun 2004 14:22:24 -0700): this should test you guru's. I want a function that accepts text as an argument and converts all & into & except where it is a html character already such as , ", and of course &.
I can't figure out how you manage to get such a garbled input. Anyway, I
guess a combination of html_entity_decode() and html_entities() should
help.
--
--
-- Álvaro G. Vicario - Burgos, Spain
--
KhanyBoy wrote: this should test you guru's. I want a function that accepts text as an argument and converts all & into & except where it is a html character already such as , ", and of course &.
If there is already a php function for this I would like to know, but if not, what is the GREP equivilent?
OK, this does that, but it may not be a very elegant soltion. I recently
needed the same functionality for a project involving oscommerce.
function ampersandFix($x){
$x=str_replace('&','&',$x);
$pattern='`&(#[0-9]{2,3}|aacute|acirc|acute|aelig|agrave|amp'.
'|aring|atilde|auml|brvbar|brkbar|ccedil|cedil|cen t'.
'|copy|curren|deg|divide|eacute|ecirc|egrave|eth|e uml'.
'|frac12|frac14|frac34|gt|iacute|icirc|iexcl|igrav e'.
'|iquest|iuml|laquo|lt|macr|hibar|micro|middot|nbs p|not'.
'|ntilde|oacute|ocirc|ograve|ordf|ordm|oslash|otil de'.
'|ouml|para|plusmn|pound|quot|raquo|reg|sect|shy|s up1|sup2'.
'|sup3|szlig|thorn|times|uacute|ucirc|ugrave|uml'.
'|die|uuml|yacute|yen|yuml);`i';
$replace='&$1;';
return preg_reacple($pattern,$replace,$x);
}
--
Justin Koivisto - sp**@koivi.com
PHP POSTERS: Please use comp.lang.php for PHP related questions,
alt.php* groups are not recommended.
Justin Koivisto wrote: KhanyBoy wrote: If there is already a php function for this I would like to know, but if not, what is the GREP equivilent?
OK, this does that, but it may not be a very elegant soltion. I recently needed the same functionality for a project involving oscommerce.
function ampersandFix($x){ $x=str_replace('&','&',$x); $pattern='`&(#[0-9]{2,3}|aacute|acirc|acute|aelig|agrave|amp'. '|aring|atilde|auml|brvbar|brkbar|ccedil|cedil|cen t'. '|copy|curren|deg|divide|eacute|ecirc|egrave|eth|e uml'. '|frac12|frac14|frac34|gt|iacute|icirc|iexcl|igrav e'. '|iquest|iuml|laquo|lt|macr|hibar|micro|middot|nbs p|not'. '|ntilde|oacute|ocirc|ograve|ordf|ordm|oslash|otil de'. '|ouml|para|plusmn|pound|quot|raquo|reg|sect|shy|s up1|sup2'. '|sup3|szlig|thorn|times|uacute|ucirc|ugrave|uml'. '|die|uuml|yacute|yen|yuml);`i'; $replace='&$1;'; return preg_reacple($pattern,$replace,$x); }
A lot faster, but not as accurate:
$text = preg_replace('!&(?![#a-z0-9]{1,7};)!i','&',$text);
You could also use this method (it's a lookahead assertion) with
Justin's function, which will still be a lot faster than his hack ;)
Greetings Christian.
KhanyBoy wrote: this should test you guru's. I want a function that accepts text as an argument and converts all & into & except where it is a html character already such as , ", and of course &.
Why are entity references recognised in text?
--
Jock
Regarding this well-known quote, often attributed to KhanyBoy's famous "10
Jun 2004 14:22:24 -0700" speech: Hi,
this should test you guru's. I want a function that accepts text as an argument and converts all & into & except where it is a html character already such as , ", and of course &.
If there is already a php function for this I would like to know, but if not, what is the GREP equivilent?
Thanks
It might be a different direction, but here are some functions to determine
whether something is an HTML entity, using the built-in PHP entity
functions. It's sort of a recycled reply to an earlier question:
A simple (rough) test of the concept is online at: http://php.pixelsaredead.com/htmlentities.php
<?php
/*
How to determine whether a given string decodes to an HTML entity
*/
function contains_entities($raw)
{
// $raw can be a string of any length
$raw = trim($raw);
return (strlen(htmlentities($raw)) > strlen($raw));
}
function is_entity_reference($raw)
{
// $raw should be a string with only the entity ref in it,
// in the form "&...;"
return (preg_match('/&.+;/', $raw)) &&
(strlen(html_entity_decode(trim($raw))) == 1);
}
?>
--
-- Rudy Fleminger
-- sp@mmers.and.evil.ones.will.bow-down-to.us
(put "Hey!" in the Subject line for priority processing!)
-- http://www.pixelsaredead.com
Christian Fersch wrote: Justin Koivisto wrote:
KhanyBoy wrote:
If there is already a php function for this I would like to know, but if not, what is the GREP equivilent?
OK, this does that, but it may not be a very elegant soltion. I recently needed the same functionality for a project involving oscommerce.
function ampersandFix($x){ $x=str_replace('&','&',$x); $pattern='`&(#[0-9]{2,3}|aacute|acirc|acute|aelig|agrave|amp'. '|aring|atilde|auml|brvbar|brkbar|ccedil|cedil|cen t'. '|copy|curren|deg|divide|eacute|ecirc|egrave|eth|e uml'. '|frac12|frac14|frac34|gt|iacute|icirc|iexcl|igrav e'. '|iquest|iuml|laquo|lt|macr|hibar|micro|middot|nbs p|not'. '|ntilde|oacute|ocirc|ograve|ordf|ordm|oslash|otil de'. '|ouml|para|plusmn|pound|quot|raquo|reg|sect|shy|s up1|sup2'. '|sup3|szlig|thorn|times|uacute|ucirc|ugrave|uml'. '|die|uuml|yacute|yen|yuml);`i'; $replace='&$1;'; return preg_reacple($pattern,$replace,$x); }
A lot faster, but not as accurate: $text = preg_replace('!&(?![#a-z0-9]{1,7};)!i','&',$text);
You could also use this method (it's a lookahead assertion) with Justin's function, which will still be a lot faster than his hack ;)
Let's fix the hack then, shall we?
function ampersandFix($x){
$pattern='`&(?!(#[0-9]{2,3}|aacute|acirc|acute|aelig|agrave|amp'.
'|aring|atilde|auml|brvbar|brkbar|ccedil|cedil|cen t'.
'|copy|curren|deg|divide|eacute|ecirc|egrave|eth|e uml'.
'|frac12|frac14|frac34|gt|iacute|icirc|iexcl|igrav e'.
'|iquest|iuml|laquo|lt|macr|hibar|micro|middot|nbs p|not'.
'|ntilde|oacute|ocirc|ograve|ordf|ordm|oslash|otil de'.
'|ouml|para|plusmn|pound|quot|raquo|reg|sect|shy|s up1|sup2'.
'|sup3|szlig|thorn|times|uacute|ucirc|ugrave|uml'.
'|die|uuml|yacute|yen|yuml);)`i';
return preg_replace($pattern,'&',$x);
}
Now it's faster AND accurate. ;)
--
Justin Koivisto - sp**@koivi.com
PHP POSTERS: Please use comp.lang.php for PHP related questions,
alt.php* groups are not recommended. This thread has been closed and replies have been disabled. Please start a new discussion. Similar topics
by: sf |
last post by:
Just started thinking about learning python.
Is there any place where I can get some free examples, especially for
following kind of problem ( it must be trivial for those using python)
I have...
|
by: John E. Jardine |
last post by:
Hi,
Problem:
Executing 's///' has a side effect on grep null string matching.
If line 62, the substitution, is executed the last two values returned by
grep and printed on lines 68, 69 are...
|
by: David Isaac |
last post by:
What's the standard replacement for the obsolete grep module?
Thanks,
Alan Isaac
|
by: xavier vazquez |
last post by:
I have a problem with a program that does not working properly...when the program run is suppose to generate a cross word puzzle , when the outcome show the letter of the words overlap one intop of...
|
by: xavier vazquez |
last post by:
have a problem with a program that does not working properly...when the program run is suppose to generate a cross word puzzle , when the outcome show the letter of the words overlap one intop of the...
|
by: oncue01 |
last post by:
Word Puzzle
Task
You are going to search M words in an N × N puzzle. The words may have
been placed in one of the four directions as from (i) left to right (E), (ii) right
to left (W), (iii) up...
|
by: Anton Slesarev |
last post by:
I've read great paper about generators:
http://www.dabeaz.com/generators/index.html
Author say that it's easy to write analog of common linux tools such
as awk,grep etc. He say that performance...
|
by: Henning_Thornblad |
last post by:
What can be the cause of the large difference between re.search and
grep?
This script takes about 5 min to run on my computer:
#!/usr/bin/env python
import re
row=""
for a in range(156000):...
|
by: honey777 |
last post by:
Problem: 15 Puzzle
This is a common puzzle with a 4x4 playing space with 15 tiles, numbered 1 through 15. One "spot" is always left blank. Here is an example of the puzzle:
The goal is to...
|
by: Charles Arthur |
last post by:
How do i turn on java script on a villaon, callus and itel keypad mobile phone
|
by: ryjfgjl |
last post by:
If we have dozens or hundreds of excel to import into the database, if we use the excel import function provided by database editors such as navicat, it will be extremely tedious and time-consuming...
|
by: emmanuelkatto |
last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud.
Please let me know.
Thanks!
Emmanuel
|
by: BarryA |
last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
|
by: Hystou |
last post by:
There are some requirements for setting up RAID:
1. The motherboard and BIOS support RAID configuration.
2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...
|
by: marktang |
last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
|
by: Hystou |
last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...
|
by: Oralloy |
last post by:
Hello folks,
I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>".
The problem is that using the GNU compilers,...
|
by: jinu1996 |
last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...
| |