Hello all,
I've been struggling for a few days with the question of how to convert
"smart" (curly) quotes into straight quotes. I tried playing with the
htmlentities() function, but all that is doing is changing the smart
quotes into nonsense characters. I also searched the web for quite a
while and was unsuccessful in finding a solution.
What puzzles me is that doing it the other way around is simple enough.
For example, this works fine in converting a straight quote into an
"open" smart quote:
if ($content[$k] == "\"")
$content = substr($content , 0, $k) . "" . substr
($content, $k+1, strlen($content )-$k+1);
But the other way around doesn't work. Any ideas?
Thanks,
Martin Goldman
My e-mail addresse's correct domain name is mgoldman.com. 9 12271
Martin Goldman <ww*@nowhere.fo o> wrote: I've been struggling for a few days with the question of how to convert "smart" (curly) quotes into straight quotes.
Smart/curly quotes? straight quotes? What are these?
What puzzles me is that doing it the other way around is simple enough. For example, this works fine in converting a straight quote into an "open" smart quote:
if ($content[$k] == "\"") $content = substr($content , 0, $k) . "" . substr ($content, $k+1, strlen($content )-$k+1);
Funny way to do a str_replace :)
What character is represented by #147? AFAIK it's not in any characters
set I know (ASCII or ISO-8859-x). So your actual problem might be that
you are using an other encoding for the character you want to preplace
that PHP is actually using!
BTW 3rd parameter in htmlentities specifies the character set.
--
Daniel Tryba
On Fri, 14 Nov 2003 17:42:08 GMT, Martin Goldman <ww*@nowhere.fo o> wrote: I've been struggling for a few days with the question of how to convert "smart" (curly) quotes into straight quotes. I tried playing with the htmlentities () function, but all that is doing is changing the smart quotes into nonsense characters. I also searched the web for quite a while and was unsuccessful in finding a solution.
You've got to work out what character set the text is encoded in, for
starters, since 'smart quotes' exist in Microsoft's Codepage 1522 but not in
the standard ISO 8859 character sets, e.g. iso-8859-15.
In codepage 1522:
hex dec Unicode Unicode name
91 145 8216 LEFT SINGLE QUOTATION MARK
92 146 8217 RIGHT SINGLE QUOTATION MARK
93 147 8220 LEFT DOUBLE QUOTATION MARK
94 148 8221 RIGHT DOUBLE QUOTATION MARK
But in iso-8859-15, 145-148 aren't defined as printable characters; 128-159
are reserved for control characters.
So if you change it to “, but output your page encoded in iso-8859-1,
you're just changing it to the code for a non-printable character. The same
entity will appear as a left double quotation mark if encoded in Windows-1522
though.
What puzzles me is that doing it the other way around is simple enough. For example, this works fine in converting a straight quote into an "open" smart quote:
if ($content[$k] == "\"") $content = substr($content , 0, $k) . "" . substr ($content, $k+1, strlen($content )-$k+1);
But the other way around doesn't work. Any ideas?
In what way doesn't it work? What does str_replace($co ntent, chr(147), '"');
appear to do in your setup?
--
Andy Hassall (an**@andyh.co. uk) icq(5747695) ( http://www.andyh.co.uk)
Space: disk usage analysis tool ( http://www.andyhsoftware.co.uk/space)
Martin Goldman wrote: I've been struggling for a few days with the question of how to convert "smart" (curly) quotes into straight quotes.
As D. Tryba hinted at, str_replace should work fine. After all,
you're replacing one character with another.
$string = str_replace($ch r,'"',$string)
where $chr is the character you want to replace.
I tried playing with the htmlentities() function, but all that is doing is changing the smart quotes into nonsense characters.
I'd be interested in seeing what you actually tried. Since so-called
smart quotes aren't in the Latin-1 repertoire, you'd have to specify
a charset other than the default ISO-8859-1. Say you typed smart
quotes on a bog standard Windows system by holding down Alt and
pressing 0, 1, 4, and 7 (or 8) on the numeric keypad, you'd use
$string = htmlentities($s tring,ENT_COMPA T,'cp1252')
where $string is the string containing smart quotes. That converts
smart quotes to their respective entity references.
What puzzles me is that doing it the other way around is simple enough.
Eek! I'd have thought that was *more* difficult...
if ($content[$k] == "\"") $content = substr($content , 0, $k) . "" . substr ($content, $k+1, strlen($content )-$k+1);
How does your script know that the quotation mark was intended as an
opening quotation mark? ;-)
In HTML, the character reference is undefined. The LEFT DOUBLE
QUOTATION MARK can be represented using the character reference
“ or the entity reference “. The RIGHT DOUBLE QUOTATION
MARK can be represented using the character reference ” or the
entity reference ”.
--
Jock
John Dunlop <jo*********@jo hndunlop.info> wrote in
news:MP******** *************** *@news.freeserv e.net: Martin Goldman wrote:
I'd be interested in seeing what you actually tried. Since so-called smart quotes aren't in the Latin-1 repertoire, you'd have to specify a charset other than the default ISO-8859-1. Say you typed smart quotes on a bog standard Windows system by holding down Alt and pressing 0, 1, 4, and 7 (or 8) on the numeric keypad, you'd use
$string = htmlentities($s tring,ENT_COMPA T,'cp1252')
where $string is the string containing smart quotes. That converts smart quotes to their respective entity references.
This results in the smart quotes being replaced with nonsense characters.
The thing is, though, that I'm totally unfamiliar with character sets,
the differences between them, etc. I've never had any reason to care
about them. So I'm a little confused about what you guys are talking
about when it comes to them.
How does your script know that the quotation mark was intended as an opening quotation mark? ;-)
Well, I didn't paste the whole thing. :) I wrote a loop that goes through
the string. It toggles a flag each time a quotation mark is found. If the
flag is set, it makes it an open quote; if it's not, it makes it a closed
quote. Hence the reason I'm not just using a str_replace for that. :)
Oh, and to answer Mr. Hassall's question -- str_replace(chr (147), "\"",
$content) doesn't do anything. The exact same string is returned.
-Martin
Martin Goldman <ww*@nowhere.fo o> wrote:
[consufed about charsets] Oh, and to answer Mr. Hassall's question -- str_replace(chr (147), "\"", $content) doesn't do anything. The exact same string is returned.
That might mean that there is nog chr(147) in the string although you
_see_ a character that might be represented as the character you know as
147 in cp1252! Another fine example is the eurosymbol, IIRC its 128 in
cp1252 and 204 in iso-8859-15, in iso-8859-1 204 is a generic symbol and
totally lacks the eurosymbol. Thats why if you want to display the uero
symbol one is encouraged to use the htmlentitie €, which can be
rendered in any font and any character set (with a fallback to EUR).
So you job is to figure out how you quote is encoded (just step through
the string and print the chr value for each character)...
BTW unicode kind of solves the problem by defining every known character
in one set, the problem is that not every program supports it yet. But
unicode also introduces an other problem, the way the characters are
encoded (eg utf7, utf8, utf16...), I don't know if PHP supports utf16+.
--
Daniel Tryba
Daniel Tryba <ne************ ****@canopus.nl > wrote in news:bp5nhq$d0e $1
@news.tue.nl: That might mean that there is nog chr(147) in the string although you _see_ a character that might be represented as the character you know
as 147 in cp1252! Another fine example is the eurosymbol, IIRC its 128 in cp1252 and 204 in iso-8859-15, in iso-8859-1 204 is a generic symbol
and totally lacks the eurosymbol. Thats why if you want to display the uero symbol one is encouraged to use the htmlentitie €, which can be rendered in any font and any character set (with a fallback to EUR).
So you job is to figure out how you quote is encoded (just step through the string and print the chr value for each character)...
Interesting you should suggest this, because I just did that. And indeed,
it's not coming out as 147. It's coming out as 226, followed by 128,
followed by 156. I suppose I could do a str_replace for these 3
characters and replace it with 147. Although, then I'd have to do that
for every character I want to support. What a drag.
Thanks,
Martin
On Sat, 15 Nov 2003 19:57:14 GMT, Martin Goldman <ww*@nowhere.fo o> wrote: Daniel Tryba <ne************ ****@canopus.nl > wrote in news:bp5nhq$d0e $1 @news.tue.nl :
That might mean that there is nog chr(147) in the string although you _see_ a character that might be represented as the character you know as 147 in cp1252! Another fine example is the eurosymbol, IIRC its 128 in cp1252 and 204 in iso-8859-15, in iso-8859-1 204 is a generic symbol and totally lacks the eurosymbol. Thats why if you want to display the uero symbol one is encouraged to use the htmlentitie €, which can be rendered in any font and any character set (with a fallback to EUR).
So you job is to figure out how you quote is encoded (just step through the string and print the chr value for each character)...
Interesting you should suggest this, because I just did that. And indeed, it's not coming out as 147. It's coming out as 226, followed by 128, followed by 156. I suppose I could do a str_replace for these 3 characters and replace it with 147. Although, then I'd have to do that for every character I want to support. What a drag.
Your text is encoded in UTF-8. Going back to the characters again:
hex dec Unicode Unicode name
91 145 8216 LEFT SINGLE QUOTATION MARK
92 146 8217 RIGHT SINGLE QUOTATION MARK
93 147 8220 LEFT DOUBLE QUOTATION MARK
94 148 8221 RIGHT DOUBLE QUOTATION MARK
226,128,147 in binary is:
11100010
10000000
10011100
'1110' in the first few bits of the first byte indicates it is a lead byte for
a three-byte character. The remaining two are trail bytes, as they start with
10. So separating out the data gets:
1110 0010
10 000000
10 011100
=> 001000000001110 0 (binary)
= 8220 (decicmal)
Which is LEFT DOUBLE QUOTATION MARK.
--
Andy Hassall (an**@andyh.co. uk) icq(5747695) ( http://www.andyh.co.uk)
Space: disk usage analysis tool ( http://www.andyhsoftware.co.uk/space)
Andy Hassall <an**@andyh.co. uk> wrote: So you job is to figure out how you quote is encoded (just step through the string and print the chr value for each character)...
Interesting you should suggest this, because I just did that. And indeed, it's not coming out as 147. It's coming out as 226, followed by 128, followed by 156. I suppose I could do a str_replace for these 3 characters and replace it with 147. Although, then I'd have to do that for every character I want to support. What a drag.
Your text is encoded in UTF-8. Going back to the characters again:
[in depth UTF-8 decoding :)]
So Martin, you should take a look at iconv or if your server lacks
support utf8_decode(). The latter has also a usercontrib on how to use
str_replace on UTF-8 encoded string.
--
Daniel Tryba
Daniel Tryba <ne************ ****@canopus.nl > wrote in
news:bp******** **@news.tue.nl: Andy Hassall <an**@andyh.co. uk> wrote:
So Martin, you should take a look at iconv or if your server lacks support utf8_decode(). The latter has also a usercontrib on how to use str_replace on UTF-8 encoded string.
Great. Thanks to everyone to replied.
-Martin
my correct domain name is mgoldman.com This thread has been closed and replies have been disabled. Please start a new discussion. Similar topics |
by: Tim Hochberg |
last post by:
During the recent, massive, painful Lisp-Python crossposting thread the
evils of Python's whitespace based indentation were once again brought
to light. Since Python' syntax is so incredibly brittle, and failure
prone, it's amazing that we don't have more editor support for our
feeble minds. To help reduce the severity of this crisis, I decided to
take a whack at creating smart block copy and paste functionality.
Anyway, I wrote some...
|
by: David B. Held |
last post by:
I wanted to post this proposal on c.l.c++.m, but my news
server apparently does not support that group any more.
I propose a new class of exception safety known as the
"smart guarantee". Essentially, the smart guarantee
promises to clean up resources whose ownership is
passed into the function, for whatever defintion of "clean
up" is most appropriate for the resource passed.
Note that this is different from both the basic and the...
|
by: Ron |
last post by:
Hello, I'm having an aggravating time getting the "html" spewed by Word
2003 to display correctly in a webpage.
The situation here is that the people creating the documents only know
Word, and aren't very computer savvy. I created a system where they
can save their Word documents as "html" and upload them to a certain
directory, and the web page dynamically runs them through tidylib using
the tidy extension to php4, thus causing the...
|
by: BobAchgill |
last post by:
Is there a way to let the User click on a button on a web
site and have that download and install my prepackaged
compressed data directory and place it nicely under my
existing VB .Net Form application on the User's computer?
Maybe another way of asking the question is. Can I build
a smart .msi "data" installer that will when clicked on
as "Run" on the web page will load into the desktop's
memory ... find the location of my VB .Net Form...
|
by: red floyd |
last post by:
I've got some code where somebody cut&pasted some comments from MS Word,
and so these comments have "smart quotes" (in particular apostrophes)
embedded.
The apostrophe is character hex 0x92.
2.1 indicates that characters not in the source character set are
converted to the universal character name that designates the character.
So far, so good. The non-source character gets translated. No big deal.
| |
by: Noozer |
last post by:
I'm looking for a "smart folder" program to run on my Windows XP machine.
I'm not having any luck finding it and think the logic behind the program is
pretty simple, but I'm not sure how I'd implement this. I've done some VB6
programming and dabbled in VS.Net.
Can someone share some pointers in how I could implement the following?
Basically, you drag a file to the "smart" folder and, depending on the type
of file and settings for that...
|
by: Hystou |
last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can effortlessly switch the default language on Windows 10 without reinstalling. I'll walk you through it.
First, let's disable language synchronization. With a Microsoft account, language settings sync across devices. To prevent any complications,...
|
by: jinu1996 |
last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven tapestry of website design and digital marketing. It's not merely about having a website; it's about crafting an immersive digital experience that captivates audiences and drives business growth.
The Art of Business Website Design
Your website is...
|
by: tracyyun |
last post by:
Dear forum friends,
With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each protocol has its own unique characteristics and advantages, but as a user who is planning to build a smart home system, I am a bit confused by the choice of these technologies. I'm particularly interested in Zigbee because I've heard it does some...
|
by: agi2029 |
last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing, and deployment—without human intervention. Imagine an AI that can take a project description, break it down, write the code, debug it, and then launch it, all on its own....
Now, this would greatly impact the work of software developers. The idea...
|
by: isladogs |
last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM).
In this session, we are pleased to welcome a new presenter, Adolph Dupré who will be discussing some powerful techniques for using class modules.
He will explain when you may want to use classes instead of User Defined Types (UDT). For example, to manage the data in unbound forms.
Adolph will...
| |
by: conductexam |
last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and then checking html paragraph one by one.
At the time of converting from word file to html my equations which are in the word document file was convert into image.
Globals.ThisAddIn.Application.ActiveDocument.Select();...
|
by: TSSRALBI |
last post by:
Hello
I'm a network technician in training and I need your help.
I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs.
The last exercise I practiced was to create a LAN-to-LAN VPN between two Pfsense firewalls, by using IPSEC protocols.
I succeeded, with both firewalls in the same network. But I'm wondering if it's possible to do the same thing, with 2 Pfsense firewalls...
|
by: 6302768590 |
last post by:
Hai team
i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated we have to send another system
|
by: bsmnconsultancy |
last post by:
In today's digital era, a well-designed website is crucial for businesses looking to succeed. Whether you're a small business owner or a large corporation in Toronto, having a strong online presence can significantly impact your brand's success. BSMN Consultancy, a leader in Website Development in Toronto offers valuable insights into creating effective websites that not only look great but also perform exceptionally well. In this comprehensive...
| |