Hi!
We've been building a pretty big web app here for internal use. SMS
text messages come in from an aggregator and are stored in a MySQL 4
db. Our operators then deal with them using a web interface. The db is
queried using PHP4 and the results output as XML which is then
transformed using XSLT into XHTML.
Now, in our testing environment everything works just fine. However,
when we try and run it with actual live data, any incoming SMS message
that contains a non-ASCII character breaks the system at the Sablotron
stage (invalid token).
Now the aggregating service is sending us the incoming messages UTF-8
encoded. The XML and XSL is all set up to be UTF-8. However, somewhere
along the lines something is getting screwed up so that Sablotron barfs
(typical examples are pound signs or euro signs).
I'm having a hard time debugging this because as far as I can tell
everything is set to be using UTF-8 by default. Clearly something isn't
(MySQL possibly). I'd really appreciate some pointers for things to
check.
TIA,
Darren