473,386 Members | 1,710 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,386 software developers and data experts.

Correct escaping of untrusted data

Hi folks,

The thread on injection attacks was very instructive, but seemed to
run out of steam at an interesting point. Now you guys have kindly
educated me about the real nature of the issues, can I ask again
what effective escaping really means?

Are the standard escaping functions found in the PHP, Tcl etc APIs to
Postgres bombproof? Are there any encodings that might slip through
and be cast to malicious strings inside Postgres? What about functions
like convert(): could they be used to slip something through the
escaping function?

I don't really have enough knowledge in this area to be confident in
the results of my own experiments. Any advice from the more
technically savvy would be much appreciated.

------------------
Geoff Caplan
Vario Software Ltd
(+44) 121-515 1154
---------------------------(end of broadcast)---------------------------
TIP 9: the planner will ignore your desire to choose an index scan if your
joining column's datatypes do not match

Nov 23 '05 #1
11 2156
Geoff Caplan <ge***@variosoft.com> writes:
Are the standard escaping functions found in the PHP, Tcl etc APIs to
Postgres bombproof?
I dunno; you'd probably want to look at the source for each one you
planned to use, anyway, if you're being paranoid. As long as they
escape ' and \ they should be okay. If your source language allows
embedded nulls (\0) in strings you might want to reject those as well.
Are there any encodings that might slip through
and be cast to malicious strings inside Postgres?
All the supported encodings are supersets of ASCII, so I don't think
there is any such risk. There is a risk in the opposite direction
I think: if the escaping function doesn't know the encoding being used
it might think that one byte of a multibyte character is ' or \ and
try to escape it, thereby breaking the data. This could not happen in
"sane" encodings like UTF-8, however, just in the one or two Far Eastern
encodings that allow multibyte characters to contain bytes <= 0x7F.

Since you as the application programmer can control what client-side
encoding is used, the simplest answer here is just to be sure you're
using a sane encoding, or at least that the escaping function knows
the encoding you're using.
What about functions like convert(): could they be used to slip
something through the escaping function?


Don't see how. The issue is to be sure that the query string traveling
to the backend will be interpreted the way you expected. By the time
any server-side function executes it is far too late to change that
interpretation.

regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 6: Have you searched our list archives?

http://archives.postgresql.org

Nov 23 '05 #2
Geoff Caplan wrote:
Are the standard escaping functions found in the PHP, Tcl etc APIs to
Postgres bombproof? Are there any encodings that might slip through
and be cast to malicious strings inside Postgres? What about functions
like convert(): could they be used to slip something through the
escaping function?


What about writing nessus plugin(s) or a specific scanner for these
escaping issues ? I don't know if a such thing already exists...

--
Olivier

---------------------------(end of broadcast)---------------------------
TIP 7: don't forget to increase your free space map settings

Nov 23 '05 #3
Tom,

Belated thanks for the info (I've been away from my desk).

Very helpful.

------------------
Geoff Caplan
Vario Software Ltd
(+44) 121-515 1154
---------------------------(end of broadcast)---------------------------
TIP 7: don't forget to increase your free space map settings

Nov 23 '05 #4
At 11:09 AM 7/31/2004 -0400, Tom Lane wrote:
All the supported encodings are supersets of ASCII, so I don't think
there is any such risk.
Is the 7.4.x multibyte support bombproof? How would we avoid problems like
this:
http://groups.google.com/groups?hl=e...ii%40sra.co.jp

Summary of that problem: an invalid multibyte character "eats" the
following character.

I know it's fixed already, but is there a way to reduce exposure to such bugs?
There is a risk in the opposite direction I think: if the escaping
function doesn't know the encoding being used
it might think that one byte of a multibyte character is ' or \ and try to
escape it, thereby breaking the data.


Is the escaping function always consistent with the backend's
interpretation? Is it impossible for them to be inconsistent (e.g. they use
the same code to interpret data).

My concern is if the escaping function thinks one byte of a multibyte is \
but the rest of the backend doesn't then you can end up with an escaped
backslash which does not escape a naughty '...

Also: what is the proper/official way to deal with:

update tablea set data=3-? where a=1;

And the parameter is -1

Somehow ensure it's always like this?
update tablea set data=3 - ? where a=1;

This doesn't seem to be escaped safely for: DBD::Pg 1.22 (3 versions old)
with Postgresql 7.3.4

Would it be best to do the 3-? part in the application and then do update
tablea set data=? where a=1;

Possibly result in less CPU usage at backend too.

Regards,

Link.
---------------------------(end of broadcast)---------------------------
TIP 4: Don't 'kill -9' the postmaster

Nov 23 '05 #5

Is the 7.4.x multibyte support bombproof? How would we avoid problems
like this:
http://groups.google.com/groups?hl=e...ii%40sra.co.jp
Well, maybe using UTF-8 encoding would fix this ?
update tablea set data=3-? where a=1;
Add parentheses :
update tablea set data=3-(?) where a=1;


Or do it in your program... but you can't do this if you have a db field
or function instead of the 3.

---------------------------(end of broadcast)---------------------------
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to ma*******@postgresql.org so that your
message can get through to the mailing list cleanly

Nov 23 '05 #6
Hi folks

I'm designing a table to be used for web session management. If all
goes well with the project, the table should have 100,000+ records and
be getting hammered with SELECTS, INSERTS and UPDATES.

The table will need a technical key. The question is, what is the most
efficient way to do this?

a) Generate a random 24 character string in the application. Very
quick for the INSERTs, but will the longer key slow down the the
SELECTs and UPDATES?

b) Use a sequence. Faster for the SELECTS and UPDATES, I guess, but
how much will the sequence slow down the INSERTS on a medium sized
record-set?

There will probably be 6-8 SELECTs & UPDATEs for each INSERT.

I appreciate that I could set up some tests, but I am under the hammer
time-wise. Some rule-of-thumb advice from the list would be most
welcome.

------------------
Geoff Caplan
Vario Software Ltd
(+44) 121-515 1154
---------------------------(end of broadcast)---------------------------
TIP 1: subscribe and unsubscribe commands go to ma*******@postgresql.org

Nov 23 '05 #7
After a long battle with technology, ge***@variosoft.com (Geoff Caplan), an earthling, wrote:
b) Use a sequence. Faster for the SELECTS and UPDATES, I guess, but
how much will the sequence slow down the INSERTS on a medium sized
record-set?


Why, in particular, would you expect the sequence to slow down
inserts? They don't lock the table.

Note that if you're really doing a lot of INSERTs in parallel, you
might find it worthwhile to configure the sequence to cache some
number of entries so that they are pre-allocated and stored in memory
for each session (e.g. - for each connection) for quicker access. See
the documentation for "create sequence" for more details...
--
output = reverse("gro.gultn" "@" "enworbbc")
http://www3.sympatico.ca/cbbrowne/x.html
Think of C++ as an object-oriented assembly language.
Nov 23 '05 #8
On Thu, Aug 12, 2004 at 13:05:45 +0100,
Geoff Caplan <ge***@variosoft.com> wrote:

b) Use a sequence. Faster for the SELECTS and UPDATES, I guess, but
how much will the sequence slow down the INSERTS on a medium sized
record-set?


Using a sequence shouldn't be slow. The main potential problem is that
it will make the session IDs guessible if you don't take any other
steps. That may or may not be a problem. One way around this is to
encrypt the sequence number in the database with a key and use a combination
of the encrypted string and an index for which key is used (this makes
changing keys for new sessions while allowing continued use of an old
key for old sessions) as the session id. You can change the keys as often
as needed and practical for your application.

---------------------------(end of broadcast)---------------------------
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to ma*******@postgresql.org so that your
message can get through to the mailing list cleanly

Nov 23 '05 #9
Bruno Wolff III wrote:
Using a sequence shouldn't be slow.
Thanks - that's the main thing I need to know.
The main potential problem is that it will make the session IDs
guessible if you don't take any other steps. That may or may not
be a problem.
Thanks for the warning, but I won't be using the sequence number as
the session id: as you say, not a safe thing to do. The session record
key persists from session to session: it is used to link sessions with
browsers and with user accounts. The session key will be a random 32
character key generated for each session.

Christopher Browne wrote:
Why, in particular, would you expect the sequence to slow down
inserts? They don't lock the table.
I was assuming that generating the sequence number was expensive: it
is some other DBs I have used. That was why I was thinking of
providing a unique id via a random string. But a practical test shows
that in PG it is pretty fast, so there is not need.
Note that if you're really doing a lot of INSERTs in parallel, you
might find it worthwhile to configure the sequence to cache some
number of entries so that they are pre-allocated and stored in memory
for each session (e.g. - for each connection) for quicker access. See
the documentation for "create sequence" for more details...


I think that would be worthwhile.

Thanks for the input, folks.

------------------
Geoff Caplan
Vario Software Ltd
(+44) 121-515 1154
---------------------------(end of broadcast)---------------------------
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to ma*******@postgresql.org so that your
message can get through to the mailing list cleanly

Nov 23 '05 #10

You could use apache mod_auth_tkt :
http://www.openfusion.com.au/labs/mod_auth_tkt/

Its main advantage is that it'll authentify a user, hence your script
gets the user ID, which you can use as a key in your session table for
instance.
Cut & paste for the lazies :

mod_auth_tkt is a lightweight cookie-based authentication module for
Apache 1.3.x, written in C. It implements a single-signon framework that
works across multiple apache instances and multiple machines. The actual
authentication is done by a user-supplied CGI or script in whatever
language you like (examples are provided in Perl), meaning you can
authenticate against any kind of user repository you can access (password
files, ldap, databases, etc.)

mod_auth_tkt supports inactivity timeouts (including the ability to
control how aggressively the ticket is refreshed), the ability to include
arbitrary user data within the cookie, configurable cookie names and
domains, and token-based access to subsections of a site.

mod_auth_tkt works by checking incoming Apache requests for a (user-
defined) cookie containing a valid authentication ticket. The ticket is
checked by generating an MD5 checksum for the username and any (optional)
user data from the ticket together with the requesting IP address and a
shared secret available to the server. If the generated MD5 checksum
matches the ticket's checksum, the ticket is valid and the request is
authorised. Requests without a valid ticket are redirected to a
configurable URL which is expected to validate the user and generate a
ticket for them. This package includes both a sample C executable and a
Perl module for generating the cookies; implementations for other
environments should be relatively straightforward.
Hi folks

I'm designing a table to be used for web session management. If all
goes well with the project, the table should have 100,000+ records and
be getting hammered with SELECTS, INSERTS and UPDATES.

The table will need a technical key. The question is, what is the most
efficient way to do this?

a) Generate a random 24 character string in the application. Very
quick for the INSERTs, but will the longer key slow down the the
SELECTs and UPDATES?

b) Use a sequence. Faster for the SELECTS and UPDATES, I guess, but
how much will the sequence slow down the INSERTS on a medium sized
record-set?

There will probably be 6-8 SELECTs & UPDATEs for each INSERT.

I appreciate that I could set up some tests, but I am under the hammer
time-wise. Some rule-of-thumb advice from the list would be most
welcome.

------------------
Geoff Caplan
Vario Software Ltd
(+44) 121-515 1154
---------------------------(end of broadcast)---------------------------
TIP 1: subscribe and unsubscribe commands go to ma*******@postgresql.org


---------------------------(end of broadcast)---------------------------
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to ma*******@postgresql.org)

Nov 23 '05 #11
Pierre-Frédéric,

PFC> You could use apache mod_auth_tkt :
PFC> http://www.openfusion.com.au/labs/mod_auth_tkt/

I think their own description of "lightweight" is a fair summary of
mod_auth.

My own approach needs to be a more security conscious. Secure web
sessions is an area that deserves more attention. The only good source
I know is:

http://cookies.lcs.mit.edu/pubs/webauth.html

The ease with which the MIT team were able to compromise so many
leading corporate sites is sobering.

My own approach is mainly a blend of the MIT ideas, the Yahoo ideas
reported on the the latest version of the MIT paper, and the OpenACS
approach:

http://openacs.org/doc/openacs-5-1/security-design.html

But this is a bit OT here. If you want to carry on with this, perhaps
you could contact me off list?

------------------
Geoff Caplan
Vario Software Ltd
(+44) 121-515 1154
---------------------------(end of broadcast)---------------------------
TIP 7: don't forget to increase your free space map settings

Nov 23 '05 #12

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

4
by: Dave Moore | last post by:
Hi All, Can anybody point me to a FAQ or similar that describes what all this stuff is about please?. I'm interfacing with a MySQL database if that's relavent. I've read a couple of books which...
0
by: Inspector Chan | last post by:
Hi, I'm using some external data on shell commands which are to be executed with os.system (other functions doesn't provide enough flexibility for executing these shell lines). So I have...
2
by: Jim Dabell | last post by:
I'm in the middle of writing a small app for Linux that needs to create directories that take their names from untrusted data. If possible, I'd like to preserve special characters rather than...
1
by: Craig Ringer | last post by:
Hi folks I'm a bit of a newbie here, though I've tried to appropriately research this issue before posting. I've found a lot of questions, a few answers that don't really answer quite what I'm...
9
by: Jim Washington | last post by:
I'm still working on yet another parser for JSON (http://json.org). It's called minjson, and it's tolerant on input, strict on output, and pretty fast. The only problem is, it uses eval(). It's...
3
by: Taras_96 | last post by:
Hi everyone, I'm having a bit of trouble understanding the purpose of escaping nulls, and the use of addcslashes. Firstly, the manual states that: "Strictly speaking, MySQL requires only...
0
by: Ben | last post by:
Hello, I've been developing apps in Delphi for years and have just started writing my first big project in c# + ms .net and have some questions about security and untrusted code. I've got an...
1
by: David Henderson | last post by:
I know 'disable-output-escaping' has been discussed in the past, but I can't put my finger on any of the threads to see if my current problem is addressed. Sorry for re-asking the question if it...
2
by: Andrey Fedorov | last post by:
Is the scope of a closure accessible after it's been created? Is it safe against XSS to use closures to store "private" auth tokens? In particular, in... ....can untrusted code access...
0
by: taylorcarr | last post by:
A Canon printer is a smart device known for being advanced, efficient, and reliable. It is designed for home, office, and hybrid workspace use and can also be used for a variety of purposes. However,...
0
by: ryjfgjl | last post by:
If we have dozens or hundreds of excel to import into the database, if we use the excel import function provided by database editors such as navicat, it will be extremely tedious and time-consuming...
0
by: ryjfgjl | last post by:
In our work, we often receive Excel tables with data in the same format. If we want to analyze these data, it can be difficult to analyze them because the data is spread across multiple Excel files...
0
by: emmanuelkatto | last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel
0
BarryA
by: BarryA | last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
1
by: nemocccc | last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
1
by: Sonnysonu | last post by:
This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...
0
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.