Correct escaping of untrusted data

Geoff Caplan

Hi folks,

The thread on injection attacks was very instructive, but seemed to
run out of steam at an interesting point. Now you guys have kindly
educated me about the real nature of the issues, can I ask again
what effective escaping really means?

Are the standard escaping functions found in the PHP, Tcl etc APIs to
Postgres bombproof? Are there any encodings that might slip through
and be cast to malicious strings inside Postgres? What about functions
like convert(): could they be used to slip something through the
escaping function?

I don't really have enough knowledge in this area to be confident in
the results of my own experiments. Any advice from the more
technically savvy would be much appreciated.

------------------
Geoff Caplan
Vario Software Ltd
(+44) 121-515 1154
---------------------------(end of broadcast)---------------------------
TIP 9: the planner will ignore your desire to choose an index scan if your
joining column's datatypes do not match

Nov 23 '05 #1

Subscribe Post Reply

2156

Tom Lane

Geoff Caplan <ge***@variosoft.com> writes:

Are the standard escaping functions found in the PHP, Tcl etc APIs to
Postgres bombproof?
I dunno; you'd probably want to look at the source for each one you
planned to use, anyway, if you're being paranoid. As long as they
escape ' and \ they should be okay. If your source language allows
embedded nulls (\0) in strings you might want to reject those as well.
Are there any encodings that might slip through
and be cast to malicious strings inside Postgres?
All the supported encodings are supersets of ASCII, so I don't think
there is any such risk. There is a risk in the opposite direction
I think: if the escaping function doesn't know the encoding being used
it might think that one byte of a multibyte character is ' or \ and
try to escape it, thereby breaking the data. This could not happen in
"sane" encodings like UTF-8, however, just in the one or two Far Eastern
encodings that allow multibyte characters to contain bytes <= 0x7F.

Since you as the application programmer can control what client-side
encoding is used, the simplest answer here is just to be sure you're
using a sane encoding, or at least that the escaping function knows
the encoding you're using.
What about functions like convert(): could they be used to slip
something through the escaping function?

Don't see how. The issue is to be sure that the query string traveling
to the backend will be interpreted the way you expected. By the time
any server-side function executes it is far too late to change that
interpretation.

regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 6: Have you searched our list archives?

http://archives.postgresql.org

Nov 23 '05 #2

Olivier Guilyardi

Geoff Caplan wrote:

Are the standard escaping functions found in the PHP, Tcl etc APIs to
Postgres bombproof? Are there any encodings that might slip through
and be cast to malicious strings inside Postgres? What about functions
like convert(): could they be used to slip something through the
escaping function?

What about writing nessus plugin(s) or a specific scanner for these
escaping issues ? I don't know if a such thing already exists...

--
Olivier

---------------------------(end of broadcast)---------------------------
TIP 7: don't forget to increase your free space map settings

Nov 23 '05 #3

Geoff Caplan

Tom,

Belated thanks for the info (I've been away from my desk).

Very helpful.

------------------
Geoff Caplan
Vario Software Ltd
(+44) 121-515 1154
---------------------------(end of broadcast)---------------------------
TIP 7: don't forget to increase your free space map settings

Nov 23 '05 #4

Lincoln Yeoh

At 11:09 AM 7/31/2004 -0400, Tom Lane wrote:

All the supported encodings are supersets of ASCII, so I don't think
there is any such risk.
Is the 7.4.x multibyte support bombproof? How would we avoid problems like
this:
http://groups.google.com/groups?hl=e...ii%40sra.co.jp

Summary of that problem: an invalid multibyte character "eats" the
following character.

I know it's fixed already, but is there a way to reduce exposure to such bugs?
There is a risk in the opposite direction I think: if the escaping
function doesn't know the encoding being used
it might think that one byte of a multibyte character is ' or \ and try to
escape it, thereby breaking the data.

Is the escaping function always consistent with the backend's
interpretation? Is it impossible for them to be inconsistent (e.g. they use
the same code to interpret data).

My concern is if the escaping function thinks one byte of a multibyte is \
but the rest of the backend doesn't then you can end up with an escaped
backslash which does not escape a naughty '...

Also: what is the proper/official way to deal with:

update tablea set data=3-? where a=1;

And the parameter is -1

Somehow ensure it's always like this?
update tablea set data=3 - ? where a=1;

This doesn't seem to be escaped safely for: DBD::Pg 1.22 (3 versions old)
with Postgresql 7.3.4

Would it be best to do the 3-? part in the application and then do update
tablea set data=? where a=1;

Possibly result in less CPU usage at backend too.

Regards,

Link.
---------------------------(end of broadcast)---------------------------
TIP 4: Don't 'kill -9' the postmaster

Nov 23 '05 #5

Pierre-Frédéric Caillaud

Is the 7.4.x multibyte support bombproof? How would we avoid problems
like this:
http://groups.google.com/groups?hl=e...ii%40sra.co.jp
Well, maybe using UTF-8 encoding would fix this ?
update tablea set data=3-? where a=1;
Add parentheses :
update tablea set data=3-(?) where a=1;

Or do it in your program... but you can't do this if you have a db field
or function instead of the 3.

---------------------------(end of broadcast)---------------------------
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to ma*******@postgresql.org so that your
message can get through to the mailing list cleanly

Nov 23 '05 #6

Geoff Caplan

Hi folks

I'm designing a table to be used for web session management. If all
goes well with the project, the table should have 100,000+ records and
be getting hammered with SELECTS, INSERTS and UPDATES.

The table will need a technical key. The question is, what is the most
efficient way to do this?

a) Generate a random 24 character string in the application. Very
quick for the INSERTs, but will the longer key slow down the the
SELECTs and UPDATES?

b) Use a sequence. Faster for the SELECTS and UPDATES, I guess, but
how much will the sequence slow down the INSERTS on a medium sized
record-set?

There will probably be 6-8 SELECTs & UPDATEs for each INSERT.

I appreciate that I could set up some tests, but I am under the hammer
time-wise. Some rule-of-thumb advice from the list would be most
welcome.

------------------
Geoff Caplan
Vario Software Ltd
(+44) 121-515 1154
---------------------------(end of broadcast)---------------------------
TIP 1: subscribe and unsubscribe commands go to ma*******@postgresql.org

Nov 23 '05 #7

Christopher Browne

After a long battle with technology, ge***@variosoft.com (Geoff Caplan), an earthling, wrote:

b) Use a sequence. Faster for the SELECTS and UPDATES, I guess, but
how much will the sequence slow down the INSERTS on a medium sized
record-set?

Why, in particular, would you expect the sequence to slow down
inserts? They don't lock the table.

Note that if you're really doing a lot of INSERTs in parallel, you
might find it worthwhile to configure the sequence to cache some
number of entries so that they are pre-allocated and stored in memory
for each session (e.g. - for each connection) for quicker access. See
the documentation for "create sequence" for more details...
--
output = reverse("gro.gultn" "@" "enworbbc")
http://www3.sympatico.ca/cbbrowne/x.html
Think of C++ as an object-oriented assembly language.

Nov 23 '05 #8

Bruno Wolff III

On Thu, Aug 12, 2004 at 13:05:45 +0100,
Geoff Caplan <ge***@variosoft.com> wrote:

b) Use a sequence. Faster for the SELECTS and UPDATES, I guess, but
how much will the sequence slow down the INSERTS on a medium sized
record-set?

Using a sequence shouldn't be slow. The main potential problem is that
it will make the session IDs guessible if you don't take any other
steps. That may or may not be a problem. One way around this is to
encrypt the sequence number in the database with a key and use a combination
of the encrypted string and an index for which key is used (this makes
changing keys for new sessions while allowing continued use of an old
key for old sessions) as the session id. You can change the keys as often
as needed and practical for your application.

---------------------------(end of broadcast)---------------------------
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to ma*******@postgresql.org so that your
message can get through to the mailing list cleanly

Nov 23 '05 #9

Geoff Caplan

Bruno Wolff III wrote:

Using a sequence shouldn't be slow.
Thanks - that's the main thing I need to know.
The main potential problem is that it will make the session IDs
guessible if you don't take any other steps. That may or may not
be a problem.
Thanks for the warning, but I won't be using the sequence number as
the session id: as you say, not a safe thing to do. The session record
key persists from session to session: it is used to link sessions with
browsers and with user accounts. The session key will be a random 32
character key generated for each session.

Christopher Browne wrote:
Why, in particular, would you expect the sequence to slow down
inserts? They don't lock the table.
I was assuming that generating the sequence number was expensive: it
is some other DBs I have used. That was why I was thinking of
providing a unique id via a random string. But a practical test shows
that in PG it is pretty fast, so there is not need.
Note that if you're really doing a lot of INSERTs in parallel, you
might find it worthwhile to configure the sequence to cache some
number of entries so that they are pre-allocated and stored in memory
for each session (e.g. - for each connection) for quicker access. See
the documentation for "create sequence" for more details...

I think that would be worthwhile.

Thanks for the input, folks.

------------------
Geoff Caplan
Vario Software Ltd
(+44) 121-515 1154
---------------------------(end of broadcast)---------------------------
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to ma*******@postgresql.org so that your
message can get through to the mailing list cleanly

Nov 23 '05 #10

Pierre-Frédéric Caillaud

You could use apache mod_auth_tkt :
http://www.openfusion.com.au/labs/mod_auth_tkt/

Its main advantage is that it'll authentify a user, hence your script
gets the user ID, which you can use as a key in your session table for
instance.
Cut & paste for the lazies :

mod_auth_tkt is a lightweight cookie-based authentication module for
Apache 1.3.x, written in C. It implements a single-signon framework that
works across multiple apache instances and multiple machines. The actual
authentication is done by a user-supplied CGI or script in whatever
language you like (examples are provided in Perl), meaning you can
authenticate against any kind of user repository you can access (password
files, ldap, databases, etc.)

mod_auth_tkt supports inactivity timeouts (including the ability to
control how aggressively the ticket is refreshed), the ability to include
arbitrary user data within the cookie, configurable cookie names and
domains, and token-based access to subsections of a site.

mod_auth_tkt works by checking incoming Apache requests for a (user-
defined) cookie containing a valid authentication ticket. The ticket is
checked by generating an MD5 checksum for the username and any (optional)
user data from the ticket together with the requesting IP address and a
shared secret available to the server. If the generated MD5 checksum
matches the ticket's checksum, the ticket is valid and the request is
authorised. Requests without a valid ticket are redirected to a
configurable URL which is expected to validate the user and generate a
ticket for them. This package includes both a sample C executable and a
Perl module for generating the cookies; implementations for other
environments should be relatively straightforward.

Hi folks

I'm designing a table to be used for web session management. If all
goes well with the project, the table should have 100,000+ records and
be getting hammered with SELECTS, INSERTS and UPDATES.

The table will need a technical key. The question is, what is the most
efficient way to do this?

a) Generate a random 24 character string in the application. Very
quick for the INSERTs, but will the longer key slow down the the
SELECTs and UPDATES?

b) Use a sequence. Faster for the SELECTS and UPDATES, I guess, but
how much will the sequence slow down the INSERTS on a medium sized
record-set?

There will probably be 6-8 SELECTs & UPDATEs for each INSERT.

I appreciate that I could set up some tests, but I am under the hammer
time-wise. Some rule-of-thumb advice from the list would be most
welcome.

------------------
Geoff Caplan
Vario Software Ltd
(+44) 121-515 1154
---------------------------(end of broadcast)---------------------------
TIP 1: subscribe and unsubscribe commands go to ma*******@postgresql.org

---------------------------(end of broadcast)---------------------------
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to ma*******@postgresql.org)

Nov 23 '05 #11

Geoff Caplan

Pierre-Frédéric,

PFC> You could use apache mod_auth_tkt :
PFC> http://www.openfusion.com.au/labs/mod_auth_tkt/

I think their own description of "lightweight" is a fair summary of
mod_auth.

My own approach needs to be a more security conscious. Secure web
sessions is an area that deserves more attention. The only good source
I know is:

http://cookies.lcs.mit.edu/pubs/webauth.html

The ease with which the MIT team were able to compromise so many
leading corporate sites is sobering.

My own approach is mainly a blend of the MIT ideas, the Yahoo ideas
reported on the the latest version of the MIT paper, and the OpenACS
approach:

http://openacs.org/doc/openacs-5-1/security-design.html

But this is a bit OT here. If you want to carry on with this, perhaps
you could contact me off list?

------------------
Geoff Caplan
Vario Software Ltd
(+44) 121-515 1154
---------------------------(end of broadcast)---------------------------
TIP 7: don't forget to increase your free space map settings

Nov 23 '05 #12

by: Dave Moore | last post by:

Hi All, Can anybody point me to a FAQ or similar that describes what all this stuff is about please?. I'm interfacing with a MySQL database if that's relavent. I've read a couple of books which...

PHP

Escaping shell commands

by: Inspector Chan | last post by:

Hi, I'm using some external data on shell commands which are to be executed with os.system (other functions doesn't provide enough flexibility for executing these shell lines). So I have...

Python

Directory names from untrusted data

by: Jim Dabell | last post by:

I'm in the middle of writing a small app for Linux that needs to create directories that take their names from untrusted data. If possible, I'd like to preserve special characters rather than...

Python

Correct way to handle independent interpreters when embedding in asingle-threaded C++ app

by: Craig Ringer | last post by:

Hi folks I'm a bit of a newbie here, though I've tried to appropriately research this issue before posting. I've found a lot of questions, a few answers that don't really answer quite what I'm...

Python

Sanitizing untrusted code for eval()

by: Jim Washington | last post by:

I'm still working on yet another parser for JSON (http://json.org). It's called minjson, and it's tolerant on input, strict on output, and pretty fast. The only problem is, it uses eval(). It's...

Python

PHP, mysql, and escaping characters

by: Taras_96 | last post by:

Hi everyone, I'm having a bit of trouble understanding the purpose of escaping nulls, and the use of addcslashes. Firstly, the manual states that: "Strictly speaking, MySQL requires only...

PHP

Executing Untrusted Code

by: Ben | last post by:

Hello, I've been developing apps in Delphi for years and have just started writing my first big project in c# + ms .net and have some questions about security and untrusted code. I've got an...

.NET Framework

firefox doesn't like 'disable-output-escaping' setting... looking for alternatives

by: David Henderson | last post by:

I know 'disable-output-escaping' has been discussed in the past, but I can't put my finger on any of the threads to see if my current problem is addressed. Sorry for re-asking the question if it...

.NET Framework

Is a closure's scope accessible by untrusted code?

by: Andrey Fedorov | last post by:

Is the scope of a closure accessible after it's been created? Is it safe against XSS to use closures to store "private" auth tokens? In particular, in... ....can untrusted code access...

Javascript

Easy Steps to Fix "Canon Printer Won't Connect to WiFi Network"

by: taylorcarr | last post by:

A Canon printer is a smart device known for being advanced, efficient, and reliable. It is designed for home, office, and hybrid workspace use and can also be used for a variety of purposes. However,...

General

Batch import of multiple excel files into the database

by: ryjfgjl | last post by:

If we have dozens or hundreds of excel to import into the database, if we use the excel import function provided by database editors such as navicat, it will be extremely tedious and time-consuming...

Data Management

Merging data from multiple Excel files

by: ryjfgjl | last post by:

In our work, we often receive Excel tables with data in the same format. If we want to analyze these data, it can be difficult to analyze them because the data is spread across multiple Excel files...

Data Management

Migrating Website to Cloud - Emmanuel Katto

by: emmanuelkatto | last post by:

Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel

General

Navigating the Data Structures and Algorithms (DSA)

by: BarryA | last post by:

What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...

Algorithms / Advanced Math

Looking to do Android software development, any suggestions? Is flutter better?

by: nemocccc | last post by:

hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?

General

Is that possible of reading the .csv file in column wise and the column have different lengths ?

by: Sonnysonu | last post by:

This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...

C / C++

How to build RAID in BIOS?

by: Hystou | last post by:

There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...

Computer Hardware

Changing the language in Windows 10

by: Hystou | last post by:

Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...

Windows Server

Correct escaping of untrusted data

Similar topics