General question on charging for data access

Frnak McKenney

Back when computer dinosaurs roamed the earth and the precursors to
today's Internet were tiny flocks of TDMs living symbiotically with
the silicon giants, tracking access to data processing resources was
much simpler: you logged in with a userID and password, and when
you were done you ended your session by logging out (or occasionally
by being disconnected). Connection time was easy to measure, and it
made sense to both the customer and the service provider as a rough
metric for value received by a customer.

Things have changed. The stateless design of HTTP makes this kind
of simple "time = value" approach to charging for services much more
difficult to implement, since there seems to be no way of easily
defining a "session" in its older sense. The term "session ID" used
by PHP and other HTTP support environments is somewhat misleading,
since, while it has a distinct beginning where authorization can be
tested, there is no event corresponding to a "session ending".

This is good for some purposes, As long as the HTTP server retains
its copy of session information (perhaps days or weeks) a browser
with the correct session ID can pick up a partially-completed user
transaction (e.g. a book order from Amazon.com) from where it left
off several hours previous. In fact, for better or worse, a
different browser on a different system in a different location can
pick up the same partially-completed user transaction.

Trouble is, there are still some situations where it would be very
useful to track usage of some HTTP-based service, or limit access,
based on something like the older concept of a "session". One of my
customers provides online access to a small, limited-purpose
database, and until recently was able to maintain a simple
fixed-rate billing-by-userID scheme. This works tolerably well when
each userID comes from a single IP, but one of my customer's
customers would like to allow multiple individuals access to the
database from their internal network. Oh, and they'd like "instant
access", that is, recognition of (say) their IP as one automatically
authorized and not requiring an ID/password step to access the data.
My customer would like some way of limiting access so that no more
than N (initially one) of this customer's individual users is/are
"accessing the database" at one time, so that usage past a certain
limit will cost the customer more. Unfortunately, this idea of
"only N simultaneous users" seems extremely difficult to implement
cleanly in the HTTP environment.

From the server's point of view, each HTTP transaction is completely
defined by its contents, and persistence is available only through
the combined efforts of the browser and whatever "session memory"
the server provides for the browser's use. In theory (translation:
I haven't completely tested this) the server could distinguish
multiple end-user browsers making requests through the same IP
because each browser would be assigned a different "session ID", but
given the expected short duration of HTTP transactions (in this
case, on the order of seconds) it would be very unlikely for the
handling of transactions from different browsers to overlap.

Limiting the number of simultaneous "sessions" (in the PHP sense)
from a given IP is tricky. What defines when one browser's PHP
"session" ends so the server can let another one be initiated? Do
you attempt to require that each end-user go through a "logout" page
when they are done, something at odds with the user's experience
with every other 'web site? (And do you set up a switchboard ot
handle the complaints when half the end users go to lunch without
bothering to "log out"? <grin>)

I came up with one promising approach using server-side persistent
non-session storage (an SQL table, say) and a fixed "expiration
time" (say 15 minutes). If the server stores the session ID, IP
address, and a timestamp for each HTTP transaction the server can
declare an "quota exceeded" condition (refuse to assign new session
IDs) whenever a given IP address (or userID) has more than N
"unexpired" entries in the SQL table. This comes _close_ to
approaching the old idea of a "session", and I _think_ it can be
done through straight PHP and xxSQL; that is, without having to
rewrite any server internals <grin>. But it's really klugy, and I
sincerely hope some better method exists.

Am I spending too much time and effort studying trees instead of the
forest? Is there some other way (or ways) of looking at the problem
of charging for access that would be a better fit for data access
through HTTP?

Any suggestions or comments will be welcomed.
Frank McKenney

Frank McKenney, McKenney Associates
Richmond, Virginia / (804) 320-4887
Munged E-mail: frank uscore mckenney ayut minds pring dawt cahm (y'all)

Jul 17 '05 #1

Subscribe Post Reply

2930

jblanch

Well, i'm going to attempt to weed out what you want, and some issues i
can foresee.

First off, the sessions are a great idea, but what about just an MD5 or
any other encrypted cookie? that way its easier to maintain and use,
becuase i always have trouble changing my session settings and things
like that (unless its a personal server, where you could easily edit
the php.ini file). But what you could do, is have the one time signup
(the acctuall regestriation/activation set the encrypted cookie with
the data you want, and that cookie will be read when they enter the
site, put into the database with a timestamp, and checked. What this
cookie problem will solve is the inability to record IP's correctly,
becuase with all the wireless LAN's out there, anyone can have 2-100
computers on the same IP.

I think using the cookie with an xxSQL database is one of the only ways
to go, because you can easily erase the records from the database, and
the other table will have the stored identifiers, keeping the two at
low memory usage.

The HTTP overlaping situation is somthing i had occur when programming
with winsock, i would use simultatious SendData's and end up connecting
the data together, effectivley crashing whatever i was doing, but when
you're i'm almost certian that with the HTTP Protocol, this "overlap"
would cause a server error, and not get read and parsed to CGI at all,
and it would bring up some 40-,50-, or some error.

Hope this helps.. and by the way.. what kind of project are you
_acctually_ working on? i hope i'm not helping with some ad/spam/adult
site :-/.

JBlanch
jblanch at gmail dot com
http://jblanch.us

Jul 17 '05 #2

Gordon Burditt

>by being disconnected). Connection time was easy to measure, and it

made sense to both the customer and the service provider as a rough
metric for value received by a customer.
Connection time no longer provides a good metric (if it ever did),
as it used to be limited mostly by bandwidth, and now it is not.
Things have changed. The stateless design of HTTP makes this kind
of simple "time = value" approach to charging for services much more
difficult to implement, since there seems to be no way of easily
defining a "session" in its older sense. The term "session ID" used
by PHP and other HTTP support environments is somewhat misleading,
since, while it has a distinct beginning where authorization can be
tested, there is no event corresponding to a "session ending".
Consider this possibility: a HTTP transaction *IS* a session. (Not
in the PHP sense.) Charge for them, or at least some of them, like
submitting an order or getting query results. Time is not a good
metric here (the slower your server is, the WORSE service the user
got). Consider a charge per query or a charge per unit of data.

This is good for some purposes, As long as the HTTP server retains
its copy of session information (perhaps days or weeks) a browser
with the correct session ID can pick up a partially-completed user
transaction (e.g. a book order from Amazon.com) from where it left
off several hours previous. In fact, for better or worse, a
different browser on a different system in a different location can
pick up the same partially-completed user transaction.
A well-designed setup won't do this with any reliability if
sessions are done properly and wiretapping and spyware are kept
under control. Session IDs should be taken from a large enough
number space and generated randomly so they are not guessable.
Trouble is, there are still some situations where it would be very
useful to track usage of some HTTP-based service, or limit access,
based on something like the older concept of a "session". One of my
customers provides online access to a small, limited-purpose
database, and until recently was able to maintain a simple
fixed-rate billing-by-userID scheme. This works tolerably well when
each userID comes from a single IP, but one of my customer's
customers would like to allow multiple individuals access to the
database from their internal network. Oh, and they'd like "instant
access", that is, recognition of (say) their IP as one automatically
authorized and not requiring an ID/password step to access the data.
That sounds a lot like "one page, one session, one charge". Remember,
if the users are taught to log out IMMEDIATELY (after saving the
results) to keep the number of simultaneous sessions down, they can
get a *LOT* of mileage out of one allowed simultaneous connection.
This suggests that the billing method isn't in touch with reality
and doesn't reflect value received.
My customer would like some way of limiting access so that no more
than N (initially one) of this customer's individual users is/are
"accessing the database" at one time, so that usage past a certain
limit will cost the customer more. Unfortunately, this idea of
"only N simultaneous users" seems extremely difficult to implement
cleanly in the HTTP environment.
Not only is it difficult to implement, it's difficult to even DEFINE
what a "simultaneous user" is.
From the server's point of view, each HTTP transaction is completely
defined by its contents, and persistence is available only through
the combined efforts of the browser and whatever "session memory"
the server provides for the browser's use. In theory (translation:
I haven't completely tested this) the server could distinguish
multiple end-user browsers making requests through the same IP
because each browser would be assigned a different "session ID", but
given the expected short duration of HTTP transactions (in this
case, on the order of seconds) it would be very unlikely for the
handling of transactions from different browsers to overlap.
Exactly. Consider charging per query. Or charge per significant
query (if it takes getting through 3 pages to do a query, charge
for the final one that gives useful results).
Limiting the number of simultaneous "sessions" (in the PHP sense)
from a given IP is tricky. What defines when one browser's PHP
"session" ends so the server can let another one be initiated? Do
Nothing. Maybe you should consider whether it is APPROPRIATE to
generate a large bill for someone who forgets to log out over a
weekend, but generates no activity during that time. And is it
really approprate to generate an even larger bill if one browser
does a few queries one year, the employee retires, and when it's
reactivated 3 years later, it still has the session open, so that's
three years of continuous connect time?
you attempt to require that each end-user go through a "logout" page
when they are done, something at odds with the user's experience
with every other 'web site? (And do you set up a switchboard ot
handle the complaints when half the end users go to lunch without
bothering to "log out"? <grin>)
Why should going to lunch and forgetting to log out cost a lot?
I came up with one promising approach using server-side persistent
non-session storage (an SQL table, say) and a fixed "expiration
time" (say 15 minutes). If the server stores the session ID, IP
address, and a timestamp for each HTTP transaction the server can
declare an "quota exceeded" condition (refuse to assign new session
IDs) whenever a given IP address (or userID) has more than N
"unexpired" entries in the SQL table. This comes _close_ to
approaching the old idea of a "session", and I _think_ it can be
done through straight PHP and xxSQL; that is, without having to
rewrite any server internals <grin>. But it's really klugy, and I
sincerely hope some better method exists. Am I spending too much time and effort studying trees instead of the
forest?
Yes. You're trying to charge for man-hours in the forest when the
value received is how much they remove in logs to be turned into lumber.
Professional loggers get a bargain and the Cub Scouts get charged
a huge amount for one tree.
Is there some other way (or ways) of looking at the problem
of charging for access that would be a better fit for data access
through HTTP?

Pay-per-query. Either a fixed amount, some measure of how much
data is returned, or a combination of both. For example, the
subscription says that for $x a month, they get 500 queries per
month, after that it costs $y per additional query. Or make it a
per-day or per-week limit. No need to count individual users (only
what subscription they have access under) at all.

Think: what kind of behavior do you wish to ENCOURAGE? Don't
charge extra for that. What kind of behavior do you wish to
DISCOURAGE? Charge extra for that. What kind of behavior gives
the customer value? Charge for that. Assume the customer will (or
at least may) modify his behavior to reduce his bill.

BAD: The user logs in, does a query, logs out quickly. Repeat as
necessary. This costs 3 hits on your server compared to
staying logged in.
GOOD: The user stays logged in for a few hours worth of queries, then
logs out.
BAD: The user logs in, issues one query which returns 50 GB, then
uses it all day. Repeat next day. This is hell on your
bandwidth.
GOOD: The user issues queries with a small amount of data returned,
all he needs to use for that query.
Gordon L. Burditt

Jul 17 '05 #3

Gordon Burditt

>The HTTP overlaping situation is somthing i had occur when programming

with winsock, i would use simultatious SendData's and end up connecting
the data together, effectivley crashing whatever i was doing, but when
you're i'm almost certian that with the HTTP Protocol, this "overlap"
would cause a server error, and not get read and parsed to CGI at all,
and it would bring up some 40-,50-, or some error.

A web server ought to be able to deal with multiple simultaneous
connections from the same IP address, whether or not they are
actually on the same client machine or whether NAT is distributing
them to dozens of different client machines. TCP considers a
connection to be (client IP, client port, server IP, server port),
and the client ports will be different (with a decent NAT implementation)
even if the other three are the same). Note that a browser will
typically open up several connections at a time, to fetch images
in parallel. It should just work.

Server load is a different issue, but even a Pentium 100 machine
should be able to handle a few simultaneous connections without
slowing down too much.

It sounds like you ran into some winsock bug.

Gordon L. Burditt

Jul 17 '05 #4

Colin McKinnon

Frnak McKenney wrote:

Back when computer dinosaurs roamed the earth and the precursors to

<snip long story>

This is the kind of problem where a application server can acheive a
solution more simply than a cgi type solution (but as others have said - it
might be a lot better for a number of reasons to redefine the problem).

If you really want to go down this road, you might want to look at SRM (try
google for SRM PHP banana). Although not a full front controller, it may
simplify the process.

HTH

C.

Jul 17 '05 #5

Frnak McKenney

On 15 Dec 2004 20:40:13 -0800, jblanch <jb*****@gmail.com> wrote:

Well, i'm going to attempt to weed out what you want, and some
issues i can foresee.
JBlanch,

Thanks. I can use all the help I can get. <grin>
First off, the sessions are a great idea, but what about just an MD5
or any other encrypted cookie? that way its easier to maintain and
use, becuase i always have trouble changing my session settings and
things like that (unless its a personal server, where you could
easily edit the php.ini file). But what you could do, is have the
one time signup (the acctuall regestriation/activation set the
encrypted cookie with the data you want, and that cookie will be
read when they enter the site, put into the database with a
timestamp, and checked. What this cookie problem will solve is the
inability to record IP's correctly, becuase with all the wireless
LAN's out there, anyone can have 2-100 computers on the same IP.
Ah. I hadn't considered the idea of "tagging" the specific browser
with a unique, permanent (okay, a "persistent and machine-unique but
user-erasable") identifier. A copy of the ID stays with the site,
and, once the ID is "approved for use" by the site administrator,
any browser submitting that ID gets access to the site. Thanks --
you just added to my options.
The HTTP overlaping situation is somthing i had occur when
programming with winsock, i would use simultatious SendData's and
end up connecting the data together, effectivley crashing whatever i
was doing, but when you're i'm almost certian that with the HTTP
Protocol, this "overlap" would cause a server error, and not get
read and parsed to CGI at all, and it would bring up some 40-,50-,
or some error.
Oops -- I've miscommunicated. My thinking has been that most of the
end-user HTTP transactions (e.g. POST, GET, INITIATE-SELF-DESTRUCT)
will complete within a few seconds at most, and that _HTTP_ overlap
is unlikely. What my customer wants is some form of limiting access
on a coarser-grained scale, but one that "makes sense" to my
customer's customers.

From the point of view of someone being _billed_ for database
access, "one person at a time for $X/month or $Y/year" is a
straightforward quantization of access and payment. An organization
sends in a check, and one person in that organization gets access to
the data for K months. From the site's point of view, doing it this
way means extremely simple billing. What gets hairy is actually
_implementing_ something along those lines -- this is one of those
things that _looks_ like a technical problem, but one that is being
driven by _people_. ('Course, that description also fits the
Internet, and the WorldWideWeb. <grin>)
Hope this helps.. and by the way.. what kind of project are you
_acctually_ working on? i hope i'm not helping with some
ad/spam/adult site :-/.

Neither. It's a niche database used as a research tool, so testing
it isn't nearly as entertaining as it might be for an XXX-rated
site. <grin>

Thanks for the feedback.
Frank McKenney
--
There is no avoiding war; it can only be postponed to the advantage
of others. -- Niccolo Machiavelli
--
Frank McKenney, McKenney Associates
Richmond, Virginia / (804) 320-4887
Munged E-mail: frank uscore mckenney ayut minds pring dawt cahm (y'all)

Jul 17 '05 #6

Frnak McKenney

Gordon,

Thanks for responding.

On 16 Dec 2004 05:21:20 GMT, Gordon Burditt <go***********@burditt.org> wrote:

by being disconnected). Connection time was easy to measure, and it
made sense to both the customer and the service provider as a rough
metric for value received by a customer.
Connection time no longer provides a good metric (if it ever did),
as it used to be limited mostly by bandwidth, and now it is not.

Yup. Also it generally involved an "authorization" step when the
connection was being established and a "disconnection", both of
which aren't explicitly available for HTTP transactions.

Things have changed. The stateless design of HTTP makes this kind
of simple "time = value" approach to charging for services much more
difficult to implement, since there seems to be no way of easily
defining a "session" in its older sense. The term "session ID" used
by PHP and other HTTP support environments is somewhat misleading,
since, while it has a distinct beginning where authorization can be
tested, there is no event corresponding to a "session ending".

Consider this possibility: a HTTP transaction *IS* a session. (Not
in the PHP sense.) Charge for them, or at least some of them, like
submitting an order or getting query results. Time is not a good
metric here (the slower your server is, the WORSE service the user
got). Consider a charge per query or a charge per unit of data.

That's certainly _possible_, but it increases the complexity of the
billing process (currently a fixed fee for one user for K months) by
the proverbial order-of-magnitude. I'm hoping for an approach that
would be easy to understand, easy to implement, and easy to bill.
(Oh, and can I get fries with that? <grin>)

--snip--That sounds a lot like "one page, one session, one charge". Remember,
if the users are taught to log out IMMEDIATELY (after saving the
results) to keep the number of simultaneous sessions down, they can
get a *LOT* of mileage out of one allowed simultaneous connection.
This suggests that the billing method isn't in touch with reality
and doesn't reflect value received.
Well... after working in this industry for three decades I'm not
sure I'd claim that billing methods need to be "in touch with
reality" to work <grin>, but I'll agree that, within broad limits,
any method for charging needs to reflect the perceived value
received by the user/customer. It also has to produce sufficient
revenue to cover the costs of the site, its administration, and some
measure of profit.

Human beings prefer predictability, not just in billing but in most
things in life. It reduces stress. If I've paid a fixed fee for
database access -- or for operating my automobile -- I can use it
or not use it without worrying about whether my use is affecting my
wallet. I've "chunked" the decision process into a simple Yes/No
to the question "Am I getting $Y out of this?".

From the site's point of view, of course, it would like some way of
distinguishing levels of use. As long as the end users are
generally scattered, flat-rate billing on a per-userID, per-month or
per-year basis is simple and makes sense to both parties. What's
tricky here is coming up with a "site-wide access license" in a way
that doesn't overstretch the current billing paradigm -- it's the
kind of shift that generates mythical billing constructs like "user
equivalents" <grin?>.

My customer would like some way of limiting access so that no more
than N (initially one) of this customer's individual users is/are
"accessing the database" at one time, so that usage past a certain
limit will cost the customer more. Unfortunately, this idea of
"only N simultaneous users" seems extremely difficult to implement
cleanly in the HTTP environment.

Not only is it difficult to implement, it's difficult to even DEFINE
what a "simultaneous user" is.

I think a definition can be _created_, but in order for the
definition to be acceptable it has to be recognized by all parties
as a workable measure of service.
... Consider charging per query. Or charge per significant
query (if it takes getting through 3 pages to do a query, charge
for the final one that gives useful results).
This is certainly possible, but doing this makes billing more
complex.

Limiting the number of simultaneous "sessions" (in the PHP sense)
from a given IP is tricky. What defines when one browser's PHP
"session" ends so the server can let another one be initiated? Do

Nothing. Maybe you should consider whether it is APPROPRIATE to
generate a large bill for someone who forgets to log out over a
weekend, but generates no activity during that time. And is it
really approprate to generate an even larger bill if one browser
does a few queries one year, the employee retires, and when it's
reactivated 3 years later, it still has the session open, so that's
three years of continuous connect time?

Exactly. Or, as I explained to my customer, it "represents certain
technical difficulties". <grin>

you attempt to require that each end-user go through a "logout" page
when they are done, something at odds with the user's experience
with every other 'web site? (And do you set up a switchboard ot
handle the complaints when half the end users go to lunch without
bothering to "log out"? <grin>)

Why should going to lunch and forgetting to log out cost a lot?

It would be reasonable if (as in the olden days) forgetting to log
out blocked access by other users. That doesn't apply here.

I came up with one promising approach using server-side persistent
non-session storage (an SQL table, say) and a fixed "expiration
time" (say 15 minutes). --snip--Am I spending too much time and effort studying trees instead of the
forest?

Yes. You're trying to charge for man-hours in the forest when the
value received is how much they remove in logs to be turned into lumber.
Professional loggers get a bargain and the Cub Scouts get charged
a huge amount for one tree.

Ah! Nice analogy.

Up until now, my customer (the site) has been charging forest access
on (say) an annual per-person basis, and the number of trees removed
has been limited by the needs of each individual. Now a local co-op
is asking for an annual permit to cover all of its members who have
fireplaces and is asking what we'd charge for this.

And the site and I are trying to figure out how to word the permit
so that (a) we don't feel that we're "giving the access away" if
we charge the same amount we would for (say) three per-person
licenses, and that (b) we can enforce the terms of the permit.

Is this any clearer?

Is there some other way (or ways) of looking at the problem
of charging for access that would be a better fit for data access
through HTTP?

Pay-per-query. Either a fixed amount, some measure of how much
data is returned, or a combination of both. For example, the
subscription says that for $x a month, they get 500 queries per
month, after that it costs $y per additional query. Or make it a
per-day or per-week limit. No need to count individual users (only
what subscription they have access under) at all.

FWIW, at this point most subscribers are "onesies" (which, of
course, is why we've managed to dodge this particular bullet for so
long <grin>).
Think: what kind of behavior do you wish to ENCOURAGE? Don't
charge extra for that. What kind of behavior do you wish to
DISCOURAGE? Charge extra for that. What kind of behavior gives
the customer value? Charge for that. Assume the customer will (or
at least may) modify his behavior to reduce his bill.

Thank you for putting that into words. Again, the site would prefer
to work at a much coarser level than "transaction", but your
principles still apply.

Time to go fix some lunch and (ig!) think some more.
Frank McKenney
--
If you're riding' ahead of the herd, take a look back every now
and then to make sure it's still there. -- Will Rogers
--
Frank McKenney, McKenney Associates
Richmond, Virginia / (804) 320-4887
Munged E-mail: frank uscore mckenney ayut minds pring dawt cahm (y'all)

Jul 17 '05 #7

Frnak McKenney

Colin,

Thanks for the reply.

On Thu, 16 Dec 2004 13:06:57 +0000, Colin McKinnon <co**************@andthis.mms3.com> wrote:

Frnak McKenney wrote:
Back when computer dinosaurs roamed the earth and the precursors to

<snip long story>

This is the kind of problem where a application server can acheive a
solution more simply than a cgi type solution (but as others have said - it
might be a lot better for a number of reasons to redefine the problem).

If you really want to go down this road, you might want to look at SRM (try
google for SRM PHP banana). Although not a full front controller, it may
simplify the process.

? banana ? It's _definitely_ lunch time. <grin>

I'll take a look at it, Thanks for the suggestion.
Frank McKenney
--
"Human felicity is produced not so much by great pieces of
good fortune thas seldom happen, as by little advantages that
occur every day." -- Benjamin Franklin / The Autobiography
--
Frank McKenney, McKenney Associates
Richmond, Virginia / (804) 320-4887
Munged E-mail: frank uscore mckenney ayut minds pring dawt cahm (y'all)

Jul 17 '05 #8

Chung Leong

"Frnak McKenney" <fr***@far.from.the.madding.crowd.com> wrote in message
news:cq**************@newsread3.news.atl.earthlink .net...

I came up with one promising approach using server-side persistent
non-session storage (an SQL table, say) and a fixed "expiration
time" (say 15 minutes). If the server stores the session ID, IP
address, and a timestamp for each HTTP transaction the server can
declare an "quota exceeded" condition (refuse to assign new session
IDs) whenever a given IP address (or userID) has more than N
"unexpired" entries in the SQL table. This comes _close_ to
approaching the old idea of a "session", and I _think_ it can be
done through straight PHP and xxSQL; that is, without having to
rewrite any server internals <grin>. But it's really klugy, and I
sincerely hope some better method exists.

There's essentially what PHP session does. If you generate your session id
by concatenating the IP address with a random number, then you can see how
many sessions is from a particular IP by simply doing a
glob("$session_path/$ip_address.*").

If you want to simulate more closely a traditional session, set your session
expiration time to a low number--say 3 minutes--then add an invisible iframe
in your page that refreshes itself every minute. If a user continues to look
at a page, the session is preserved. If she closes the browser or navigate
to a different site, then the session quickly dies.

Jul 17 '05 #9

General question on charging for data access

Similar topics