[cvs] [Wiki] created: ChucksHorde4Thoughts
Chuck Hagenbuch
chuck at horde.org
Tue Jul 8 02:33:02 UTC 2008
chuck Mon, 07 Jul 2008 22:33:02 -0400
Created page: http://wiki.horde.org/ChucksHorde4Thoughts
[[toc]]
+ Chuck's Horde 4 Thoughts
++ Unsorted
horde debugging:
http://www.sitepoint.com/blogs/2008/05/13/useful-in-browser-development-tools-for-php/
chuckhagenbuch: we could use standard ways of gathering debug output...
chuckhagenbuch: we could have a 'debug' driver...
chuckhagenbuch: specify an extra parameter like 'debug_driver' to say
what real subclass
to use...
chuckhagenbuch: it would be able to intercept all calls and return
values. that'd be
better than nothing...
AlkernF: could it be generic enough to work with
everything? and how do you intercept calls?
chuckhagenbuch: it would probably have to be written for each driver type
chuckhagenbuch: and by intercept, i mean that you would have it
instantiate a real
driver, keep it as a variable, and then every method call, you can see
the paramters,
then you pass them to the real driver, look at the return value, then
pass the return
value back...
chuckhagenbuch: (by driver type i mean Connection, Prefs, etc... -
API, i guess)
// return a 304 if the file hasn't been modified since the
If-Modified-Since date
// no point in resending all the data if the browser already has it cached
if (function_exists("apache_request_headers")) {
$headers = apache_request_headers();
if ($headers['If-Modified-Since']) {
$ims = strtotime($headers['If-Modified-Since']);
if ($ims >= $serve_data['modified_time']) {
Header ("HTTP/1.0 304 Not Modified");
exit(0);
}
}
}
horde apps - "instance" of a horde app == installed horde app + a
group of horde_policies that configure it
let those policies be named
instead of using shipping/foo api calls, use $instance->foo()
$Horde->api->method() (chaining)?
I just went through my first signup process that required an
SMS-capable device for confirmation. It also didn't make me pick my
credit card type, and instead used my country code (+1) to decide on a
card detection algorithm.
update_client.pl /modules/future_contribution /modules/future_signup
I think I found now the right mysql-server settings, with which the
performance is quite Ok. Increasing the sort_buffer_size was one of
the changes that helped.
skip-external-locking
skip-thread-priority
key_buffer = 64M
max_connections = 1024
max_connect_errors = 1000
max_allowed_packet = 8M
table_cache = 512
sort_buffer_size = 8M
read_buffer_size = 1M
read_rnd_buffer_size = 2M
myisam_sort_buffer_size = 64M
thread_cache_size = 50
query_cache_size = 128M
tmp_table_size= 1024M
thread_concurrency = 12
wait_timeout = 60
interactive_timeout = 60
log_slow_queries
add dynamic finders (find_by_name, find_by_id, etc.) to Rdo Mappers or
Horde_Db_Model or whatever
Controller classes/objects vs. Action classes/objects vs. Resources vs. API
how to develop? give up central config?
http://www.w3.org/Provider/Style/URI
index.php - global dispatcher
how to do themes/custom templates? chain local -> app -> horde?
a horde 4 installation:
config/
lib/
apps/
public/ <- with app/ subdirs containing images, etc.
everything routable goes in apps/
apps/
login/
help/
prefs/
admin/
etc...
... auto-install web files to a writable dir, either in web ui or in
cli? keep apps self-contained that way?
app name is the first part of the route > /login
subdomain support
route aliases
an app should uncompress over a horde/ dir - /config/app/*.php ->
config dir is compiled/cached
horde is not rails. it is designed as a container for multiple,
collaborating apps
horde apps are configured by Horde_Policy objects
need Horde_Db, whatever implements DML, DDL, and SQL - Mad? MDB2?
- prefer PHP over XML
merge Rdo and Mad into Horde_Db
allow for overriding the mappers so that non-SQL can be used, but,
default to SQL/sqlite and leverage it
framework repository/module
Horde/lib/...
Rampage/lib/...
use subpackages or multiple *.xml for packages to avoid silliness?
apps should be installable into a horde container. shouldn't be tied
to the app name - keep imp, krono, etc, but install as mail, cal,
events (should be able to install two versions of krono w/ different
permissions - see HordeSpaces)
installing gives a slug, that slug manages config, templates, themes,
perms, etc.
----
figure out how to merge luxor into Chora
----
for now, build Horde_Content_* based on Rdo, then move to Horde_Db
Horde_Db provides Horde_Db_Mapper which creates Horde_Model_Base objects
apps have a config/ dir, but that's just defaults and defining base
routes, polices, etc. user settings are stored in the db or a global
directory.
should have parallel web and cli configuration and installation/update
tools; web requires webserver to have write access to a config/ dir
and to public/; cli tools do not (if run as another user)
Horde 4 app - a Horde 3.x app updated for PHP 5 and to use the latest
libraries
Rampage app - "RAD" (rapid application development) MVC app that uses Horde 4
/horde/page/ -> dispatcher for Rampage modules w/ views (overridable),
routes, controllers, etc.?
have generic views for rampage_login, rampage_admin_*, etc.
configuration:
config/routes.php
config/routes_local.php -> do this for all config files
Horde_Content_Index -> horde-wide search
Random Horde Ideas
mini-cms for building your own sidebar/menu/etc?
- shortcuts to any bit of horde
labels labels labels
keywords also or just labels? probably just flexible labels
"smart folders"
Getting Things Done support? (other apps that do it - Tracks, Kinkless
GTD, Midnight Inbox)
make mnemo into more of a snippet keeper? sort of like a personal cms
- or wiki. carry the encryption feature through to other kinds of
content
create an outliner!
tags/labels for mail
rename virtual folders to smart folders? too apple?
freetext boolean mail searches:
apples & oranges
apples | oranges
apples ! oranges (apples but not oranges)
apples & (oranges | lemons)
security of redirects:
http://www.xssed.com/mirror/39494/
This is sort of an interesting one. For the actual attack he merely
figured out that we are base64 encoding the successurl and reflecting
back whatever is there. The interesting thing is that merely
filtering the unecoded data is not going to save us here. The string
was javascript:alert(/XSS.By.Mityo/) and was being loaded into the URL
field of a meta redirect. So our max filter of strip_tags is useless.
It just illustrates the rationale for Phase 2 of the security build
out where we have to be careful when we are dealing with redirects.
In this particular case, we need to make sure we are getting a valid
URL format. That will prevent javascript insertions. But we also
want to make sure the URL is not redirecting outside the intended
domain for some phishing scam. In this case I will fix the problem by
validating the URL on the cons/login.inc.php where the data is coming
in but will also try doing it on the generic show_redirect_message()
call if I think I can do so without breaking other pages.
Event-driven apps:
"Understanding and implementing this event model can free your
application from the constraints of defined elements. For example,
instead of applying an event listener for each link in a menu, you can
assign a single listener to the menu item itself and retrieve the
event target. That way you don?t need to change your script when the
menu gets larger or when links get removed from it."
http://yuiblog.com/blog/2007/01/17/event-plan/
tagging/instant hierarchies as specialized permission-based search
RBAC
what is horde?
groupware?
horde data services?
horde data access?
ui layers
be the php dojo framework? or the php yui framework?
see http://tigermouse.epsi.pl/ ?
or, don't do desktop-like widgets? see UI design bookmarks
move away from gettext, at least as a default? midgard i18n notes:
http://www.midgard-project.org/discussion/developer-forum/midgard-s-multilang-support/
try to rely only on thread-safe extensions?
reduce dependency tree
avoid globals and non horde-namespaced functions/methods in framework
and core app code
class-based registry apis
against edge cases: http://www.bakesalehq.com/contents/show/12/
features from Prado? http://www.urdalen.com/blog/?p=198
use functions where appropriate for shortcuts/helpers, like Mike's
t("translated string") function? but would be horde_t? would call
configured translation system
helper sets for dojo, protaculous, yui - simple functions like
dojo_editor(), dojo_pane(), yui_map(), etc. Load with something like
Horde/Layout/Helpers/YUI.php, etc. See http://www.ngcoders.com/projax/
Horde as a set of apps and methodology needs to pick a js lib, pick a
template methodology, etc. - this is Rampage Horde as a framework can
allow for flexibility
To make it even better, separate the control logic from the
presentation. That way, back could be reverse, etc. I do this in all
my forms since application logic and presentation "word play" are two
distinct things to me. This is what I use:
<form method="post" action="form.php">
<input name="submit[back]" value="reverse" type="submit" />
<input name="submit[next]" value="speed ahead" type="submit" />
<input name="submit[home]" value="no place like home" type="submit" />
</form>
Then, you can have a simple routine that captures submit actions
regardless of the presentation value. You check for the array submit
-- count 1 and whitelist against the acceptable values. A multi-row
table can expand upon the theme by using this: submit[edit_3],
submit[delete_3]m submit[edit_5], etc.
caching
make sure Rdo and other services allow dropping in caching rules
http://sebastian-bergmann.de/pages/talks.html
phpunit - @test markup in methods
phpunit + selenium
cruise control?
really hope google will integrate any product of theirs with any other
products of theirs? receive an email, transform it to document, add
spreadheet, add notes, add bookmarks saved from search history and a
link to an event in calendar anyone?
From nyphp-talk:
The other day I had to get an application started in a hurry. It's
doing something useful at < 700 lines, but I'm considering options that
could grow it out to about 10 times that. It depends on a "core
library" that's < 500 lines. This library deals with common issues in
string handling, parameter handling, and HTML form generation.
About 10% of the application, or 70 lines, is a microframework
that's loosely built on Struts. About 20 of those lines are in 2
functions which would be generally useful for microframeworks (such as
file_exists_in_include_path()). Like Struts, the microframework
chooses an "action" based on form parameters: the action then chooses a
"view" -- a "view" is basically a template that a designer can edit
which can be supplemented by an optional "query" which pulls stuff out
of the database. Like Ruby-on-Rails, the microframework uses
convention instead of configuration: the dispatcher computes an "action
name" based on query parameters, and uses that to compute a
filename... It checks that the file exists and executes it with the
"require method".
The microframework uses no object-oriented techniques. That's not
because I have any antipathy to OO, but because I didn't need it, and
I like writing my actions, queries, and views in a style that "feels
like PHP".
Yes, my microframework is nowhere near as powerful as CakePHP or
Symfony. Yet, it's more flexible, because I can codesign it with my
application. Because it's so simple, I can easily adapt it to do what
I want. If I decide I really hate it, I can write a new one in an
hour. I'm an expert on it, because I developed it, and I wouldn't
have to take on the technical, social and emotional burdens of
"forking" an open-source codebase if I wanted to make a change in direction.
I'm moving towards a vision of web app architecture where we move
towards shared vocabulary and standardized interfaces. Rather than
working with a "comprehensive framework" that does everything, I'd like
to have a "framework construction set" that contains a number of
elements that I can take or leave."
Resources:
http://www.ryandaigle.com/articles/2006/06/30/whats-new-in-edge-rails-activeresource-is-here
mixins: http://www.symfony-project.com/book/trunk/17-Extending-Symfony
split db ideas: http://pear.php.net/pepr/pepr-proposal-show.php?id=359
http://dataspill.org/pages/projects/ruby-activeldap
More php features to look in to:
__toString works everywhere
SPL features: Regex Iterators, SplFileObject CSV support, Caching Iterator
Data: stream support
DateTime and DateTimeZone classes
set date.timezone ini setting automatically based on user?
Search engine sitemap stuff - of use at all? maybe support in rampage cms
http://p7.hostingprod.com/@www.ysearchblog.com/archives/000437.html
5. I want a registration info tab like Inbox.lv where they can change
their personal stuff they put on file with us on the signup forms.
14. We may need Windows address book synchronization(this is a
feature that fastmail is adding, and hotmail already has, so I guess
we will have to also?) It is not a must in my books.
17. I want to add a new feature next to the attach button that is like
send message after attached, so if they are uploading a big file the
can leave and it will be sent automatically.
19. We MUST have an easy user interface. Fastmail has lots of features
and they try to make it where you can do everything in 2 clicks or
less. We need to try to do this. Fastmail is all bunched up and looks
like shit though. We need to make ours more of a packed with features
like fastmail, but spread out like AOL has or fastmail. This will
attract all the old people and beginners of the internet who have just
gotten off of AOL and moved to DSL Fastmail looks like it is only made
for advanced users and is hard to get used to. We need to
Have a main Navigation bar and which is on every page, which has all
the mail icons that people use the most like, compose, inbox,
addressbook, options, and the main Navagation bar should be on every
page at the top.Then we wil have a subnavagation bar for each other
page , for example, if you were to hit the calander icon on the main
navagition bar that is on the top of EVERY page, then it would take
you to the calander page and show you the calander and the
subnavagation bar would have all the calander icons like add events
ect. I was thinking, in IMP we could have the logo at the top left
coner of the page, then on the top right we could have all the main
navagation icons. Both the logo and the main navagitions would be o
every sign page in IMP, so it would be easy to get around. Then the
sub navagation bars coulkd go where the main navagition bar is now on
IMP, understand?
2. Make a bounce button like fastmail.fm. This is how fastmail
explains their bounce button:
'Bounce' takes the currently selected emails and sends back an email
to the addresses the email(s) came from saying basically that 'the
email address does not exist' in a standard internet email protocol
way. Some more organised spammers remove these from their lists. After
sending the bounce response, the messages are deleted."
* If accessed with a browser, public folder is also a personal
web-site, accessible at http://username.fastmail.fm
* Provide tool allowing synchronization of Outlook Express etc address
book with FastMail contacts, possibly using LDAP
* Use JavaScript for browsers that support it to speed up many
actions, such as searching through the address book
* A general notification system, so you can send a pager message, SMS
message, instant message, or short email
eGroupWare over Horde reasons
Linking: There is the "infolog" for linking items. An infolog item
can be a to-do, call, or note. It can link to the addressbook,
projects, calendar, or another infolog item. That is very flexible.
Access Control: Under Preferences, there is a "Grant Access" link for
the calendar, addressbook, infolog, and projects. It allows you to
select Read, Add, Edit, Delete, and Private access for each group and
each user. Again, very flexible.
Categories: Multiple category selection is allowed in the addressbook,
projects, calendar and infolog.
Custom Fields: I can create custom fields.
PHP_SELF
Executive summary: PHP_SELF intentionally includes extra URL garbage (or
valuable URL variables, take your pick) tacked on by the user. Don't use
it without knowing what it does.
Here's what you get when you hit the URL:
http://example.com/info.php/testing1?testing2 :
_SERVER["REQUEST_URI"] /info.php/testing1?testing2
_SERVER["PHP_SELF"] /info.php/testing1
_SERVER["SCRIPT_NAME"] /info.php
Get it? If you don't want that extra stuff tacked on by the user, use the
correct _SERVER variable. If you use REQUEST_URI or PHP_SELF, be aware the
user can affect the contents of that variable. 99% of the time, you want
SCRIPT_NAME, not PHP_SELF.
By the way, here's another test:
http://example.com/info.php/testing<script>?testing :
_SERVER["REQUEST_URI"] /info.php/testing%3Cscript%3E?testing
_SERVER["PHP_SELF"] /info.php/testing<script>
_SERVER["SCRIPT_NAME"] /info.php
Note that the REQUEST_URI variable, which comes from Apache, is encoded,
while the PHP_SELF variable, which comes from PHP, is not. So PHP 5.2.0
still makes it possible to shoot yourself in the foot, and as I've pointed
out below, well-known PHP authorities actually recommend that you do so.
Here's the email that I sent at in July 2005:
Subject: Re: [nyphp-talk] $_SERVER['PHP_SELF'} not working?
Date: Friday 22 July 2005 12:05 pm
From: Michael Sims <jellicle at gmail.com>
To: NYPHP Talk <talk at lists.nyphp.org>
On Thursday 21 July 2005 17:16, Dan Cech wrote:
You could put:
$_SERVER['PHP_SELF'] = $_SERVER['SCRIPT_NAME'];
into one of your common include files.
Yes. I'm afraid I don't understand this entire thread. Apparently
because of the numerous PHP developer articles recommending it, and
because of the php.net page which for whatever reason lists it first on
the list of predefined variables, people are using PHP_SELF when they
really want SCRIPT_NAME. SCRIPT_NAME solves all the problems mentioned
in this thread - it's just the script name, without any extra garbage
that might be tacked on by the user. PHP_SELF explicitly includes that
extra garbage, so solutions in this thread that involve stripping the
garbage off of PHP_SELF to make it safe are really, really missing the
point - just use SCRIPT_NAME instead. Please don't use FORM ACTION="";
according to the spec, what the browser does with that is undefined, so
even if it works in current browsers, it might not work in future ones.
People can be forgiven for making this mistake -- I'm here holding my
copy of _Learning PHP 5_, and it recommends on page 8 and again on page
86 the use of PHP_SELF for self-referencing forms, ahem -- but it's time
to put it to bed: PHP_SELF is unsafe for any usage where it is echoed
back to the page.
SESSIONS:
I'll try to reply to this and some other people who replied to my
previous message.
I'll start with my background. I've often been the person who the
buck stops with --
somebody else develops an application that almost works (perhaps even
puts it in
production) and then I have to clean up the mess. The app might be
written in PHP,
Java, Cold Fusion, Perl, you name it. I've learned to see session
variables as a "bad
smell".
When I develop my own applications, I use cookies for
personalization and caching. I
use the authentication system described in
http://cookies.lcs.mit.edu/pubs/webauth:sec10-slides.ps.gz
this mechanism can carry a "session id", which in turn can be
used a key against
application state stored in a relational database. I think through
the boundary cases,
and find that my greenfield apps behave predictably -- my only woe is
that you'll
discover that browsers have a lot of undocumented behavior connected
with cookies, form
handling, and caching. All problems that you still need to fight
with if you use
sessions, see the comments for
http://www.php.net/manual/en/function.session-cache-limiter.php
----
The context of this is that the average web application is poor in
the areas of
usability and security: recent studies show that 80% of web
applications have serious
security problems
http://www.whitehatsec.com/home/resources/presentations/files/wh_security_stats_webinar.pdf
Jacob Nielsen's website has been chronicling the sorry state of
web application
usability:
http://www.useit.com/
Perhaps the top 20% of programmers can write applications with
$_SESSION that don't
have serious security and usability problems, but what about the other 80%?
----
(1) Session variables are treacherous. Odd things can happen in
boundary cases, such
as when sessions expire, or when you are targeted by session fixation
attacks.
http://shiflett.org/articles/security-corner-feb2004
I've looked at many apps that use sessions that seem to be
working... Until you walk
away for two hours, come back, and discover that you're logged in as
somebody else. I
suppose I could have spent hours or days tracking down an intermittent
problem, which
involved some confluence of browser oddness (IE was fine, Firefox was
screwy), the
behavior of the session system, and crooked logic in the application.
Or I could use
cryptographically signed cookies to implement an authentication system
which won't give
me surprises in the future.
Anybody can write applications that work 95% of the time with
$_SESSION. Getting the
other 5% right requires a deep understanding of state and
statelessness on the web...
Which is what (many) people are trying to avoid when they use
$_SESSION variables.
There are more than twenty configuration variables that affect the
way sessions work
under PHP. Incorrect configuration of any of these can cause
applications to fail,
often in intermittent ways. The use of a custom session handler can
have unpredictable
effects on security, reliability and performance.
Other languages are a lot worse than PHP -- the use of the "scope"
concept in
languages such as Cold Fusion and Tango makes it easy to use a session
variable without
realizing it... Resulting in an application that "works" sometimes,
but fails in
mysterious ways.
(2) Session variables are bound to a particular language. In the real
world, I work
with legacy systems that might be written in other languages. I might
have some old
pages in Cold Fusion that work just fine, and I won't rework them in
PHP until I've got
a good reason. If users can set a customization parameter, such as
the background of a
page, it's easy to write a cookie that all languages can read.
Applications stuck in
the session variable roach motel aren't as maintainable and portable.
(3) PHPSESSID. Do I need to say more? I consider the client that
wants user tracking
and can't accept cookies, so all the pages on their
site look like
http://www.example.com/about_us.php?PHPSESSID=**pseudo-random blob**
Three months later they come back and wonder why their site isn't
being indexed in
Google. Yes, there's a saner way to use this feature, but this
"cure" to privacy
violation is worse than the cookie "disease", since session ids will
leak out through
referrers, bookmarks, links that people cut-and-pate...
(4) The back button. When somebody asks a question about sessions on
a forum, they'll
usually ask another question a few days or weeks later: "How do I
disable the back
button?"
The underlying problem is a deep aspect of the structure of the
web. There is certain
state information that's particular to a request (GET and POST
variables) and certain
state information that has a more persistent scope (cookies, session
information, a
relational database.) The back button makes it possible for these two
things to get out
of sync.
Ultimately, we need a systematic strategy to deal with this. One
pattern is to put
the complete state of the application in form variables. Applications
that use this
pattern always work perfectly with the back button. This pattern
doesn't work always
(hitting the back button shouldn't cancel your order on an e-commerce
site), but it
works often... For instance, you can use hidden variables to hold
onto form variables
for complicated forms that spread over several pages,
(5) Multiple windows. I think it's a human right to be able to have
more than one window
open on a web site. If I'm shopping, for instance, I'd like to be
able to look at two
products simultaneously. An application that keeps state in form
variables doesn't care
how many you have open. If you're looking for jobs at an organization
that uses
taleo.net's software, you'll find that it uses trickery to prevent
you from having more
than one window open... So you can't look at two jobs at once, or
look at the job
description while you're filling out the application. I suspect that
they did this
because they don't want to spend forever debugging "race conditions"
that could be caused
by a user acting in two windows simultaneously.
Session variables introduce problems of locking. PHP gets an
exclusive lock on the
session for each page displayed. This hurts the performance of pages that use
dynamically generated images and Javascript, and can mysteriously
deadlock AJAX
applications.
(6) Scalability, Reliability, and all that. This is a tricky one,
because it depends
on particulars. Sessions can be lightning-fast in systems that keep
them in RAM, such
as Java and Cold Fusion. The default session handler in PHP uses
files, and is probably
faster than a relational database in a direct comparison: however,
the session handler
will load all of the data into RAM, whereas a relational
implementation may only need to
load information when it's needed. Keeping information in POST
variables or cookies also
involves a tradeoff -- this is as scalable as it gets so far as server
resources, but
requires that the state be passed back and forth between the browser
and server. This is
no big deal if the state is 500 bytes. It's unacceptable if the state
is 500 megabytes.
In most cases, it starts looking expensive when we're passing an
extra 10k-100k around.
I've recently been working on a legacy app that contains a query
(select a subset of
items) and reporting (display user-selected fields of those items)
function. The
interface between those modules is simple: the query system passes a
comma-separated
list of item identifiers to the reporting system. I like this,
because it meant that
one system could be changed without affecting the other. I had to
update the app so it
would work with a changed database schema, so both sides needed some work.
I discovered that the app was passing the item list as a session
variable. This worked:
unless I was using the application in two windows at a time. In that
case, a query in
one window would change the report delivered in another window. I
thought about it, and
realized that in this case, result sets would always be under about
10k, and usually be
around 1k. Therefore, it made sense to pass this as a hidden
variable in the form and
ditch the session variable.
This shows the kind of problems that regularly turn up in the
applications that
developers "throw over the wall" to testers and clients. Choose a
session variable, and
your application behaves mysteriously for a user who didn't respect
the "one window at a
time" assumption you made. Passing hidden variables in forms, on the
other hand, might
work OK when you're testing with a small data set over a LAN, but
could rapidly become a
performance nightmare for dialup users using a production database.
Performance can be improved in a number of ways: for instance, by
delta-sigma
compressing the item list, or creating a "form scope" variable that's
keyed against a
unique identifier in the form. Either way, quality web applications
take quality
thought.
(7) Lack of engineered application state: Engineered Application
State is the gem of
database-backed web applications.
If you keep the state of your application in a relational database,
you need to ~design~
the state of your application. You need to ~think~ every time you add
or change a table
in your relational database. You can add a new variable to your
application as easily as
typing '$'.
Desktop apps keep the application state in a tangle of pointers. C
and C++ applications
tend to contain 5 or more defects per thousand lines of code. Errors
show up in data
structures over time, just as mutations occur in your cells. Memory
leaks, application
hangs, and crashes are cancers caused by these mutations.
PHP apps die at the end of each request, and are reborn for the next
request. They
don't accumulate errors over time. Web application environments such
as Java and Cold
Fusion that involve a long-running process regularly hang or crash and
require restarts.
When is the last time you've had to restart PHP?
A database protects you from errors in multiple ways. Transactions,
for instance,
protect against data corruption caused by crashing scripts. It's easy
to write
$_SESSION["logged_in"]=true;
in one place and
$_SESSION["logged-in"]=false;
in another, introducing unpredictable behavior and security holes. A
relational
database will give you an error if you try something like that.
-------------
Can users of $_SESSION avoid the seven deadly sins?
Yes.
In practice they don't.
Paul,
That looks like a lot of info to digest without specific examples. Is
there a book or
other resource on session management that you recommend that deals
with these issues in
more detail?
Thanks.
-Leo
I'm not aware of one, but I wish there was. I think the question
isn't so much "session management" but about how to manage state in a
stateless protocol -- sessions
are one abstraction for doing that, but other abstractions exist too.
I think the best approach here is the "Pattern Vocabulary"
approach. There are
certain practices, that when applied to an application, have certain
results.
For instance, there's the pattern of "Stateless Server" -- the
complete state of the
application (or subsystem thereof) is kept in hidden POST and GET
variables. You accept
some limits, but get some real benefits: infinite scalability, no
headaches with the
back button, no need for cookies...
You might try the above and then notice that you're passing 100K
around in your hidden
form variables... People are complaining that your app is slow. Now
you can generate a
unique id each time you draw a form ("Generated Form Scope", for lack
of a better term.)
You can stuff your "hidden" variables into the database under this
key, and restore
them when the key comes back... If your code is organized right (does
something like
$vars=$_POST, and only looks at $vars afterwards), you can do this
transparently to the
rest of your app.
The same kind of thinking can protect you against certain kinds of
back button woes --
you can at least stop people from submitting the same form more than
once, by checking
to see if a form with that unique id has been submitted before.
"Shopping Cart" is another pattern. People often use session
variables to handle
shopping carts, but that's really not ideal from a user interface
perspective...
Ideally, each instance of a shopping cart has it's own unique id...
Imagine we want to
make an e-commerce site that behaves like amazon.com:
(1) User visits e-commerce site from a home computer -- a long-term
tracking cookie gets
stuck on their browser
(2) User adds item A to their shopping cart... A new shopping cart is
created with id
#101, associated with the tracking cookie. (3) User adds items B,C,D,
and E to their
shopping cart in the course of 30 minutes of browsing. Each time an
item is added, we
add a row to a table in the database that links the item id to the
shopping cart id.
(4) 4-year old hits reset button
(5) User comes back to e-commerce site... He's happy to find his cart
is still there.
User creates account #202 to check out. Shopping cart #101 is
associated with account
#202
(6) User checks out shopping cart.
(7) User comes back a week later, wants to buy a few more items. The
site recognizes
who he is. He adds two of item A and an item F to a newly created
shopping cart with id
#102, associated with user account #202.
(8) User goes to work, logs in... The system sees that he has
shopping cart #102 open.
He adds item G, and then checks out.
(9) User learns that he can trust this site to work correctly and
becomes a loyal
customer.
It's nice that we've got a historical record of the shopping cart
after the fact, but
there's a more important point -- we could have lost the customer's
dollar at many points
in the above transaction if we were using a $_SESSION based cart.
The session wouldn't
have survived step 4, for instance. A good user interface isn't
academic here... It
puts money in our pocket.
The above scenario is complex, and it might not be fair to expect that a
first-generation shopping cart has those features. A $_SESSION-based
shopping cart would
need to be completely reworked to add the features above. A cart
that uses a unique
"cart id" and relational back end, will be a lot more maintainable...
You could even
start out using $_SESSION to keep track of the "cart id", then keep
it in a cookie,
then associate it with a user name, add the facility to promote an
anonymous cart to an
authenticated cart and so on. Starting with a good design, we can
provide the interface
that we ~want~ to provide, not that one that our abstract layer
~forces~ us to provide.
In regards to slides 29 and 30, can you elaborate and give a more detailed
example what they are trying to say? Are they saying that the session key
should contain a hash of the data? Or does the hash become the "salt" in
crypting the data? Finally, how does doing that make it easier to prevent
circumvention and forgeability.
Let's take it a step at a time... Imagine we've got a token of the
following format...
$token="$user_id:$session_id"
The session_id doesn't have to be unpredictable -- it could could from an
auto_increment column in a database table... With the caveat that
people could estimate
the usage of your site by looking at the session id's.
You could put this in a cookie, and it would work quite well, as
long as you didn't
have users who knew how to look at or change the cookies. An attacker
who understands
cookies can easily change the user id, or session_id.
To protect the cookies from tampering, we could do something like
$hash=sha1($token);
$signed_token="$hash:$token";
We could check the integrity of the token by recomputing the hash
and see if it
matches the one in the signed token. This protects against accidental
damage, or very
simple attacks. Still, it's quite possible that an attacker could
guess what you're
doing: it wouldn't be safe at all in an open source system.
That's where the salt comes in... For a particular web site, we
create a random
"salt" that, effectively, gives us a unique hash function for our web site.
$salt="... a random salt defined in a per-site configuration file ...";
function private_hash($token) {
global $salt;
return sha1("$salt:$token");
}
$private_hash=sha1("$salt:$token");
$signed_token="$private_hash:$token";
Now, nobody can alter your tokens unless they know your salt.
Because the tokens are cryptographically signed, the token itself
is a proof that
somebody has logged in -- you don't need to look at the database or
keep ~any~ server
side state. This makes it a highly scalable system... This basic
approach is used on
some of the biggest sites in the world, such as yahoo.com.
Except for one little detail: replay attacks.
Nothing stops a person from saving his token and presenting later
-- after his account
may have been deactivated, or after associated session information
has been purged (an
error condition.) An attacker that gets the person's cookie jar, or
who intercepts
network traffic, can also steal the token.
It's not possible to completely protect against sophisticated
attacks where a hostile
party controls your network without installing complex software on
both ends, and
solving some intrinsically difficult problems having to do with mutual
authentication.
Let's just say that the developers of SSL have solved these problems,
and that you
should use SSL for applications with the strongest security needs.
We can, however, make replay attacks a lot harder by adding a
timestamp... Now the
token looks like
$timestamp:$user_id:$session_id
Now we're keeping a table on the server that looks like
create table session (
session id ... session id ... primary key
user_id ... user id ...,
last_updated ... timestamp ...,
begin_time ... timestamp ...,
end_time ... timestamp ...
);
Now we've got two constants:
REFRESH_TIME: how old a timestamp is before we issue a token with a
new timestamp and
write the timestamp to the last_updated column.
EXPIRE_TIME: how old a timestamp is before we eliminate the session.
You might think you could put the client ip address in the token,
and lock the
session to an ip address to make it harder to steal tokens. I tried
this, but found out
that some of the largest ISPs (such as aol) have a proxy server that
makes users seem to
"jump around". You can do it if you know people are logging from a
sane ISP, but you
can't do it in general.
---
This system can be improved in numerous ways, such as adding
anonymous sessions,
operating in a split http/https mode, and caching authorization
system in the token.
If you're worried about information leakage (you don't want
someone to know that he
got session 88427 yesterday and 99105 today), you can encrypt the
token. But be
careful... It's easy to use cryptography the wrong way: don't rely
on encryption to
protect token integrity against tampering -- most of the obvious
schemes don't really
work.
cookie usage:
20 per domain, 4094 characters (bytes) in the value
Horde_Model -> Horde_Rdo_Model extends it
Horde_Type
Page/Block object
- how to return block from driver, inherit Block methods, but also
inherit Rdo_Base?
Mapper! _Mappers are the drivers_
Nag - tasks are a model
different models for different sources of tasks
so maybe horde_rdo_model isn't extension but delegate?
types are string, etc.
types can be used by rdo as well as by forms (models)
form helpers go into horde_view helper pack
Horde_Model:
validation:
validatesPresenceOf
validatesUniquenessOf
validatesAcceptanceOf
validatesConfirmationOf
one database, one real filesystem space
no globals
webroot has:
index.php
.htaccess
assets/ (css, images, js)
mod_rewrite rules
everything else pear-installable
make assets pear installable somehow
viewbuilder/pagebuilder - custom views
command line and web service actions (still api/method/params)
catalyst::message() - replaces logmessage - fatal, notification,
observer - has a return value (?)
session object management
cms for rampage based on (replacing) ulaform + wicked + giapeto
horde_form
- db and xml descriptions instead of just php building
reconcile driver architecture with Rdo Models
apps provide models instead of forms?
apps provide route bundles? (if frontcontroller)
forms are models!
reconcile models and mappers
what do routes point to (models? mappers? views?) -> controllers
controllers handle mappers vs. models?
composite mapper? (turba, etc.)
After reading that theserververside.com entry, it seems like we've
been doing this in Solar (framework for PHP5) for a little while now.
Essentially, after processing a form, you call
$this->_redirectNoCache('controller/action') and you shouldn't get any
re-POST troubles.
Boring code from the page-controller follows.
<http://solarphp.com/svn/trunk/Solar/Controller/Page.php>;;
/**
*
* Redirects to another page and action after disabling HTTP caching.
*
* The _redirect() method is often called after a successful POST
* operation, to show a "success" or "edit" page. In such cases, clicking
* clicking "back" or "reload" will generate a warning in the
* browser allowing for a possible re-POST if the user clicks OK.
* Typically this is not what you want.
*
* In those cases, use _redirectNoCache() to turn off HTTP caching, so
* that the re-POST warning does not occur.
*
* This method sends the following headers before setting Location:
*
* {{code: php
* header("Cache-Control: no-store, no-cache, must-revalidate");
* header("Cache-Control: post-check=0, pre-check=0", false);
* header("Pragma: no-cache");
* }}
*
* @param Solar_Uri_Action|string $spec The URI to redirect to.
*
* @param int|string $code The HTTP status code to redirect with; default
* is '303 See Other'.
*
* @return void
*
*/
protected function _redirectNoCache($spec, $code = 303)
{
// reset cache-control
$this->_response->setHeader(
'Cache-Control',
'no-store, no-cache, must-revalidate'
);
// append cache-control
$this->_response->setHeader(
'Cache-Control',
'post-check=0, pre-check=0',
false
);
// reset pragma header
$this->_response->setHeader('Pragma', 'no-cache');
// continue with redirection
return $this->_redirect($spec, $code);
}
More information about the cvs
mailing list