[horde] babel?

Michael Hoennig michael at hoennig.de
Fri Oct 27 08:07:56 PDT 2000


Hi Chuck - and everybody else here of cause as well!

> I am nominally - it's something I want working sooner rather than
> later, and I'm in the midst of re-architecting it a bit and converting
> it over to PEAR. So now is a good time to make design changes. =)

sounds good!

> I think that you really want to have messages stored in a database or
> other dynamic store somewhere, to enable searching, generating other
> forms of content, decent locking on updates, rebuilding static pages
> with a new template easily, etc.

And I really think, database storage should be optional ;-) Getting to
your points:

to enable searching
	I am pretty sure, a grep is faster, at least for 
	fulltext search. And is'nt fulltext search mostly needed?
	My approach would be a search engine (this one with a 
	database, yes) anyway. But the orginals should be a stupid
	plain HTML, XHTML or XML file. Reindexing is no problem.
	To me it's even ok to have a HTML version (so static requests
	can be served) based on the original XML version of each
	article.

generating other forms of content
	This is a point, but there are still other solutions. One
	solution is to have an XML format as an origal and a rendered
	HTML version additionally. Another, of cause, is stripping off
	the old layout from the HTML (structured) files and merging
	a new template into it.

decent locking on updates
	there is only one piece of software accessing the pages 
	anyway - so locking should be easy. On the other hand:
	a supporting database is not a real problem for me, I
	just like to have th

A well going, but not huge, forum has about 5.000 to 10.000 articles per
year. So I would probably set up a tree of directories for the articles
(*) plus a control file plus two indexes (a flat index with recent
articles and a tree index with the latest n discussions/threads).
Additionally I had m indexes with n threads each for the archive (oder
stuff).

(*) directory tree like this:

.../index.html
.../recent.html
.../archives.html
.../.control
.../articles/0000000001.html
.../articles/0000000001/0000000002.html
.../articles/0000000001/0000000004.html
.../articles/0000000001/0000000005.html
.../articles/0000000003.html
.../articles/0000000006.html

(only article 1 has some replies in this picture)

My point is, I just don't like document storage in a database. I am fine
with having "administrative" data in databases, like the user-accounts
(the forum should be able to have user registrations and even have a
mode "registered users only"). Even having some extracts stored in the
database, to make some nice search functions easier, is ok to me.

My idea is an object oriented abstraction layer for all data and all
documents we have to deal with. Then, there can be one implementation
with plain files, one with a database - or mixed: administrative objects
using the database, documents use filesystem.

> Here are my arguments for generating on demand:
> 
> 1. You make sure that you don't do unnecessary work. When the page 
> is updated, and needs to be viewed, you regenerate it.

My argument here is, virtualla every single newly created article
will be viewed within the next 5 minutes. At least in forums which
go well enough that it is worth to bother about efficiency at all.

> 2. It makes it easier to check for things like a changed configuration
> (template, whatever) and regenerate those pages, also. If you
> generated on post, and you wanted to change the look of the whole
> site, you'd need a seperate method to regenerate everything, and you'd
> have to do it all at once, instead of having old articles converted as
> they were viewed.

One point is, that I do not really believe that rebuilding is so much
faster than stripping off the old stuff and merging a new template. The
other is, as mentioned before, an XML format for an original storage and
a HTML rendered version is ok to me. Although again, this could be
optional.

> 3. It seems easier to ensure that you don't munge things. 

I personally trust filesystems much more than databases ;-) But you
meant the forum software anyway ...

> If you're
> generating on view, and you manage to get two simultaneous requests,
> well, you generate the same page twice, and maybe one overwrites the
> other - no big deal. And even if one gets and old copy and misses an
> update, then the page will be regenerated the next time it's viewed 
> in order to be correct. If you generate on post, you have to be much
> more careful about locking files when outputting to make sure
> you don't miss anything. At least, that's how it seems to me - 
> I could have this wrong in my head, though.

I don't see this problem at all. With a single accessing program,
locking is never a real problem. The problem just comes up if multiple
programms access the data directly.

> Thoughts? I'd definitely be interested in seeing this ball rolling.

Me too. And I'm really interested in defining a layer of classes which
could be implemented in different ways - with a database and with flat
files.

Does it make sense to take the existing babel as a base or does it make
more sense to start from scratch? What do you thing, although I took a
look at the sources, I've not even bothered to make them run. Do they
run at all? You know more about that anyway.

	Michael

-- 
Boytinstr. 10 - D-22143 Hamburg - Germany ----- http://www.hoennig.de
home:++49 40 67581412 office:++49 40 23646910 mobile:++49 177 3787491
http://www.binational-in.de -- Forum für binationale Paare & Familien
http://www.hostsharing.org - Webhosting-Spielregeln mal neu definiert




More information about the horde mailing list