[cvs] [Wiki] changed: ChucksHorde4Thoughts

Tue Jul 8 16:44:25 UTC 2008

chuck  Tue, 08 Jul 2008 12:44:25 -0400

Modified page: http://wiki.horde.org/ChucksHorde4Thoughts
New Revision:  1.2
Change log:  but wait, there are more text files

@@ -921,4 +921,550 @@
          // continue with redirection
          return $this->_redirect($spec, $code);
      }

+
+
+
+apps provide models instead of forms
+apps provide route bundles
+apps provide controllers
+
+
+
+seekable iterators?
+use of ArrayIterator
+adding LimitIterators and FilterIterators on top of Rdo
+
+
+
+match up RDO with making resources first class - a wiki page, a task,  
etc. all get a URI
+
+Meanwhile HTTP was designed for access to resources, the ìprimary  
keyî being determined by itís URL (vs. having to worry about the  
insert id). If you think ìdocumentsî, itís clear thereís no need to  
make a distinction between creating and updatingócreating a document  
results in the first version. Updating means overwriting an existing  
document with a new version. But in both cases the client is POSTing  
the same thing and does not need to be aware of whether the document  
already existed or not.
+
+Meanwhile a common first demo app for server side frameworks is a  
CRUD example. The implication here is frameworks place a strong  
emphasis on the database, while HTTP is largely ignored (itís rare to  
even see HTTP status codes as a fundamental part of a framework).
+
+Avoiding a long filesystem vs. database discussion (like the need for  
virtual file systems with extensible properties) suffice to  
sayóconsider how Dokuwiki stores wiki pages 1-to-1 as files compared  
to MediaWiki. What makes more sense to you? Perhaps our websites have  
been driven too far by the database?
+
+The point here is, given the mismatch between HTTP and CRUD, weíve  
put CRUD first which in turns makes actions first class in our  
frameworks. We aim to support N different types of action (verbs) when  
really we should have been dealing with only threeóGET, POST and  
DELETE (the latter being perhaps re-routed to a specific ìresource  
classî method according to some framework / form conventions).
+
+
+# Nannying: tell me how to get organisedóclear signposts for where to  
put my code.
+# Just add water: give me my prototype now!
+# Donít make me think: I can do this stuff even on my dumbest days.
+# DRY: making the same change 50 times is not cool.
+# Anti-pasta: help me avoid spaghetti
+# Security: no nasty surprises please. Help me get this right first time.
+# Testing: help me protect myself against myself.
+
+
+
+To me, what we should look at is the basic reasons why we want to  
manage web-pages and satisfy them:
+
+   1. centralized control over page rights and access
+   2. ability to remap urls due to changes in web-site structure
+   3. handling 404-errors intelligently
+   4. ability to dynamically add headers and footers to pages for  
displaying alerts such as "system going down at 5pm"
+   5. separates content from presentation in a reasonable manner, eg.  
with templates
+   6. managing tainted data (eg. POSTS, GETS, COOKIES)
+
+
+AJAX Considered Harmful
+
+Please pardon the provocative title, but this post is intended to
+surface one point I buried in yesterday's presentation in the hopes
+that by making it a separate post it will attract a wider audience.
+
+I intend for this to post to be constructive, so I will focus on two
+specific suggestions which hopefully will serve as the seed for the
+development of a set of best practices for AJAX.  Here are the two
+humble suggestions on things that people should standardize on:
+
+    * the data should first be encoded as octets according to the
+      UTF-8 character encoding
+
+    * GET should never be used to initiate another operation which
+      will change state
+
+Rationale for these two suggestions follows.
+
+Encoding
+
+For the former, I proposed a simple test:
+
+    The first thing I want you to do is to copy the string
+    ìIÒtÎrn‚tiÙn‡lizÊti¯nî into your tool
+    and observe what comes out the other side.
+
+When expressed as a part of the query component of a URI, it should
+look like I%C3%B1t%C3%ABrn%C3%A2ti%C3%B4n%C3%A0liz%C3%A6ti%C3%B8n.
+
+Standardizing improves interoperability, and the reason why I am
+suggesting UTF-8 is that it is backwards compatible with ASCII, can
+express the full range of the Unicode character set, and is widely
+implemented.
+
+Idempotency
+
+Looking into the current PHP implementation of SAJAX, you will see the
+following:
+
+// Bust cache in the head
+header ("Expires: Mon, 26 Jul 1997 05:00:00 GMT");    // Date in the past
+header ("Last-Modified: " . gmdate("D, d M Y H:i:s") . " GMT");
+               // always modified
+header ("Cache-Control: no-cache, must-revalidate");  // HTTP/1.1
+header ("Pragma: no-cache");                          // HTTP/1.0
+
+
+This code should be a rather large clue that you are probably doing
+something wrong.  Apparently the author recognized that these headers
+are somewhat sporadically and inconsistently implemented, and hoped
+that by combining them that the chances of success would be improved.
+
+The danger that the responses may be cached is actually the smaller of
+several concerns.  A much bigger concern is that unsuspecting
+grandmothers and bots everywhere can be tricked into modifying online
+databases simply by following a link.
+
+Judicious use of HTTP GET can be a very good thing.  Perhaps toolkits
+can adopt a convention that procedure names that start with the
+characters ìGetî use GET, everything else uses POST.
+
+
+
+
+possible dispatcher:
+
+<?php
+
+define('HORDE_CONFIG_DSN', 'file:///var/horde/config/head-site1');
+define('HORDE_BASE', '/var/horde/head');
+
+require_once HORDE_BASE . 'core.php';
+
+Horde_Rampage_Dispatcher::run();
+
+
+no-cache headers:
+
+header("Cache-Control: no-store, private, must-revalidate,  
proxy-revalidate, post-check=0, pre-check=0, max-age=0, s-maxage=0");
+
+
+
+meta tags to include:
+
+<meta name="MSSmartTagsPreventParsing" content="true" />
+<meta http-equiv="imagetoolbar" content="no" />
+
+
+
+Things to watch out for:
+
+PHP_SELF
+SERVER_NAME
+Referer - never depend on it
+passwords - don't use just md5, add a salt.
+
+or, consider everything in $_SERVER tainted. Of course $_GET, $_POST,
+$_REQUEST, $_COOKIE are.
+
+Edge cases: $_SESSION, backend databases. If you don't consider it
+input, then it's part of your application for security purposes.
+
+Never display credit card info - this means it shouldn't be
+repopulated!
+
+Filtering is _inspection_, not correction. Don't try to correct
+invalid data. Casts are relatively safe but still miss simplistic
+attacks.
+
+When possible, whitelist - prove data valid. Simple list of values, or
+a regexp. Everything else is bad.
+
+Need a model for making the filtered data clearly available, and don't
+touch the tainted data.
+
+ctype_* - fast, and charset aware. Much better than regexp tests.
+
+
+Output filtering
+
+Escaping is preservation, not changing data.
+
+HTML, javascript, cli output, session data, rss feeds, XML, etc. Any
+remote destination.
+
+Need a clever way to integrate this into the template system! Perhaps
+a content-type on variables (too much?) of text/html, text/plain,
+text/xml, etc.? How about instead of tag, have text:foo, html:foo,
+xml:foo? Or <tag:foo type="html">, defaulting to type="text".
+
+Escaping MUST be charset aware. Data escaped for us-ascii might result
+in JavaScript in Japanese (not necessarily a valid example).
+
+For filtering complex data, use checksums instead.
+
+
+fopen_wrappers - turn off if possible?
+
+display_errors - write a custom error handler, handle errors elegantly
+& integrated with Log object.
+
+complexity leads to mistakes
+
+http://phpsec.org/
+http://brainbulb.com/
+http://shiflett.org/
+http://md5.rednoize.com/
+
+
+
+http://www.midgard-project.org/updates/2003-05-29-000.html
+
+
+* Standardized URL-to-object mapping
+* Standardized object-to-application mapping
+* Standardized navigational system
+* Standardized object extensibility API
+* Standardized way to make application output configurable
+
+
+
+So, MidCOM is about standardizing how to build Midgard applications
+and site features. Lets look at each of the points in more detail
+
+Standardized URL-to-object mapping
+
+Before MidCOM Midgard site and application developers have had to
+figure out how to map URL requests into Midgard objects, typically to
+topics and articles. Everybody has rolled their own solution for this,
+using object names, IDs or GUIDs as the identifiers, and using either
+GET parameters or active page arguments.
+
+With MidCOM, application development doesn't any more have to start by
+writing a URL parser, as the MidCOM system provides this already. URL
+parsing happens completely in topic and article space, using object
+names as the identifiers. This makes for very clean URLs. Consider the
+following:
+
+/gallery/spring-2003/IMG_2442.html
+
+This example would translate to article named "IMG_2442" in topic
+"spring-2003" under topic "gallery". Clean, pronounceable and easy to
+use. An even better, any Midgard object instanced using a MidCOM
+component is aware of its location, providing the URL through MidCOM's
+metadata API.
+
+Standardized object-to-application mapping
+
+In addition to connecting URLs to Midgard objects, URLs also need to
+be connected to specific applications, or in MidCOM terms, components.
+
+All topics in MidCOM are assigned to be managed by a component. This
+means that different parts of the site can work in different ways. For
+example, URL:
+
+/news/midgard-tutorial.html
+
+Could load a "news ticker" component, and provide the topic "news" and
+article "midgard-tutorial" to be handled and displayed by it.
+
+The newsticker component can fully control the administrative
+interface for managing content under it, and the output provided by
+URLs it manages.
+
+Component is selected for each topic separately. This means that
+example URL:
+
+/news/contacts/bergius.html
+
+Could be handled by a "employee directory" component.
+
+Standardized navigational system
+
+Each MidCOM component provides all navigational information about
+objects managed by it to a system called NAP, which is accessible by
+an easy object-oriented API.
+
+The NAP system means that site developers don't worry about different
+components or object types when writing the site's navigational
+interface. You can write one script for generating the whole site
+navigation, and it will work with the site and any component under it.
+
+This makes standardized navigational tools like breadcrumbs or the
+NemeinNavBar utility much more useful, as they can be used with any
+MidCOM-based site. I expect that in near future site developers will
+have a huge library of prebuilt navigational systems to select from.
+
+Standardized object extensibility API
+
+Enabling content managers to define their own object types or metadata
+fields has always been a problem with Midgard, meaning that any new
+metadata field has forced site developers to write their own content
+creation UIs.
+
+MidCOM provides an easier system for this called datamanager. With
+datamanager, site developers can define their own customer data
+structures, called "layouts". Layouts are PHP arrays telling
+datamanager what fields to allow for objects handled for that
+component, how to present those fields in an administrative interface,
+and where to store them (parameters, object fields or attachments).
+
+Using datamanager component writers don't really have to care about
+what object fields site developers will want to use, they just need to
+use the datamanager utility. Data structure "layouts" can be provided
+as part of the default component configuration, and can be overridden
+on a per-sitegroup basis.
+
+Datamanager is integrated to the MidCOM AIS content management
+interface, providing customized editing forms for all components based
+on widgets defined in the "layouts" configuration. The widgets can be
+anything from text input boxes to a WYSIWYG editor or image upload
+system.
+
+Standardized way to make application output configurable
+
+The MidCOM specification requires that all application output is
+handled through the MidCOM style system. MidCOM's style engine is an
+extension of the Midgard style engine, allowing component outputs to
+be configured using style elements, but also for fallback elements to
+be provided as snippets.
+
+This means that output of any MidCOM component will be fully
+configurable by site developers using the familiar Midgard style
+engine. Style to be used can be defined separately for all topics,
+allowing for different output styles from same components on per site
+area basis.
+
+Because components can be loaded dynamically to a Midgard page, site
+developers can have different parts of the same page use different
+styles, making administration of the style elements much easier.
+
+Conclusions
+
+MidCOM brings into Midgard something that has been lacking so far: a
+"write once and run everywhere" framework for building site
+components, styles and navigational tools.
+
+This promotes component sharing and code reuse, both within a single
+Midgard solution provider company, and within the international Open
+Source community.
+
+So far Midgard has provided a nice content management framework, but
+actual sites have needed to be built from scratch. MidCOM promises to
+change that, making Midgard much easier to implement.
+
+Of course, sloppy coding is still possible with MidCOM, but if
+component writers adher to the MidCOM specification, PEAR coding
+standards and use NemeinLocalization for internationalizing their
+components, we should achieve global reusability.
+
+I invite all Midgard developers to seriously study and consider MidCOM
+for their projects. There is some learning curve, but real code
+reusability should repay that very quickly.
+
+
+The Midgard Framework is a powerful toolkit for managing online
+information. Writing applications and functionalities to the platform
+is done using the easy-to-learn PHP scripting language. All
+interfacing with the system is done via a regular Web browser, and no
+special tools are needed for developers or content authors.
+
+Main features of Midgard Framework include:
+
+    * Easy and well documented Application Programming Interface (API)
+    * Efficient management of Web content using a hierarchical topic system
+    * Separation of layout, content and site logic
+    * Support for editorial workflow and approval mechanisms
+    * Attachment of metadata to all content objects
+    * Management of PIM data including contacts and calendaring information
+    * Multilingual support (including Unicode) and localization
+    * Replication for clustered setups and staging
+    * Multi-company support using virtual databases
+    * Flexible user and group management
+
+Midgard works on most common UNIX platforms, including Linux, FreeBSD
+and Solaris. Prebuilt binary packages are available for some Linux
+platforms (including Red Hat, Debian and Mandrake), and the system can
+be installed from sources to most other environments.
+
+For other environments, including hosted servers and Windows systems,
+there is the pure-PHP implementation, Midgard Lite.
+
+The Midgard Application Server is free software developed
+internationally with the Open Source model and distributed under the
+GNU licenses. Commercial support, applications and services for the
+platform are available from a range of companies worldwide.
+
+The PHPmole toolkit provides Midgard developers with a
+freely-available Integrated Development Environment (IDE) comparable
+to DreamWeaver and MS Visual Studio, with additional content
+management functionalities.
+
+
+With the Midgard CMS package, the ease-of-use of productivity software
+and office suites can be brought to Midgard content management.
+
+
+query building:
+
+<?php
+// Instantiate the Query Builder for seeking MidgardArticles
+$query = new MidgardQueryBuilder("MidgardArticle");
+
+// Next add the SQL constraints you need
+
+// List articles only from specific topic
+$query->addConstraint("topic", "=", $topic->id);
+
+// List only articles that have been approved since some timestamp
+$query->addConstraint("approved", ">", $starting_time);
+
+// Order the articles based on their approval time
+$query->addOrder("approved", "DESC");
+
+// Get only 20 articles for this particular view
+$query->setLimit(20);
+
+// Start from the Nth page of this article list
+$query->setOffset($_REQUEST["startfrom"]);
+
+// Execute the query returning an array of matching MidgardArticle objects
+// The MidgardArticles are the full article objects with all regular methods
+$articles = $query->execute();
+
+if (!$articles)
+{
+    // Handle error
+}
+
+// And then display your articles
+print_r($articles);
+?>
+
+Query Builder in action
+Thanks to Jukka's efforts, we have already working MidgardQueryBuilder.
+
+Let's start with simple example.
+
+/* Define which MgdSchema type should be used and returned by QB */
+$qb = new midgardquerybuilder("NewMidgardArticle");
+
+/* Define constraints */
+$qb->addConstraint("topic", "<", 2);
+$qb->addConstraint("title", "=", "News");
+
+/* Execute SQL query and return array*/
+$f = $qb->execute();
+
+MySQL query executed:
+
+SELECT article.id FROM article_i,article
+WHERE
+article.topic < 2 AND article_i.title = 'News'
+AND article.id=article_i.sid
+
+
+As you notice, title property is defined in article_i table while  
topic property is defined in article table.
+Query Builder follows class' tables definition and is able to search  
for objects which has more than one table as storage.
+$qb->execute(); returned array with only one object ( due to record  
returned by SELECT ), so
+
+print_r($f[0]);
+
+
+ NewMidgardArticle Object
+        (
+             [sitegroup] => 0
+            [author] => 0
+            [owner] => 0
+            [realm] => article
+            [guid] => cedda8cb461c9f846c73f043aaf888e9
+            [changed] =>
+            [updated] =>
+            [action] => create
+            [errno] => 0
+            [errstr] =>
+            [id] => 28
+            [calstart] => 0000-00-00
+
+etc etc etc
+
+Let's try to use datetime fields:
+
+$qb = new midgardquerybuilder("NewMidgardArticle");
+$qb->addConstraint("revised", ">", "2003-04-30 09:46:00");
+$f = $qb->execute();
+
+MySQL query executed:
+
+SELECT article.id FROM article_i,article
+WHERE
+article.revised > '2003-04-30 09:46:00'
+AND article.id=article_i.sid
+
+
+Now $qb->execute() returned array with 5 objects. I do not want to
+print'em all , so let's look at revised properties if were selected
+correctly:
+
+print_r($f);
+
+Array(
+
+    [0] => NewMidgardArticle Object
+        (
+    [revised] => 2003-04-30 10:30:06
+
+   [1] => NewMidgardArticle Object
+        (
+    [revised] => 2003-04-30 10:01:18
+
+   [2] => NewMidgardArticle Object
+        (
+    [revised] => 2003-04-30 11:03:31
+
+   [3] => NewMidgardArticle Object
+        (
+    [revised] => 2005-04-05 16:29:16
+
+    [4] => NewMidgardArticle Object
+        (
+    [revised] => 2005-05-12 12:36:18
+
+
+Simple , fast and usefull :)
+
+OK, now try to read about classes which extend MgdSchema classes and
+think how this could be used with Query Builder. PHP classes names are
+not case-sensitive, and MgdSchema type's names are. So if we could use
+only lowercases for type and classes names in MgdSchema we could
+extend MgdSchema classes and use own classes and objects with Query
+Builder too.
+
+Just like this:
+
+class Amerigard extends NewMidgardArticle
+{
+}
+
+class FlyHigh extends Amerigard
+{
+}
+
+$qb = new midgardquerybuilder("FlyHigh");
+$qb->addConstraint("topic", "<", 2);
+$qb->addConstraint("title", "=", "News");
+
+$f = $qb->execute();
+
+MySQL query executed:
+
+SELECT article.id FROM
+article_i,article
+WHERE
+article.topic < 2
+AND article_i.title = 'News'
+AND article.id=article_i.sid
+
+
+Above example is not working example of course , but could be :)