19 July 2005

Some AJAX thoughts

Every webdeveloper has heard of AJAX last time. If you have not, you are not a webdeveloper, or you have invented it yourself and use another name for it. AJAX means: Asynchronous JavaScript and XML. But not all AJAX implementations use XML (see this article about JSON). But DHTML is not a buzz-word nowadays... Wikipedia has more info about AJAX than about DHTML.

Because I am a PHP developer, I am interested in the available PHP libraries for AJAX. After some searching I found the sajax and xajax libraries. I will not use sajax because it is not object oriƫnted.
But the JPSpan project is an AJAX project too. However, they do not use the AJAX buzz-word. And the JPSpan libraries make it possible to use PHP functions through JavaScript with the same name and parameters.

While reading around I found an interesting article titled: "Listen kids, AJAX is not cool". The author states JavaScript is not meant to implement basic functionality of a web page. I agree; too much web pages does not work without JavaScript. Use JavaScript (and all based on it) only to add some helper functions or do some layout enhancements.

However, I like the AJAX technology. You can use it to add hints (if typing the recipient in GMail) or create a dropdown list under an edit box. But always use it asynchronous (first A of AJAX). Requests have latency and it is very annoying if you can not type because the web page is waiting for an AJAX request.

For sending forms I have developed another asynchronous technology. I have implemented it in one site for the poll yet. But unfortunately that site is not updated frequently and the vote period has past. The maintainers should publish a new poll.
If JavaScript is enabled, the form will be sent using JavaScript and some info on the page will be dynamically updated. Without JavaScript a normal GET request will be sent and the whole page will be updated.
With JavaScript the request is sent by adding a <script> node to the HTML DOM tree with the url of the GET request in the src attribute. An extra parameter in the GET request indicates it is sent using JavaScript. If the script is sent using HTML this parameter is unavailable. The server responds depending on the presence of this parameter with JavaScript or with HTML.

An example of the JavaScript function to send a form:

/**
* Send contents of a form using a special JavaScript GET
*
* @access public
* @param object theForm
* @return bool return false on error (remember to set a return false in the onclick handler to prevent a double submit)
*/
function FormSendJs(theForm)
{
myRadioGroups = new Array();
myQuery = 'request_method=javascript';
for (i = 0; i < theForm.elements.length; i++) {
if ((theForm.elements.item(i).nodeName == 'INPUT') && (theForm.elements.item(i).type == 'radio')) {
/* handle radio buttons */
if (theForm.elements.item(i).checked) {
myRadioGroups[theForm.elements.item(i).name] = new Array();
myRadioGroups[theForm.elements.item(i).name]['Name'] = theForm.elements.item(i).name;
myRadioGroups[theForm.elements.item(i).name]['Value'] = theForm.elements.item(i).value;
} else if (!myRadioGroups[theForm.elements.item(i).name]) {
myRadioGroups[theForm.elements.item(i).name] = null;
}
} else {
/* other fields */
if (theForm.elements.item(i).name && (theForm.elements.item(i).name.length > 0)) {
myQuery += '&' + theForm.elements.item(i).name + '=' + theForm.elements.item(i).value;
}
}
}
/* detect if all radiogroups are filled */
myResult = true;
for (myRadioGroup in myRadioGroups) {
if (myRadioGroup == null) {
myResult = false;
break;
}
myQuery += '&' + myRadioGroups[myRadioGroup]['Name'] + '=' + myRadioGroups[myRadioGroup]['Value'];
}
if (!myResult) {
lastError = 'U heeft geen optie gekozen!';
return false;
}
/* construct URL */
myQuery = myQuery.replace(/&$/, '');
myUrl = new String('');
myUrl += theForm.getAttribute('action');
if (!myUrl.match(/\?/)) {
myUrl += '?';
} else if (!myUrl.match(/&$/)) {
myUrl += '&';
}
myUrl += myQuery;
/* construct results id */
myResultsId = theForm.id.replace(/_form_id$/, '_results_div_id');
/* remove form */
theForm.parentNode.removeChild(theForm);
/* display results */
myResults = document.getElementById(myResultsId);
myResults.style.display = 'block';
/* include JavaScript with constructed Url which will update the results */
myScript = document.createElement('SCRIPT');
myScript.type = 'text/javascript';
myScript.src = myUrl;
myResults.parentNode.insertBefore(myScript, myResults);
return true;
}

11 July 2005

Store MPTT trees with history

When searching for a method to store trees in a database (in this case MySQL) in the spring of 2004, I found a method called "Nested Sets" by Joe Celko (other version). This method is also known as MPTT (Modified Preorder Tree Traversal). Using this method, queries to get all parents or all childs are much faster, then recursively getting the next parent or next childs.

To be able to save the history of my tree, I added two columns to the database: valid_from and valid_till. To insert an item in the tree, I backuped the items to be changed (the items on the right and the parent items) setting the valid_till field to now, inserted copies of the backuped ones with higher left and right values (and a valid_from field set to now) to make room for the new item, and finally inserted the new item. Doing so the number of rows in the database grew fast; every insert caused all right items and parents to be copied for history.

I researched alternatives for the MPTT algoritm and found some good articles on the internet. Vadim Tropashko wrote an article about "Nested Intervals" as an alternative to Nested Sets (other articles). Using that method you did not need updating parent items, but the right items must be updated as well.
Another drawback an implementation of Nested Intervals is, that you must use fractions or very accurate floats. When using fractions your performance decreases because they must be evaluated to select items. Using floats may give round errors if there are many levels and/or many items.

When brainstorming with my colleague Hermen van den Hoorn about our growing database table containing the tree, we got the ultimate idea. Why should there be no gaps in the tree? Why are we reformatting the tree each time to use each value between the left and right value of the root? The MPTT model does not dictate that.
Wanting to save the history of the tree, we will never delete an item; only expire it by setting its valid_till field to now. If we leave room for the deleted item, i.e. do not rebuild the left and right values of the tree, the tree remains correct. Adding an element requires making room for it, but does not need to backup the entire tree.
The final tree will contain all items: deleted items, current items and perhaps future items. If we take care of the whole tree, even the deleted and future items, if changing it, we are able to get the tree as it was on each timestamp.

An example to illustrate.
Base table (from Joe Celko's article):
id name mptt_left mptt_right mptt_level valid_from valid_till
1 Albert 1 12 1 2000-01-01 9999-12-31
2 Bert 2 3 2 2000-01-01 9999-12-31
3 Chuck 4 11 2 2000-01-01 9999-12-31
4 Donna 5 6 3 2000-01-01 9999-12-31
5 Eddie 7 8 3 2000-01-01 9999-12-31
6 Fred 9 10 3 2000-01-01 9999-12-31

The corresponding tree (after 2000-01-01) looks like:
          Albert (1,12)
           /       \
         /           \
   Bert (2,3)   Chuck (4,11)
                  /   |  \
                /     |    \
              /       |      \
            /         |        \
       Donna (5,6) Eddie (7,8) Fred (9,10)

Now I want to add employee Jack under Bert per 2003-01-01. We make room and the table becomes:
id name mptt_left mptt_right mptt_level valid_from valid_till
1 Albert 1 14 1 2000-01-01 9999-12-31
2 Bert 2 5 2 2000-01-01 9999-12-31
3 Chuck 6 13 2 2000-01-01 9999-12-31
4 Donna 7 8 3 2000-01-01 9999-12-31
5 Eddie 9 10 3 2000-01-01 9999-12-31
6 Fred 11 12 3 2000-01-01 9999-12-31

For all employees right of Bert the left and right values are increased by 2. For all parents, including Bert, the right value is increased by 2. The tree is still valid.
After adding Jack the table becomes:
id name mptt_left mptt_right mptt_level valid_from valid_till
1 Albert 1 14 1 2000-01-01 9999-12-31
2 Bert 2 5 2 2000-01-01 9999-12-31
3 Chuck 6 13 2 2000-01-01 9999-12-31
4 Donna 7 8 3 2000-01-01 9999-12-31
5 Eddie 9 10 3 2000-01-01 9999-12-31
6 Fred 11 12 3 2000-01-01 9999-12-31
7 Jack 3 4 3 2003-01-01 2004-01-01

And the tree (per 2003-01-01):
          Albert (1,14)
           /       \
         /           \
   Bert (2,5)   Chuck (6,13)
      /           /   |  \
    /           /     |    \
 Jack (3,4)   /       |      \
            /         |        \
       Donna (7,8) Eddie (9,10) Fred (11,12)

When Jack stopped on 2004-01-01 the tree is:
          Albert (1,14)
           /       \
         /           \
   Bert (2,5)   Chuck (6,13)
                  /   |  \
                /     |    \
              /       |      \
             /         |        \
       Donna (7,8) Eddie (9,10) Fred (11,12)

The table is the same as above.

It is very simple to maintain the tree. You can move an item by expiring it and insert it as new node (after having made room) with the same id. Note that the id, the valid_from and valid_till columns form an unique key.

I am now busy with the conversion from our 72,000 records table to a < 900 records table...