There and Back Again

Improving JPSpan page load performance

If you use JPSpan to augment a normal website you may have noticed that your page loads now take quite a bit longer. The reason for this is loading the JPSpan client javascript code, its generated on each load and so it can’t be cached on the client. This file can become huge if you register lots of classes with lots of methods to create stubs for.

So the solution is to allow this page to be cached, to allow for that, im using HTTP_Cache, i’ll also be caching the output generated by PHP since the client files shouldn’t change often.

When you start caching the most important question to answer is, “How do I know when I need to regenerate my cache”. In the JPSpan case two things can cause a cache regen, the server url changing (the url of the JPSpan server is embedded in the file) or the api your exporting changing. The server url changing might not ever happen for some people, but in my usage the ip is different depending on if your using the VPN or the public url so this needs to be taken into account. Its also good to take into account so that a stale cache file from development doesn’t ruin things.
Now the api is a little harder, but since JPSpan has to know what to export, its something that must be available.

In fact its in the descriptions array on the JPSpan_Server_PostOffice Object. So to know if its change you just make a hash of it, and check for a change in that. Code to make our API hash is shown below.

// create a hash from the api of the handlers
	// turn the descriptions into a string
	$api = "";
	foreach($S->descriptions as $key => $val) 
		$api .= $key;
		foreach($val->methods as $method) {
			$api .= $method;
		}
	}
	$apihash = md5($api);

Now that we have the information we need on when to regenerate our cache its just a matter of implementing it. First will implement the client side cache, to do this we Use HTTP_Cache to send 304 codes when nothing is changed. The code to do this is shown below, notice that were not compressing the generated JavaScript yet (strip whitespace comments) since its a really expensive operation, that will have to wait until we cache that part too. Also take not that were not calling displayClient anymore, its echo’s it output and then calls exit, so that makes its pretty useless for caching.

        // setup HTTP_Cache give it our custom etag and see if we need to generate the client
	require_once 'HTTP/Cache.php';

	$cache = &new HTTP_Cache();
	$cache->setEtag($etag);

	if (!$cache->isValid()) {

		// Compress the output Javascript (e.g. strip whitespace)
		//define('JPSPAN_INCLUDE_COMPRESS',TRUE);

		// Display the Javascript client
		$G = & $S->getGenerator();
		require_once JPSPAN . 'Include.php';
		$I = & JPSpan_Include::instance();
		
		// HACK - this needs to change
		$I->loadString(__FILE__,$G->getClient());
		$client = $I->getCode();
		header('Content-Type: application/x-javascript');

		$cache->setBody($client);
	}
	else {
		// something is setting Cache-Control 
		header('Cache-Control: must-revalidate');
	}

	$cache->send();

Now if things are working correctly the web browser shouldn’t be fetching the entire page after the first time, it should just be getting a 304 header as the result. There are a couple ways to test that this is working, the easiest it too just view the client page, get the page info in firefox and see if its in your browsers cache, if not then something is broken. In debugging these types of problems i’ve found the LiveHTTPHeaders firefox extension to be useful. What your looking for is the server to send the ETag and then the client to respond with it on the next reload. The server should then respond with a 304 instead of a 200.

Now the next step in the process is to add a file cache in php for the generated client stubs so that we can turn on whitespace stripping and make the first download faster. Thats pretty simple stuff, just make your filename, check if the file exists, if so use it, otherwise generate the client as write it out to file for latter use. I’m using file_put_contents to write to the file so if your on php4 you’ll want to replace that code or check out PHP_Compat. Complete code showing both client side caching using HTTP_Cache and 304, and server side caching is shown below.

// Include JPSpan and setup your server here
// now our new updated client serving code with lots o caching
if (isset($_SERVER['QUERY_STRING']) && strcasecmp($_SERVER['QUERY_STRING'], 'client')==0) {
	// cache dir
	$cacheDir = APP_ROOT."/tmp/";

	// create a hash from the api of the handlers
	// turn the descriptions into a string
	$api = "";
	foreach($S->descriptions as $key => $val) {
		$api .= $key;
		foreach($val->methods as $method) {
			$api .= $method;
		}
	}
	$apihash = md5($api);

	// get the host the request is being made with since it gets embedded in the client file
	$server = preg_replace('/[^a-zA-Z0-9\._]/','',$_SERVER['HTTP_HOST']);

	// create the filename
	$cacheFile = "client-$apihash-$server.js";

	// create the etag
	$etag = md5($cacheFile);

	// setup HTTP_Cache give it our custom etag and see if we need to generate the client
	require_once 'HTTP/Cache.php';

	$cache = &new HTTP_Cache();
	$cache->setEtag($etag);

	if (!$cache->isValid()) {

		if (!file_exists($cacheDir.$cacheFile)) {
			// Compress the output Javascript (e.g. strip whitespace)
			define('JPSPAN_INCLUDE_COMPRESS',TRUE);

			// Display the Javascript client
			$G = & $S->getGenerator();
			require_once JPSPAN . 'Include.php';
			$I = & JPSpan_Include::instance();
			
			// HACK - this needs to change
			$I->loadString(__FILE__,$G->getClient());
			$client = $I->getCode();

			file_put_contents($cacheDir.$cacheFile,$client);
		}
		else {
			$client = file_get_contents($cacheDir.$cacheFile);
		}


		header('Content-Type: application/x-javascript');

		$cache->setBody($client);
	}
	else {
		// something is setting Cache-Control 
		header('Cache-Control: must-revalidate');
	}

	$cache->send();

} else {

		// This is where the real serving happens...
		// Include error handler
		// PHP errors, warnings and notices serialized to JS
		require_once JPSPAN . 'ErrorHandler.php';

		// Start serving requests...
		$S->serve();

}

Update: added data cleaning around HTTP_HOST since I use that in a file name

9 thoughts on “Improving JPSpan page load performance

  1. Mike Bulman

    I’ve actually been using JPSpan a lot since I discovered it. I am in the middle of creating an application that will use multiple JPSpan servers. I realized that doing this meant forcing the user to download 12KB of the same javascript code (the base JPSpan code) for every server. Since I am manually crunching javascript anyway, I decided to break this code up into a 12KB base.js and multiple javascript files for each server. This caused some hassle, but I didn’t see any other way around it.

    Any chance future functionality might allow the postoffice too only output server specific javascript?

  2. Joshua Eichorn Post author

    It shoudln’t be that hard to do that, but if you use this code im not sure that it matters unless your using multiple pages to expose the stub classes.

  3. Patrick Nijs

    I am using JPSpan too to serialize the POST-string and get it to a PHP page which can then unserialize the POST-values.
    But I’m having problems with enters in Mozilla Firefox v1.0.6.
    It seems that Mozilla is counting an \r\n (chr(13) && chr(10)) as 1 character, while Microsoft Internet Explorer v6 counts an \r\n as two characters.
    PHP will unserialize a JPSpan serialized string correctly with Internet Explorer, but will generate an error on a serialized string POST’ed by FireFox, because the string length is too short (differs 1 character on every enter).
    Are you familiar with this?

    Patrick

  4. Joshua Eichorn Post author

    Patrick:
    I haven’t seen this happen, \r\n is two characters so getting the wrong count from firefox does sound like a bug. Are you using the default xml serializer in JPSpan or something else.

  5. Patrick Nijs

    Hi Joshua,

    That was a quick reply 😉

    Well, in my Smarty (http://smarty.php.net) templates, I’ve defined following in the -section of my HTML-output:

    {php}
    require_once ‘JPSpan.php’;
    require_once JPSPAN . ‘Include.php’;
    JPSpan_Include_Register(‘util/data.js’);
    JPSpan_Include_Register(‘encode/php.js’);
    {/php}

    {php}
    JPSpan_Includes_Display();
    {/php}

    And then I am calling this function to serialize the POST-values and send the serialized string to a php-page:

    function postForm() {

    var Encoder = new JPSpan_Encode_PHP();
    var data = new Array();
    var serialized_string;

    data[“textfield”] = document.getElementById(‘textfield’).value;
    serialized_string = Encoder.encode(data);

    alert(serialized_string);

    document.serialize.data.value = serialized_string;
    document.serialize.submit();

    }

    ” method=”post”>

    ” method=”post”>

    Now in FireFox I get this when I POST an ENTER in my TEXTAREA:
    HTTP_POST String:
    a:1:{s:9:”textfield”;s:1:”
    “;}

    Notice: unserialize(): Error at offset 27 of 31 bytes in c:\htdocs\temp\serialize_bleeding_edge.php on line 15

    As you see, FireFox counts this enter only as 1 character.
    It also goes wrong when I’m using Debian Testing as my PHP-enabled webserver.

  6. Joshua Eichorn Post author

    In the email i got things make sense.

    As far as I can tell you one of the few people using the PHP serialzer in JPSpan, I wouldn’t really recommend using it, since its format includes string sizes its always going to be really brittle when generated from other languages.

    Also it can be used to create an instance of any class thats already been included which can be a security risk. If you really want the added performance this gives you I would recommend writing your own version of unserialize that won’t create new objects or only creates ones on a whitelist. You could also cover this bug in that case.

    You might also want to have a look at HTML_AJAX, which uses JSON by default in both directions.

  7. Patrick Nijs

    Well, like I stated before, the main reason we are using JPSpan is because JPSpan can serialize an array of data client-side through Javascript.

    It adds structure to POSTDATA which normally without JPSpan or serialize functions, isn’t structured.

    So with this functionality in mind, are you still recommending HTML_AJAX? (Haven’t quite looked at it yet…)

  8. Joshua Eichorn Post author

    The default JPSpan encoding type is XML, it doesn’t suffer from this bug, but gives you the exact same result.

    The default encoding for HTML_AJAX is JSON (in both directions).

    If your using JPSpan with the default XML serializer or HTML_AJAX with JSON you’ll be able to transparently move JavaScript datatypes to PHP ones and vice versus.

    Minus any bugs you hit of course, but both have worked well for me. Also im the author of HTML_AJAX, just to be clear, but its goals is to support everything JPSpan does plus more while being lighter weight and faster.