There and Back Again

Webthumb API additions

If you wondered an API that requires polling isn’t a very good thing for scalability. On my current setup I can pretty easily handle about 20 status requests per second on top of my normal traffic, the problem is its not hard for a bad polling implmentation being run by one user to make that many requests.

To solve this problem im adding an addition to the Webthumb API that will allow you to skip polling all together. The basic idea is that your make an API request and when your thumbnail is complete i’ll make a GET request back too your server telling you that the request is complete.

So on the request side its just a matter of including an notify tag in your request block. An example is shown below:

<webthumb>
	<apikey>apikeyhere</apikey>
	<request>
		<url>webthumb.bluga.net</url>
                <notify>http://webthumb.bluga.net/sample/notify.php?secret=blahblahblah</notify>
	</request>
</webthumb>

I’m including a secret variable in the URL so that if someone found the URL of my notify script they couldn’t DOS me by making me download 100′s of different thumbnails from the server.

I’ve written a basic notify script to get you started. Feel free to use this script as a basis for whatever you need.

Update: Paths to thumbnails have changed so the download code listed here wont’ work. The new directory hash is:

<?php
substr($id,-2).'/'.substr($id,-4,-2).'/'.substr($id,-6,-4)
?>

Which means if your job id is: wt4761c8f914559 then the directory is: http://webthumb.bluga.net/data/59/45/91/

<?php

// this is a really simple notify script
// it downloads the specified thumbs for a job and puts them in the storage dir
// if you want to store files based urls etc you'll need to store the id at request time
// then do a mapping in this script

// download options
// zip     - all sizes in a zip
// zipAuto - zip download and auto uncompress 
// large   - 640x480
// medium2 - 320x240 
// medium  - 160x120
// small   - 80x60
$downloadType = 'zipAuto';


// secret id im using to make sure no one has me download every thumb etc
$mysecret = 'changeme';

// directory to write files too
$storageDir = 'tmp';

// webthumb base url
$url = 'http://webthumb.bluga.net/data/';

// unzip command
$unzipCommand = 'unzip';


// END CONFIG
if (!isset($_GET['id']) || !isset($_GET['secret'])) {
    exit;
}

if ($mysecret == 'changeme') {
    echo "Configure notify script";
    exit;
}

$jobId = $_GET['id'];
$secret = $_GET['secret'];

if ($secret !== $mysecret) {
    echo "bad secret";
    exit;
}

$jobDir = substr($jobId,-4);

switch($downloadType) {
    case 'zip':
    case 'zipAuto':
        $file = "$jobId.zip";
        break;
    default:
        $file = "$jobId-thumb_$downloadType.jpg";
        break;
}


// this is the simplest possible download code, curl, PEAR http_request might be better
// will only work if allow_url_fopen is on
$contents = file_get_contents($url.$jobDir."/$file");
file_put_contents($storageDir."/$file",$contents);

if ($downloadType == 'zipAuto') {
    exec("cd $storageDir && $unzipCommand $file");
    unlink($storageDir."/$file");
}

?>

Let me know if you find any major bugs in the code. There are always going to be cases where the polling API makes more sense (command line utils etc) but I think this notify API should work great for any application integration.

19 thoughts on “Webthumb API additions

  1. Pingback: PHPDeveloper.org

  2. Mike Naberezny

    Joshua,

    Another approach is that you could extend the WebThumb API to allow the client to specify which image size(s) the server should return. This would give the server enough information to then complete the next transaction in single asynchronous POST request back to the client with the image data instead of just notifying.

    The client end would be simplified by not having to make another request and no longer needing the secret key. Clients would also gain the familiarity and efficiency of having the images you POST back like handled any other file uploads.

    On the server, it saves a roundtrip. Also, scripts like notify.php above can hold your notifier connection open until the download is complete on a separate connection, effectively doubling the number of connections your server has to handle. POSTing the data back in a single request eliminates this possibility so it will allow you to scale better as the API usage grows.

    If you implement this, consider PUTting the image data back instead of POSTing it. RESTifarians will appreciate your good taste in verbs and PHP provides handling for PUT requests (http://www.php.net/manual/en/features.file-upload.put-method.php). If you wanted to get fancy and add a little extra security, allow users to specify auth credentials to use in the return PUT and then they even have the option of configuring their HTTP server to collect the images you send automatically.

    Regards,
    Mike Naberezny

  3. Joshua Eichorn Post author

    Mike: I’ll looking into putting the data back, with a notify type approach. I went with making a seperate request for the image because its something I could get up and running in an hour. It also allows me to serve up the images with straight apache instead of through php. Anyhow there are some other features i’d like to add to the API and i’ll put your idea on that list.

  4. Nima

    Hi,

    I use your notify script on my own server, but it seems not getting any hit (request) from your server.
    Is it working correctly?

  5. Joshua Eichorn Post author

    Nima, it was in my testing, but I don’t think many people have used it yet so there could be a bug.

    I don’t see a request in the logs that contains a notify URL, but things are working correctly in my testing.

    Send me an email (josh@bluga.net) with the XML payload your sending and we will get it worked out.

  6. Oscar Merida

    Great service – the jobDir has changed since this script was written, I had to change it to the following to get it to work:

    $jobDir = substr($jobId,-2) . ‘/’ . substr($jobId, -4, 2) . ‘/’ . substr($jobId, -6, 2);

  7. John

    Great web service, thank you.

    Is there a way to make this get multiple images (from multiple URLs at a time? Or feed it a list of URLs?
    And also to save them with filename domain1.com.jpeg, domain2.com.jpeg etc?

    Seems to me these changes would make it very useful for lots of people

    Thanks
    John

  8. Joshua Eichorn Post author

    John:
    You can have multiple request blocks for multiple urls see http://bluga.net/webthumb/api.txt for an example.

    Filename wise I don’t have a way to do nicer ones until I do a new rev of the api since I need unique names on my end and the files are directly exposed with the current design.

  9. epto

    I think:
    json or serialize are better than xml in this application!
    This script is simple and it work.

    sid: is sthe filename

    <?
    $sid=@$_GET['sid'];
    $id=@$_GET['id'];

    $savepath=”/mysavepath”;
    $path=substr($id,-2).’/’.substr($id,-4,-2).’/’.substr($id,-6,-4);
    $url=”http://webthumb.bluga.net/data/$path/$id.jpg”;
    $data=file_get_contents($url);
    if ($data!==FALSE AND strpos($data,’OK

    Get this script as example and test
    Don’t lost the money and time with other!

  10. Joshua Eichorn Post author

    epto: I am looking into adding json support in the future, but its not like the xml were working with is complex so i don’t see a huge win for anyone by supporting it and there are more interesting features to work on.

  11. Drew

    Hi Josh,

    I am currently implementing your script on a Code Igniter framework, which means no query strings (or their are, but they are in directory path format). Currently your notify callback will append the jobid to the url, which will result in a 404 not found for me.

    Even if you would still like to do it via post, perhaps you could come up with a way to format the way in which it is returned?

    If I could get it to post back as in the following it would be good to go as well:
    /notifiers/webthumb/

    -or-

    Is it possible to POST this data to the notify_url rather then GET?

    Sending the POST data to the url
    /notifiers/webthumb/

    would be another option.

    Thoughts?

  12. Drew

    Oops. The first example url above is suppose to be
    /notifiers/webthumb/{job_id}

  13. Marc Palmer

    Josh… I think it would be much nicer if the notify callback included the URLs of the thumnails, or at least the base URL.

    This also gives you more options about where this stuff is stored without breaking all the existing clients – eg temporarily shift files up to Amazon S3 or to some other temporary domain.

    eg post to the notify callback some XML data with thumbnail type + url pairs.

  14. Joshua Eichorn Post author

    Marc: I just saw your post, I do have some additions to notify that are in testing, give me an email if you want to help test/shape the new features.

  15. Soren

    I agree with ad. 16 – It would be really nice if you could chose a notification that just include an url to the specified thumb.