REST Easy With Zope
Plain old HTTP as an All-Purpose Web Services Protocol
John Barham
DTS Digital Images (formerly Lowry Digital Images)
The DTS DI Production Environment
- DTS DI is in the movie restoration business (e.g., Star Wars, Lady and the Tramp, Peter Pan, James Bond)
- I have been a software developer for DTS DI since early 2004 writing internal
web applications using Zope and PostgreSQL and network applications in Python
- Massively distributed environment:
- > 550 Mac dual and quad G5 image processing nodes
- ~400 terabytes of storage on 90 Linux file servers
- ~25 touch-up artist OS X workstations
- Gigabit Ethernet network
- One Zope control server to rule them all!
The Movie Restoration Pipeline
- Scan best available film negatives at "4K" (4096 x 3112 pixels, ~ 49 MB) or "2K" (2048 x 1556 pixels, ~12 MB):
~8 terabytes of disk space required for a 4K scan of a two hour movie!
- Project managers set parameters of custom image processing software which is run on
cluster to remove damage from movie frames
- In-house touch-up artists manually fix damage that software missed (or in some cases creates
as artifacts)
- Restored images are then returned back to the client either by being transferred back to film or on disk
Details: The Manual Touch-Up Stage
- Movie is physically divided into "clips" with ~12,000 - 15,000 frames/clip;
all frames for a clip are stored in a single folder on FTP file server
- Physical clips are divided into logical shots; shot list is stored in PostgreSQL database
- Once automated image processing cleanup has been completed, the movie (or
clip) is released to the touch-up dept.
- Each shot is assigned to a touch-up artist who pulls shot frames from file
server, repairs damage on local Mac workstation and saves frames back to file
server. Repeat until complete.
- Artists repair frame damage using an in-house developed Photoshop like
application, Paint, written in C++ based on the QT toolkit
Example Project: Disney's Peter Pan
- 110,000 frames
- ~20 terabytes of image data
- > 8.5 years of Mac dual 2.0 GHz G5 CPU processing hours
- In touch-up for ~4 months
Touch-Up Stage Requirements
- Store physical location of movie frames; given size of image files, multiple file servers are required per movie project
- Track which shots have been assigned to which artists and ensure that only one artist is assigned per shot, otherwise artists duplicate or overwrite work
- Record repair activities (e.g., dirt, scratch or gate hair removal) per shot
- Track shot state to generate % completion rate for each clip/movie for management/clients
- Generate statistics (e.g., touch-up hours per frame) for billing and estimating purposes
Tracking Touch-Up: The Early Days
- Shot ranges and locations were entered in a binder at the front of the touch-up room
- Artists walked up to the binder, copied down (or memorized) the frame numbers of a free shot,
initialed the shot they were working on, and returned to their workstations, where
they typed the same information into an Open Office timesheet
- Artist then downloaded frames from the FTP server using a graphical Mac FTP client,
cleaned up the frames on their workstation, saved them back to the FTP server,
updated their timesheet, walked back up to the binder at the front of the room,
annotated the checked out shot, self-assigned another shot, and repeat...
Manual Touch-Up Tracking Problems
- Everything is written down on paper: too much scope for error
- Since artists are self-assigning shots, they might get a shot that is too
easy or difficult for their skill level
- Open Office spreadsheet timesheet files are used primarily for tracking
artists' hours for payroll, not useful for tracking overall progress of movie or
cumulative time spent per movie for client billing or estimating purposes
Enter the Touch-Up "WebApp"
- A web-based application for the touch-up dept., built using PostgreSQL and Zope
- Touch-up shift managers assign shots to artists based on their skill levels
- Artist authentication using Zope
LDAPUserFolder Product
and workstation identifaction from IP address of HTTP requests
- Artists initiate specific activities from the web app (e.g., downloading
frames, cleaning dirt, removing scratches, saving frames)
- The total of an artist's daily activities comprise their timesheet, hence no
more need for Open Office spreadsheet
- Real-time report generation from PostgreSQL
The WebApp: Better, but not Perfect
- Files still had to be downloaded from and saved to FTP servers manually:
- 15,000 file folder listings still took a long time to render
- Error-prone with occasional erroneous transfer destroying hours of work
- Artists had to constantly flip between web browser, where WebApp was
running, and Paint, in-house touch-up application
- Why couldn't Paint itself know what an artist was supposed to work on?
Direct Integration Option
- Authenticate artists using OpenLDAP C library
- Directly access PostgreSQL database using C psql library
- Problems:
- LDAP programming is painful
- Accessing database directly means replicating in C++ business logic
already implemented in Zope (e.g., invariant that artists can only
have one touch-up activity in progress at a time is a rule enforced
by the web app, not by the database)
Enter REST
- REST: "Representational State Transfer"
- Coined and defined by Roy Fielding of the Apache Software Foundation
in his Ph. D. thesis
- Also championed by VanPyZ's very own Paul Prescod
- Core idea: Universal addressability of resources, using URIs, combined with
HTTP's operations—POST, GET, PUT, DELETE—is sufficient for
constructing programmatic APIs over the web
- Looser definition: Any interface that uses XML (or even plain text) over HTTP without
an additional messaging layer such as SOAP
or XML-RPC
DTS DI REST API Requirements
- Complement rather than replace the existing web app; some
things (e.g., ad-hoc report generation) are fine done through
a web interface
- Authenticate artists and identify workstations on artist login into Paint
- Provide known URLs for Paint to query what shots have been assigned
to the artist, and to record when an artist starts or stops a touch-up
activity on a shot
- Provide known URLs for workstation scripts to query when frames
have been assigned to a workstation and are ready for downloading
REST API Implementation
- Authentication:
- Call a known Zope URL that requires authentication
with basic HTTP authentication header
- Existing Zope LDAPUserFolder Product handles gory details of
authenticating against LDAP server
- Security: Basic HTTP authentication is little better than plain
text but if in-house staff are running packet sniffers, that's the
least of our problems...
- Workstation identification: Get IP address of calling workstation
from REMOTE_ADDR environment variable of HTTP request
(HTTP_X_FORWARDED_FOR variable if using Apache rewriting, as we are)
REST API Implementation (cont'd)
- Paint application calls Zope URLs using C++ HTTP object
and populates QT widgets using returned XML
- In some cases simply reuse existing Zope Python scripts that are
targets of HTML forms (although might need to suppress redirects after
script completes)
- Reuse ZSQL methods created for web app, but return XML instead of HTML
- Return plain text where appropriate (e.g., id number of created activity)
- Use standard HTTP error codes (e.g., 403, "Forbidden") to indicate
problems with API call
Automated file download/upload
- Touch-up shift manager assigns shot to artist's workstation before shift begins
- Pending file download record is created in PostgreSQL database
- Script running periodically on touch-up workstations calls Zope URL
for XML list of pending file transfers for that workstation
- No need to configure each script locally as Zope knows which workstation
is calling from the IP address of the HTTP request
- Script downloads the files and records the download using the same
URL methods called by Paint for synchronous file transfers
- Similarly, script uploads files to FTP servers in background
so that artist can work on another shot
Benefits of REST API
- Artists work in a native GUI application that is transparently making HTTP
calls on their behalf
- Easy, often trivial, reuse of existing code implemented for HTML interface
- Test client-side of API using a web-browser; modern browsers (e.g., Firefox, IE 6)
will even pretty-print XML and check it for well-formedness
- Can leverage extensive system integration capabilities of Zope products
(e.g., LDAP authentication, SQL database access) through simpler HTTP interface
- Centralized configuration: Set Paint features available to artists within Zope;
feature set is loaded by Paint after successful artist login
REST API Implementation Examples
- Retrieve listing of assigned shots:
GET http://touchup/API/AssignedRanges.xml
Note: Zope extracts identity of calling artist from HTTP authentication header.
- Result: XML listing of assigned shots
- Start an activity on an assigned shot:
POST http://touchup/API/StartRangeActivity
Parameters: range_id, activity_type, activity_start_frame, activity_end_frame
- Result: Numeric id of created activity as plain text
- End a running activity:
POST http://touchup/API/EndRangeActivity
Parameters: range_activity_id, start_frame, end_frame, range_state_id, comments
- Result: 200 OK/204 No Content
REST API Implementation Examples (cont'd)
Zope Python script to return per-user features:
request = context.REQUEST
user = str(request.AUTHENTICATED_USER).lower()
try:
features = container[user].data
except:
features = "TouchUp"
return container[features].data
Disadvantages of HTTP
- Connectionless protocol:
- Computationally expensive to use encryption for every request
- Client must poll server to retrieve list of pending events
- HTTP status code application mismatch
Industrial REST API Providers
REST vs. SOAP
Yahoo!
Question: Does Yahoo! plan to support SOAP?
Answer: Not at this time. We may provide SOAP interfaces in the future, if
there is significant demand. We believe REST has a lower barrier to entry, is
easier to use than SOAP, and is entirely sufficient for these services.
Amazon
Of more than 50,000 users of its web services,
fewer than 20%
choose to use its SOAP interfaces.
The End
Thanks for listening!
Questions?