Archive for the ‘Scheme’ Category

Articles

Deploying a Scheme web application: the Taubatron

In functional programming,Scheme,Web development on April 23, 2012 by gcbenison Tagged: , , ,

This is the story of how I deployed the Taubatron – a web application written in Scheme.  I wrote about what it does earlier; this post is concerned with getting it running on the actual internet, as opposed to on my personal computer.

CGI vs server process

The most straightforward approach would have been to deploy the application as a CGI program.  I had deployed Scheme applications at my hosting provider that way before.  This is also the type of service offered, at the time of this writing, at the free Scheme web hosting site ellyps.net.  But for this application, performance with CGI was a problem -

Document Path:          /taubatron.cgi
Document Length:        248 bytes

Concurrency Level:      10
Time taken for tests:   23.234 seconds
Complete requests:      100
Failed requests:        0
Write errors:           0
Total transferred:      29000 bytes
HTML transferred:       24800 bytes
Requests per second:    4.30 [#/sec] (mean)
Time per request:       2323.402 [ms] (mean)
Time per request:       232.340 [ms] (mean, across all concurrent requests)
Transfer rate:          1.22 [Kbytes/sec] received

Percentage of the requests served within a certain time (ms)
  50%   2321
  66%   2331
  75%   2333
  80%   2338
  90%   2354
  95%   2365
  98%   2380
  99%   2401
 100%   2401 (longest request)

The slowness – up to several seconds to complete a request – had two causes: first, the expense of forking a new guile process for every request, and second, the application’s lengthy initialization phase (building a graph out of a dictionary of words).  Web developers in other languages have found ways to avoid these costs – for Python there is WSGI, and of course there is mod_perl – couldn’t I do as well in Scheme?  I considered mod_lisp and FastCGI but frankly these seemed difficult and perhaps not possible on my host.  The approach that seemed most promising was to run the application as a long-living server process using the built-in HTTP server found in recent versions of the guile Scheme compiler.  Getting such a server application running was about as easy as setting up the CGI program and the performance boost was remarkable:

Concurrency Level:      10
Time taken for tests:   2.488 seconds
Complete requests:      1000
Failed requests:        0
Write errors:           0
Total transferred:      385000 bytes
Total POSTed:           170000
HTML transferred:       306000 bytes
Requests per second:    401.97 [#/sec] (mean)
Time per request:       24.878 [ms] (mean)
Time per request:       2.488 [ms] (mean, across all concurrent requests)
Transfer rate:          151.13 [Kbytes/sec] received
                        66.73 kb/s sent
                        217.86 kb/s total

That’s right – performing the initialization steps just once resulted in a 100-fold performance increase.  Serving requests directly from the built-in HTTP server like this probably represents a lower bound on the latency of this application.  But to use this approach at all, first I would have to get a recent-enough version of guile running on my host, which turned out to be non-trivial.

Installing guile on the host

Of course guile did not come pre-installed on my host the way PHP did and I had no root access so I could not simply install it like I would on a personal machine; however I did have user-level shell access and the necessary tools to compile guile from source.  This is how I had gotten an older version of guile running to host some simpler applications.  But the high-performance web server is found only in a newer guile version which failed to compile from source due to resource limits imposed by the hosting service.  I tried simply uploading a build of guile from a Debian package appropriate for the hosts’ architecture; this failed at run time with an error about a glibc version mismatch.  However, I noticed that the early parts of the build process that involved compiling and linking C code were working fine; the build wouldn’t fail until later when guile was trying to compile parts of itself into bytecode (files with extension ‘.go’ in the build tree).  Figuring that these bytecode files might be architecture-dependent but should not depend on any specific glibc version, I tried simply copying to the build tree from the Debian package those bytecode files which were failing to build.    And it worked – I had a working guile-2.0 compiler installed on my host.

Configuring the front-end server

But of course I wasn’t finished – it’s not as though I could just bind the guile server to port 80 on the shared host and be done with it.  I needed a way to integrate it with the front-end server, Apache in the case of this host.  One way is to bind the guile process to some high-numbered port and use Apache’s RewriteRule to map requests to that port.  But in a shared hosting environment I couldn’t count on just being able to grab an available port.  I had a good discussion about this at Barcamp Portland and concluded that the best approach was to bind the guile process to a local unix socket, and then configure the front-end web server to forward requests to that socket.  Binding the guile-based HTTP server to a unix socket was no problem, but trying to figure out how to get Apache to cooperate in this seemingly simple task was frustrating.  I eventually tried asking the Internet, but apparently it either did not know or did not care.  In contrast, it is easy to find examples of this in nginx.  I soon had my application serving requests through a unix socket behind nginx with a latency of less than 3 msec per request – nearly as fast as the bare guile HTTP server.  (It entertains me that this benchmark, my first use of nginx for any purpose, was achieved on a laptop on Portland’s MAX blue line, heading East.)

The CGI trampoline

Still, I was not finished, because  I didn’t have the option of using nginx on my host – I had to figure out a way to make their Apache installation work for me.  I gave up on finding an Apache configuration directive to do this and realized that there was a alternative that was also likely to be portable to just about any host, no matter which web server it was running or how it was configured -  I could write a lightweight CGI process that would simply open up a connection to the socket, forward the request, and echo back the response.  I called this a “CGI trampoline”, implemented it, and after the fact found at least one other implementation of the same idea using the same name.  My first implementation was in Scheme, and I had my web application serving requests through a unix socket behind Apache with a latency of 39 msec – ten times slower than the bare guile HTTP server, but still ten times better than the whole application as a CGI process.  The performance hit was due of course to the cost of starting a new guile process for each request.  I rewrote the CGI trampoline in C and request latency dropped to 3.6 msec – pretty good compared to the lower bound of 2.4 msec achieved by serving requests directly from the application running as an HTTP server.

And that’s how the Taubatron was deployed – try it out here!

Articles

Guest commentary: Thien-Thi Nguyen on guile-www

In Scheme,Web development on March 11, 2012 by gcbenison Tagged: , , , ,

When I needed to develop web applications for the first time, naturally I turned to Scheme, a language which has been central to my evolution as a programmer and which remains my go-to language for many things.  I am still an enthusiast for web development in Scheme, and I think it could become much more popular than it is, particularly if it were easier to find hosting and support.

All the libraries you need for Scheme web development already exist. One of the earliest was guile-www, which provides support for HTTP, cookies, CGI, and related things.  For this post I turned to the current maintainer of guile-www to ask what motivated him to become involved in Scheme web development.  It appears that, perhaps because Scheme remains something of a fringe language on the web, the ecosystem around it continues to evolve rapidly.  Projects fork, merge, and fork again; there isn’t any one “Scheme stack” that dominates.  Nevertheless, the basic reasons for using Scheme for web development – it’s simple, it’s concise, it’s elegant – remain the same.

What drew you to get involved in maintaining guile-www?

ttn: Back in 1998 I made various motorcycle trips, and published a “trip log” with photos to a website.  This used some custom Emacs Lisp to push the html and jpeg files to the server (you might find wup.el — web update — somewhere on the net).  I had just discovered Guile and was intrigued by its module system, certainly a step up from Emacs’ crowded flatland.

I stumbled onto Guile-WWW, which IIRC at that time was client-side and HTTP/1.0 only, and started fiddling with it.  Around 2000 or 2001, I somehow also finangled write privs to Guile proper (the umbrella project that also included the CVS sub-repo of Guile-WWW) and started landing patches there.  Unfortunately, due to a misunderstanding with the former Guile maintainer (Marius Vollmer), those privs got cancelled.

Since the code is GPL, i started maintaining the package privately, naming myself as the copyright holder for new code and making releases every so often.  The hope was (and continues to be, though ever more faintly, seeing how things have developed since) that after some time, my changes could be merged back to GNU Guile.

What advantages do you see Scheme having over other languages for web development?

ttn: Less typing and more fun: My Emacs inserts () instead of [] automatically, edits and motions are by sexps, Emacs boldfaces the current list, etc.

Do you have a favorite application that uses guile-www?

ttn: That would be Sizzweb , i suppose.  I use it to serve Debian ISOs and other stuff on the home network.  Client-side, not so much happening…

The development of guile has picked up in pace in the past few  years; do you see its niche growing and that of guile-www with it?

ttn: Yes.  I believe bundling (web …) modules in Guile proper was a strategic mistake that renders Guile-WWW independence even more important.  So “with it” might be better read as “parallel to it”.

Articles

You’re only using PHP because you don’t have Guile and (sxml simple)

In functional programming,Scheme,Web development on February 19, 2012 by gcbenison Tagged: , , ,

OK, maybe you would still use PHP even if you did have a Scheme implementation and the (sxml simple) module.  But it is true that just about any web application written in PHP could also be written in Scheme or another Lisp dialect.   And yes you can get it hosted, more easily than you might think; I’ve done it (see this example, described at the end) with just the basic services at my hosting provider.  There are even complete Lisp web frameworks available.

So web developers do use Scheme, even if it is far from the most popular web language.  But web development in Lisp is not just possible: it is also a good idea, especially as your project evolves past “a page with some dynamic content” to “a full application that runs over the web”.  HTML can be understood in two ways:

  1. As plain text with a few formatting tags thrown in
  2. As a tree of nodes, where nodes have names and optionally, attributes and values

These two views lead to two different approaches to dynamic web programming:

  1. Treat the web page as a plain text document, with little embedded snippets of code that expand like macros when the page is displayed
  2. Represent the entire web page as a tree using a native data structure, which is then rendered as HTML at page display time

Both HTML and many web programming languages were designed with view (1) in mind because it can be very simple.  You don’t need to be a programmer, or to understand what the phrase “tree of nodes” means, in order to understand what this is supposed to do:

Today is: <?php echo date('Y-m-d'); ?> and it is a <i>good</i> day

But the problem with this approach is that it doesn’t scale well to more complicated uses.   Take, for instance, this production PHP snippet from the WordPress code base (pasted here as an image because of the difficulty of embedding actual literal PHP here):

That, despite being an example of well written PHP, is not terribly intuitive even to a programmer.   I really dislike the  intermingling of two distinct syntaxes (that of HTML and PHP).   It forces you to switch back and forth between the two languages when mentally parsing such code.  It leads to odd orderings of delimiters -  <?php { ?> etc.   It’s what leads to PHP being called things like “bloated, messy, and counter-intuitive“.

Such contortions do not appear in web applications that treat the whole page as a native data structure which the system then renders as HTML.   Since all of the actual HTML is generated by the renderer, the application developer need not even know any HTML syntax in order to program this way!  Scheme is well-suited to representing such a tree of nodes as a native data structure, and there is a standard convention for doing so known as SXML.  One of the things I like best about Scheme for web development is the clear equivalence of a live Scheme object, its serialized representation as SXML, and its serialized representation as HTML – all of which can be interconverted with a single command.

Interconversion of HTML and its equivalent Scheme representation

This means we can take the result of any Scheme computation, manipulate it into SXML, and render it as HTML.  A working implementation of the following example can be found here.  Consider a function top-word-usage which splits a string into words and returns the number of times each word appears, sorted with the most common words first:


> (top-word-usage "Hello, world.  Yes- hello, world, hello!" 5)
$1 = (("hello" . 3) ("world" . 2) ("yes" . 1))

Now define a function to format each entry as a string, and another function to wrap any content as an SXML “list element”:

(define (format-usage usage)
(format #f "~a -- ~a occurrences" (car usage)(cdr usage)))

(define (as-li content)
`(li ,content))

> (map as-li (map format-usage $1))

$2 = ((li "hello -- 3 occurrences")
      (li "world -- 2 occurrences")
      (li "yes -- 1 occurrences"))

Around this core functionality, we can use SXML to build a template for a complete web application:

(display "Content-type: text/html\n\n")
(sxml->xml
 `(html
   (head (title "wc demo"))
   (body
    (h1 (a (@ (href ,(car (command-line)))) "Word count demo"))
    ,(if input
	 `(p "Top words of at least " ,min-word-length " characters:"
	     (ul
	      ,(map as-li
		    (map format-usage (top-word-usage input 5)))))
	    '(p "Enter some text below, to count word usage:")))
    (form (@ (method "post")
	     (name "input-form")
	     (action ,(car (command-line))))
	  (textarea (@ (name "input-content")
		       (rows 20)
		       (cols 60))
		    ,(or input ""))
	  (p "Minimum word length: "
	     (input (@ (type "text")
		       (name "min-length"))))
	  (input (@ (type "submit")
		    (name "input-submit")
		    (value "Determine word usage")))))))

The full application is less than 100 lines long and can be found here. Try pasting in a favorite piece of writing and seeing what happens – it’s actually kind of fun!

Articles

API discovery with Scheme

In APIhackday,Scheme,Web development on August 2, 2011 by gcbenison

By now JSON is a widely-accepted standard, well-supported in the most common web development languages. But the whole point of an open standard is that one ought to be able work with it in any language and on any platform. Attempting to work with a standard on an unusual platform is a good way to test how “open” it really is, and standards have good reason to pay attention to the results: not only do fringe players sometimes become more important (see, e.g. Linux kernel), they also help keep standards honest by preventing language-dependent or platform-dependent “conveniences” from sneaking in.

So when my 13-year-old brother-in-law Lucien and I arrived at APIhackday, we decided to work with web API’s in two of our favorite languages – lua and Scheme – even though these are not often thought of as web programming languages. We chose to work with PDXCouncilConnect, which provides access to civic records of the City of Portland through a web API. A typical JSON result stream looks like this:

{
    "sessions": [
        {
            "header": "", 
            "id": 304, 
            "location": "City Hall - 1221 SW Fourth Avenue", 
            "message": "Due to lack of an agenda there will be no meeting.", 
            "start_date": "2011-06-01 14:00:00.0", 
            "title": ""
        }, 
        {
            "header": "THOSE PRESENT WERE:  Commissioner Saltzman, Presiding; Commissioners Fish \nand Fritz, 3. ", 
            "id": 302, 
            "items": [
                {
                    "bureaus": "", 
                    "disposition": "", 
                    "disposition_url": "", 
                    "emergency": 0, 
                    "files": "", 
                    "heading": "Times Certain", 
                    "id": 921, 
                    "item_number": "", 
                    "item_number_string": "", 
                    "motions": [
                        {
                            "status": "passed", 
                            "title": "Accepted", 
                            "type": "motion", 
                            "votes": [
                                {
                                    "owner": "City Auditor LaVonne Griffin-Valade", 
                                    "vote": "-"
                                }, 
etc...

A simple JSON parser can be written quickly, but it is also possible to find at least a rudimentary one in almost any language. Lucien found a lua JSON parser, and using the CouncilConnect stream, actually found a bug, wrote a workaround, and by the end of the session was parsing the JSON stream. Though there seems to be no dominant JSON parser for Scheme, I quickly found a simple one. Now that I had the JSON stream as a Scheme object, what to do with it? My first idea was to pull in the guile-gnome library and use it to build a graphical desktop app (using GTK+ widgets) that draws its data from the web stream. By the end of the one-day hackathon it was working, demonstrating the utility of Scheme in pulling together disparate libraries quickly. Though interesting, this result didn't seem to harness the unique advantages of the Scheme language. After all, JSON comprises nested lists of values, and manipulating lists is what Scheme is for. Couldn't we use Scheme to transform the data in some useful way? Here is the CouncilConnect stream, represented as a tree:

Tree representation of JSON input

JSON input, represented as a tree

As I watched people around the room working with this API for the first time, I noticed that I and others were spending much of our time studying the JSON stream to understand this underlying tree structure. Though we often knew immediately which element we were interested in (for example, the "votes" node, highlighted), we first had to learn how to navigate the tree in order to get to it. What if there was a way to bring this part of the API, buried deep in the tree, to the forefront without having to understand the whole API? We can re-root a tree at any node by changing the direction of some of the links (visually, grab the new root and hold it up; let the rest of the tree dangle down):

.Re-rooted JSON tree

JSON tree, re-rooted at "votes" element. Edges that have been flipped are highlighted

Now, the most interesting piece of information is easily accessible, right at the top of the tree. It is generally easier - both conceptually and in code - to navigate down a tree than it is to backtrack up it: in the rotated tree, I can still see that the "votes" object is part of a "motions" object which is part of an "items" object which is part of a "session", and I can still navigate those relationships, but I don't need to understand them just to get at the "votes" data.

It is possible to write a completely generic Scheme function that takes as arguments any JSON tree and any node-matching predicate, and returns a tree re-rooted at the node matching the predicate (or a list of such trees where more than one node matches). This is the sort of thing Scheme excels at; the heart of the code looks like this:

;; Return an array node,
;; whose members are trees T' which are T re-rooted
;; at nodes N matching (pred N)
(define (reroot T pred)
  (node 'array
	#f
	#f
	(let loop ((T T)
		   (path '()))
	  (if (pred T)
	      (list (rebuild T path 1))
	      (apply append
		     (map-partitions
		      (lambda (child other-children)
			(loop child
			      (cons
			       (node:replace-children T other-children)
			       path)))
		      (node:children T)))))))

;; Rebuild tree rooted at T, but appending links following 'path'
(define (rebuild T path level)
  (if (null? path)
      T
      (node 'object
	    (node:key T)
	    #f
	    (list
	     (node:replace-missing-key T "new_root")
	     (rebuild (node:replace-key
		       (car path)
		       (format #f "xpath-~a-~a"
			       level
			       (or (node:key (car path)) "NA")))
		      (cdr path)
		      (+ level 1))))))

Now, having the re-rooted tree, what should we do with it? How about de-parsing it straight back into JSON? The Scheme program is now acting as a sort of middleware or API virtualization layer - the re-factored JSON stream can be fed to any other JSON-capable tool (in any language). Here is an excerpt of the CouncilConnect JSON stream with "votes" rotated to the top:

[ { "votes": [
            { "owner": "City Auditor LaVonne Griffin-Valade", 
              "vote": "-"  }, 
            {   "owner": "Commissioner Amanda Fritz", 
                "vote": "Yea" }, 
            {   "owner": "Commissioner Nick Fish", 
                "vote": "Yea" }, 
            { "owner": "Mayor Sam Adams", 
                "vote": "-" }
        ], 
        "xpath-1-NA": {
            "xpath-1-NA": {
                "status": "passed", 
                "title": "Accepted", 
                "type": "motion"
            }, 
            "xpath-2-motions": {
                "xpath-2-motions": [], 
                "xpath-3-NA": {
                    "xpath-3-NA": {
                        "bureaus": "", 
                        "emergency": 0, 
                        "heading": "Times Certain", 
                        "id": 921, 
                        "owners": "", 
                        "title": "Portland City Council Agenda", 
                        "topic": "TIME CERTAIN: 9:30 AM - Combined Sewer Overflow Program  (Report introduced by Commissioner Saltzman)  30 minutes requested"
                    }, 

Now, I can easily access the contents of the "vote" object at the head of the stream, and if I care to, I can drill down further to see that the vote was passed and that it was related to a Sewer program.

For another example, let's hoist the node called "bureaus" to the top and see what happens. In the result below, we find at the head of the JSON stream a piece of business related to the transportation bureau; navigating down a level, we see the result of a vote; down one more level, and we see that this was about disabled parking permits.

[
    { "bureaus": [  "Portland Bureau of Transportation" ], 
        "xpath-1-NA": {
            "xpath-1-NA": {
                "emergency": 0, 
                "files": "", 
                "heading": "Regular Agenda", 
                "id": 922, 
                "item_number": "", 
                "item_number_string": "", 
                "motions": [
                    {
                        "status": "", 
                        "title": "", 
                        "type": "item", 
                        "votes": [
                            {   "owner": "Commissioner Nick Fish", 
                                "vote": "Yea"
                            }, 
                            ...
                        ]
                    }
                ], 
                "owners": [ "Mayor Sam Adams" ], 
                "title": "Portland City Council Agenda", 
                "topic": "Extend the date of the privileges for regular disabled person parking permits (Second Reading Agenda 520; amend Code Section 16.20.640)"
...

So, there it is: an application of Scheme code to web API programming, which leverages the strengths of the language, and which might even be useful. Here is the full code for the JSON-refactoring program:

(Updated 8/7/2011: this code is now also available on github)

;; Greg Benison, 1 August 2011
;; transform JSON input stream (stdin) into a tree;
;; re-root the tree and re-output as JSON (stdout).

(use-modules (json reader)
	     (json writer)
	     (srfi srfi-1)
	     (ice-9 pretty-print))

;; Change this as desired...
(define node-of-interest "votes")

; ---------------- utility functions -----------

(define (->string x)(format #f "~a" x))

(define (compose . ops)
  (if (null? ops)
      (lambda (x) x)
      (lambda (x)
	((car ops)
	 ((apply compose (cdr ops)) x)))))

;; like 'map', but supplies a second
;; argument to 'op' consisting of
;; "the other elements of set" in their
;; original order.
(define (map-partitions op set)
  (let loop ((first-part '())
	     (result '())
	     (rest set))
    (if (null? rest)
	(reverse result)
	(let ((this-element   (car rest))
	      (other-elements (append (reverse first-part)(cdr rest))))
	  (loop (cons this-element first-part)
		(cons (op this-element other-elements) result)
		(cdr rest))))))

(define (json:dump-to-file json fname)
  (let ((outfile (open-output-file fname)))
    (display (json:dump json) outfile)
    (close outfile)))

(define (hash-keys ht)(hash-fold (lambda (k v p)(cons k p)) '() ht))

; ---------------- JSON as tree ----------
(define (node type key value children)
  (cons (cons type key)
	(cons value children)))

(define node:type     caar)
(define node:key      cdar)
(define node:value    cadr)
(define node:children cddr)

(define (node:replace-children N new-children)
  (node (node:type  N)
	(node:key   N)
	(node:value N)
	new-children))

(define (node:replace-key N new-key)
  (node (node:type N)
	new-key
	(node:value N)
	(node:children N)))

(define (node:replace-missing-key N new-key)
  (if (node:key N) N (node:replace-key N new-key)))

(define (node:key? key)
  (lambda (N)(equal? key (node:key N))))

(define (node:take N limit)
  (node:replace-children N (take (node:children N) limit)))

(define (json->tree json key)
  (cond ((hash-table? json)
	 (node 'object
	       key
	       #f
	       (hash-fold
		(lambda (k v p)
		  (cons (json->tree v k) p))
		'()
		json)))
	((list? json)
	 (node 'array
	       key
	       #f
	       (map (lambda (v)(json->tree v #f)) json)))
	(else (node 'scalar key json '()))))

(define (tree->json tree)
  (case (node:type tree)
    ((scalar)(node:value tree))
    ((array)(map tree->json (node:children tree)))
    ((object)
     (let ((table (make-hash-table)))
       (for-each
	(lambda (node)
	  (hash-set! table (node:key node) (tree->json node)))
	(node:children tree))
       table))))

;; Rebuild tree rooted at T, but appending links following 'path'
(define (rebuild T path level)
  (if (null? path)
      T
      (node 'object
	    (node:key T)
	    #f
	    (list
	     (node:replace-missing-key T "new_root")
	     (rebuild (node:replace-key
		       (car path)
		       (format #f "xpath-~a-~a"
			       level
			       (or (node:key (car path)) "NA")))
		      (cdr path)
		      (+ level 1))))))


;; Return an array node,
;; whose members are trees T' which are T re-rooted
;; at nodes N matching (pred N)
(define (reroot T pred)
  (node 'array
	#f
	#f
	(let loop ((T T)
		   (path '()))
	  (if (pred T)
	      (list (rebuild T path 1))
	      (apply append
		     (map-partitions
		      (lambda (child other-children)
			(loop child
			      (cons
			       (node:replace-children T other-children)
			       path)))
		      (node:children T)))))))

; ---------------- input parsing ---------------

(define infile
  (cond ((null? (cdr (command-line)))       (current-input-port))
	((equal? "-" (cadr (command-line))) (current-input-port))
	(else (open-input-file (cadr (command-line))))))

(define input (json:read-value infile))
(define input-as-tree (json->tree input #f))

; ---------------- transform & output ----------

(display
 (json:dump
  (tree->json
   (reroot input-as-tree (node:key? node-of-interest)))))
Follow

Get every new post delivered to your Inbox.