Clojure Magic: proxy and proxy-super

Clojure is fun and Clojure macros are even more fun to use and really useful for developing domain specific languages. You can do amazing things using macros once you get used to them. This post describes how we can use Clojure proxy and proxy-super in combination with macros to implement DSLs which needs interactions between both Java and Clojure.

Recently I was developing DSL for Apache Storm as an experiment and had to bridge Storm’s Java API and Clojure. There is a Clojure DSL available for Storm in Storm project itself. But I wanted to implement DSL similar to popular stream processing language StreamIt. I mainly followed existing Storm Clojure DSL implementation and built my DSL based techniques used in that.

Existing Storm Clojure DSL is also done by bridging Storm’s Java API and Clojure and it uses combination of Java interfaces and Clojure reify to allow users to define Storm bolts and spouts in Clojure. Storm Clojure DSL basically follows the same pattern as Java API and below is a example of Storm bolt written in Java and Storm bolt written in Clojure.

public class RollingCountBolt extends BaseRichBolt {

...

@Override
 public void execute(Tuple tuple) {
    if (TupleHelpers.isTickTuple(tuple)) {
      LOG.debug("Received tick tuple, triggering emit of current window counts");
      emitCurrentWindowCounts();
    }
    else {
      countObjAndAck(tuple);
    }
 }

...

}

I have only shown the execute method above because it is the main method which get executed when Storm receives a tuple which is directed at a bolt.

(defbolt word-count ["word" "count"] {:prepare true}
  [conf context collector]
  (let [counts (atom {})]
    (bolt
     (execute [tuple]
       (let [word (.getString tuple 0)]
         (swap! counts (partial merge-with +) {word 1})
         (emit-bolt! collector [word (@counts word)] :anchor tuple)
         (ack! collector tuple))))))

bolt shown in above code snippet is a Clojure macro and code inside that will get reified to a instance of type IBolt.

public interface IBolt extends Serializable {
    void prepare(Map stormConf, TopologyContext context, OutputCollector collector);

    void execute(Tuple input);

    void cleanup();
}

As shown in above Clojure code, instance of OutputCollector will be available in execute scope via macro magic and can be used inside reified execute method.

In new DSL I am developing I wanted to get rid of explicit calls to output collectors and others by introducing push, pop and peek functions to interact with the stream of data. For this purpose I used a abstract class instead of a interface and use proxy and proxy-super macros in Clojure to build instances of type of the above abstract class. This gives me the capability implement push, pop, and peek internally in any way I want.

In proxy based implementation push, pop, and peek are Clojure macros which internally used proxy-super to call relevant methods of super class as below.

(proxy-super push tuple)

Stream processor written in this new DSL will looks like following.

(stormit/filter int-source [] [[] -> ["int"]]
         (init [max 1000])
         (work {:push 1}
               (let [i (rand-int max)]
                 (Thread/sleep 100)
                 (stormit/push [i]))))

I believe that this way of bridging Clojure and Java together can become handy in lots of situations. If you are interested on my work, you can follow it at (https://github.com/milinda/StormIt)[https://github.com/milinda/StormIt].

 
4
Kudos
 
4
Kudos

Now read this

Recursively Scraping A Blog With Scrapy

Scrapy is a web crawling and scraping framework written in python. The framework is really simple to understand and easy to get started with. If you know little bit of Python, you should be able to build your own web scraper within few... Continue →