Double Negative

Software, code and things.

PhantomJS, Mocha, and Chai for functional testing

I have been playing around with a number of open source projects pertaining to testing different aspects of a web based application. Over the past few days i have been playing with PhantomJS, Mocha, and Chai.

What is PhantomJS?

PhantomJS is a full stack headless web browser based on Webkit. That means it uses the same browser engine as many of the top browsers including Chrome and Safari.

ZombieJS was the other option that I considered. The difference is that Zombie works with JSDOM a javascript implementation of the DOM.

I opted to use PhantomJS because Zombie is not a particularly stable product (in my opinion), and having tested both the 1.4.1 version and the alpha 2.0.0 version I encountered a number of issues. The biggest problem for me was that it did not work very well with my complex shimming of externally loaded Javascript files, nor did it play nicely with ReactJS.

The other obvious consideration is that I believe tests should be run in as real an environment as possible.

PhantomJS is very good - it is easy to install and setup, but the documentation is a little sparse.

What is Mocha?

Mocha is a javascript testing suite that can be used with node OR in the browser. I want to use it in the browser, my browser being a PhantomJS browser instance.

Mocha allows you to hook in at various points to make sure for example that the necessary setup is complete before running your tests. It also has a really nice way of dealing with code which executes asynchronously. It is actively developed and has a good community around it.

before(function(done) {  
    //run asynchronous setup

    //tell mocha when you are done
    done();
});

//test code

What is Chai?

Chai is an assertion library. It essentially provides methods that can be used to assert that what you get is what you expect. Mocha and Chai work extremely well together.

I want to use Chain because it is extremely readable (BDD constructs) and is extremely well documented.

expect(resultCount).to.be.above(0);  

Combining the three

This is where my adventure got a little tougher.

I want to essentially load a webpage (the page under test), execute a number of commands, and then assert that they did what I expected.

It is simple enough* to create a PhantomJS browser instance and load a page but how does one then load both Mocha and Chai and manipulate the page in a testable way?

Whereas when using node you can simply require() dependencies, because we are using phantomJS from the command line we cannot.

There is a phantomJS runner available called mocha-phantomjs however I found it to be somewhat constraining. You can call a file which contains the code you want to test and the libraries you want to use to test.. which are then run. I can see this being useful for unit testing, but I want to test an already bult page without needing to adapt it for testing. It essentially takes control of the browser (PhantomJS) piece of the puzzle which in may case is unsuitable.

My approach

PhantomJS has a webpage module that has an injectJs() method. I chose to utilize this to inject my test code (and all its requirements) into my page under test. What this means is that I can utilize jQuery (which is already loaded on my page) to manipulate the DOM and access the elements, properties, and values that i want to test.

PhantomJS also provides a method on the client side, callPhantom(). This allows you to callback to the Phantom instance where it triggers the callback that you setup on page.onCallback().

As such my approach is to:

  • Run a PhantomJS browser instance and load the page I want to test
  • Inject my tests
  • Run my tests using Mocha and Chai
  • Pass the formatted response back to PhantomJS
  • Output the results on the command line.

Execution

Given the above, my execution is as follows:

var page = require("webpage").create();  
var args = require('system').args;

//pass in the name of the file that contains your tests
var testFile = args[1];  
//pass in the url you are testing
var pageAddress = args[2];

if (typeof testFile === 'undefined') {  
    console.error("Did not specify a test file");
    phantom.exit();
}

page.open(pageAddress, function(status) {  
    if (status !== 'success') {
        console.error("Failed to open", page.frameUrl);
        phantom.exit();
    }

//Inject mocha and chai                               page.injectJs("../node_modules/mocha/mocha.js");
    page.injectJs("../node_modules/chai/chai.js");

    //inject your test reporter
    page.injectJs("mocha/reporter.js");

    //inject your tests
    page.injectJs("mocha/" + testFile);

    page.evaluate(function() {
        window.mocha.run();
    });
});

page.onCallback = function(data) {  
    data.message && console.log(data.message);
    data.exit && phantom.exit();
};

page.onConsoleMessage = function(msg, lineNum, sourceId) {  
  console.log('CONSOLE: ' + msg + ' (from line #' + lineNum + ' in "' + sourceId + '")');
};

The only bit of the above code that i have yet to explain is reporters. Mocha provides a number of reporters for formatting your test results. Because of the nature of this setup you cannot simply use Mocha's reporters - you have to build your own. This is one benefit of mocha-phantomjs (see above) in that the author has successfully ported over the reporters for you to use.

My basic implementation of a reporter is as follows:

(function() {

    var color = Mocha.reporters.Base.color;

    function log() {

        var args = Array.apply(null, arguments);

        if (window.callPhantom) {
            window.callPhantom({ message: args.join(" ") });
        } else {
            console.log( args.join(" ") );
        }

    }

    var Reporter = function(runner){

        Mocha.reporters.Base.call(this, runner);

        var out = [];
        var stats = { suites: 0, tests: 0, passes: 0, pending: 0, failures: 0 }

        runner.on('start', function() {
            stats.start = new Date;
            out.push([ "Testing",  window.location.href, "\n"]);
        });

        runner.on('suite', function(suite) {
            stats.suites++;
            out.push([suite.title, "\n"]);
        });

        runner.on('test', function(suite) {
            stats.tests++;
        });

        runner.on("pass", function(test) {
            stats.passes++;
            if ('fast' == test.speed) {
                out.push([ color('checkmark', '  ✓ '), test.title, "\n" ]);
            } else {
                out.push([
                    color('checkmark', '  ✓ '),
                    test.title,
                    color(test.speed, test.duration + "ms"),
                    '\n'
                ]);
            }

        });

        runner.on('fail', function(test, err) {
            stats.failures++;
            out.push([ color('fail', '  × '), color('fail', test.title), ":\n    ", err ,"\n"]);
        });

        runner.on("end", function() {

            out.push(['ending']);

            stats.end = new Date;
            stats.duration = new Date - stats.start;

            out.push([stats.tests, "tests ran in", stats.duration, "ms"]);
            out.push([ color('checkmark', stats.passes), "passed and", color('fail', stats.failures), "failed"]);

            while (out.length) {
                log.apply(null, out.shift());
            }

            if (window.callPhantom) {
                window.callPhantom({ exit: true });
            }

        });

    };

    mocha.setup({
        ui: 'bdd',
        ignoreLeaks: true,
        reporter: Reporter
    });

}());

Issues

When I was playing with ZombieJS, my usage of React caused a number of issues. In my mind this was understandable - given how React works with the virtual DOM etc, I kind of figured that a javascript DOM implementation may have problems with it.

There was however an issue using React with PhantomJS. This is outlined in detail here - you just need to polyfill the bind method. This occurs because PhantomJS is using an old version of WebKit. PhantomJS 2.0 will be coming at some point, and this will resolve this issue. This update (when it comes) may change callPhantom() (discussed above) as the documentation outlines that it is an experimental API.

Soo..

Hopefully you find the above helpful. I'd be interested to hear peoples thoughts on this approach as well as any suggestions people may have for improvements.

Facebook's Jest javascript unit testing framework

Over the past few days i have been investigating Facebook's Jest - a javascript unit testing framework. I have found a number of issues with it for my use case.

My main issue is as to what it is actually useful for. The documentation outlines how to test microcosmically small sections of code. Admitedly, that is exactly what a unit test is, however I don't personally believe a lot of people build javascript apps with such minute seperation of concerns.

One of my projects uses React for a number of its client facing interfaces. Jest is used by Facebook to test their React components, so I thought I would try and do the same.

I can absolutely see how one could and would test a component. React is great in that you can make simple components and reuse them in multiple places. The example shown in the Jest documentation outlines how you could test a checkbox component - that really is the extent to which Jest shines in my opinion. I can test that when I change my <SelectButton /> components value my valueChanged callback is called, and an ajax request is sent off.

Integration

A React component can be made up of a number of other react components. In fact, a react interface is a number of react components layed out together and interacting with one another. As such, having tested each component individually, I would like to test the interface as a whole, and the interactions between the components. This is more of an integration test than a unit test, and Jest certainly is not designed for this.

Issues

Jest is extremely new, and is not particularly well documented at the moment. Within a reactive setup, it only works for the most basic of setups. For example, I build my code using gulp and browserify. I use browserify-shim to allow me to require() modules that I am loading from a CDN for example. Jest does not play with this well. You can work around this, but writing tests should not be hard or complex and I expect a testing framework to remove the need for complex boilerplate.

Another problem that I have encountered is the automocking functionality - in principle it is great but it does not work across the board. In certain situations you have to implement your own mocks, and who wants to do that :P It is also not immediately clear what is mocked and when - although the documentation makes it seem clear, I created a manual mock, and within it required some other modules which were not mocked. I then manually created a mock with jest.mock() but it broke the unit under test.

Conclusion

If you are testing extremely small, simple units, Jest is great. It is too immature to stand out to me, but I will certainly keep an eye on it. If I get the the time I would like to read into the internals some more - I feel that if you know exactly what it is doing, and how it is doing it you will get a lot more from it.

At the end of the day I use testing as a way of making me feel confident that my code works as I expect it to and does not regress. I think Jest is a nice compliment to a test suite but it certainly wouldn't be my first port of call.

Consider Sphinx for your search needs

Intro

For most entities building a website, search is not really a consideration. Consideration in the sense that for your search functionality you intend to simply query your database backend for the results you need.. like everyone does.. right?

Certainly for most use cases, the power of modern database backends mean that specific search software is not, and will never be a requirement.

Everyone builds out software intending for it to become popular but unless your software is going to need to query millions of rows using complex queries, considerations like this are not neccesary. That said, if you are unsure of the potential of your product it may be worth considering this now because, as with all things it will be significantly more difficult to integrate into a legacy project down the line.

What is Sphinx?

Sphinx is an open source full text search server. It is extremely powerful, easy to setup, and has a well documented, well architected PHP API for you to use.

It is used by Craigslist as well as many many other entities, small and large.

Use case

There are many use cases for software such as Sphinx. The most apparent use case in my opinion is one that I have utilized Sphinx to resolve - querying large, complex data sets.

If for example you have a denormalized database architecture (for good reasons) and you need to produce search functionality that queries many tables for millions of rows, Sphinx may well be a suitable answer. You have denormalized your database with good reason, and the only respect in which your architecture is lacking is in its ability to be searched. What can you do?

An extremely complex mySQL query for example might take seconds or even tens of seconds. If you want to provide a good user experience, you cannot keep your user waiting for that long.
Instead you can index all of this data on a Sphinx server (running independently of your web/database server(s)) and query it quickly using the provided API.

I implemented this such that 4 million rows could be queried in a negligible amount of time, where negligible = milliseconds.

Issues

The most apparent issue with indexing your data and searching it is that your indexed data quickly becomes outdated. Sphinx fortunately can execute delta indexes which only index changed data. You can for example run a full initial index and then run delta indexes every 15 minutes. You can alter your usage based on your requirements - if you have regularly changing data, you may want to run delta indexes more often.

Conclusion

The above is an abstract look at Sphinx search in relation to a personal implementation of it. To get down to the nitty gritty I suggest you take a look at the latest documentation. I highly recommend the product. It is well documented, and supported, and is actively developed.

SEO, reactJS, v8js, and PHP

reactJS from the guys at Facebook is a really nice javascript UI framework. It powers the Facebook UI which in my opinion looks good, is responsive, and is easy to use.

One thing that concerns me with javascript is Search Engine Optimization (SEO) - it is great to have a nice UI, but if noone is getting to your site it is somewhat pointless. This is definitely a consideration when using a framework like react to potentially build your whole interface. It is no longer a case of the googlebot missing a few interactive elements of your site, but rather has the potential to result in a completely unindexable website.

Fortunately the guys at react/Facebook are well aware of this and built in a provision to help developers with this problem: renderComponentToString

The basic premise around using this method in relation to SEO is that you build your initial interface on the server side and display it when a http request hits your website. You then initiate the client side rendering of the same react component into the containing division once the page has loaded. React will then attach event handlers to take control of the content displayed.

The other piece of the puzzle is of course finding a way of executing javascript on the server.

My first thought when I heard server side and javascript was Node. On further consideration however I decided to investigate PHP's implementation of the v8 javascript engine namely v8js.

My reasoning for this was that by rendering my javascript with PHP I would reduce the complexity of the setup, have (to a greater extent) a DRY codebase (by easily and efficiently reusing the same javascript for both the server side and client side rendering), and be able to integrate more easily with my build scripts. In addition to this, my initial thought as regards a node setup seemed slow and inefficient - having a node server executing the javascript which one could hit with a CURL request.

Given that I knew I would have to write a wrapper class around whatever was rendering the javascript it seemed more logical to build it around v8js.

I also asked about it and Ben Alpert - one of the guys working on react pointed me in the direction of v8js.

Stoyan gives a good overview of how one can setup react with v8js. This was my starting point when setting things up. Of course to deploy a production ready site using such a setup there are certainly a lot of additional considerations. Furthermore the v8js php extension is complex in nature and there is not much in the way of documentation or tutorials.

A few issues that I encountered were:

  • memory usage rendering javascript server side
  • the lack of a 'real' DOM server side, meaning either working with a fake DOM (jsdom for example) or avoiding javascript requiring a DOM

I currently have one of my websites running in production with such a setup. It was a complex initial setup, but working with it once you are set up is an absolute pleasure. My site is being indexed.. so it works, and I have a beautiful client facing react interface.

I went down the line of setting this up to work with PHP. It is however perfectly possible to implement a similarly premised setup with any codebase. Pete Hunt (another member of the react team) mentions here how Instagram uses a combination of node and python.

I would certainly be interested to here how others have setup server side javascript rendering for the purposes of SEO, and if anyone has any questions about my setup I would be happy to answer them.