Wednesday, June 21, 2017

5 ways to make an http request

The version 0.2.1 of the GPF-JS library delivers an HTTP request helper that can be used on all supported hosts. It was quite a challenge as it implied 5 different developments, here are the details.

The need for HTTP requests

In a world of interoperability, internet of things and microservices, the - almost 30 years old - HTTP protocol defines a communication foundation that is widely known and implemented.

Originally designed for human-to-machine communication, this protocol also supports machine-to-machine communication through standardized concepts and interfaces:

Evolution of HTTP requests in browsers

Web browsers were the first applications implementing this protocol to access the World Wide Web.

Netscape communicator loading screen
Netscape communicator loading screen

Before AJAX was conceptualized, web pages had to be fully refreshed from the server to reflect any change. JavaScript was used for simple client manipulations. From a user experience point of view, it was OK (mostly because we had no other choices) but this limited the development of user interfaces.

I guess
I guess

Then AJAX introduced new ways to design web pages: only the new information could be fetched from the server without reloading the page. Therefore, the pages were faster, crisper and fully asynchronous.

However, each browser had its own implementation of AJAX requests (not mentioning DOM, event handling and other incompatibilities). And that's why jQuery, which was initially designed to offer a uniform API that would work identically on any browser, became so popular.

jQuery everywhere
jQuery everywhere

Today, the situation has changed: almost all browsers are implementing the same APIs and, consequently, modern libraries are considering browsers to be one environment only.

GPF-JS

GPF-JS obviously supports browsers and it leverages AJAX requests to implement HTTP requests in this environment. But the library is also compatible with NodeJS as well as other - less common - command line hosts:

Designing only one API that is compatible with all these hosts means to deal with each host specificities.

How to test HTTP request

When you follow the TDD practice, you write tests before writing any line of production code. But in that case, the first challenge was to figure out how the whole HTTP layer could be tested. Mocking was not an option.

The project development environment heavily relies on the grunt connect task to deliver the dashboard: a place where the developer can access all the tools (source list, tests, documentation...).

dashboard
dashboard

As a lazy developer, I just need one command line for my development (grunt). Then all the tools are available within the dashboard.

Some middleware is plugged to add extra features such as:

  • cache: introduced with version 0.1.7, it is leveraged by the command line used to test browsers when Selenium is not available. It implements a data storing service like Redis.
  • fs: a file access service used to read, create and delete files within the project storage. For instance, it is used by the sources tile to check if a source has a corresponding test file.
  • grunt: a wrapper used to execute and format the log of grunt tasks.

Based on this experience, it became obvious that the project needed another extension: the echo service. It basically accepts any HTTP request and the response either reflects the request details or can be modified through URL parameters.

POSTMAN was used to test the tool that will be used to test the HTTP layer...

GET
GET

GET 500
GET 500

POST
POST

One API to rule them all

Now that the HTTP layer can be tested, the API must be designed to write the tests.

Input

An HTTP request starts with some parameters:

  • The Uniform Resource Locator which determines the web address you want to send the request to. There are several ways to specify this location: NodeJS offers an URL class which exposes the different parts of it (host, port ...). However, the simplest representation remains the one everybody is used to: the string you can read inside the browser location bar.
  • The request method (also known as verb) which specifies the kind of action you want to execute.
  • An optional list of header fields meant to configure the request processing (such as specifying the expected answer type...). The simplest way to provide this list is to use a key/value dictionary, meaning an object.
  • The request body, mostly used for POST and PUT actions, which contains that data to upload to the server. Even if the library supports the concept of streams, most of the expected use cases imply sending an envelope that is synchronously built (text, JSON, XML...). Also, JavaScript (in general) is not good at handling binary data, hence a simple string is expected as a request body.

This leads to the definition of the httpRequestSettings type.

Output

On completion, the server sends back a response composed of:

  • A status code that provides feedback about how the server processed the request. Typically, 200 means everything went well. On the contrary, 4xx messages signal an error and 500 is a critical server error.
  • A list of response headers. For instance, this is how cookies are transmitted by the server to the client (and, actually, they are also sent back by the client to the server through headers).
  • The response body: depending on what has been requested, it will contain the server answer. This response could be deserialized using a readable stream. But, for the same reasons, a simple string containing the whole response text will be returned.

This leads to the definition of the httpRequestResponse type.

If needed, the API may evolve later to introduce the possibility to use streams.

Waiting for the completion

An HTTP request is asynchronous; hence the client must wait for the server to answer. To avoid the callback hell, a Promise is used to represent the eventual completion of the request.

This leads to the definition of the gpf.http.request API.

The promise is resolved when the server answered, whatever the status code (including 500). The only way the promise would be rejected is when something wrong happened during communication.

Shortcuts

For simple requests, such as a GET with no specific header, the API must be easy to use. Shortcuts are defined to shorten the call, for instance:

gpf.http.get(baseUrl).then(function (response) { process(response.responseText); }, handleError);

See the documentation.

Handling different environments

Inside the library, there are almost as many implementations as there are supported hosts. Each one is inside a self-titled file below the http source folder. This will be detailed right after.

Consequently, there are basically many ways to call the proper implementation depending on the host:

  • Inside the request API, create an if / else condition that checks every possibility gpf.http.request = function (/*...*/) { if (_GPF_HOST.NODEJS === _gpfHost) { // call NodeJS implementation } else if (_GPF_HOST.BROWSER === _gpfHost) { // call Browser implementation } else /* ... */ };
  • Have a global variable receiving the proper implementation, using an if condition inside each implementation file // Inside src/host/nodejs.js if (_GPF_HOST.NODEJS === _gpfHost) { _gpfHttpRequestImpl = function (/*...*/) { /* ... NodeJS implementation ... */ }; } // Inside src/http.js gpf.http.request = function (/*...*/) { _gpfHttpRequestImpl(/*...*/); };
  • Create a dictionary indexing all implementations per host and then fetch the proper one on call // Inside src/host/nodejs.js _gpfHttpRequestImplByHost[_GPF_HOST.NODEJS] = function () { /* ... NodeJS implementation ... */ }; // Inside src/http.js gpf.http.request = function (/*...*/) { _gpfHttpRequestImplByHost[_gpfHost](/*...*/); };

My preference goes to the last choice for the following reasons:

  • if / else conditions generate cyclomatic complexity. In general, the less if the better. In this case, they are useless because used to compare a variable (here the current host) with a list of predefined values (the list of host names). A dictionary is more efficient.
  • It is simpler to manipulate a dictionary to dynamically declare a new host or even update an existing implementation. Indeed, we could imagine a plugin mechanism that would change the way requests are working by replacing the default handler.

Consequently, the internal library variable _gpfHttpRequestImplByHost contains all implementations indexed by host name. The request API calls the proper one by fetching the implementation at runtime.

Browsers

As explained in the introduction, browsers offer AJAX requests to make HTTP Requests. This is possible through the XmlHttpRequest JavaScript class.

There is one major restriction when dealing with AJAX requests. You are mainly limited to the server you are currently browsing. If you try to access a different server (or even a different port on the same server), then you are entering the realm of cross-origin requests.

If needed, you will find many examples on the web on how to use it. Long story short, you can trigger a simple AJAX request in 5 lines of code.

In terms of processing, it is interesting to note that, once triggered from the JavaScript code, the network communication is fully handled by the browser: it does not require the JavaScript engine. This means that the page may execute some code while the request is being transmitted to the server as well as while waiting for the response. However, to be able to process the result (i.e. trigger the callback), the JavaScript engine must be idle.

Test preview
Test preview

Browser implementation is done inside src/http/xhr.js.

Two external helpers are defined inside src/http/helpers.js:

Setting the request headers and sending request data are almost done the same way for three hosts. To avoid code duplication, those two functions generates specialized versions capable of calling host specific methods.

NodeJS

Besides being a JavaScript host, NodeJS comes with a complete set of API for a wide variety of tasks. Specifically, it comes with the http feature.

But unlike AJAX requests, triggering an HTTP requests requires more effort than in a browser.

The http.request method allocates an http.ClientRequest. However, it expects a structure that details the web address. That's why the URL parsing API is needed.

The clientRequest object is also a writable stream and it exposes the method to push data over the connection. Things being done at the lowest level, you are responsible of ensuring the consistency of the request details. Indeed, it is mandatory to set the request headers properly. For instance, forgetting the Content-Length specification on a PUSH or a PUT will lead to the HPE_UNEXPECTED_CONTENT_LENGTH error. The library is taking care of that part.

The same way, the response body is a readable stream. Fortunately, GPF-JS provides a NodeJS-specific stream reader and it deserializes the content inside a string using _gpfStringFromStream.

Test preview
Test preview

NodeJS implementation is done inside src/http/nodejs.js.

WScript

WScript is the scripting host available on almost all post Windows XP Microsoft Windows operating systems. It comes in two flavors:

  • WScript.exe which is showing outputs in dialog boxes
  • cscript.exe which is the command line counter part

This host has a rather old and weird support of the JavaScript features. It does not supports timers but GPF-JS provides all the necessary polyfills to compensate for the missing APIs.

Despite all these troubles, it has one unique advantage over the other hosts: it offers the possibility to manipulate COM components.

Indeed, the host-specific class ActiveXObject gives you access to thousands of external features within a script:

For instance, few years ago, I created a script capable of reconfiguring a virtual host to fit user preferences and make it unique on the network.

Among the list of available objects, there is one that is used to generate HTTP requests: the WinHttp.WinHttpRequest.5.1 object.

Basically, it mimics the interface of the XmlHttpRequest object with one significant difference: its behavior is synchronous. As the GPF API returns a Promise, the developer does not have to care about this difference.

Test preview
Test preview

Wscript implementation is done inside src/http/wscript.js.

Rhino

Rhino is probably one of the most challenging - and fun - environment because it is based on java.

The fascinating aspect of Rhino comes from the tight integration between the two languages. Indeed, the JavaScript engine is implemented in java and you can access any java class from JavaScript. In terms of language support, it is almost the same than WScript: no timers and written on a relatively old specification. Here again, the polyfills are taking care of filling the blanks.

To implement HTTP requests, one have to figure out which part of the java platform would be used. After doing some investigations (thanks Google), the solution appeared to be the java.net.URL class.

Like NodeJS, java streams are used to send or receive data over the connection. Likewise, the library offers rhino-specific streams implementation.

Stream reading works by consuming bytes. To read text, a java.util.Scanner instance is used.

Surprisingly, if the status code is in the 5xx range, then getting the response stream will fail and you have to go with the error stream.

Test preview
Test preview

Rhino implementation is done inside src/http/rhino.js.

PhantomJS

To put it in a nutshell, PhantomJS is a command line simulating a browser. It is mainly used to script access to web sites and it is the perfect tool for test automation.

But there are basically two styles of PhantomJS scripts:

  • On one hand, it browses a website and simulates what would happen in a browser
  • On the other hand, it is a command line executing some JavaScript code

As a matter of fact, GPF-JS uses those two ways:

  • mocha is used to automate browser testing with PhantomJS
  • a dedicated command line runs the test suite without any web page

As a result, in this environment, the XmlHttpRequest JavaScript class is available.

However, like in a browser, this host is also subject to security concerns. Hence you are not allowed to request a server that is not the one being opened.

Luckily, you can bypass this constraint using a command line parameter: --web-security=false.

Test preview
Test preview

PhantomJS implementation is done inside src/http/xhr.js.

Conclusion

If you survived the whole article, congratulations (and sorry for the broken English).

Now you might be wondering...

What's the point?
What's the point?

Actually, this kind of challenge satisfies my curiosity. I learned a lot by implementing this feature and, actually, it was immediately applied to greatly improve the coverage measurement.

Indeed, each host is tested with instrumented files and the collected coverage data is serialized to be later consolidated and reported on. However, as of today, only two hosts supports file storage: NodeJS and WScript. But thanks to the HTTP support, all hosts are sending the coverage data to the fs middleware so that it generates the file.

Q.E.D.

Monday, June 12, 2017

Release 0.2.1: Side project support

This new release supports my side projects by implementing html generation and http request helpers. It also improves the coverage measurement and this is the first version to be published again as an NPM package.

New version

Here comes the new version:

NPM Publishing

Starting from this version, the library will be published as an NPM package on every release.

The package was already existing since it was first published for version 0.1.4. However, the library has since been redesigned in a way that is not backward compatible. That's the reason why the MINOR version number was increased.

It is violating the normal backward compatibility rule but, actually, nobody was really using it... And I didn't want to increase the MAJOR number to 1 until the library is ready.

An .npmignore file instructs NPM which files should be included or not. The package is almost limited to the build folder.

HTML generation helper

I was watching the excellent funfunfunction video about the hidden costs of templating languages and, at some point, he showed some HTML generation helpers which syntax amazed me.

Hence, to support my side project, I decided to create my own HTML generation helpers based on this syntax.

For instance: var div = gpf.web.createTagFunction("div"), span = gpf.web.createTagFunction("span"); document.getElementById("placeholder") = div({className: "test1"}, "Hello ", span({className: "test2"}, "World!") ).toString();

Sadly, I realize that I completely forgot to document the feature properly. Fortunately, the tests demonstrate the feature.

HTTP Requests

The prominent feature of this version is the gpf.http.request helper. With it, you can trigger HTTP requests from any supported host using only one API. The response is given back to the script asynchronously through Promises.

The promise is resolved when the server provides a response, whatever the status code.

Some shortcuts are also defined to improve code readability:

gpf.http.get(requestUrl).then(function (response) { if (response.status === 500) { return Promise.reject(new Error("Internal Error")); } return process(response.responseText); }) .catch(function (reason) { onError(reason); });

The supported (and tested) features are:

  • Most common HTTP verbs
  • Possibility to provide request headers
  • Possibility to access response headers
  • Possibility to submit textual payload (on POST and PUT)
  • Access to response text

Other features will be added depending on the needs but this is enough to make it fully functional.

I will soon write an article about this very specific part as I had lot of challenges to test and implement it.

Improved coverage

Because each host benefits from its own http request implementation, it was important to assess the code coverage.

Up to this version, the coverage was measured with NodeJS through istanbul and mochaTest. Each host specific code was flagged to be ignored using the istanbul ignore comment.

However, as the library grows, more code was actually not verified.

After analyzing how istanbul generates, stores and consolidates the coverage information, I found a way to run the instrumented code on all hosts. A new grunt task consolidate all data into the global one.

In the end, there are still some branches / instructions that are ignored but they are all documented.

  • Statements ignored: 0.74%
  • Branches ignored: 1.77%
  • Functions ignored: 1.36%

Lessons learned

ECHO service

To be able to test the http helpers, I needed a server which responses could be controlled. I already had some middleware plugged inside the grunt connect task. I decided to create the ECHO service.

I took me a while to figure out the proper coding: for instance, I had to disable the cache because Internet Explorer was storing the request results and it was failing the tests.

Also, I had to change the way tests are running to transmit the http port information. This is done through the config global object.

Some code was definitely not tested

Having the coverage measured on almost every lines revealed sections of untested code. This leaded to some simplifications (boot for instance) but also to new tests.

Incomplete design for ending a stream

In order to prepare the CSV reader, I created a line adapter stream. It implements both the IReadableStream and IWritableStream interfaces. It caches the data being received with write until some line separator is detected.

However, because of caching, it requires a method to flush the buffered data.

As of now, I decided to create a method to explicitly flush the stream.

However, I am still wondering if it may change in the future: if you pipe several streams together, it could be convenient to have a standard way to flush all the different levels. One idea could be to introduce a special token that would be written at the end of a stream but then it would require all streams to implement it.

Next release

For now the project is on pause because of vacations. I will take some time to plan the next iteration more carefully but, still, I have to support my side project by creating a CSV reader as well as a record container.

Saturday, April 29, 2017

Release 0.1.9: Records files

This new release leverages the interface concept and delivers a file storage implementation for NodeJS and WScript. It also introduces the necessary tools for a side project I am currently working on.

New version

Here comes the new version:

Path management

Because this version comes with file management, it all started with path management. Some existing code was waiting to be re-enabled in the library. This was done easily.

The gpf.path helpers are considering that the normal path separator is the Unix one. The translation is done whenever necessary and the tests are considering each case.

IFileStorage and streams

Three new interfaces were designed in this release:

The purpose is to encapsulate the file system in a flexible way using Promises to handle asynchronicity. Also, streams were introduced to abstract files concept to reading or writing data.

The method gpf.fs.getFileStorage retrieves the current host's IFileStorage (if existing).

Reading a file becomes host independent:

function read (path) { var iFileStorage = gpf.fs.getFileStorage(), iWritableStream = new gpf.stream.WritableString(); return iFileStorage.openTextStream(path, gpf.fs.openFor.reading) .then(function (iReadableStream) { return iReadableStream.read(iWritableStream); .then(function () { return iFileStorage.close(iReadableStream); }) .then(function () { return iWritableStream.toString(); }); }); }

And so is writing to a file:

function write (path, content) { var iFileStorage = gpf.fs.getFileStorage(); return iFileStorage.openTextStream(path, gpf.fs.openFor.appending) .then(function (iWritableStream) { return iWritableStream.write(content) .then(function () { return iFileStorage.close(iWritableStream); }); }); }

Following TDD practice, all the methods were first tested and then implemented for:

Some notes:

  • If you need to replace a file content, you must delete it first.
  • File storage, streams and hosts implementation were existing and were waiting to be re-enabled. However, they were using a convoluted notification mechanism that has been dropped for Promises.
  • Some existing code handles binary streams (even with WScript). However, I doubt this would be useful for the coming projects so I removed it. Indeed, the file storage method is named OpenTextStream .
  • I plan to later implement file storage for rhino and even browsers (with a dedicated backend).

Filters and Sorters

This release was labeled "Record files" because I started a side project in which thousands of records will be manipulated. Handling an array of records requires that you can easily filter or sort them on any property.

As of now, records are supposed to be flat objects.

To make it efficient, the code generation approach was preferred.

Two functions are proposed:

The chosen syntax is documented and lots of examples can be found in the test cases. I also integrated a regular expression operator that allows interesting extractions.

I plan to create some parsers to generate the filter from more readable syntaxes (SQL, LDAP...).

Sorting can be done on any property, two types of comparison are offered:

  • Number (default) where values are compared using a subtraction
  • String through localCompare

You may sort on several properties, see the examples.

Lessons learned

WScript specific behaviors

WScript has a pretty weak JavaScript implementation: I found two issues that broke some of my tests.

  • Newly created prototypes automatically have the constructor property assigned. I had to remove it for interfaces, see this fix

Tests

The library has now 541 tests. If you compare with version 0.1.8 that had 396, it is almost 1/3 more!

In terms of the time required to execute them, it takes only 1.5 second to run one round them with NodeJS. It is still acceptable.

One notable challenge was to test the NodeJS streams wrappers. Some error situations are almost impossible to simulate. With the help of mocking, and a good comprehension of the stream events, the code was secured.

Code coverage

Now that the library offers two different implementations for the file storage object, a big change occurred in the way coverage is measured. Indeed, the source version now only loads what's specific to the host so that it prevents adding countless "istanbul ignore" comments. But this also means that some files are not covered anymore.

I plan to fix that in the next version.

Next release

The next release will be dedicated to support my other project:

  • A streamed line reader will be developed (and it may lead to a CSV reader)
  • The library will be published to NPM (automatically when releasing the library)
  • As stated above, the coverage will be re-designed to include other hosts

However, because of the side project, the release frequency may slow down.

Thursday, March 30, 2017

Sneaky JavaScript Technics III

A ninja is a master of disguise. How do you apply the JavaScript ninjutsu to hide an object member so that nobody can see (and use) it? Here are several ways.

The context

These days, I started a new task on my gpf-js project and it implies manipulating an AST structure within NodeJS. Because the structure analysis requires to have access to the parent nodes, I augmented the AST items with a link to their parent. The result is later serialized into JSON for tracing purpose but the generated circular reference broke the conversion.

Actually, I could have used the replacer parameter of the JSON.stringify method but this circular reference also caused trouble in my own code. Hence I had to find another way.

Reflection

The JavaScript language offers some reflection mechanisms. Indeed, the for..in syntax is capable of listing all enumerable properties of any object.

I would recommend using Object.keys instead. However, the lack of support of old browsers requires that you polyfilled it with for..in combined with hasOwnProperty .

This fundamental machinery is widely used:

  • ESlint offers a rule to detect unsafe uses
  • JavaScript evolved to make the enumerable state configurable on a property

Object.defineProperty

The simplest way to add a property to an object is to simply do it: var obj = {}; obj.newProperty = "value";

This newly created property will be enumerable by default.

ECMAScript 5.1 introduced Object.defineProperty to create a property with options. This feature is implemented by most recent browsers with some limitations when using it on DOM objects.

It can be used in different ways and most of them are beyond the scope of this article. I will mostly focus on the possibility to create a non-enumerable property:

var obj = {}; Object.defineProperty(obj, "newProperty", { value: "value", writable: true, enumerable: false, configurable: false });

By setting enumerable to false, this property will not be enumerated when using the for..in syntax. By setting configurable to false, this property can't be deleted and can't be reconfigured to make it visible again.

The advantages are:

  • It is easy to implement
  • It is available since IE9

But, regarding the initial purpose, it comes with drawbacks:

To make the task harder for a hacker to figure out the property name, you may generate a complex random name and store it in a 'private' variable.

Symbol

Another simple way to add a property to an object is to use the square bracket syntax: var obj = {}; var newPropertyName = "newProperty"; obj[newPropertyName] = "value";

In that example, the newly created property will be named "newProperty" as per the value of the variable. The only difference with the dot syntax is the use of a variable to name the property.

But what if the variable is not a string?

For most standard objects (and primitive types), the variable value is converted to string. Consequently, the following syntax is valid (even if it does not make sense): var obj = {}; function any () {} obj[any] = "value";

The created property name will be "function any() {}"

This obviously means that you can use names that are not valid identifiers. Hence, it is mandatory to use the bracket syntax to access them.

However, there is one mechanism that behaves differently. It was introduced with ECMAScript 2015. Every time you call the Symbol function, it returns a unique value. This value type is primitive but it does not convert to string.

This feature is implemented only by most recent browsers and not by IE.

The advantages are:

  • It is easy to implement
  • There is no way to access the property unless you have the symbol value

Tuesday, March 28, 2017

Release 0.1.8

This new release introduces the interface concept and also prepares future optimizations of the release version.

New version

Here comes the new version:

Interfaces

A new entity type can now be defined: interface.

Being able to define interfaces and to check if an object conforms with their specification will normalize and improve the way encapsulations are handled within the library but also when using it.

Release optimization

When the library is built, some source manipulations occur:

Before this release, the debug and release versions were almost the same. The main differences were:

  • The release version is serialized without the comments (see rewriteOptions)
  • Also, the release version is minified using UglifyJS

Because I am planning to implement some optimization patterns on the release version, I started to redesign the build process. After converting all the relevant sources to ES6, I realized that the former AST manipulation was not working properly. Indeed, it was trying to rename all variables to shorten them but... it didn't work. Not a big deal because UglifyJS was already doing the job.

Finally, I rewrote the AST manipulation to start optimizing the concatenated AST structure.

Estimating functions usage

I implemented a detection of unused functions which revealed that _gpfIsUnsignedByte was not used! Hence, it is removed from the output.

NOP

The build process includes a pre-processor capable of conditioning some JavaScript lines using C-like defines. However, as a lazy developer, I don't want to wrap all DEBUG specific functions inside a #ifdef / #endif pair.

That's why some variables are tagged with a comment containing only gpf:nop. This is a way to signal that the variable represents a no operation.

For instance: javascript /*#ifdef(DEBUG)*/ // DEBUG specifics _gpfAssertAttributeClassOnly = function (value) { _gpfAsserts({ "Expected a class parameter": "function" === typeof value, "Expected an Attribute-like class parameter": value.prototype instanceof _gpfAttribute }); }; _gpfAssertAttributeOnly = function (value) { _gpfAssert(value instanceof _gpfAttribute, "Expected an Attribute-like parameter"); }; /* istanbul ignore if */ // Because tested in DEBUG if (!_gpfAssertAttributeClassOnly) { /*#else*/ /*gpf:nop*/ _gpfAssertAttributeClassOnly = _gpfEmptyFunc; /*gpf:nop*/ _gpfAssertAttributeOnly = _gpfEmptyFunc; /*#endif*/ /*#ifdef(DEBUG)*/ }

The optimizer is now capable of locating all variables flagged with gpf:nop and safely remove them from the output.

Automated release (final)

In the last release, I forgot one last step: closing the milestone. This is now completed.

Lessons learned

Performance measurement

When dealing with performances, one of the biggest challenge is to establish the point of reference. Indeed, if you want to quantify how much you gain, you need to make sure that the measurement environment is stable enough so that you can compare your results.

In my case, this is tough: the JavaScript hosts are evolving very fast and, since I started this project, I already changed my workstation twice. Furthermore, I don't have any performance specific tests.

So, I decided to take a pragmatic approach.

I reuse the test cases (that cover almost all the library) and compare the debug version with the release one. By quantifying the difference of execution between the two versions, this would give me a good indication on how much the release is optimized.

Today, both version demonstrates similar performances, even after implementing some of the optimizations.

jsdoc plugin

In the article My own jsdoc plugin, I explained how I tweaked jsdoc to facilitate generation of the documentation.

However, I noticed that the defineTags could be a better alternative to define custom tags. After trying to experiment it (knowing that jsdoc plugins are badly documented), it appeared that it is extremely limited:

  • No lookup function to access other existing doclets
  • Very limited information on the current doclet (for instance: I tried to implement @gpf:chainable but the memberOf property was not set). Indeed, we can't know when the onTagged is triggered (no control on the order).

Next release

The next release will introduce some features that are required for a side project I created.

Monday, March 6, 2017

Release 0.1.7

This new release secures the class mechanism and improves project tools.

New version

Here comes the new version:

Improved $super keyword

When writing the article My own super implementation I found several issues that were immediately fixed.

Better tooling

The following tools were modified:

  • It is now possible to remove unused files from the sources page
  • Each file modification triggers one pass of tests (it takes less than 1 second to execute). This way, I know as soon as possible if something goes wrong
  • fs middleware is secured to limit access to the project files only. It now supports the DELETE verb
  • The watch and serve tasks are monitoring the sources.json file modifications to update the list of linted files. This way, it is no more required to restart the grunt task.

More flavors for browser testing

Selenium was upgraded to version 3 and the detection has been fixed to make it more reliable.

On top of Selenium, a command line execution wrapper combined with a caching service (to store & grab the results) allows testing of non-automated browsers (such as Safari on Windows). Once the tests are done, the command line is killed.

Automated release

Releasing a version has never been so easy, a script using github-api module to call the GitHub API implements the following steps:

  • Check version number
  • Update package.json (if needed)
  • Check GitHub milestones to identify the milestone details and check that all issues are closed
  • Update README.md
  • Grunt make
  • Copy tmp/plato/report.history. to build/ (grunt copy:releasePlatoHistory)
  • commit & push
  • Create a new release on GitHub
  • Copy build/tests.js into test/host/legacy/{version}.js
  • commit & push

However, once last step was forgotten: closing the milestone. An incident is created.

Lessons learned

Documenting features makes them better

I will task the risk of repeating myself here but the article about super made me realize several mistakes in the implementation of the $super equivalent. Furthermore, taking the time to explore the ECMAScript super keyword gave me a better understanding of the feature.

In general, it is valuable to step back from the code and document the intent behind a feature.

Better Selenium detection

One of the reasons why I wanted to remove Selenium was the buggy detection. Indeed, there are some failures which are not encapsulated properly in a Promise. As a result, the whole process fails when it happens.

After digging on the web, I found this excellent thread on NodeJS Exception handling. It allowed me to handle those unmanaged exceptions the proper way and it secured the detection.

Next release

The next release will introduce the interface concept.

Saturday, February 18, 2017

My own super implementation

Release 0.1.6 of GPF-JS delivers a basic class definition mechanism. Working on the release 0.1.7, the focus is to improve this implementation by providing mechanism that mimic the ES6 class definition. In particular, the super keyword is replaced with a $super member that provides the same level of functionalities. Here is how.

Introduction

The super keyword was introduced with ECMAScript 2015. Its goal is to simplify the access to parent methods of an object. It can be used within a class definition or directly in object literals. We will focus on class definition.

Class examples

To demonstrate the usage, let define a simple class A. class A { constructor (value = "a") { this._a = true; this._value = value; } getValue () { return this._value; } }

In that example, the class A offers a constructor with an optional parameter (defaulted to "a"). Upon execution, it sets the member _a to true (this will be used later to validate the constructor call). Also, the member _value receives the value of the parameter. Finally, the method getValue exposes _value.

Then, let subclass it with class B. class B extends A { constructor () { super("b"); this._b = true; } getValue () { return super.getValue().toUpperCase(); } }

When instances of B are built, the constructor of A is explicitly called with the parameter "b". Also, the behavior of the method getValue is modified to uppercase the result of parent implementation.

None of these features are new to JavaScript. Indeed, the exact same definition can be achieved without any of the ECMAScript 2015 keywords.

For instance:

function A (value) { this._a = true; this._value = value || "a"; } Object.assign(A.prototype, { getValue: function () { return this._value; } }); function B () { A.call(this, "b"); this._b = true; } B.prototype = Object.create(A.prototype); Object.assign(B.prototype, { getValue: function () { return A.prototype.getValue.call(this).toUpperCase(); } });

There are several ways to implement inheritance in JavaScript. In this example, the pattern used in GPF-JS is demonstrated.

Differences

Whether you use one syntax or the other, both versions of A and B will look (and behave) the same:

  • A and B are functions
  • A.prototype has a method named getValue
  • b instances only have own properties _a, _b and _value
  • b instanceof A works

Class version (Chrome & Firefox only)

Function version

So, why would you use the super keyword?

As you may see in the examples, accessing the parent methods without super is possible but requires the knowledge of the parent class being extended. Furthermore, the syntax is not easy to remember... Well, after using it a thousand times, you end knowing it by heart.

  • In the constructor, super("b") is replaced with A.call(this, "b")
  • In a method, super.getValue() is replaced with A.prototype.getValue.call(this)

As a consequence, any update in the class hierarchy would lead to a mass search & replace in the code.

Beside this, one could say that this keyword is a typical example of syntaxic sugar as it does not bring new feature.

if you forget about object literals...

Exploring the feature

Even if the documentation on super is extensive, some questions remains about the way it reacts to edge cases.

Redefining parent method

What happens if the parent prototype is modified? Does it call the modified method or does it call the method that was existing when the child method is defined.

The link is dynamic.

Example (Chrome & Firefox only)

This is consistent with the function implementation: A.prototype.getValue re-evaluates the member every time it is called.

Getting function object

Is it possible to access the parent method without invoking it? Does it return a function object?

It returns the parent function object.

Example (Chrome & Firefox only)

It is important to notice that if not invoked immediately (look at getSuperGetValue in the example), the value of this is undefined.

Checking parent method existence

Finally, how does the super keyword validate the method that is accessed: what happens if you try to reference a non-existing member: does it fail when generating the class or upon method execution?

Accessing a non-existing member returns undefined.

Example (Chrome & Firefox only)

This is also consistent with the function implementation: it makes sense that the error is thrown at evaluation time.

A super idea

One of the goals of GPF-JS is to provide the same feature set whatever the host running the script. Because some of them are old (Rhino and WScript), it is not only impossible to use recent features but also it prevents the use of transpilers.

Transpilers like babel are capable of generating compatible JavaScript code out of next-gen JavaScript source.

gpf.define is a class definition helper exposed by the library since version 0.1.6. But it would not be complete without a mechanism that mimics the super keyword in order to reduce the complexity of calling parent methods.

super being a reserved keyword, it could not be used. But as the library reserves $ properties for specific usage, the idea of defining $super naturally came up.

In order to make the $super keyword a global one (like super), the library had to tweak the global context object which generated lots of issues (leaks detected in mocha, validation errors in linters, the variable could already be defined by the developer...). So, $super had to be attached to the context of the class instance.

this.$super was defined and had to support two different syntaxes:

  • Calling this.$super must be equivalent to super
  • Calling this.$super.methodName must be equivalent to super.methodName

Class definition

The library internally uses an object to retain the initial definition dictionary, parse it and build the class handler. This class definition object is not yet exposed but will be in the future through a read-only interface.

This object is a key component of this implementation as it keeps track of the class properties such as the extended base class. This will be leveraged to access parent methods.

Object.getPrototypeOf could be used to escalate the prototype chain and retrieve the base methods. However, it is poorly polyfilled on old hosts and it does not work as expected with standard objects.

Wrapping methods

In order to be able to cope with this.$super calls inside a method, the library has to make sure that the $super member exists before executing the method.

A long time ago, when studying JavaScript inheritance, I found this very interesting article from John Resig (the creator of jQuery).

It took me ages to fully understand its Class.extend helper but it demonstrates a brilliant JavaScript ninja technique: by testing the method with a regular expression, it is capable of finding out if a class method uses the _super keyword. If so, the method is wrapped inside a container function that defines the _super member for the lifetime of the call.

// Check if we’re overwriting an existing function prototype[name] = typeof prop[name] == "function" && typeof _super[name] == "function" && fnTest.test(prop[name]) ? (function(name, fn){ return function() { var tmp = this._super; // Add a new ._super() method that is the same method // but on the super-class this._super = _super[name]; // The method only need to be bound temporarily, so we // remove it when we’re done executing var ret = fn.apply(this, arguments); this._super = tmp; return ret; }; })(name, prop[name]) : prop[name];

Typically, GPF-JS uses the same strategy to detect the use of $super and wrap the method in a new one that defines the value of this.$super upon execution.

The use of _gpfFunctionDescribe and _gpfFunctionBuild ensures that the signature of the final method will be the same as the initial one. Indeed, GPF-JS will soon enable interfaces validation and signatures of methods have to match.

Dynamic mapping of super method

So, when the class is being defined, a dictionary mapping method names to their implementation is passed to gpf.define. This definition dictionary is enumerated so that when the $super use is detected in a method, the name of the parent method is deduced.

This name (as well as members of $super, it will be explained right after) is remembered in a closure and passed to the function _get$Super before calling the method.

Building a new $super method object

The class definition method _get$Super creates a new function instead of reusing the parent one. The reason is quite simple: JavaScript functions being objects, it is allowed to add properties to them... and this will be required to define expected additional super method names.

But then you may wonder why the parent function object is not used? those additional member names could be backed up, overwritten and restored once the call is completed. In the end, it would allow the child method to use parent one members.

However, there are several considerations here:

  • This object could be frozen with Object.freeze meaning it would be read-only.
  • This function object could be used elsewhere meaning that the modification could be visible outside of the method.

One may argue that this is also true for this.$super. However, detection of this member makes it to be overwritten.

While writing this above comment, I realized that the current implementation has an issue. If you ever wonder why I wrote this article, this is a good reason.

  • Even if super returns the parent method object, it would be extremely confusing to have members that are being used. Consider the following example: super.getValue, how do you know if it is a parent method named getValue or the member getValue of the parent method?

I suspect this is the reason why super() is supported only in constructor functions. Try using super() in a class method, you will get an error "SyntaxError: 'super' keyword unexpected here". this.$super overcomes this limitation.

  • If the developer expects to get members on the parent method, he would have a hard time defining them and reusing them (not mentioning the code complexity). This encapsulation prevents this bad practice and avoids headaches.

Detecting $super members

Once $super is detected in a method content, the list of $super members is extracted using a regular expression.

This detection part is critical as it greatly improves performances by generating only what is required.

Then, for each extracted name, the member is created inside _get$Super right after allocating $super.

Invoking super methods

When calling this.$super(), the method $super would obviously receive the proper context.

However, things are more complicated when calling this.$super.methodName().

If you understand how JavaScript function invocation works, you know that inside methodName, this would be equal to this.$super.

And that is a function object.

So how can the library make sure that the proper context is transmitted to methodName?

Function binding could be used to force the value of this but then we would lose the possibility to invoke it with any context.

Function binding

Before Function.prototype.bind was introduced, people used to create a closure to force the value of this inside a function.

Function.prototype.bind = function(oThis) { var fToBind = this; return function () { return fToBind.apply(oThis, arguments); } };

This concept was also made popular with jQuery.proxy.

The drawback is that, once a function is bound, it is no more possible to change the context it is executed with.

Demonstration function getValue() { return this.value; } // Passing the context log(getValue.call({ value: "Hello World" })); // output "Hello World" // Binding var boundGetValue = getValue.bind({ value: "Bound" }); log(boundGetValue()); // output "Bound" // Trying to pass a different context log(boundGetValue.call({ value: "Hello World" })); // output "Bound" // Trying to bind again var reboundGetValue = boundGetValue.bind({ value: "Hello World" }); log(reboundGetValue()); // output "Bound"

Back to the $super.methodName example, it requires a sort of weak bind: a method binding that could be overwritten with a different context using bind, call or apply.

Weak binding

$super being known when the methods are created, it can be compared with the value of this and substituted when matched.

This realizes the weak binding and allows the developer to bind, call or apply the method without any problem.

Conclusion

There is no revolution in this article and many will consider this realization useless as they focus on modern environment and they use latest JavaScript features. However, my curiosity is satisfied as I learned a lot about the super feature. Moreover, the library will soon deliver new features on top of this one that should make the difference.