Thursday, December 21, 2017

Release 0.2.3: Streams and Pipes

This new release focuses on streams by offering two new tools: a way to pipe streams together (and simplify their use) and a CSV parser.

New version

Just before Christmas vacations, here comes the new version:

Release content

Cleaning

A significant effort was put in cleaning the content of the publication repository as it previously contained one plato report folder as well as one documentation folder for each version.

Now everything is consolidated in a single folder and the plato report also includes file history (for instance: regexp.js).

All existing links are adjusted.

gpf.stream.pipe

Combining streams can become tedious when more than two streams must be plugged together. This is greatly simplified using the gpf.stream.pipe helper.

It takes care of chaining the read / write methods and returns a Promise that is resolved when all the data has been processed (including flushable streams).

For example:

// Reading a CSV file var csvFile = gpf.fs.getFileStorage() .openTextStream("file.csv", gpf.fs.openFor.reading), lineAdapter = new gpf.stream.LineAdapter(), csvParser = new gpf.stream.csv.Parser(), output = new gpf.stream.WritableArray(); // csvFile -> lineAdapter -> csvParser -> output gpf.stream.pipe(csvFile, lineAdapter, csvParser, output) .then(function () { return output.toArray(); }) .then(function (records) { // process records });

gpf.stream.csv.Parser

As demonstrated in the previous example, CSV parsing is now possible using the gpf.stream.csv.Parser class.

Lessons learned

There are still some doubts regarding the mechanics of the gpf.stream.pipe implementation.

In particular, if an intermediate read completes before the whole stream has been processed, a new read must be triggered. Also, error handling is unclear even if fully tested. Finally, flushing a stream does not necessarily means its end. Hence it should be possible to continue writing after a flush.

As these behaviors are not yet clearly defined, the code might require adjustments in the future.

In any case, it demonstrates that some 'real life' examples are required to see how this could simplify (or not) the development.

Next release

The next release content will mostly focus on:

  • Finishing the release process improvements
  • Introducing attributes that will be later used to simplify CSV generation
  • Adding new stream helpers to filter / page batch of records

In parallel, the article on gpf.require.define implementation is still in progress and some additional side projects are popping up.

Friday, November 3, 2017

Release 0.2.2: gpf.require

This new release includes a modularization helper as well as feedback from a side project. It also includes fixes for some annoying bugs and quality improvement.

New version

Here comes the new version:

Why did it take so long to get a new version?

First, no timeline is clearly defined for the project. Usually, a release spans over one month and an implicit commitment is made. However, the necessary time is spent to make sure that a release contains everything needed.

In that case, there are three main reasons to explain the delay.

  • The second reason is that a side project dragged a lot of my time. I will later write an article about it as it was my first HTML mobile app experience with a tough customer... my wife.
  • Finally, this release addresses an issue that appeared from time to time and that was never considered seriously. Some tests were failing on a regular basis but not frequently enough to represent a real threat. This problem was prioritized and tackled once for all but it brought many time-consuming challenges.

Oh, and I also designed a logo
logo

Release content

gpf.require

This is clearly the new feature coming with this version. It allows the developer to modularize its project by creating separate source files that depend from each other. A separate article will be written to detail the concept (and implementation) but, in the meantime, you may read the tutorial on how to use it.

To put it in a nutshell:

gpf.require.define({ hello: "hello.js" }, function (require) { "use strict"; console.log(require.hello()) // Output "World!" });

Provided the hello.js file is in the same folder and contains:

gpf.require.define({}, function () { "use strict"; return function () { return "World!"; }; });

Or (using CommonJS syntax)... "use strict"; module.exports = function () { return "World!"; };

This mechanism was envisioned for a while but it was hard to figure out how (and where) to start. After reading some documentation about NodeJS and now that the library is mature enough, it appeared to be surprisingly easy to implement.

I did it
I did it

Some of you may object that, indeed, it already exists in NodeJS and browsers (natively or through different libraries such as RequireJS or browserify). But this new mechanism brings one uniform feature to all supported platforms.

And it comes with a way to alter the require cache to simplify testing by injecting mocked dependencies.

Improved gpf.web.createTagFunction

To develop the mobile app, Scalable Vector Graphics were extensively leveraged. In order to create the appropriate markup, some specific DOM APIs must be used (such as createElementNS). That's why the support of the predefined svg namespace was added to gpf.web.createTagFunction.

... But the use of namespaces is not yet documented... an incident is assigned to the next release.

The same way, this method allocates a function that generates tags exposing methods (toString and appendTo). The documentation was missing but now this is resolved.

Improved gpf.http layer

The HEAD verb was not supported by the HTTP layer. It is now possible.

Another improvement is the possibility to mock any HTTP requests by adding a custom handler. Using this feature, any code that involves HTTP communication can be tested independently from the web server.

For instance:

gpf.http.mock({ method: gpf.http.methods.get, url: new RegExp("echo\\?status=([0-9]+)"), response: function (request, status) { if (status === "400") { return; // Don't mock } return { status: parseInt(status, 10) + 1, headers: { "x-mock": true }, responseText: "It works" }; } }); // No HTTP request will be sent to any server gpf.http.get("/echo?status=200").then(function (response) { assert(response.status === 201); assert(response.headers["x-mock"] === true); assert(response.responseText === "It works"); /*...*/ }); // But this one will... gpf.http.get("/echo?status=400");

Improved Timeout & Promise tests

From time to time, with a ratio of 1 of 1000 executions, the WScript tests were failing. It took ages to narrow the problem down to the place where the assertion failed:

  • An 'infinite performance' mode was added to the testing command line: it loops forever and shows mean as well as deviation on the execution time
  • Error occurring in the testing command line are now documented to help locating the issue

Failure example 1
Failure example 1

Failure example 2
Failure example 2

This highlights the fact that tests should be small and clear enough to immediately spot a problem when it happens.

Upon qualification, it appeared that two test suites were badly designed:

  • Promises testing was using timeouts for no good reasons. Removing them from the test suite was easy: diff
  • Timeout testing was based on global variables to assess several timeout executions and the data got corrupted depending on the executions sequences. Changing the test suite to secure it was more challenging: before after

Once the whole thing was figured out and fixed, the problem of legacy tests remained. Indeed, the test suites are saved after each release and they should remain untouched to ensure backward compatibility.

But then, how do you handle tests that were badly designed? You can't just drop the legacy test suites (more than 600 tests) just because some of them are invalid.

Legacy code
Legacy code

That's why the idea of a legacy management came out. With the help of the legacy.json file, the library offers a way to disable some tests based on their name and version.

Improved quality and tooling

The minimum maintainability ratio has been increased to 70, raising the overall quality requirement for source files.

The grunt connect middleware was modified to automatically scroll the page content. This allowed the recording of this joyful video.

Lessons learned

  • Almost everything can be implemented in oldest JavaScript hosts. This demonstrates the power and flexibility of the language.
  • The way tests are written today significantly impacts tomorrow versions. This is something to keep in mind.
  • Documentation must be reviewed before releasing. However, it becomes more and more complex and I am not necessarily the best person to review my own writing.

Next release

The next release content will mostly focus on:

  • Improving the streams formalism and implement an CSV from/to records engine
  • Improving the release process
  • Doing some cleaning

Wednesday, November 1, 2017

Released 0.2.2

Just a quick note to announce that version 0.2.2 is released.
I am now working on release notes and next release preparation.

I captured the operation to illustrate how easy (and safe) the release process is.

Saturday, October 14, 2017

0.2.2 in progress

This version 0.2.2 is still in progress. Here is a sneak peek of the build process.

When building the GPF library, it is tested with 4 command line hosts:

And at least 3 browsers:

  • Internet Explorer

Backward compatibility with all previous versions is checked as well as maintainability ratio (using Plato) and coverage (using Jasmine).

The library is published in three flavors:

  • source (development version)
  • debug (concatenated sources)
  • release (an optimized and uglify-ed version of debug)

The test library contains more than 600 unit tests covering almost 100% of the code (exceptions are documented).

Last but not least, the code is linted and documentation is generated with jsdoc.

This represents a total of 55 test executions.

Obviously, I am too lazy to do it all by myself: everything is automated with Grunt.

Today was a big day working on version 0.2.2 (still in progress): I am finalizing a feature that will help me better modularize my code.

I wanted to share those 2 minutes of pure development joy that concluded my work (full screen recommended).

Saturday, September 16, 2017

Sneaky JavaScript Technics IV

A ninja is a lazy fighter. As Sun Tzu stated "Every battle is won or lost before it's ever fought". Here are quick tips to simplify your development when dealing with optional parameters.

Handling default parameters

The are many way to default parameters when they are not passed during function invocation.

ES6 proposes a syntax to formalize them, you can try it by yourself.

The following example: function test (firstParam = "default value") { return firstParam; }

is transpiled into: "use strict"; function test() { var firstParam = arguments.length > 0 && arguments[0] !== undefined ? arguments[0] : "default value"; return firstParam; }

Indeed, the arguments object contains the list of parameters that were passed during function invocation. It looks like an array (but it is not) and exposes length as well as all passed parameters.

This is quite complex and I usually go with:

function test (firstParam) { if (firstParam === undefined) { firstParam = "default value" } return firstParam; }

Or, shorter,

function test (firstParam) { firstParam = firstParam || "default value"; return firstParam; }

This last syntax may lead to errors when dealing with falsy values

All these non ES6 syntaxes are working fine but they come with two drawbacks:

  • You have to manually handle the missing parameters
  • The condition adds cyclomatic complexity to your function

The lazy solution

Wouldn't it be nice to have a way to wrap a function so that default parameters would be handled without having to code anything?

Considering that optional parameters are usually at the end of a function, let's introduce a new method to the function object that will default its last parameters.

So considering this function: function add(value, increment) { return value + increment; }

We could define inc as: var inc = add.withDefaults(1);

That would be equivalent to the ES6 version of: function inc(value, increment = 1) { return add(value, increment); }

Forewords

Before going straight to the proposal, you need to understand the following concepts:

  • A function exposes the size of its signature through the property length: it allows any developer to know the number of expected parameters
  • The arguments object is not an array but it can be easily converted into one using the following pattern: [].slice.call(arguments)
  • You can create an array of any size using new Array(size) It will be filled with undefined values
  • If the provided default parameters are not enough to set missing ones, they are replaced with undefined (as expected)

Now you are ready.

Proposal

Here is the proposal to handle default values:

Function.prototype.withDefaults = function () { var defaultParameters = [].slice.call(arguments), wrappedFunction = this; return function () { var receivedParameters = [].slice.call(arguments), missingCount = wrappedFunction.length - receivedParameters.length, actualParameters, sliceFrom; if (missingCount > 0) { sliceFrom = defaultParameters.length - missingCount; actualParameters = receivedParameters .concat( new Array(Math.max(-sliceFrom, 0)), defaultParameters.slice(Math.max(sliceFrom, 0)) ); } else { actualParameters = receivedParameters; } return wrappedFunction.apply(this, actualParameters); } }

As well as the associated test case.

Improvements

This solution is not perfect. Indeed, the resulting function has a signature length of 0. Consequently, you can't chain with another call to default parameters of such a wrapped function.

There are several ways to work around this limitation but I wanted to keep this article short and simple

Wednesday, June 21, 2017

5 ways to make an http request

The version 0.2.1 of the GPF-JS library delivers an HTTP request helper that can be used on all supported hosts. It was quite a challenge as it implied 5 different developments, here are the details.

The need for HTTP requests

In a world of interoperability, internet of things and microservices, the - almost 30 years old - HTTP protocol defines a communication foundation that is widely known and implemented.

Originally designed for human-to-machine communication, this protocol also supports machine-to-machine communication through standardized concepts and interfaces:

Evolution of HTTP requests in browsers

Web browsers were the first applications implementing this protocol to access the World Wide Web.

Netscape communicator loading screen
Netscape communicator loading screen

Before AJAX was conceptualized, web pages had to be fully refreshed from the server to reflect any change. JavaScript was used for simple client manipulations. From a user experience point of view, it was OK (mostly because we had no other choices) but this limited the development of user interfaces.

I guess
I guess

Then AJAX introduced new ways to design web pages: only the new information could be fetched from the server without reloading the page. Therefore, the pages were faster, crisper and fully asynchronous.

However, each browser had its own implementation of AJAX requests (not mentioning DOM, event handling and other incompatibilities). And that's why jQuery, which was initially designed to offer a uniform API that would work identically on any browser, became so popular.

jQuery everywhere
jQuery everywhere

Today, the situation has changed: almost all browsers are implementing the same APIs and, consequently, modern libraries are considering browsers to be one environment only.

GPF-JS

GPF-JS obviously supports browsers and it leverages AJAX requests to implement HTTP requests in this environment. But the library is also compatible with NodeJS as well as other - less common - command line hosts:

Designing only one API that is compatible with all these hosts means to deal with each host specificities.

How to test HTTP request

When you follow the TDD practice, you write tests before writing any line of production code. But in that case, the first challenge was to figure out how the whole HTTP layer could be tested. Mocking was not an option.

The project development environment heavily relies on the grunt connect task to deliver the dashboard: a place where the developer can access all the tools (source list, tests, documentation...).

dashboard
dashboard

As a lazy developer, I just need one command line for my development (grunt). Then all the tools are available within the dashboard.

Some middleware is plugged to add extra features such as:

  • cache: introduced with version 0.1.7, it is leveraged by the command line used to test browsers when Selenium is not available. It implements a data storing service like Redis.
  • fs: a file access service used to read, create and delete files within the project storage. For instance, it is used by the sources tile to check if a source has a corresponding test file.
  • grunt: a wrapper used to execute and format the log of grunt tasks.

Based on this experience, it became obvious that the project needed another extension: the echo service. It basically accepts any HTTP request and the response either reflects the request details or can be modified through URL parameters.

POSTMAN was used to test the tool that will be used to test the HTTP layer...

GET
GET

GET 500
GET 500

POST
POST

One API to rule them all

Now that the HTTP layer can be tested, the API must be designed to write the tests.

Input

An HTTP request starts with some parameters:

  • The Uniform Resource Locator which determines the web address you want to send the request to. There are several ways to specify this location: NodeJS offers an URL class which exposes the different parts of it (host, port ...). However, the simplest representation remains the one everybody is used to: the string you can read inside the browser location bar.
  • The request method (also known as verb) which specifies the kind of action you want to execute.
  • An optional list of header fields meant to configure the request processing (such as specifying the expected answer type...). The simplest way to provide this list is to use a key/value dictionary, meaning an object.
  • The request body, mostly used for POST and PUT actions, which contains that data to upload to the server. Even if the library supports the concept of streams, most of the expected use cases imply sending an envelope that is synchronously built (text, JSON, XML...). Also, JavaScript (in general) is not good at handling binary data, hence a simple string is expected as a request body.

This leads to the definition of the httpRequestSettings type.

Output

On completion, the server sends back a response composed of:

  • A status code that provides feedback about how the server processed the request. Typically, 200 means everything went well. On the contrary, 4xx messages signal an error and 500 is a critical server error.
  • A list of response headers. For instance, this is how cookies are transmitted by the server to the client (and, actually, they are also sent back by the client to the server through headers).
  • The response body: depending on what has been requested, it will contain the server answer. This response could be deserialized using a readable stream. But, for the same reasons, a simple string containing the whole response text will be returned.

This leads to the definition of the httpRequestResponse type.

If needed, the API may evolve later to introduce the possibility to use streams.

Waiting for the completion

An HTTP request is asynchronous; hence the client must wait for the server to answer. To avoid the callback hell, a Promise is used to represent the eventual completion of the request.

This leads to the definition of the gpf.http.request API.

The promise is resolved when the server answered, whatever the status code (including 500). The only way the promise would be rejected is when something wrong happened during communication.

Shortcuts

For simple requests, such as a GET with no specific header, the API must be easy to use. Shortcuts are defined to shorten the call, for instance:

gpf.http.get(baseUrl).then(function (response) { process(response.responseText); }, handleError);

See the documentation.

Handling different environments

Inside the library, there are almost as many implementations as there are supported hosts. Each one is inside a self-titled file below the http source folder. This will be detailed right after.

Consequently, there are basically many ways to call the proper implementation depending on the host:

  • Inside the request API, create an if / else condition that checks every possibility gpf.http.request = function (/*...*/) { if (_GPF_HOST.NODEJS === _gpfHost) { // call NodeJS implementation } else if (_GPF_HOST.BROWSER === _gpfHost) { // call Browser implementation } else /* ... */ };
  • Have a global variable receiving the proper implementation, using an if condition inside each implementation file // Inside src/host/nodejs.js if (_GPF_HOST.NODEJS === _gpfHost) { _gpfHttpRequestImpl = function (/*...*/) { /* ... NodeJS implementation ... */ }; } // Inside src/http.js gpf.http.request = function (/*...*/) { _gpfHttpRequestImpl(/*...*/); };
  • Create a dictionary indexing all implementations per host and then fetch the proper one on call // Inside src/host/nodejs.js _gpfHttpRequestImplByHost[_GPF_HOST.NODEJS] = function () { /* ... NodeJS implementation ... */ }; // Inside src/http.js gpf.http.request = function (/*...*/) { _gpfHttpRequestImplByHost[_gpfHost](/*...*/); };

My preference goes to the last choice for the following reasons:

  • if / else conditions generate cyclomatic complexity. In general, the less if the better. In this case, they are useless because used to compare a variable (here the current host) with a list of predefined values (the list of host names). A dictionary is more efficient.
  • It is simpler to manipulate a dictionary to dynamically declare a new host or even update an existing implementation. Indeed, we could imagine a plugin mechanism that would change the way requests are working by replacing the default handler.

Consequently, the internal library variable _gpfHttpRequestImplByHost contains all implementations indexed by host name. The request API calls the proper one by fetching the implementation at runtime.

Browsers

As explained in the introduction, browsers offer AJAX requests to make HTTP Requests. This is possible through the XmlHttpRequest JavaScript class.

There is one major restriction when dealing with AJAX requests. You are mainly limited to the server you are currently browsing. If you try to access a different server (or even a different port on the same server), then you are entering the realm of cross-origin requests.

If needed, you will find many examples on the web on how to use it. Long story short, you can trigger a simple AJAX request in 5 lines of code.

In terms of processing, it is interesting to note that, once triggered from the JavaScript code, the network communication is fully handled by the browser: it does not require the JavaScript engine. This means that the page may execute some code while the request is being transmitted to the server as well as while waiting for the response. However, to be able to process the result (i.e. trigger the callback), the JavaScript engine must be idle.

Test preview
Test preview

Browser implementation is done inside src/http/xhr.js.

Two external helpers are defined inside src/http/helpers.js:

Setting the request headers and sending request data are almost done the same way for three hosts. To avoid code duplication, those two functions generates specialized versions capable of calling host specific methods.

NodeJS

Besides being a JavaScript host, NodeJS comes with a complete set of API for a wide variety of tasks. Specifically, it comes with the http feature.

But unlike AJAX requests, triggering an HTTP requests requires more effort than in a browser.

The http.request method allocates an http.ClientRequest. However, it expects a structure that details the web address. That's why the URL parsing API is needed.

The clientRequest object is also a writable stream and it exposes the method to push data over the connection. Things being done at the lowest level, you are responsible of ensuring the consistency of the request details. Indeed, it is mandatory to set the request headers properly. For instance, forgetting the Content-Length specification on a PUSH or a PUT will lead to the HPE_UNEXPECTED_CONTENT_LENGTH error. The library is taking care of that part.

The same way, the response body is a readable stream. Fortunately, GPF-JS provides a NodeJS-specific stream reader and it deserializes the content inside a string using _gpfStringFromStream.

Test preview
Test preview

NodeJS implementation is done inside src/http/nodejs.js.

WScript

WScript is the scripting host available on almost all post Windows XP Microsoft Windows operating systems. It comes in two flavors:

  • WScript.exe which is showing outputs in dialog boxes
  • cscript.exe which is the command line counter part

This host has a rather old and weird support of the JavaScript features. It does not supports timers but GPF-JS provides all the necessary polyfills to compensate for the missing APIs.

Despite all these troubles, it has one unique advantage over the other hosts: it offers the possibility to manipulate COM components.

Indeed, the host-specific class ActiveXObject gives you access to thousands of external features within a script:

For instance, few years ago, I created a script capable of reconfiguring a virtual host to fit user preferences and make it unique on the network.

Among the list of available objects, there is one that is used to generate HTTP requests: the WinHttp.WinHttpRequest.5.1 object.

Basically, it mimics the interface of the XmlHttpRequest object with one significant difference: its behavior is synchronous. As the GPF API returns a Promise, the developer does not have to care about this difference.

Test preview
Test preview

Wscript implementation is done inside src/http/wscript.js.

Rhino

Rhino is probably one of the most challenging - and fun - environment because it is based on java.

The fascinating aspect of Rhino comes from the tight integration between the two languages. Indeed, the JavaScript engine is implemented in java and you can access any java class from JavaScript. In terms of language support, it is almost the same than WScript: no timers and written on a relatively old specification. Here again, the polyfills are taking care of filling the blanks.

To implement HTTP requests, one have to figure out which part of the java platform would be used. After doing some investigations (thanks Google), the solution appeared to be the java.net.URL class.

Like NodeJS, java streams are used to send or receive data over the connection. Likewise, the library offers rhino-specific streams implementation.

Stream reading works by consuming bytes. To read text, a java.util.Scanner instance is used.

Surprisingly, if the status code is in the 5xx range, then getting the response stream will fail and you have to go with the error stream.

Test preview
Test preview

Rhino implementation is done inside src/http/rhino.js.

PhantomJS

To put it in a nutshell, PhantomJS is a command line simulating a browser. It is mainly used to script access to web sites and it is the perfect tool for test automation.

But there are basically two styles of PhantomJS scripts:

  • On one hand, it browses a website and simulates what would happen in a browser
  • On the other hand, it is a command line executing some JavaScript code

As a matter of fact, GPF-JS uses those two ways:

  • mocha is used to automate browser testing with PhantomJS
  • a dedicated command line runs the test suite without any web page

As a result, in this environment, the XmlHttpRequest JavaScript class is available.

However, like in a browser, this host is also subject to security concerns. Hence you are not allowed to request a server that is not the one being opened.

Luckily, you can bypass this constraint using a command line parameter: --web-security=false.

Test preview
Test preview

PhantomJS implementation is done inside src/http/xhr.js.

Conclusion

If you survived the whole article, congratulations (and sorry for the broken English).

Now you might be wondering...

What's the point?
What's the point?

Actually, this kind of challenge satisfies my curiosity. I learned a lot by implementing this feature and, actually, it was immediately applied to greatly improve the coverage measurement.

Indeed, each host is tested with instrumented files and the collected coverage data is serialized to be later consolidated and reported on. However, as of today, only two hosts supports file storage: NodeJS and WScript. But thanks to the HTTP support, all hosts are sending the coverage data to the fs middleware so that it generates the file.

Q.E.D.

Monday, June 12, 2017

Release 0.2.1: Side project support

This new release supports my side projects by implementing html generation and http request helpers. It also improves the coverage measurement and this is the first version to be published again as an NPM package.

New version

Here comes the new version:

NPM Publishing

Starting from this version, the library will be published as an NPM package on every release.

The package was already existing since it was first published for version 0.1.4. However, the library has since been redesigned in a way that is not backward compatible. That's the reason why the MINOR version number was increased.

It is violating the normal backward compatibility rule but, actually, nobody was really using it... And I didn't want to increase the MAJOR number to 1 until the library is ready.

An .npmignore file instructs NPM which files should be included or not. The package is almost limited to the build folder.

HTML generation helper

I was watching the excellent funfunfunction video about the hidden costs of templating languages and, at some point, he showed some HTML generation helpers which syntax amazed me.

Hence, to support my side project, I decided to create my own HTML generation helpers based on this syntax.

For instance: var div = gpf.web.createTagFunction("div"), span = gpf.web.createTagFunction("span"); document.getElementById("placeholder") = div({className: "test1"}, "Hello ", span({className: "test2"}, "World!") ).toString();

Sadly, I realize that I completely forgot to document the feature properly. Fortunately, the tests demonstrate the feature.

HTTP Requests

The prominent feature of this version is the gpf.http.request helper. With it, you can trigger HTTP requests from any supported host using only one API. The response is given back to the script asynchronously through Promises.

The promise is resolved when the server provides a response, whatever the status code.

Some shortcuts are also defined to improve code readability:

gpf.http.get(requestUrl).then(function (response) { if (response.status === 500) { return Promise.reject(new Error("Internal Error")); } return process(response.responseText); }) .catch(function (reason) { onError(reason); });

The supported (and tested) features are:

  • Most common HTTP verbs
  • Possibility to provide request headers
  • Possibility to access response headers
  • Possibility to submit textual payload (on POST and PUT)
  • Access to response text

Other features will be added depending on the needs but this is enough to make it fully functional.

I will soon write an article about this very specific part as I had lot of challenges to test and implement it.

Improved coverage

Because each host benefits from its own http request implementation, it was important to assess the code coverage.

Up to this version, the coverage was measured with NodeJS through istanbul and mochaTest. Each host specific code was flagged to be ignored using the istanbul ignore comment.

However, as the library grows, more code was actually not verified.

After analyzing how istanbul generates, stores and consolidates the coverage information, I found a way to run the instrumented code on all hosts. A new grunt task consolidate all data into the global one.

In the end, there are still some branches / instructions that are ignored but they are all documented.

  • Statements ignored: 0.74%
  • Branches ignored: 1.77%
  • Functions ignored: 1.36%

Lessons learned

ECHO service

To be able to test the http helpers, I needed a server which responses could be controlled. I already had some middleware plugged inside the grunt connect task. I decided to create the ECHO service.

I took me a while to figure out the proper coding: for instance, I had to disable the cache because Internet Explorer was storing the request results and it was failing the tests.

Also, I had to change the way tests are running to transmit the http port information. This is done through the config global object.

Some code was definitely not tested

Having the coverage measured on almost every lines revealed sections of untested code. This leaded to some simplifications (boot for instance) but also to new tests.

Incomplete design for ending a stream

In order to prepare the CSV reader, I created a line adapter stream. It implements both the IReadableStream and IWritableStream interfaces. It caches the data being received with write until some line separator is detected.

However, because of caching, it requires a method to flush the buffered data.

As of now, I decided to create a method to explicitly flush the stream.

However, I am still wondering if it may change in the future: if you pipe several streams together, it could be convenient to have a standard way to flush all the different levels. One idea could be to introduce a special token that would be written at the end of a stream but then it would require all streams to implement it.

Next release

For now the project is on pause because of vacations. I will take some time to plan the next iteration more carefully but, still, I have to support my side project by creating a CSV reader as well as a record container.

Saturday, April 29, 2017

Release 0.1.9: Records files

This new release leverages the interface concept and delivers a file storage implementation for NodeJS and WScript. It also introduces the necessary tools for a side project I am currently working on.

New version

Here comes the new version:

Path management

Because this version comes with file management, it all started with path management. Some existing code was waiting to be re-enabled in the library. This was done easily.

The gpf.path helpers are considering that the normal path separator is the Unix one. The translation is done whenever necessary and the tests are considering each case.

IFileStorage and streams

Three new interfaces were designed in this release:

The purpose is to encapsulate the file system in a flexible way using Promises to handle asynchronicity. Also, streams were introduced to abstract files concept to reading or writing data.

The method gpf.fs.getFileStorage retrieves the current host's IFileStorage (if existing).

Reading a file becomes host independent:

function read (path) { var iFileStorage = gpf.fs.getFileStorage(), iWritableStream = new gpf.stream.WritableString(); return iFileStorage.openTextStream(path, gpf.fs.openFor.reading) .then(function (iReadableStream) { return iReadableStream.read(iWritableStream); .then(function () { return iFileStorage.close(iReadableStream); }) .then(function () { return iWritableStream.toString(); }); }); }

And so is writing to a file:

function write (path, content) { var iFileStorage = gpf.fs.getFileStorage(); return iFileStorage.openTextStream(path, gpf.fs.openFor.appending) .then(function (iWritableStream) { return iWritableStream.write(content) .then(function () { return iFileStorage.close(iWritableStream); }); }); }

Following TDD practice, all the methods were first tested and then implemented for:

Some notes:

  • If you need to replace a file content, you must delete it first.
  • File storage, streams and hosts implementation were existing and were waiting to be re-enabled. However, they were using a convoluted notification mechanism that has been dropped for Promises.
  • Some existing code handles binary streams (even with WScript). However, I doubt this would be useful for the coming projects so I removed it. Indeed, the file storage method is named OpenTextStream .
  • I plan to later implement file storage for rhino and even browsers (with a dedicated backend).

Filters and Sorters

This release was labeled "Record files" because I started a side project in which thousands of records will be manipulated. Handling an array of records requires that you can easily filter or sort them on any property.

As of now, records are supposed to be flat objects.

To make it efficient, the code generation approach was preferred.

Two functions are proposed:

The chosen syntax is documented and lots of examples can be found in the test cases. I also integrated a regular expression operator that allows interesting extractions.

I plan to create some parsers to generate the filter from more readable syntaxes (SQL, LDAP...).

Sorting can be done on any property, two types of comparison are offered:

  • Number (default) where values are compared using a subtraction
  • String through localCompare

You may sort on several properties, see the examples.

Lessons learned

WScript specific behaviors

WScript has a pretty weak JavaScript implementation: I found two issues that broke some of my tests.

  • Newly created prototypes automatically have the constructor property assigned. I had to remove it for interfaces, see this fix

Tests

The library has now 541 tests. If you compare with version 0.1.8 that had 396, it is almost 1/3 more!

In terms of the time required to execute them, it takes only 1.5 second to run one round them with NodeJS. It is still acceptable.

One notable challenge was to test the NodeJS streams wrappers. Some error situations are almost impossible to simulate. With the help of mocking, and a good comprehension of the stream events, the code was secured.

Code coverage

Now that the library offers two different implementations for the file storage object, a big change occurred in the way coverage is measured. Indeed, the source version now only loads what's specific to the host so that it prevents adding countless "istanbul ignore" comments. But this also means that some files are not covered anymore.

I plan to fix that in the next version.

Next release

The next release will be dedicated to support my other project:

  • A streamed line reader will be developed (and it may lead to a CSV reader)
  • The library will be published to NPM (automatically when releasing the library)
  • As stated above, the coverage will be re-designed to include other hosts

However, because of the side project, the release frequency may slow down.