In this article, I will introduce you to esprima that allows a deep and precise code analysis.
### Linting is necessary...
In a previous post, I have been talking about the benefits of using
[code linting tools](http://gpf-js.blogspot.ca/2013/11/
why-lint-tool-can-reduce-development.html) as it really saves the developer time
during the coding phase:
* It highlights common mistakes (undeclared variables, unused parameters,
use of [hasOwnProperty](https://developer.mozilla.org/en-US/docs/Web/JavaScript/
Reference/Global_Objects/Object/hasOwnProperty), == vs ===...)
* It maintains code consistency (formatting, naming conventions...)
* It checks code complexity, in particular the [cyclomatic complexity](http://
en.wikipedia.org/wiki/Cyclomatic_complexity)
Regarding JavaScript, in my opinion, one of the best options is to apply
[JSHint](http://www.jshint.com/).
In the following example, several errors are raised because of:
* quotes inconsistencies
* undeclared functions (let say they are declared in other sources)
* missing semicolon
* undeclared or improper variable names
* ...
Move your mouse over the underlined errors to read them *(note that this is
using live JSHint parsing)*.
`(function () {/*gpf:apply-jshint*/
"use strict";
// Handles the click on a specific button
function on_click (ctrl) {
freezeInterface("Please wait, updating information...")
$.ajax ('getInformation.aspx', {
success: function (data) {
resultCode = data.resultCode;
if (resultCode == 0)
updateInterfaceUsingData(data);
}
})
unfreezeInterface();
}
document.getElementById('myButton')
.addEventListener("click", on_click);
})();`
### ...but not sufficient
However, this kind of tool has limits as it tends to apply general validation
rules without considering the meaning of the code that is executed.
To illustrate the problem, let me first fix the JSHint errors:
`(function () {/*gpf:apply-jshint*/
"use strict";
/*global $, freezeInterface, updateInterfaceUsingData, unfreezeInterface*/
// Handles the click on a specific button
function onClick () {
freezeInterface("Please wait, updating information...");
$.ajax("getInformation.aspx", {
success: function (data) {
var resultCode = data.resultCode;
if (resultCode === 0) {
updateInterfaceUsingData(data);
}
}
});
unfreezeInterface();
}
document.getElementById("myButton")
.addEventListener("click", onClick);
})();`
Using the "global" comment, JSHint is configured to be aware of the global
variables to be defined. With this declaration, the code successfully passes
JSHint validation.
But do you see any problem here?
...
no?
...
really?
...
### Incorrect use of the API
The method **[$.ajax](http://api.jquery.com/jquery.ajax/)** comes from the
[jQuery](http://jquery.com/) framework and it encapsulates the necessary code to
handle AJAX calls with the server.
There are two important things to remember about AJAX calls:
* They are asynchronous: this is why callbacks are used to be notified when the
answer comes back from the server.
* Like any function call, they may generate errors: either because the API that
was invoked can't provide the requested result or because a network failure
prevents the call to complete.
Consequently, there are two problems with the way the method **onClick** is
implemented:
* The use of **unfreezeInterface** is done right after the call to **$.ajax**.
It means that it does not even wait for the call to succeed.
* Furthermore, if the call fails, nothing will happen. The error is ignored.
There are some situations where it is acceptable to ignore errors but, in a
general manner, all errors must be handled.
### Understanding the code
To detect such a misconception, one must first locate any function call to
**$.ajax** and then analyze the parameters to verify that it has been used the
right way.
One easy solution could be to consider the whole source as a big string, look
for "$.ajax" and then check that the "success" as well as the "error"
keywords appear just after.
But how reliable is that?
`(function () {/*gpf:apply-jshint*/
"use strict";
/*global $, freezeInterface, updateInterfaceUsingData, unfreezeInterface*/
/*global showError*/
// Handles the click on a specific button
function onClick () {
freezeInterface("Please wait, updating information...");
$.ajax("getInformation.aspx", {
success: function (data) {
var resultCode = data.resultCode;
if (resultCode === 0) {
updateInterfaceUsingData(data);
} else {
// The 'error' keyword is here but inside a string
showError("An error occurred");
}
}
});
unfreezeInterface();
}
document.getElementById("myButton")
.addEventListener("click", onClick);
})();`
In the above example, the "error" keyword appears in a comment and a string.
Furthermore, how can you make sure that no instruction is executed after the
call to **$.ajax**?
### Parsing the code
The only good way to understand the JavaScript code is to parse the source in
order to build a structured representation of the instructions it contains.
Consequently, the **$.ajax** would appear as a function call and it would be
possible to enumerate and check its parameters.
In particular, we could verify that the second one is an object with a member
named "error".
JavaScript parsing is a wide topic... In my [library](https://github.com/
ArnaudBuchholz/gpf-js), I created a [tokenizer](https://github.com/
ArnaudBuchholz/gpf-js/blob/master/tokenizer.js) that is used in this website to
apply syntax coloring but it does not really provides the program structure: it
just identifies the keywords, strings, comments and symbols.
### esprima: a free JavaScript parser
Fortunately, there are existing libraries that do the job for you. I will
focus on [esprima](http://esprima.org/index.html).
To make a long story short, esprima is a JavaScript library that parses
JavaScript sources. It generates a [Sensible syntax tree](http://esprima.org/
doc/index.html#ast) format, compatible with [Mozilla Parser AST](https://
developer.mozilla.org/en-US/docs/Mozilla/Projects/SpiderMonkey/Parser_API).
The previous example is converted into:
`{
"type": "Program",
"body": [
{
"type": "ExpressionStatement",
"expression": {
"type": "CallExpression",
"callee": {
"type": "FunctionExpression",
"id": null,
"params": [],
"defaults": [],
"body": {
"type": "BlockStatement",
"body": [
{
"type": "ExpressionStatement",
"expression": {
"type": "Literal",
"value": "use strict",
"raw": "\"use strict\""
}
},
{
"type": "FunctionDeclaration",
"id": {
"type": "Identifier",
"name": "onClick"
},
"params": [],
"defaults": [],
"body": {
"type": "BlockStatement",
"body": [
{
"type": "ExpressionStatement",
"expression": {
"type": "CallExpression",
"callee": {
"type": "Identifier",
"name": "freezeInterface"
},
"arguments": [
{
"type": "Literal",
"value": "Please wait, updating information...",
"raw": "\"Please wait, updating information...\""
}
]
}
},
{
"type": "ExpressionStatement",
"expression": {
"type": "CallExpression",
"callee": {
"type": "MemberExpression",
"computed": false,
"object": {
"type": "Identifier",
"name": "$"
},
"property": {
"type": "Identifier",
"name": "ajax"
}
},
"arguments": [
{
"type": "Literal",
"value": "getInformation.aspx",
"raw": "\"getInformation.aspx\""
},
{
"type": "ObjectExpression",
"properties": [
{
"type": "Property",
"key": {
"type": "Identifier",
"name": "success"
},
"value": {
"type": "FunctionExpression",
"id": null,
"params": [
{
"type": "Identifier",
"name": "data"
}
],
"defaults": [],
"body": {
"type": "BlockStatement",
"body": [
{
"type": "VariableDeclaration",
"declarations": [
{
"type": "VariableDeclarator",
"id": {
"type": "Identifier",
"name": "resultCode"
},
"init": {
"type": "MemberExpression",
"computed": false,
"object": {
"type": "Identifier",
"name": "data"
},
"property": {
"type": "Identifier",
"name": "resultCode"
}
}
}
],
"kind": "var"
},
{
"type": "IfStatement",
"test": {
"type": "BinaryExpression",
"operator": "===",
"left": {
"type": "Identifier",
"name": "resultCode"
},
"right": {
"type": "Literal",
"value": 0,
"raw": "0"
}
},
"consequent": {
"type": "BlockStatement",
"body": [
{
"type": "ExpressionStatement",
"expression": {
"type": "CallExpression",
"callee": {
"type": "Identifier",
"name": "updateInterfaceUsingData"
},
"arguments": [
{
"type": "Identifier",
"name": "data"
}
]
}
}
]
},
"alternate": {
"type": "BlockStatement",
"body": [
{
"type": "ExpressionStatement",
"expression": {
"type": "CallExpression",
"callee": {
"type": "Identifier",
"name": "showError"
},
"arguments": [
{
"type": "Literal",
"value": "An error occurred",
"raw": "\"An error occurred\""
}
]
}
}
]
}
}
]
},
"rest": null,
"generator": false,
"expression": false
},
"kind": "init",
"method": false,
"shorthand": false
}
]
}
]
}
},
{
"type": "ExpressionStatement",
"expression": {
"type": "CallExpression",
"callee": {
"type": "Identifier",
"name": "unfreezeInterface"
},
"arguments": []
}
}
]
},
"rest": null,
"generator": false,
"expression": false
},
{
"type": "ExpressionStatement",
"expression": {
"type": "CallExpression",
"callee": {
"type": "MemberExpression",
"computed": false,
"object": {
"type": "CallExpression",
"callee": {
"type": "MemberExpression",
"computed": false,
"object": {
"type": "Identifier",
"name": "document"
},
"property": {
"type": "Identifier",
"name": "getElementById"
}
},
"arguments": [
{
"type": "Literal",
"value": "myButton",
"raw": "\"myButton\""
}
]
},
"property": {
"type": "Identifier",
"name": "addEventListener"
}
},
"arguments": [
{
"type": "Literal",
"value": "click",
"raw": "\"click\""
},
{
"type": "Identifier",
"name": "onClick"
}
]
}
}
]
},
"rest": null,
"generator": false,
"expression": false
},
"arguments": []
}
}
]
}`
In particular, the **$.ajax** call is:
`// [...]
{
"type": "ExpressionStatement",
"expression": {
"type": "CallExpression",
"callee": {
"type": "MemberExpression",
"computed": false,
"object": {
"type": "Identifier",
"name": "$"
},
"property": {
"type": "Identifier",
"name": "ajax"
}
},
"arguments": [
{
"type": "Literal",
"value": "getInformation.aspx",
"raw": "\"getInformation.aspx\""
},`
And, at the same level, you find the call to **unfreezeInterface** call:
`// [...]
{
"type": "ExpressionStatement",
"expression": {
"type": "CallExpression",
"callee": {
"type": "Identifier",
"name": "unfreezeInterface"
},
"arguments": []
}
}`
So it is possible to write a program that uses esprima to parse the JavaScript
source and then analyze the result structure to do the necessary checks.
### verify.js
So I created the program [verify.js](https://github.com/ArnaudBuchholz/
ArnaudBuchholz.github.io/blob/master/blog/post/Code%20analysis/verify.js) that
relies on esprima to parse a file and locate the two problems described above.
The main algorithm relies on the structure exploration done in the function
[walk](https://github.com/ArnaudBuchholz/ArnaudBuchholz.github.io/blob/
65e9bf2164a3081a90f92fe2ff4805479880ea6b/blog/post/Code%20analysis/
verify.js#L71) and, once the **$.ajax** structure is detected, the function
[checkAjaxCallbacks](https://github.com/ArnaudBuchholz/ArnaudBuchholz.github.io/
blob/65e9bf2164a3081a90f92fe2ff4805479880ea6b/blog/post/Code%20analysis/
verify.js#L36) takes care of checking the parameters.
To use it, first get it from the GitHub repository
(use the [raw](https://raw.githubusercontent.com/ArnaudBuchholz/
ArnaudBuchholz.github.io/master/blog/post/Code%20analysis/verify.js) button).
Supposing you have [nodeJs](http://nodejs.org/) installed, open a command prompt
and type:
* npm install gpf-js
* npm install esprima
![Setup](https://arnaudbuchholz.github.io/blog/post/Code%20analysis/setup.png)
Then you are ready to go.
Type "node verify"
![Help](https://arnaudbuchholz.github.io/blog/post/Code%20analysis/help.png)
Or download the two samples and test them:
* [sample1.js](https://arnaudbuchholz.github.io/blog/post/Code%20analysis/
sample1.js)
![Sample 1](https://arnaudbuchholz.github.io/blog/post/Code%20analysis/
sample1.png)
* [sample2.js](https://arnaudbuchholz.github.io/blog/post/Code%20analysis/
sample2.js)
![Sample 2](https://arnaudbuchholz.github.io/blog/post/Code%20analysis/
sample2.png)
### To conclude
JavaScript linting with JSHint is necessary. However, you might need a more
advanced tool that is capable of understanding the meaning of the algorithm
to get a deeper validation of your code.
With this article, I tried to demonstrate only a fragment of what can be done
with the AST structure generated by esprima.
But once the door is opened:
* You can also check function signatures or dependencies, keep track of closures,
handle variable types...
* You may also think of generating statistics, documentation...
* Why not modifying the AST structure in order to manipulate the code and
generate a modified version of it: have a look on
[escodegen](https://github.com/estools/escodegen).