Cover Page

To the memory of Nicolas Classeau, 43, Director of IUT/Marne-la-Vallée,
killed in the Bataclan terrorist attack, November 13, 2015

Series Editor

Jean-Charles Pomerol

JavaScript and Open Data

Robert Jeansoulin

Wiley Logo

Introduction

I.1. Motivation

Two main facts constituted the motivation for writing this book:

  • – Broad interest in the JavaScript language: the most used on Earth. It is run hundreds of millions times every day1: what web page does not use it?
image

Figure I.1. Programmers’ contributions to the most commonly used computer languages

This use ensures regular maintenance of the code and has led to permanent improvement in performance for more than 25 years. In recent years, several innovations of the JavaScript norm have convinced all browser providers to adopt them. Figure I.1., published October 2017, counts the contributions of internautsʼ coding development in several computer languages.

  • – Free access to big volumes of data on the Web:

Besides “proprietary” restricted data, public open data (e.g. United Nations, INSEE, US Census Bureau, etc.) [ICS 16] and free access data, via private providers (e.g. shared Google drive) or non-profits (Wikipedia), are an enormous reservoir of universal information.

These two facts have led us to write this book as a JavaScript programming manual, an open data oriented manual, with insight on combining web data and displaying them. Since 2015, the large adoption of recent JavaScript norms encourages their use and greatly facilitates new coding practices, as described in this manual.

Data represent the common heritage of humanity; everyone should have usable and free fetching tools. Every citizen who is curious about data and has a taste for technology may become a “data scientist”, an eager amateur, able to study the data, texts or figures of their own focused hobby field. And also in the professional field, a student from a school of political science, or a journalist [NICAR], or the person in charge of an association will find many useful and relevant tools in this book.

I.2. Organization of the book

The sequel of this introduction presents a history of the language for a better understanding of its evolution, a demystification of prejudices, a list of prerequisites and useful tools, and a list of the main features of JavaScript to introduce what the following parts will detail.

  • – Part 1: This part presents the basics of the language: variables, instructions, tests, string processing, arrays, objects and functions. It also details the specific aspects of JavaScript, its originality in the world of object languages and how it can answer most of the data processing tasks that we highlight in this book. We conclude with some examples of programming by “patterns”.
  • – Part 2: This presents JavaScript in “the ecosystem of the web page”, which is composed of the HTTP protocol, the HTML code, the CSS rules and JavaScript as the scripting language. The interface with the elements and events of the HTML DOM (“document object model”) allows for the dynamic enrichment of the page on the “client side”. Ajax technology allows for the addition of data extracted from the Internet. We address the issues of asynchronous data processing.
  • – Part 3: This is dedicated to deploying applications:
    • - accessing open data, free data, combining data from multiple sources by asynchronous “join”;
    • - displaying digital data in graphical plots, animating vector data, cartographic representation;
    • - creation of JSON files from spreadsheet software, for using JSONP delivery tools, or data directly accessible via different APIs, for converting many data from the Internet into data ready for your applications.

I.3. The history of JavaScript

This historical notice about the birth and life of JavaScript demonstrates another motivation for learning this language: on the computer science scale, it is an “old” language (more 25 years old). How can we explain such longevity? What significant assets allowed such perennity?

  • – 1993: Release of the web browser Mosaic (which made the World Wide Web popular) by the US NCSA (National Center for Supercomputing Applications).
  • – 1995: Mosaic becomes Netscape (then: 90% of the market), and asks Brendan Eich to build a scripting language for their Navigator, mimicking Java, released a few months earlier by Sun. Within 2 weeks, the job is done, based upon the language Self (Xerox PARC) and based on “prototypes” instead of “classes” like Java (for respecting the time allowed).
  • – 1996: Netscape applies JavaScript to the standard body ECMA. Microsoft reacts by developing JScript for Internet Explorer (version 3).
  • – 2006: W3C specification of the object XMLHttpRequest in order to standardize the use of the Ajax technology on the web.
  • – 2008: V8, the open source JavaScript engine of Google Chrome.
  • – 2009: ES5, first version to be adopted in all major browsers.
  • – 2009: Node.js (Ryan Dahl): JavaScript fully implemented server side.
  • – 2010: V8 optimized performance competition between browsers.
  • – 2015: ES6 brings important innovations: “let”, “const” declarations, Object.assign method, etc. supported by all the recent browsers.

Table I.1. History of the versions of JavaScript

Year Name/alias Description
1998 ECMAScript 2 Editorial changes only
1999 ECMAScript 3/ES3 Added regular expressions, try/catch statement
  ECMAScript 4 (never released)
2009 ECMAScript 5/ES5 Added JSON support, Object.create
2015 ECMAScript 6/ES6/ECMAScript2015 Added let, const, Object.assign, arrow syntax, template syntax, spread and rest syntax. Also: classes, modules
2016 ECMAScript 7 Added exponential operator (**), Array#includes

I.3.1. Analyzing this biography of JavaScript

We can notice several very distinct periods:

  • – initial success, 1997–1999: evidence of the interest in enriching interactive and dynamic web pages (dynamic HTML, named DHTML);
  • – stagnation decade, 1999–2009: different versions developed by different browsers, unsuccessful attempts to develop JavaScript on server-side. The design and development of jQuery was the survival response: providing a single access gate to JavaScript (“code once, run everywhere”);
  • – revival, 2009: the release of the Ajax technology, the V8 fast compilation, the design of the “off-web” Node.js, the release of the object JSON, all these innovations woke up the JavaScript normalization;
  • – JavaScript as a generalist language, 2015 to present day: the broad adoption2 of ES5, then ES6, allows one to overcome many coding pitfalls and performance gains, making the JavaScript engines acceptable even for video and animation.

I.4. To code without “var”, nor “for”, nor “new”

JavaScript is a lively language, which adapted itself to the evolutions of the new uses of the Internet (e.g. video, social networks, targeted publicity, etc.). More and more web applications massively use JavaScript on the client-side, and even on the server-side, since the release of Node.js and V8.

I.4.1. Comments

The foundations of JavaScript are unchanged, but two evolutions deeply modify todayʼs coding practices:

  • – the majority uses are no longer the same: interactive HTML on the web page has become anecdotal, processing data sources from the Internet via Ajax requests is the big deal;
  • – recent innovations (ES5, ES6) in the “Core ECMAScript” allow us to better take advantage of the prototypal approach and the functional nature of the language.

I.4.2. Deliberate bias of this book

We no longer code JavaScript in 2018 as we did before 2015. Using ES6 allows JavaScript to unleash its qualities rather than its faults:

  • – Javascript objects are prototype based. Use prototypes and avoid NEW;
  • – JavaScript functions = first-class objects: code functionally! Avoid “for” loops;
  • – use the better controlled variable declarations! Ban “var”.

And your code will be shorter, more readable, and much, much easier to debug.

I.4.3. Prerequisites

The Big Data should not be the “preserve of the big actors”: everybody with a browser, an Internet connection and some self-training, can work with Big Data. Everybody? As long as they are at least somewhat trained with a few basic notions on:

  • – the Internet and the World Wide Web (WWW);
  • – HyperText Markup Language (HTML) and a minimal knowledge of Cascading Style Sheets (CSS);
  • – the “Developer” tools of any browser (this bookʼs examples are checked with the open-source Firefox).

I.4.4. Some useful, easy, and free programming tools

Your browser knows how to interpret JavaScript (it is the training tool) and most of the time without the need for an Internet connection:

  • – to display the result of some script (see “Part 2”): write a simple web page including that script and use the “Web Console” or “Browser Console”.
  • – the browser “ScratchPad” (checked with Firefox) is useful for quickly testing a few lines of code: it displays console.log results and variable values in comments.

Besides, there are several “online tools” for helping us:

  • – W3Schools: allows both HTML+JavaScript code to be tested, as well as providing a convenient tutorial and most API references;
  • – JSBin and JSFiddle: these are popular among developers and provide a similar context, in which you can archive more easily;
  • – Thimble (Mozilla): makes it possible to build cooperative projects if you are coding in a team;
  • – JSLint: provides online lexical analysis of your JavaScript (see also ESLint).

Most of the code “bins” of this book have been tried with W3Schools.

I.5. Mechanisms and features of the script language

I.5.1. JavaScript is interpreted and run within an ecosystem

JavaScript is a scripting language, whose interpretation and execution depend on a script engine that requires a “host”, an environment providing basic objects, events and resources.

We must distinguish the “core JavaScript”, common foundations of the language, and the embedded JavaScript that includes the objects of a specific environment (e.g. “client-side JavaScript”).

Within the “web page ecosystem”, the environment is the browser (the “window” object). We may also find the “workers”, provided by the browser, but independent of the web page and the ecosystem of the Node.js modules, totally independent of any browser.

image

Figure I.2. JavaScript needs a host environment

I.5.2. What does a JavaScript engine do?

The script engine analyzes then runs the code in the hosting environment, which is named the “Global Object”. As soon as the engine starts, it executes two successive tasks:

  • – the lexical analysis (“lexical-time” or “read-time”) and production of a machine bytecode;
  • – the execution of the machine code (“run-time”). The V8 engine compiles (full-codegen), then optimizes in real time (crankshaft) to improve performance

From the programmer’s viewpoint, this means:

  • – the lexical declarations are parsed first, and the variable values are initialized to “undefined”, whose type is “undefined”. Try these lines:
    let x; // declaration without explicit definition (no assignment)
    console.log( x );
    console.log( typeof x );
  • – assignments are processed at run-time: definition of variables, typing, evaluation of expressions, etc. whatever appears on the right-hand side of an assignment sign "=”.

I.5.3. Variables and instructions: the functionalities of an “imperative language”

The lexical analysis lists the variables and their “scope” (where they are known). The run-time defines the variables (gives them a defined value) and also determines their type: the type of value. We may name this “dynamic typing”, rather than “weak typing”. The values can be: primitive values (numbers, strings, boolean constants), evaluated expressions, references to objects, arrays, functions or regular expressions.

The instructions can be assignments, function calls and classic control structures such as loops or conditional instructions.

I.5.4. Objects: functionalities of a “prototype-based object-oriented language”

The objects can be “built-in” (Object, Function, Array, Date, Math, JSON, etc.) or provided by the “host ecosystem”, such as the DOM elements (document HTML), or built by the application.

JavaScript makes no distinction between “class” and “instance” notions: there are only objects and any object may become the “prototype” used to create new objects. We may wrap-up the definition of a JavaScript object as “a set of properties plus one prototype”.

I.5.5. Functions as “first-class objects”: the functionalities of a “functional language”

Functions and objects behave in the same way, except that only a function can be invoked, or called. Any operation supported by an object is supported by a function: it makes it possible to build a “higher order function” and grounds the functional nature of JavaScript:

const mult = function(f,g){return function(n){return f(n)*g(n);}},
     square = function(x) { return x * x; };
mult(square, square)(3);     // -> 81

I.6. Conclusion

JavaScript is born for the web, and because of the web, JavaScript still lives 25 years later, and this is the motivation for writing this book. A small language, designed within 2 weeks by only one person, has proven to be surprisingly flexible. This flexibility is both an advantage, which allowed evolution and compliance with successive versions, and a drawback, in that it does not protect enough against programming errors, hard to detect and debug. JavaScript has been qualified as the “most misunderstood language” from one of its main “gurus”, Doug Crockford.

A tolerance that is too big, a control that is too weak, no “classes” for the objects, a syntax much too similar to the style of imperative and procedural languages; here are many deceptive pitfalls. By contrast, the initial choice of functions as “first-class objects” gives JavaScript the abilities of a functional language: a skill that is neglected by many programmers.

Recent normalization efforts provide new ways to avoid most pitfalls: we can code without var declaration (avoiding many “hoisting” traps), without for loops (avoiding index troubles), and almost without the "new" operator (for a more direct use of prototypes). This leads to writing a more readable and easier-to-debug code.

The surprising book “If Hemingway wrote JavaScript” (see [CRO 14]) shows how it is possible to render unrecognizable, and in several different ways, a JavaScript code that, nevertheless, always carries out the same operation.

PART 1
Core JavaScript

Introduction to Part 1

This part deals with the fundamentals of JavaScript, whatever the environment in which it is hosted. Most of the time, JavaScript is associated with a browser, but it is a language by itself. We use the term “Core JavaScript” or “ECMAScript” to mean what is “pure JavaScipt”, in contrast with what is added by the environment into which JavaScript is embedded.

There is no need to be a JavaScript expert, good applications can be quickly programmed, provided that you adopt some “good practices”. Coding without good practice (aka. “anti-pattern”) may seem correct while being silently error prone, with errors difficult to spot. That is why you will find “Recommendations” paragraphs throughout the chapters, suggesting ways to encode “patterns”. ES5 and ES6 standards facilitate this type of coding.

Part 1 can be used as a manual: going directly to the section that concerns one immediate problem, or instead studying a chapter more in depth, for instance with the sensitive issues of the language (e.g. prototypes, closures). Here is a quick tour of the features of this amazing language:

– Variables: declaration, definition, types (ban every “var”):

Here, we insist on the subtle distinction between the status of “declared” and “defined” variables, which deserves a particular attention. In this chapter, the creation of the tree structure of “variable scopes”, the implicit “hoisting” and the (dangerous) implicit declaration of global variables are presented. The ES6 version provides new declaration keywords, which can avoid many (silent) causes of error.

– Controls: booleans, tests and loops (replace “for” loops by array methods):

JavaScript looks classically procedural, when it comes to controlling the status of variables. But several operators may react surprisingly, due to their “polymorphism”: they do not react in the same way according to the type (e.g. number or string) and implicit (silent) recasting can be done. We emphasize such traps, and suggestions are provided to avoid them.

– Data: characters used as numbers, and strings and dates:

An appropriate format is required to represent quantitative (numbers) or qualitative data (names, texts, dates). Any unicode character can be used, and figures can be used within numbers or names or dates: we detail some issues related to type conversions and value comparisons, which are among the tricky points.

– Objects (restrict “new” to built-in or APIs objects):

The construction of specific objects is required to structure linked data into meaningful information and to assign it the appropriate methods. The objects in JavaScript are treated rather differently than in most object-oriented languages. We focus on the innovations of ES6 that provide better tools to build objects in the “prototype” way.

– Arrays (get rid of loops “for”):

To handle time series or relational data, tables and matrices are required. We detail the most useful methods, added to the JavaScript Array object since ES5 or ES6, that make it possible to avoid the loops, rewriting them in a “functional” code style.

– Functions (do program functionally, as often as possible):

Functions are “first-class objects” in JavaScript. This is probably the most important part of this language, and the key reasons for its efficiency and longevity.