Wednesday, November 25, 2015

V8 Release 4.8

Roughly every six weeks, we create a new branch of V8 as part of our release process. Each version is branched from V8’s git master immediately before Chrome branches for a Chrome Beta milestone. Today we’re pleased to announce our newest branch, V8 version 4.8, which will be in beta until it is released in coordination with Chrome 48 Stable. V8 4.8 contains a handful of developer-facing features, so we’d like to give you a preview of some of the highlights in anticipation of the release in several weeks.

Improved ECMAScript 2015 (ES6) support

This release of V8 provides support for two well-known symbols, built-in symbols from the ES2015 spec that allow developers to leverage several low-level language constructs which were previously hidden.

@@isConcatSpreadable

The name for a Boolean valued property that if true indicates an object should be flattened to its array elements by Array.prototype.concat.

(function(){
"use strict";
class AutomaticallySpreadingArray extends Array {
   get [Symbol.isConcatSpreadable]() {
      return true;
   }
}
let first = [1];
let second = new AutomaticallySpreadingArray();
second[0] = 2;
second[1] = 3;
let all = first.concat(second);
// Outputs [1, 2, 3]
console.log(all);
}());

@@toPrimitive

The name for a method to invoke on an object for implicit conversions to primitive values.

(function(){
"use strict";
class V8 {
   [Symbol.toPrimitive](hint) {
      if (hint === 'string') {
         console.log('string');
         return 'V8';
      } else if (hint === 'number') {
         console.log('number');
         return 8;
      } else {
         console.log('default:' + hint);
         return 8;
      }
   }
}

let engine = new V8();
console.log(Number(engine));
console.log(String(engine));
}());

ToLength

The ES2015 spec adjusts the abstract operation for type conversion to convert an argument to an integer suitable for use as the length of an array-like object.  (While not directly observable, this change might be indirectly visible when dealing with array-like objects with negative length.)

V8 API

Please check out our summary of API changes. This document gets regularly updated a few weeks after each major release.

Developers with an active V8 checkout can use 'git checkout -b 4.8 -t branch-heads/4.8' to experiment with the new features in V8 4.8. Alternatively you can subscribe to Chrome's Beta channel and try the new features out yourself soon.

Posted by the V8 team

Friday, October 30, 2015

Jank Busters Part One

Jank, or in other words visible stutters, can be noticed when Chrome fails to render a frame within 16.66ms (disrupting 60 frames per second motion). As of today most of the V8 garbage collection work is performed on the main rendering thread, c.f. Figure 1, often resulting in jank when too many objects need to be maintained. Eliminating jank has always been a high priority for the V8 team [1, 2, 3]. In this blog post we will discuss a few optimizations that were implemented between M41 and M46 which significantly reduce garbage collection pauses resulting in better user experience.

Figure 1: Garbage collection performed on the main thread.

A major source of jank during garbage collection is processing various bookkeeping data structures. Many of these data structures enable optimizations that are unrelated to garbage collection. Two examples are the list of all ArrayBuffers, and each ArrayBuffer’s list of views. These lists allow for an efficient implementation of the DetachArrayBuffer operation without imposing any performance hit on access to an ArrayBuffer view. In situations, however, where a web page creates millions of ArrayBuffers, (e.g., WebGL-based games), updating those lists during garbage collection causes significant jank. In M46, we removed these lists and instead detect detached buffers by inserting checks before every load and store to ArrayBuffers. This amortizes the cost of walking the big bookkeeping list during GC by spreading it throughout program execution resulting in less jank. Although the per-access checks can theoretically slow down the throughput of programs that heavily use ArrayBuffers, in practice V8's optimizing compiler can often remove redundant checks and hoist remaining checks out of loops, resulting in a much smoother execution profile with little or no overall performance penalty.

Another source of jank is the bookkeeping associated with tracking the lifetimes of objects shared between Chrome and V8. Although the Chrome and V8 memory heaps are distinct, they must be synchronized for certain objects, like DOM nodes, that are implemented in Chrome's C++ code but accessible from JavaScript. V8 creates an opaque data type called a handle that allows Chrome to manipulate a V8 heap object without knowing any of the details of the implementation. The object's lifetime is bound to the handle: as long as Chrome keeps the handle around, V8's garbage collector won't throw away the object. V8 creates an internal data structure called a global reference for each handle it passes back out to Chrome through the V8 API, and these global references are what tell V8's garbage collector that the object is still alive. For WebGL games, Chrome may create millions of such handles, and V8, in turn, needs to create the corresponding global references to manage their lifecycle. Processing these huge amounts of global references in the main garbage collection pause is observable as jank. Fortunately, objects communicated to WebGL are often just passed along and never actually modified, enabling simple static escape analysis. In essence, for WebGL functions that are known to usually take small arrays as parameters the underlying data is copied on the stack, making a global reference obsolete. The result of such a mixed approach is a reduction of pause time by up to 50% for rendering-heavy WebGL games.

Most of V8’s garbage collection is performed on the main rendering thread. Moving garbage collection operations to concurrent threads reduces the waiting time for the garbage collector and further reduces jank. This is an inherently complicated task since the main JavaScript application and the garbage collector may simultaneous observe and modify the same objects. Until now, concurrency was limited to sweeping the old generation of the regular object JS heap. Recently, we also implemented concurrent sweeping of the code and map space of the V8 heap. Additionally, we implemented concurrent unmapping of unused pages to reduce the work that has to be performed on the main thread, c.f. Figure 2.

Figure 2: Some garbage collection operations performed on the concurrent garbage collection threads.

The impact of the discussed optimizations is clearly visible in WebGL-based games, for example Turbolenz' Oort Online demo. The following video compares Chrome M41 to M46


We are currently in the process of making more garbage collection components incremental, concurrent, and parallel, to shrink garbage collection pause times on the main thread even further. Stay tuned as we have some interesting patches in the pipeline.

Posted by the jank busters: Jochen Eisinger, Michael Lippautz, and Hannes Payer

Wednesday, October 14, 2015

V8 Release 4.7

Roughly every six weeks, we create a new branch of V8 as part of our release process. Each version is branched from V8’s git master immediately before Chrome branches for a Chrome Beta milestone. Today we’re pleased to announce our newest branch, V8 version 4.7, which will be in beta until it is released in coordination with Chrome 47 Stable. V8 4.7 is filled will all sorts of developer-facing goodies, so we’d like to give you a preview of some of the highlights in anticipation of the release in several weeks.

Improved ECMAScript 2015 (ES6) support

Rest operator

The Rest operator enables the developer to pass an indefinite number of arguments to a function. It is similar to the arguments object.

//Without Rest operator
function concat() {
 var args = Array.prototype.slice.call(arguments, 1);
 return args.join("");
} 

//With Rest operator
function concatWithRest(...strings) {
 return strings.join("");
}

Support for upcoming ES features

Array.prototype.includes

The new Array.prototype.includes method is a new feature that is currently a stage 3 proposal for inclusion in ES2016. It provides a terse syntax for determining whether or not an element is in a given array by returning a boolean value.

[1, 2, 3].includes(3) // true
["apple", "banana", "cherry"].includes("apple") // true
["apple", "banana", "cherry"].includes("peach") // false

Ease the pressure on memory while parsing

Recent changes to the V8 parser greatly reduce the memory consumed by parsing files with large nested functions. In particular, this allows V8 to run larger asm.js modules than previously possible.

V8 API

Please check out our summary of API changes. This document gets regularly updated a few weeks after each major release. Developers with an active V8 checkout can use 'git checkout -b 4.7 -t branch-heads/4.7' to experiment with the new features in V8 4.7. Alternatively you can subscribe to Chrome's Beta channel and try the new features out yourself soon.

Posted by the V8 team

Friday, September 25, 2015

Custom startup snapshots

The JavaScript specification includes a lot of built-in functionality, from math functions to a full-featured regular expression engine. Every newly-created V8 context has these functions available from the start. For this to work, the global object (for example, the window object in a browser) and all the built-in functionality must be set up and initialized into V8’s heap at the time the context is created. It takes quite some time to do this from scratch.

Fortunately, V8 uses a shortcut to speed things up: just like thawing a frozen pizza for a quick dinner, we deserialize a previously-prepared snapshot directly into the heap to get an initialized context. On a regular desktop computer, this can bring the time to create a context from 40 ms down to less than 2 ms. On an average mobile phone, this could mean a difference between 270 ms and 10 ms.

Applications other than Chrome that embed V8 may require more than vanilla Javascript. Many load additional library scripts at startup, before the “actual” application runs. For example, a simple TypeScript VM based on V8 would have to load the TypeScript compiler on startup in order to translate TypeScript source code into JavaScript on-the-fly.

As of the release of V8 4.3 two months ago, embedders can utilize snapshotting to skip over the startup time incurred by such an initialization.. The test case for this feature shows how this API works.

To create a snapshot, we can call v8::V8::CreateSnapshotDataBlob with the to-be-embedded script as a null-terminated C string. After creating a new context, this script is compiled and executed. In our example, we create two custom startup snapshots, each of which define functions on top of what JavaScript already has built in.

We can then use v8::Isolate::CreateParams to configure a newly-created isolate so that it initializes contexts from a custom startup snapshot. Contexts created in that isolate are exact copies of the one from which we took a snapshot. The functions defined in the snapshot are available without having to define them again.

There is an important limitation to this: the snapshot can only capture V8’s heap. Any interaction from V8 with the outside is off-limits when creating the snapshot.  Such interactions include:

  • defining and calling API callbacks (i.e. functions created via v8::FunctionTemplate)
  • creating typed arrays, since the backing store may be allocated outside of V8
And of course, values derived from sources such as Math.random or Date.now are fixed once the snapshot has been captured. They are no longer really random nor reflect the current time.

Limitations aside, startup snapshots remain a great way to save time on initialization. We can shave off 100ms from the startup spent on loading the TypeScript compiler in our example above (on a regular desktop computer). We're looking forward to seeing how you might put custom snapshots to use!

Posted by Yang Guo, Software Engineer and engine pre-heater supplier

Friday, August 28, 2015

V8 Release 4.6

Roughly every six weeks, we create a new branch of V8 as part of our release process. Each version is branched from V8’s git master immediately before Chrome branches for a Chrome Beta milestone. Today we’re pleased to announce our newest branch, V8 version 4.6, which will be in beta until it is released in coordination with Chrome 46 Stable. V8 4.6 is filled will all sorts of developer-facing goodies, so we’d like to give you a preview of some of the highlights in anticipation of the release in several weeks.

Improved ECMAScript 2015 (ES6) support

V8 4.6 adds support for several ECMAScript 2015 (ES6) features.

Spread operator

The spread operator makes it much more convenient to work with arrays. For example it makes imperative code obsolete when you simply want to merge arrays.

// Merging arrays
// Code without spread operator
let inner = [3, 4];
let merged = [0, 1, 2].concat(inner, [5]);

// Code with spread operator
let inner = [3, 4];
let merged = [0, 1, 2, ...inner, 5];
Another good use of the spread operator to replace apply().

// Function parameters stored in an array
// Code without spread operator
function myFunction(a, b, c) {
   console.log(a);
   console.log(b);
   console.log(c);
}
let argsInArray = ["Hi ", "Spread ", "operator!"];
myFunction.apply(null, argsInArray);

// Code with spread operator
function myFunction (a,b,c) {
   console.log(a);
   console.log(b);
   console.log(c);
}

let argsInArray = ["Hi ", "Spread ", "operator!"];
myFunction(...argsInArray);

new.target

new.target is one of ES6's features designed to improve working with classes. Under the hood it’s actually an implicit parameter to every function. If a function is called with the keyword new, then the parameter holds a reference to the called function. If new is not used the parameter is undefined.

In practice, this means that you can use new.target to figure out whether a function was called normally or constructor-called via the new keyword.

function myFunction() {
   if (new.target === undefined) {
      throw "Try out calling it with new.";
   }
   console.log("Works;");
}

// Will break
myFunction();

// Will work
let a = new myFunction();
When ES6 classes and inheritance are used, new.target inside the constructor of a super-class is bound to the derived constructor that was invoked with new. In particular, this gives super-classes access to the prototype of the derived class during construction.

Reduce the jank

Jank can be a pain, especially when playing a game. Often, it's even worse when the game features multiple players. oortonline.gl is a WebGL benchmark that tests the limits of current browsers by rendering a complex 3D scene with particle effects and modern shader rendering. The V8 team set off in a quest to push the limits of Chrome’s performance in these environments. We’re not done yet, but the fruits of our efforts are already paying off. Chrome 46 shows incredible advances in oortonline.gl performance which you can see yourself below.



Some of the optimizations include:

The good thing is that all changes related to oortonline.gl are general improvements that potentially affect all users of applications that make heavy use of WebGL.

V8 API

Please check out our summary of API changes. This document gets regularly updated a few weeks after each major release.

Developers with an active V8 checkout can use 'git checkout -b 4.6 -t branch-heads/4.6' to experiment with the new features in V8 4.6. Alternatively you can subscribe to Chrome's Beta channel and try the new features out yourself soon.


Posted by the V8 team

Friday, August 7, 2015

Getting Garbage Collection for Free

JavaScript performance continues to be one of the key aspects of Chrome's values, especially when it comes to enabling a smooth experience. Starting in Chrome 41, V8 takes advantage of a new technique to increase the responsiveness of web applications by hiding expensive memory management operations inside of small, otherwise unused chunks of idle time. As a result, web developers should expect smoother scrolling and buttery animations with much reduced jank due to garbage collection.

Many modern language engines such as Chrome's V8 JavaScript engine dynamically manage memory for running applications so that developers don't need to worry about it themselves. The engine periodically passes over the memory allocated to the application, determines which data is no longer needed, and clears it out to free up room. This process is known as garbage collection.

In Chrome, we strive to deliver a smooth, 60 frames per second (FPS) visual experience. Although V8 already attempts to perform garbage collection in small chunks, larger garbage collection operations can and do occur at unpredictable times -- sometimes in the middle of an animation -- pausing execution and preventing Chrome from hitting that 60 FPS goal.

Chrome 41 included a task scheduler for the Blink rendering engine which enables prioritization of latency-sensitive tasks to ensure Chrome remains responsive and snappy. As well as being able to prioritize work, this task scheduler has centralized knowledge of how busy the system is, what tasks need to be performed and how urgent each of these tasks are. As such, it can estimate when Chrome is likely to be idle and roughly how long it expects to remain idle.

An example of this occurs when Chrome is showing an animation on a web page. The animation will update the screen at 60 FPS, giving Chrome around 16.6 ms of time to perform the update. As such, Chrome will start work on the current frame as soon as the previous frame has been displayed, performing input, animation and frame rendering tasks for this new frame. If Chrome completes all this work in less than 16.6 ms, then it has nothing else to do for the remaining time until it needs to start rendering the next frame. Chrome’s scheduler enables V8 to take advantage of this idle time period by scheduling special idle tasks when Chrome would otherwise be idle.

Figure 1: Frame rendering with idle tasks.

Idle tasks are special low-priority tasks which are run when the scheduler determines it is in an idle period. Idle tasks are given a deadline which is the scheduler's estimate of how long it expects to remain idle. In the animation example in Figure 1, this would be the time at which the next frame should start being drawn. In other situations (e.g., when no on-screen activity is happening) this could be the time when the next pending task is scheduled to be run, with an upper bound of 50 ms to ensure that Chrome remains responsive to unexpected user input. The deadline is used by the idle task to estimate how much work it can do without causing jank or delays in input response.

Garbage collection done in the idle tasks are hidden from critical, latency-sensitive operations. This means that these garbage collection tasks are done for “free”. In order to understand how V8 does this, it is worth reviewing V8’s current garbage collection strategy.

Deep dive into V8’s garbage collection engine

V8 uses a generational garbage collector with the Javascript heap split into a small young generation for newly allocated objects and a large old generation for long living objects. Since most objects die young, this generational strategy enables the garbage collector to perform regular, short garbage collections in the smaller young generation (known as scavenges), without having to trace objects in the old generation.

The young generation uses a semi-space allocation strategy, where new objects are initially allocated in the young generation’s active semi-space. Once that semi-space becomes full, a scavenge operation will move live objects to the other semi-space. Objects which have been moved once already are promoted to the old generation and are considered to be long-living. Once the live objects have been moved, the new semi-space becomes active and any remaining dead objects in the old semi-space are discarded.

The duration of a young generation scavenge therefore depends on the size of live objects in the young generation. A scavenge will be fast (<1 ms) when most of the objects become unreachable in the young generation. However, if most objects survive a scavenge, the duration of the scavenge may be significantly longer.

A major collection of the whole heap is performed when the size of live objects in the old generation grows beyond a heuristically-derived limit. The old generation uses a mark-and-sweep collector with several optimizations to improve latency and memory consumption. Marking latency depends on the number of live objects that have to be marked, with marking of the whole heap potentially taking more than 100 ms for large web applications. In order to avoid pausing the main thread for such long periods, V8 has long had the ability to incrementally mark live objects in many small steps, with the aim to keep each marking steps below 5 ms in duration.

After marking, the free memory is made available again for the application by sweeping the whole old generation memory. This task is performed concurrently by dedicated sweeper threads. Finally, memory compaction is performed to reduce memory fragmentation in the old generation. This task may be very time-consuming and is only performed if memory fragmentation is an issue.

In summary, there are four main garbage collection tasks:
  1. Young generation scavenges, which usually are fast
  2. Marking steps performed by the incremental marker, which can be arbitrarily long depending on the step size
  3. Full garbage collections, which may take a long time
  4. Full garbage collections with aggressive memory compaction, which may take a long time, but clean up fragmented memory
In order to perform these operations in idle periods, V8 posts garbage collection idle tasks to the scheduler. When these idle tasks are run they are provided with a deadline by which they should complete. V8’s garbage collection idle time handler evaluates which garbage collection tasks should be performed in order to reduce memory consumption, while respecting the deadline to avoid future jank in frame rendering or input latency.

The garbage collector will perform a young generation scavenge during an idle task if the application’s measured allocation rate shows that the young generation may be full before the next expected idle period. Additionally, it calculates the average time taken by recent scavenge tasks in order to predict the duration of future scavenges and ensure that it doesn’t violate idle task deadlines.

When the size of live objects in the old generation is close to the heap limit, incremental marking is started. Incremental marking steps can be linearly scaled by the number of bytes that should be marked. Based on the average measured marking speed, the garbage collection idle time handler tries to fit as much marking work as possible into a given idle task.

A full garbage collection is scheduled during an idle tasks if the old generation is almost full and if the deadline provided to the task is estimated to be long enough to complete the collection. The collection pause time is predicted based on the marking speed multiplied by the number of allocated objects. Full garbage collections with additional compaction are only performed if the webpage has been idle for a significant amount of time.

Performance evaluation

In order to evaluate the impact of running garbage collection during idle time, we used Chrome’s Telemetry performance benchmarking framework to evaluate how smoothly popular websites scroll while they load. We benchmarked the top 25 sites on a Linux workstation as well as typical mobile sites on an Android Nexus 6 smartphone, both of which open popular webpages (including complex webapps such as Gmail, Google Docs and YouTube) and scroll their content for a few seconds. Chrome aims to keep scrolling at 60 FPS for a smooth user experience.

Figure 2 shows the percentage of garbage collection that was scheduled during idle time. The workstation’s faster hardware results in more overall idle time compared to the Nexus 6, thereby enabling a greater percentage of garbage collection to be scheduled during this idle time (43% compared to 31% on the Nexus 6) resulting in about 7% improvement on our jank metric.

Figure 2: The percentage of garbage collection that occurs during idle time. 


As well as improving the smoothness of page rendering, these idle periods also provide an opportunity to perform more aggressive garbage collection when the page becomes fully idle. Recent improvements in Chrome 45 take advantage of this to drastically reduce the amount of memory consumed by idle foreground tabs. Figure 3 shows a sneak peek at how memory usage of Gmail’s JavaScript heap can be reduced by about 45% when it becomes idle, compared to the same page in Chrome 43.

Figure 3: Memory usage for latest version of Chrome 45 (left) vs Chrome 43 on Gmail.

These improvements demonstrate that it is possible to hide garbage collection pauses by being smarter about when expensive garbage collection operations are performed. Web developers no longer have to fear the garbage collection pause, even when targeting silky smooth 60 FPS animations. Stay tuned for more improvements as we push the bounds of garbage collection scheduling.

Posted by Hannes Payer and Ross McIlroy, Idle Garbage Collectors

Monday, July 27, 2015

Code caching

V8 uses just-in-time compilation (JIT) to execute Javascript code. This means that immediately prior to running a script, it has to be parsed and compiled - which can cause considerable overhead. As we announced recently, code caching is a technique that lessens this overhead. When a script is compiled for the first time, cache data is produced and stored. The next time V8 needs to compile the same script, even in a different V8 instance, it can use the cache data to recreate the compilation result instead of compiling from scratch. As a result the script is executed much sooner.

Code caching has been available since V8 version 4.2 and not limited to Chrome alone. It is exposed through V8’s API, so that every V8 embedder can take advantage of it. The test case used to exercise this feature serves as an example of how to use this API.

When a script is compiled by V8, cache data can be produced to speed up later compilations by passing v8::ScriptCompiler::kProduceCodeCache as an option. If the compilation succeeds, the cache data is attached to the source object and can be retrieved via v8::ScriptCompiler::Source::GetCachedData. It can then be persisted for later, for example by writing it to disk.

During later compilations, the previously produced cache data can be attached to the source object and passed v8::ScriptCompiler::kConsumeCodeCache as an option. This time, code will be produced much faster, as V8 bypasses compiling the code and deserializes it from the provided cache data.

Producing cache data comes at a certain computational and memory cost. For this reason, Chrome only produces cache data if the same script is seen at least twice within a couple of days. This way Chrome is able to turn script files into executable code twice as fast on average, saving users valuable time on each subsequent page load.

Posted by Yang Guo, Software Engineer