Skip to content

Commit a7f37bc

Browse files
joyeecheungtargos
authored andcommitted
src: add --heapsnapshot-near-heap-limit option
This patch adds a --heapsnapshot-near-heap-limit CLI option that takes heap snapshots when the V8 heap is approaching the heap size limit. It will try to write the snapshots to disk before the program crashes due to OOM. PR-URL: #33010 Refs: #27552 Reviewed-By: Anna Henningsen <[email protected]> Reviewed-By: Richard Lau <[email protected]> Reviewed-By: Gireesh Punathil <[email protected]>
1 parent eff4498 commit a7f37bc

20 files changed

+496
-21
lines changed

doc/api/cli.md

+47
Original file line numberDiff line numberDiff line change
@@ -333,6 +333,52 @@ reference. Code may break under this flag.
333333
`--require` runs prior to freezing intrinsics in order to allow polyfills to
334334
be added.
335335

336+
### `--heapsnapshot-near-heap-limit=max_count`
337+
<!-- YAML
338+
added: REPLACEME
339+
-->
340+
341+
> Stability: 1 - Experimental
342+
343+
Writes a V8 heap snapshot to disk when the V8 heap usage is approaching the
344+
heap limit. `count` should be a non-negative integer (in which case
345+
Node.js will write no more than `max_count` snapshots to disk).
346+
347+
When generating snapshots, garbage collection may be triggered and bring
348+
the heap usage down, therefore multiple snapshots may be written to disk
349+
before the Node.js instance finally runs out of memory. These heap snapshots
350+
can be compared to determine what objects are being allocated during the
351+
time consecutive snapshots are taken. It's not guaranteed that Node.js will
352+
write exactly `max_count` snapshots to disk, but it will try
353+
its best to generate at least one and up to `max_count` snapshots before the
354+
Node.js instance runs out of memory when `max_count` is greater than `0`.
355+
356+
Generating V8 snapshots takes time and memory (both memory managed by the
357+
V8 heap and native memory outside the V8 heap). The bigger the heap is,
358+
the more resources it needs. Node.js will adjust the V8 heap to accommondate
359+
the additional V8 heap memory overhead, and try its best to avoid using up
360+
all the memory avialable to the process. When the process uses
361+
more memory than the system deems appropriate, the process may be terminated
362+
abruptly by the system, depending on the system configuration.
363+
364+
```console
365+
$ node --max-old-space-size=100 --heapsnapshot-near-heap-limit=3 index.js
366+
Wrote snapshot to Heap.20200430.100036.49580.0.001.heapsnapshot
367+
Wrote snapshot to Heap.20200430.100037.49580.0.002.heapsnapshot
368+
Wrote snapshot to Heap.20200430.100038.49580.0.003.heapsnapshot
369+
370+
<--- Last few GCs --->
371+
372+
[49580:0x110000000] 4826 ms: Mark-sweep 130.6 (147.8) -> 130.5 (147.8) MB, 27.4 / 0.0 ms (average mu = 0.126, current mu = 0.034) allocation failure scavenge might not succeed
373+
[49580:0x110000000] 4845 ms: Mark-sweep 130.6 (147.8) -> 130.6 (147.8) MB, 18.8 / 0.0 ms (average mu = 0.088, current mu = 0.031) allocation failure scavenge might not succeed
374+
375+
376+
<--- JS stacktrace --->
377+
378+
FATAL ERROR: Ineffective mark-compacts near heap limit Allocation failed - JavaScript heap out of memory
379+
....
380+
```
381+
336382
### `--heapsnapshot-signal=signal`
337383
<!-- YAML
338384
added: v12.0.0
@@ -1276,6 +1322,7 @@ Node.js options that are allowed are:
12761322
* `--force-context-aware`
12771323
* `--force-fips`
12781324
* `--frozen-intrinsics`
1325+
* `--heapsnapshot-near-heap-limit`
12791326
* `--heapsnapshot-signal`
12801327
* `--http-parser`
12811328
* `--icu-data-dir`

doc/node.1

+4
Original file line numberDiff line numberDiff line change
@@ -182,6 +182,10 @@ Same requirements as
182182
.It Fl -frozen-intrinsics
183183
Enable experimental frozen intrinsics support.
184184
.
185+
.It Fl -heapsnapshot-near-heap-limit Ns = Ns Ar max_count
186+
Generate heap snapshot when the V8 heap usage is approaching the heap limit.
187+
No more than the specified number of snapshots will be generated.
188+
.
185189
.It Fl -heapsnapshot-signal Ns = Ns Ar signal
186190
Generate heap snapshot on specified signal.
187191
.

src/debug_utils.h

+1
Original file line numberDiff line numberDiff line change
@@ -41,6 +41,7 @@ void FWrite(FILE* file, const std::string& str);
4141
// from a provider type to a debug category.
4242
#define DEBUG_CATEGORY_NAMES(V) \
4343
NODE_ASYNC_PROVIDER_TYPES(V) \
44+
V(DIAGNOSTICS) \
4445
V(HUGEPAGES) \
4546
V(INSPECTOR_SERVER) \
4647
V(INSPECTOR_PROFILER) \

src/env-inl.h

+16
Original file line numberDiff line numberDiff line change
@@ -614,6 +614,22 @@ inline const std::string& Environment::exec_path() const {
614614
return exec_path_;
615615
}
616616

617+
inline std::string Environment::GetCwd() {
618+
char cwd[PATH_MAX_BYTES];
619+
size_t size = PATH_MAX_BYTES;
620+
const int err = uv_cwd(cwd, &size);
621+
622+
if (err == 0) {
623+
CHECK_GT(size, 0);
624+
return cwd;
625+
}
626+
627+
// This can fail if the cwd is deleted. In that case, fall back to
628+
// exec_path.
629+
const std::string& exec_path = exec_path_;
630+
return exec_path.substr(0, exec_path.find_last_of(kPathSeparator));
631+
}
632+
617633
#if HAVE_INSPECTOR
618634
inline void Environment::set_coverage_directory(const char* dir) {
619635
coverage_directory_ = std::string(dir);

src/env.cc

+146
Original file line numberDiff line numberDiff line change
@@ -3,6 +3,7 @@
33
#include "async_wrap.h"
44
#include "base_object-inl.h"
55
#include "debug_utils-inl.h"
6+
#include "diagnosticfilename-inl.h"
67
#include "memory_tracker-inl.h"
78
#include "node_buffer.h"
89
#include "node_context_data.h"
@@ -24,6 +25,7 @@
2425
#include <cinttypes>
2526
#include <cstdio>
2627
#include <iostream>
28+
#include <limits>
2729
#include <memory>
2830

2931
namespace node {
@@ -479,6 +481,11 @@ Environment::~Environment() {
479481
// FreeEnvironment() should have set this.
480482
CHECK(is_stopping());
481483

484+
if (options_->heap_snapshot_near_heap_limit > heap_limit_snapshot_taken_) {
485+
isolate_->RemoveNearHeapLimitCallback(Environment::NearHeapLimitCallback,
486+
0);
487+
}
488+
482489
isolate()->GetHeapProfiler()->RemoveBuildEmbedderGraphCallback(
483490
BuildEmbedderGraph, this);
484491

@@ -1402,6 +1409,25 @@ void Environment::DeserializeProperties(const EnvSerializeInfo* info) {
14021409
CHECK_EQ(ctx_from_snapshot, ctx);
14031410
}
14041411

1412+
uint64_t GuessMemoryAvailableToTheProcess() {
1413+
uint64_t free_in_system = uv_get_free_memory();
1414+
size_t allowed = uv_get_constrained_memory();
1415+
if (allowed == 0) {
1416+
return free_in_system;
1417+
}
1418+
size_t rss;
1419+
int err = uv_resident_set_memory(&rss);
1420+
if (err) {
1421+
return free_in_system;
1422+
}
1423+
if (allowed < rss) {
1424+
// Something is probably wrong. Fallback to the free memory.
1425+
return free_in_system;
1426+
}
1427+
// There may still be room for swap, but we will just leave it here.
1428+
return allowed - rss;
1429+
}
1430+
14051431
void Environment::BuildEmbedderGraph(Isolate* isolate,
14061432
EmbedderGraph* graph,
14071433
void* data) {
@@ -1414,6 +1440,126 @@ void Environment::BuildEmbedderGraph(Isolate* isolate,
14141440
});
14151441
}
14161442

1443+
size_t Environment::NearHeapLimitCallback(void* data,
1444+
size_t current_heap_limit,
1445+
size_t initial_heap_limit) {
1446+
Environment* env = static_cast<Environment*>(data);
1447+
1448+
Debug(env,
1449+
DebugCategory::DIAGNOSTICS,
1450+
"Invoked NearHeapLimitCallback, processing=%d, "
1451+
"current_limit=%" PRIu64 ", "
1452+
"initial_limit=%" PRIu64 "\n",
1453+
env->is_processing_heap_limit_callback_,
1454+
static_cast<uint64_t>(current_heap_limit),
1455+
static_cast<uint64_t>(initial_heap_limit));
1456+
1457+
size_t max_young_gen_size = env->isolate_data()->max_young_gen_size;
1458+
size_t young_gen_size = 0;
1459+
size_t old_gen_size = 0;
1460+
1461+
v8::HeapSpaceStatistics stats;
1462+
size_t num_heap_spaces = env->isolate()->NumberOfHeapSpaces();
1463+
for (size_t i = 0; i < num_heap_spaces; ++i) {
1464+
env->isolate()->GetHeapSpaceStatistics(&stats, i);
1465+
if (strcmp(stats.space_name(), "new_space") == 0 ||
1466+
strcmp(stats.space_name(), "new_large_object_space") == 0) {
1467+
young_gen_size += stats.space_used_size();
1468+
} else {
1469+
old_gen_size += stats.space_used_size();
1470+
}
1471+
}
1472+
1473+
Debug(env,
1474+
DebugCategory::DIAGNOSTICS,
1475+
"max_young_gen_size=%" PRIu64 ", "
1476+
"young_gen_size=%" PRIu64 ", "
1477+
"old_gen_size=%" PRIu64 ", "
1478+
"total_size=%" PRIu64 "\n",
1479+
static_cast<uint64_t>(max_young_gen_size),
1480+
static_cast<uint64_t>(young_gen_size),
1481+
static_cast<uint64_t>(old_gen_size),
1482+
static_cast<uint64_t>(young_gen_size + old_gen_size));
1483+
1484+
uint64_t available = GuessMemoryAvailableToTheProcess();
1485+
// TODO(joyeecheung): get a better estimate about the native memory
1486+
// usage into the overhead, e.g. based on the count of objects.
1487+
uint64_t estimated_overhead = max_young_gen_size;
1488+
Debug(env,
1489+
DebugCategory::DIAGNOSTICS,
1490+
"Estimated available memory=%" PRIu64 ", "
1491+
"estimated overhead=%" PRIu64 "\n",
1492+
static_cast<uint64_t>(available),
1493+
static_cast<uint64_t>(estimated_overhead));
1494+
1495+
// This might be hit when the snapshot is being taken in another
1496+
// NearHeapLimitCallback invocation.
1497+
// When taking the snapshot, objects in the young generation may be
1498+
// promoted to the old generation, result in increased heap usage,
1499+
// but it should be no more than the young generation size.
1500+
// Ideally, this should be as small as possible - the heap limit
1501+
// can only be restored when the heap usage falls down below the
1502+
// new limit, so in a heap with unbounded growth the isolate
1503+
// may eventually crash with this new limit - effectively raising
1504+
// the heap limit to the new one.
1505+
if (env->is_processing_heap_limit_callback_) {
1506+
size_t new_limit = initial_heap_limit + max_young_gen_size;
1507+
Debug(env,
1508+
DebugCategory::DIAGNOSTICS,
1509+
"Not generating snapshots in nested callback. "
1510+
"new_limit=%" PRIu64 "\n",
1511+
static_cast<uint64_t>(new_limit));
1512+
return new_limit;
1513+
}
1514+
1515+
// Estimate whether the snapshot is going to use up all the memory
1516+
// available to the process. If so, just give up to prevent the system
1517+
// from killing the process for a system OOM.
1518+
if (estimated_overhead > available) {
1519+
Debug(env,
1520+
DebugCategory::DIAGNOSTICS,
1521+
"Not generating snapshots because it's too risky.\n");
1522+
env->isolate()->RemoveNearHeapLimitCallback(NearHeapLimitCallback,
1523+
initial_heap_limit);
1524+
return current_heap_limit;
1525+
}
1526+
1527+
// Take the snapshot synchronously.
1528+
env->is_processing_heap_limit_callback_ = true;
1529+
1530+
std::string dir = env->options()->diagnostic_dir;
1531+
if (dir.empty()) {
1532+
dir = env->GetCwd();
1533+
}
1534+
DiagnosticFilename name(env, "Heap", "heapsnapshot");
1535+
std::string filename = dir + kPathSeparator + (*name);
1536+
1537+
Debug(env, DebugCategory::DIAGNOSTICS, "Start generating %s...\n", *name);
1538+
1539+
// Remove the callback first in case it's triggered when generating
1540+
// the snapshot.
1541+
env->isolate()->RemoveNearHeapLimitCallback(NearHeapLimitCallback,
1542+
initial_heap_limit);
1543+
1544+
heap::WriteSnapshot(env->isolate(), filename.c_str());
1545+
env->heap_limit_snapshot_taken_ += 1;
1546+
1547+
// Don't take more snapshots than the number specified by
1548+
// --heapsnapshot-near-heap-limit.
1549+
if (env->heap_limit_snapshot_taken_ <
1550+
env->options_->heap_snapshot_near_heap_limit) {
1551+
env->isolate()->AddNearHeapLimitCallback(NearHeapLimitCallback, env);
1552+
}
1553+
1554+
FPrintF(stderr, "Wrote snapshot to %s\n", filename.c_str());
1555+
// Tell V8 to reset the heap limit once the heap usage falls down to
1556+
// 95% of the initial limit.
1557+
env->isolate()->AutomaticallyRestoreInitialHeapLimit(0.95);
1558+
1559+
env->is_processing_heap_limit_callback_ = false;
1560+
return initial_heap_limit;
1561+
}
1562+
14171563
inline size_t Environment::SelfSize() const {
14181564
size_t size = sizeof(*this);
14191565
// Remove non pointer fields that will be tracked in MemoryInfo()

src/env.h

+10
Original file line numberDiff line numberDiff line change
@@ -597,6 +597,7 @@ class IsolateData : public MemoryRetainer {
597597
#undef VP
598598
inline v8::Local<v8::String> async_wrap_provider(int index) const;
599599

600+
size_t max_young_gen_size = 1;
600601
std::unordered_map<const char*, v8::Eternal<v8::String>> static_str_map;
601602

602603
inline v8::Isolate* isolate() const;
@@ -961,6 +962,9 @@ class Environment : public MemoryRetainer {
961962
void VerifyNoStrongBaseObjects();
962963
// Should be called before InitializeInspector()
963964
void InitializeDiagnostics();
965+
966+
std::string GetCwd();
967+
964968
#if HAVE_INSPECTOR
965969
// If the environment is created for a worker, pass parent_handle and
966970
// the ownership if transferred into the Environment.
@@ -1319,6 +1323,9 @@ class Environment : public MemoryRetainer {
13191323
inline void RemoveCleanupHook(void (*fn)(void*), void* arg);
13201324
void RunCleanup();
13211325

1326+
static size_t NearHeapLimitCallback(void* data,
1327+
size_t current_heap_limit,
1328+
size_t initial_heap_limit);
13221329
static void BuildEmbedderGraph(v8::Isolate* isolate,
13231330
v8::EmbedderGraph* graph,
13241331
void* data);
@@ -1437,6 +1444,9 @@ class Environment : public MemoryRetainer {
14371444
std::vector<std::string> argv_;
14381445
std::string exec_path_;
14391446

1447+
bool is_processing_heap_limit_callback_ = false;
1448+
int64_t heap_limit_snapshot_taken_ = 0;
1449+
14401450
uint32_t module_id_counter_ = 0;
14411451
uint32_t script_id_counter_ = 0;
14421452
uint32_t function_id_counter_ = 0;

src/heap_utils.cc

+3-3
Original file line numberDiff line numberDiff line change
@@ -313,7 +313,9 @@ inline void TakeSnapshot(Isolate* isolate, v8::OutputStream* out) {
313313
snapshot->Serialize(out, HeapSnapshot::kJSON);
314314
}
315315

316-
inline bool WriteSnapshot(Isolate* isolate, const char* filename) {
316+
} // namespace
317+
318+
bool WriteSnapshot(Isolate* isolate, const char* filename) {
317319
FILE* fp = fopen(filename, "w");
318320
if (fp == nullptr)
319321
return false;
@@ -323,8 +325,6 @@ inline bool WriteSnapshot(Isolate* isolate, const char* filename) {
323325
return true;
324326
}
325327

326-
} // namespace
327-
328328
void DeleteHeapSnapshot(const HeapSnapshot* snapshot) {
329329
const_cast<HeapSnapshot*>(snapshot)->Delete();
330330
}

src/inspector_profiler.cc

+2-18
Original file line numberDiff line numberDiff line change
@@ -394,22 +394,6 @@ static void EndStartedProfilers(Environment* env) {
394394
}
395395
}
396396

397-
std::string GetCwd(Environment* env) {
398-
char cwd[PATH_MAX_BYTES];
399-
size_t size = PATH_MAX_BYTES;
400-
const int err = uv_cwd(cwd, &size);
401-
402-
if (err == 0) {
403-
CHECK_GT(size, 0);
404-
return cwd;
405-
}
406-
407-
// This can fail if the cwd is deleted. In that case, fall back to
408-
// exec_path.
409-
const std::string& exec_path = env->exec_path();
410-
return exec_path.substr(0, exec_path.find_last_of(kPathSeparator));
411-
}
412-
413397
void StartProfilers(Environment* env) {
414398
AtExit(env, [](void* env) {
415399
EndStartedProfilers(static_cast<Environment*>(env));
@@ -427,7 +411,7 @@ void StartProfilers(Environment* env) {
427411
if (env->options()->cpu_prof) {
428412
const std::string& dir = env->options()->cpu_prof_dir;
429413
env->set_cpu_prof_interval(env->options()->cpu_prof_interval);
430-
env->set_cpu_prof_dir(dir.empty() ? GetCwd(env) : dir);
414+
env->set_cpu_prof_dir(dir.empty() ? env->GetCwd() : dir);
431415
if (env->options()->cpu_prof_name.empty()) {
432416
DiagnosticFilename filename(env, "CPU", "cpuprofile");
433417
env->set_cpu_prof_name(*filename);
@@ -442,7 +426,7 @@ void StartProfilers(Environment* env) {
442426
if (env->options()->heap_prof) {
443427
const std::string& dir = env->options()->heap_prof_dir;
444428
env->set_heap_prof_interval(env->options()->heap_prof_interval);
445-
env->set_heap_prof_dir(dir.empty() ? GetCwd(env) : dir);
429+
env->set_heap_prof_dir(dir.empty() ? env->GetCwd() : dir);
446430
if (env->options()->heap_prof_name.empty()) {
447431
DiagnosticFilename filename(env, "Heap", "heapprofile");
448432
env->set_heap_prof_name(*filename);

src/node.cc

+4
Original file line numberDiff line numberDiff line change
@@ -275,6 +275,10 @@ static void AtomicsWaitCallback(Isolate::AtomicsWaitEvent event,
275275
void Environment::InitializeDiagnostics() {
276276
isolate_->GetHeapProfiler()->AddBuildEmbedderGraphCallback(
277277
Environment::BuildEmbedderGraph, this);
278+
if (options_->heap_snapshot_near_heap_limit > 0) {
279+
isolate_->AddNearHeapLimitCallback(Environment::NearHeapLimitCallback,
280+
this);
281+
}
278282
if (options_->trace_uncaught)
279283
isolate_->SetCaptureStackTraceForUncaughtExceptions(true);
280284
if (options_->trace_atomics_wait) {

0 commit comments

Comments
 (0)