From e4caad2488a43d97ec6238d29ee52335a57285ff Mon Sep 17 00:00:00 2001 From: Marko Topolnik Date: Mon, 26 May 2025 12:50:24 +0200 Subject: [PATCH 1/7] Add new metrics to table --- documentation/third-party-tools/prometheus.md | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/documentation/third-party-tools/prometheus.md b/documentation/third-party-tools/prometheus.md index 0fae866f..2dadeeec 100644 --- a/documentation/third-party-tools/prometheus.md +++ b/documentation/third-party-tools/prometheus.md @@ -237,7 +237,10 @@ The following metrics are available: | `questdb_wal_apply_physically_written_rows_total` | counter | Total number of physically written rows during WAL apply. | | `questdb_wal_apply_rows_per_second` | gauge | Rate of rows applied per second during WAL apply. | | `questdb_wal_apply_written_rows_total` | counter | Total number of rows written during WAL apply. | +| `questdb_wal_apply_seq_txn_total` | counter | Sum of all committed transaction sequence numbers. | +| `questdb_wal_apply_writer_txn_total` | counter | Sum of all transaction sequence numbers applied to tables. | | `questdb_wal_written_rows_total` | counter | Total number of rows written to WAL. | +| `questdb_suspended_tables` | gauge | The number of tables currently in the suspended state. | | `questdb_workers_job_start_micros_max` | gauge | Maximum time taken to start a worker job in microseconds. | | `questdb_workers_job_start_micros_min` | gauge | Minimum time taken to start a worker job in microseconds. | @@ -275,7 +278,7 @@ docker inspect -f '{{range.NetworkSettings.Networks}}{{.IPAddress}}{{end}}' aler To run QuestDB and point it towards Alertmanager for alerting, first create a file `./conf/log.conf` with the following contents. `172.17.0.2` in this case is the IP address of the docker container for alertmanager that was discovered by -running the `docker inspect ` command above. +running the `docker inspect` command above. ```ini title="./conf/log.conf" # Which writers to enable From 82a3001d84820a7eabf996310f83cd2269d786e4 Mon Sep 17 00:00:00 2001 From: Marko Topolnik Date: Mon, 26 May 2025 12:52:40 +0200 Subject: [PATCH 2/7] Simplify description of two items --- documentation/third-party-tools/prometheus.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/documentation/third-party-tools/prometheus.md b/documentation/third-party-tools/prometheus.md index 2dadeeec..1ba0a92e 100644 --- a/documentation/third-party-tools/prometheus.md +++ b/documentation/third-party-tools/prometheus.md @@ -144,8 +144,8 @@ The following metrics are available: | `questdb_jvm_major_gc_time_total` | counter | Total time spent on major JVM garbage collection in milliseconds. | | `questdb_jvm_minor_gc_count_total` | counter | Number of times minor JVM garbage collection pause was triggered. | | `questdb_jvm_minor_gc_time_total` | counter | Total time spent on minor JVM garbage collection pauses in milliseconds. | -| `questdb_jvm_unknown_gc_count_total` | counter | Number of times JVM garbage collection of unknown type was triggered. Non-zero values of this metric may be observed only on some, non-mainstream JVM implementations. | -| `questdb_jvm_unknown_gc_time_total` | counter | Total time spent on JVM garbage collection of unknown type in milliseconds. Non-zero values of this metric may be observed only on some, non-mainstream JVM implementations. | +| `questdb_jvm_unknown_gc_count_total` | counter | Number of times JVM garbage collection of unknown type was triggered. Usually zero, some non-mainstream JVM implementations may show non-zero. | +| `questdb_jvm_unknown_gc_time_total` | counter | Total time spent on JVM garbage collection of unknown type in milliseconds. Usually zero, some non-mainstream JVM implementations may show non-zero. | | `questdb_memory_tag_MMAP_DEFAULT` | gauge | Amount of memory allocated for mmaped files. | | `questdb_memory_tag_NATIVE_DEFAULT` | gauge | Amount of allocated untagged native memory. | | `questdb_memory_tag_MMAP_O3` | gauge | Amount of memory allocated for O3 mmapped files. | From 33daa8a8c0a394d955d367179a23f7de1dbc813b Mon Sep 17 00:00:00 2001 From: Marko Topolnik Date: Mon, 26 May 2025 13:09:37 +0200 Subject: [PATCH 3/7] Realign table --- documentation/third-party-tools/prometheus.md | 226 +++++++++--------- 1 file changed, 113 insertions(+), 113 deletions(-) diff --git a/documentation/third-party-tools/prometheus.md b/documentation/third-party-tools/prometheus.md index 1ba0a92e..9128e12e 100644 --- a/documentation/third-party-tools/prometheus.md +++ b/documentation/third-party-tools/prometheus.md @@ -130,119 +130,119 @@ can be used to graph QuestDB-specific metrics which are all prefixed with The following metrics are available: -| Metric | Type | Description | -| :--------------------------------------- | :------ | :---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | -| `questdb_commits_total` | counter | Number of total commits of all types (in-order and out-of-order) executed on the database tables. | -| `questdb_o3_commits_total` | counter | Number of total out-of-order (O3) commits executed on the database tables. | -| `questdb_committed_rows_total` | counter | Number of total rows committed to the database tables. | -| `questdb_physically_written_rows_total` | counter | Number of total rows physically written to disk. Greater than `committed_rows` with [out-of-order ingestion. Write amplification is `questdb_physically_written_rows_total / questdb_committed_rows_total`. | -| `questdb_rollbacks_total` | counter | Number of total rollbacks executed on the database tables. | -| `questdb_json_queries_total` | counter | Number of total REST API queries, including retries. | -| `questdb_json_queries_completed_total` | counter | Number of successfully executed REST API queries. | -| `questdb_unhandled_errors_total` | counter | Number of total unhandled errors occurred in the database. Such errors usually mean a critical service degradation in one of the database subsystems. | -| `questdb_jvm_major_gc_count_total` | counter | Number of times major JVM garbage collection was triggered. | -| `questdb_jvm_major_gc_time_total` | counter | Total time spent on major JVM garbage collection in milliseconds. | -| `questdb_jvm_minor_gc_count_total` | counter | Number of times minor JVM garbage collection pause was triggered. | -| `questdb_jvm_minor_gc_time_total` | counter | Total time spent on minor JVM garbage collection pauses in milliseconds. | -| `questdb_jvm_unknown_gc_count_total` | counter | Number of times JVM garbage collection of unknown type was triggered. Usually zero, some non-mainstream JVM implementations may show non-zero. | -| `questdb_jvm_unknown_gc_time_total` | counter | Total time spent on JVM garbage collection of unknown type in milliseconds. Usually zero, some non-mainstream JVM implementations may show non-zero. | -| `questdb_memory_tag_MMAP_DEFAULT` | gauge | Amount of memory allocated for mmaped files. | -| `questdb_memory_tag_NATIVE_DEFAULT` | gauge | Amount of allocated untagged native memory. | -| `questdb_memory_tag_MMAP_O3` | gauge | Amount of memory allocated for O3 mmapped files. | -| `questdb_memory_tag_NATIVE_O3` | gauge | Amount of memory allocated for O3. | -| `questdb_memory_tag_NATIVE_RECORD_CHAIN` | gauge | Amount of memory allocated for SQL record chains. | -| `questdb_memory_tag_MMAP_TABLE_WRITER` | gauge | Amount of memory allocated for table writer mmapped files. | -| `questdb_memory_tag_NATIVE_TREE_CHAIN` | gauge | Amount of memory allocated for SQL tree chains. | -| `questdb_memory_tag_MMAP_TABLE_READER` | gauge | Amount of memory allocated for table reader mmapped files. | -| `questdb_memory_tag_NATIVE_COMPACT_MAP` | gauge | Amount of memory allocated for SQL compact maps. | -| `questdb_memory_tag_NATIVE_FAST_MAP` | gauge | Amount of memory allocated for SQL fast maps. | -| `questdb_memory_tag_NATIVE_LONG_LIST` | gauge | Amount of memory allocated for long lists. | -| `questdb_memory_tag_NATIVE_HTTP_CONN` | gauge | Amount of memory allocated for HTTP connections. | -| `questdb_memory_tag_NATIVE_PGW_CONN` | gauge | Amount of memory allocated for PostgreSQL Wire Protocol connections. | -| `questdb_memory_tag_MMAP_INDEX_READER` | gauge | Amount of memory allocated for index reader mmapped files. | -| `questdb_memory_tag_MMAP_INDEX_WRITER` | gauge | Amount of memory allocated for index writer mmapped files. | -| `questdb_memory_tag_MMAP_INDEX_SLIDER` | gauge | Amount of memory allocated for indexed column view mmapped files. | -| `questdb_memory_tag_NATIVE_REPL` | gauge | Amount of memory mapped for replication tasks. | -| `questdb_memory_free_count` | gauge | Number of times native memory was freed. | -| `questdb_memory_mem_used` | gauge | Current amount of allocated native memory. | -| `questdb_memory_malloc_count` | gauge | Number of times native memory was allocated. | -| `questdb_memory_realloc_count` | gauge | Number of times native memory was reallocated. | -| `questdb_memory_rss` | gauge | Resident Set Size (Linux/Unix) / Working Set Size (Windows). | -| `questdb_memory_jvm_free` | gauge | Current amount of free Java memory heap in bytes. | -| `questdb_memory_jvm_total` | gauge | Current size of Java memory heap in bytes. | -| `questdb_memory_jvm_max` | gauge | Maximum amount of Java heap memory that can be allocated in bytes. | -| `questdb_http_connections` | gauge | Number of currently active HTTP connections. | -| `questdb_json_queries_cached` | gauge | Number of current cached REST API queries. | -| `questdb_line_tcp_connections` | gauge | Number of currently active InfluxDB Line Protocol TCP connections. | -| `questdb_pg_wire_connections` | gauge | Number of currently active PostgreSQL Wire Protocol connections. | -| `questdb_pg_wire_select_queries_cached` | gauge | Number of current cached PostgreSQL Wire Protocol `SELECT` queries. | -| `questdb_pg_wire_update_queries_cached` | gauge | Number of current cached PostgreSQL Wire Protocol `UPDATE` queries. | -| `questdb_json_queries_cache_hits_total` | counter | Number of total cache hits for JSON queries. | -| `questdb_json_queries_cache_misses_total`| counter | Number of total cache misses for JSON queries. | -| `questdb_json_queries_completed_total` | counter | Total number of completed JSON queries. | -| `questdb_jvm_major_gc_count_total` | counter | Total number of major garbage collection events. | -| `questdb_jvm_major_gc_time_total` | counter | Total time spent on major garbage collection. | -| `questdb_jvm_minor_gc_count_total` | counter | Total number of minor garbage collection events. | -| `questdb_jvm_minor_gc_time_total` | counter | Total time spent on minor garbage collection. | -| `questdb_jvm_unknown_gc_count_total` | counter | Total number of unknown type garbage collection events. | -| `questdb_jvm_unknown_gc_time_total` | counter | Total time spent on unknown type garbage collection. | -| `questdb_memory_tag_MMAP_BLOCK_WRITER` | gauge | Amount of memory allocated for block writer mmapped files. | -| `questdb_memory_tag_MMAP_IMPORT` | gauge | Amount of memory allocated for import operations. | -| `questdb_memory_tag_MMAP_PARALLEL_IMPORT`| gauge | Amount of memory allocated for parallel import operations. | -| `questdb_memory_tag_MMAP_PARTITION_CONVERTER` | gauge | Amount of memory allocated for partition converter operations. | -| `questdb_memory_tag_MMAP_SEQUENCER_METADATA` | gauge | Amount of memory allocated for sequencer metadata. | -| `questdb_memory_tag_MMAP_TABLE_WAL_READER` | gauge | Amount of memory allocated for table WAL reader mmapped files. | -| `questdb_memory_tag_MMAP_TABLE_WAL_WRITER` | gauge | Amount of memory allocated for table WAL writer mmapped files. | -| `questdb_memory_tag_MMAP_TX_LOG` | gauge | Amount of memory allocated for transaction log mmapped files. | -| `questdb_memory_tag_MMAP_TX_LOG_CURSOR` | gauge | Amount of memory allocated for transaction log cursor mmapped files. | -| `questdb_memory_tag_MMAP_UPDATE` | gauge | Amount of memory allocated for update operations. | -| `questdb_memory_tag_NATIVE_CB1` | gauge | Amount of memory allocated for native circular buffer 1. | -| `questdb_memory_tag_NATIVE_CB2` | gauge | Amount of memory allocated for native circular buffer 2. | -| `questdb_memory_tag_NATIVE_CB3` | gauge | Amount of memory allocated for native circular buffer 3. | -| `questdb_memory_tag_NATIVE_CB4` | gauge | Amount of memory allocated for native circular buffer 4. | -| `questdb_memory_tag_NATIVE_CB5` | gauge | Amount of memory allocated for native circular buffer 5. | -| `questdb_memory_tag_NATIVE_CIRCULAR_BUFFER` | gauge | Amount of memory allocated for native circular buffers. | -| `questdb_memory_tag_NATIVE_DIRECT_BYTE_SINK` | gauge | Amount of memory allocated for native direct byte sink. | -| `questdb_memory_tag_NATIVE_DIRECT_CHAR_SINK` | gauge | Amount of memory allocated for native direct char sink. | -| `questdb_memory_tag_NATIVE_DIRECT_UTF8_SINK` | gauge | Amount of memory allocated for native direct UTF-8 sink. | -| `questdb_memory_tag_NATIVE_FAST_MAP_INT_LIST` | gauge | Amount of memory allocated for native fast map integer list. | -| `questdb_memory_tag_NATIVE_FUNC_RSS` | gauge | Amount of memory allocated for native function RSS. | -| `questdb_memory_tag_NATIVE_GROUP_BY_FUNCTION` | gauge | Amount of memory allocated for native group by function. | -| `questdb_memory_tag_NATIVE_ILP_RSS` | gauge | Amount of memory allocated for native ILP RSS. | -| `questdb_memory_tag_NATIVE_IMPORT` | gauge | Amount of memory allocated for native import operations. | -| `questdb_memory_tag_NATIVE_INDEX_READER` | gauge | Amount of memory allocated for native index reader. | -| `questdb_memory_tag_NATIVE_IO_DISPATCHER_RSS` | gauge | Amount of memory allocated for native IO dispatcher RSS. | -| `questdb_memory_tag_NATIVE_JIT` | gauge | Amount of memory allocated for native JIT. | -| `questdb_memory_tag_NATIVE_JIT_LONG_LIST`| gauge | Amount of memory allocated for native JIT long list. | -| `questdb_memory_tag_NATIVE_JOIN_MAP` | gauge | Amount of memory allocated for native join map. | -| `questdb_memory_tag_NATIVE_LATEST_BY_LONG_LIST` | gauge | Amount of memory allocated for native latest by long list. | -| `questdb_memory_tag_NATIVE_LOGGER` | gauge | Amount of memory allocated for native logger. | -| `questdb_memory_tag_NATIVE_MIG` | gauge | Amount of memory allocated for native MIG. | -| `questdb_memory_tag_NATIVE_MIG_MMAP` | gauge | Amount of memory allocated for native MIG mmapped files. | -| `questdb_memory_tag_NATIVE_OFFLOAD` | gauge | Amount of memory allocated for native offload. | -| `questdb_memory_tag_NATIVE_PARALLEL_IMPORT` | gauge | Amount of memory allocated for native parallel import. | -| `questdb_memory_tag_NATIVE_PATH` | gauge | Amount of memory allocated for native path. | -| `questdb_memory_tag_NATIVE_ROSTI` | gauge | Amount of memory allocated for native rosti. | -| `questdb_memory_tag_NATIVE_SAMPLE_BY_LONG_LIST` | gauge | Amount of memory allocated for native sample by long list. | -| `questdb_memory_tag_NATIVE_SQL_COMPILER` | gauge | Amount of memory allocated for native SQL compiler. | -| `questdb_memory_tag_NATIVE_TABLE_READER` | gauge | Amount of memory allocated for native table reader. | -| `questdb_memory_tag_NATIVE_TABLE_WAL_WRITER` | gauge | Amount of memory allocated for native table WAL writer. | -| `questdb_memory_tag_NATIVE_TABLE_WRITER` | gauge | Amount of memory allocated for native table writer. | -| `questdb_memory_tag_NATIVE_TEXT_PARSER_RSS` | gauge | Amount of memory allocated for native text parser RSS. | -| `questdb_memory_tag_NATIVE_TLS_RSS` | gauge | Amount of memory allocated for native TLS RSS. | -| `questdb_memory_tag_NATIVE_UNORDERED_MAP` | gauge | Amount of memory allocated for native unordered map. | -| `questdb_pg_wire_errors_total` | counter | Total number of errors in PostgreSQL wire protocol. | -| `questdb_pg_wire_select_cache_hits_total` | counter | Total number of cache hits for PostgreSQL wire protocol select queries. | -| `questdb_pg_wire_select_cache_misses_total` | counter | Total number of cache misses for PostgreSQL wire protocol select queries. | -| `questdb_wal_apply_physically_written_rows_total` | counter | Total number of physically written rows during WAL apply. | -| `questdb_wal_apply_rows_per_second` | gauge | Rate of rows applied per second during WAL apply. | -| `questdb_wal_apply_written_rows_total` | counter | Total number of rows written during WAL apply. | -| `questdb_wal_apply_seq_txn_total` | counter | Sum of all committed transaction sequence numbers. | -| `questdb_wal_apply_writer_txn_total` | counter | Sum of all transaction sequence numbers applied to tables. | -| `questdb_wal_written_rows_total` | counter | Total number of rows written to WAL. | -| `questdb_suspended_tables` | gauge | The number of tables currently in the suspended state. | -| `questdb_workers_job_start_micros_max` | gauge | Maximum time taken to start a worker job in microseconds. | -| `questdb_workers_job_start_micros_min` | gauge | Minimum time taken to start a worker job in microseconds. | +| Metric | Type | Description | +| :------------------------------------------------ | :------ | :-------------------------------------------------------------------------------------------------- | +| `questdb_commits_total` | counter | Number of total commits of all types (in-order and out-of-order) executed on the database tables. | +| `questdb_o3_commits_total` | counter | Number of total out-of-order (O3) commits executed on the database tables. | +| `questdb_committed_rows_total` | counter | Number of total rows committed to the database tables. | +| `questdb_physically_written_rows_total` | counter | Number of total rows physically written to disk. Greater than `committed_rows` with [out-of-order ingestion. Write amplification is `questdb_physically_written_rows_total / questdb_committed_rows_total`. | +| `questdb_rollbacks_total` | counter | Number of total rollbacks executed on the database tables. | +| `questdb_json_queries_total` | counter | Number of total REST API queries, including retries. | +| `questdb_json_queries_completed_total` | counter | Number of successfully executed REST API queries. | +| `questdb_unhandled_errors_total` | counter | Number of total unhandled errors occurred in the database. Such errors usually mean a critical service degradation in one of the database subsystems. | +| `questdb_jvm_major_gc_count_total` | counter | Number of times major JVM garbage collection was triggered. | +| `questdb_jvm_major_gc_time_total` | counter | Total time spent on major JVM garbage collection in milliseconds. | +| `questdb_jvm_minor_gc_count_total` | counter | Number of times minor JVM garbage collection pause was triggered. | +| `questdb_jvm_minor_gc_time_total` | counter | Total time spent on minor JVM garbage collection pauses in milliseconds. | +| `questdb_jvm_unknown_gc_count_total` | counter | Number of times JVM garbage collection of unknown type was triggered. Usually zero, some non-mainstream JVM implementations may show non-zero. | +| `questdb_jvm_unknown_gc_time_total` | counter | Total time spent on JVM garbage collection of unknown type in milliseconds. Usually zero, some non-mainstream JVM implementations may show non-zero. | +| `questdb_memory_tag_MMAP_DEFAULT` | gauge | Amount of memory allocated for mmaped files. | +| `questdb_memory_tag_NATIVE_DEFAULT` | gauge | Amount of allocated untagged native memory. | +| `questdb_memory_tag_MMAP_O3` | gauge | Amount of memory allocated for O3 mmapped files. | +| `questdb_memory_tag_NATIVE_O3` | gauge | Amount of memory allocated for O3. | +| `questdb_memory_tag_NATIVE_RECORD_CHAIN` | gauge | Amount of memory allocated for SQL record chains. | +| `questdb_memory_tag_MMAP_TABLE_WRITER` | gauge | Amount of memory allocated for table writer mmapped files. | +| `questdb_memory_tag_NATIVE_TREE_CHAIN` | gauge | Amount of memory allocated for SQL tree chains. | +| `questdb_memory_tag_MMAP_TABLE_READER` | gauge | Amount of memory allocated for table reader mmapped files. | +| `questdb_memory_tag_NATIVE_COMPACT_MAP` | gauge | Amount of memory allocated for SQL compact maps. | +| `questdb_memory_tag_NATIVE_FAST_MAP` | gauge | Amount of memory allocated for SQL fast maps. | +| `questdb_memory_tag_NATIVE_LONG_LIST` | gauge | Amount of memory allocated for long lists. | +| `questdb_memory_tag_NATIVE_HTTP_CONN` | gauge | Amount of memory allocated for HTTP connections. | +| `questdb_memory_tag_NATIVE_PGW_CONN` | gauge | Amount of memory allocated for PostgreSQL Wire Protocol connections. | +| `questdb_memory_tag_MMAP_INDEX_READER` | gauge | Amount of memory allocated for index reader mmapped files. | +| `questdb_memory_tag_MMAP_INDEX_WRITER` | gauge | Amount of memory allocated for index writer mmapped files. | +| `questdb_memory_tag_MMAP_INDEX_SLIDER` | gauge | Amount of memory allocated for indexed column view mmapped files. | +| `questdb_memory_tag_NATIVE_REPL` | gauge | Amount of memory mapped for replication tasks. | +| `questdb_memory_free_count` | gauge | Number of times native memory was freed. | +| `questdb_memory_mem_used` | gauge | Current amount of allocated native memory. | +| `questdb_memory_malloc_count` | gauge | Number of times native memory was allocated. | +| `questdb_memory_realloc_count` | gauge | Number of times native memory was reallocated. | +| `questdb_memory_rss` | gauge | Resident Set Size (Linux/Unix) / Working Set Size (Windows). | +| `questdb_memory_jvm_free` | gauge | Current amount of free Java memory heap in bytes. | +| `questdb_memory_jvm_total` | gauge | Current size of Java memory heap in bytes. | +| `questdb_memory_jvm_max` | gauge | Maximum amount of Java heap memory that can be allocated in bytes. | +| `questdb_http_connections` | gauge | Number of currently active HTTP connections. | +| `questdb_json_queries_cached` | gauge | Number of current cached REST API queries. | +| `questdb_line_tcp_connections` | gauge | Number of currently active InfluxDB Line Protocol TCP connections. | +| `questdb_pg_wire_connections` | gauge | Number of currently active PostgreSQL Wire Protocol connections. | +| `questdb_pg_wire_select_queries_cached` | gauge | Number of current cached PostgreSQL Wire Protocol `SELECT` queries. | +| `questdb_pg_wire_update_queries_cached` | gauge | Number of current cached PostgreSQL Wire Protocol `UPDATE` queries. | +| `questdb_json_queries_cache_hits_total` | counter | Number of total cache hits for JSON queries. | +| `questdb_json_queries_cache_misses_total` | counter | Number of total cache misses for JSON queries. | +| `questdb_json_queries_completed_total` | counter | Total number of completed JSON queries. | +| `questdb_jvm_major_gc_count_total` | counter | Total number of major garbage collection events. | +| `questdb_jvm_major_gc_time_total` | counter | Total time spent on major garbage collection. | +| `questdb_jvm_minor_gc_count_total` | counter | Total number of minor garbage collection events. | +| `questdb_jvm_minor_gc_time_total` | counter | Total time spent on minor garbage collection. | +| `questdb_jvm_unknown_gc_count_total` | counter | Total number of unknown type garbage collection events. | +| `questdb_jvm_unknown_gc_time_total` | counter | Total time spent on unknown type garbage collection. | +| `questdb_memory_tag_MMAP_BLOCK_WRITER` | gauge | Amount of memory allocated for block writer mmapped files. | +| `questdb_memory_tag_MMAP_IMPORT` | gauge | Amount of memory allocated for import operations. | +| `questdb_memory_tag_MMAP_PARALLEL_IMPORT` | gauge | Amount of memory allocated for parallel import operations. | +| `questdb_memory_tag_MMAP_PARTITION_CONVERTER` | gauge | Amount of memory allocated for partition converter operations. | +| `questdb_memory_tag_MMAP_SEQUENCER_METADATA` | gauge | Amount of memory allocated for sequencer metadata. | +| `questdb_memory_tag_MMAP_TABLE_WAL_READER` | gauge | Amount of memory allocated for table WAL reader mmapped files. | +| `questdb_memory_tag_MMAP_TABLE_WAL_WRITER` | gauge | Amount of memory allocated for table WAL writer mmapped files. | +| `questdb_memory_tag_MMAP_TX_LOG` | gauge | Amount of memory allocated for transaction log mmapped files. | +| `questdb_memory_tag_MMAP_TX_LOG_CURSOR` | gauge | Amount of memory allocated for transaction log cursor mmapped files. | +| `questdb_memory_tag_MMAP_UPDATE` | gauge | Amount of memory allocated for update operations. | +| `questdb_memory_tag_NATIVE_CB1` | gauge | Amount of memory allocated for native circular buffer 1. | +| `questdb_memory_tag_NATIVE_CB2` | gauge | Amount of memory allocated for native circular buffer 2. | +| `questdb_memory_tag_NATIVE_CB3` | gauge | Amount of memory allocated for native circular buffer 3. | +| `questdb_memory_tag_NATIVE_CB4` | gauge | Amount of memory allocated for native circular buffer 4. | +| `questdb_memory_tag_NATIVE_CB5` | gauge | Amount of memory allocated for native circular buffer 5. | +| `questdb_memory_tag_NATIVE_CIRCULAR_BUFFER` | gauge | Amount of memory allocated for native circular buffers. | +| `questdb_memory_tag_NATIVE_DIRECT_BYTE_SINK` | gauge | Amount of memory allocated for native direct byte sink. | +| `questdb_memory_tag_NATIVE_DIRECT_CHAR_SINK` | gauge | Amount of memory allocated for native direct char sink. | +| `questdb_memory_tag_NATIVE_DIRECT_UTF8_SINK` | gauge | Amount of memory allocated for native direct UTF-8 sink. | +| `questdb_memory_tag_NATIVE_FAST_MAP_INT_LIST` | gauge | Amount of memory allocated for native fast map integer list. | +| `questdb_memory_tag_NATIVE_FUNC_RSS` | gauge | Amount of memory allocated for native function RSS. | +| `questdb_memory_tag_NATIVE_GROUP_BY_FUNCTION` | gauge | Amount of memory allocated for native group by function. | +| `questdb_memory_tag_NATIVE_ILP_RSS` | gauge | Amount of memory allocated for native ILP RSS. | +| `questdb_memory_tag_NATIVE_IMPORT` | gauge | Amount of memory allocated for native import operations. | +| `questdb_memory_tag_NATIVE_INDEX_READER` | gauge | Amount of memory allocated for native index reader. | +| `questdb_memory_tag_NATIVE_IO_DISPATCHER_RSS` | gauge | Amount of memory allocated for native IO dispatcher RSS. | +| `questdb_memory_tag_NATIVE_JIT` | gauge | Amount of memory allocated for native JIT. | +| `questdb_memory_tag_NATIVE_JIT_LONG_LIST` | gauge | Amount of memory allocated for native JIT long list. | +| `questdb_memory_tag_NATIVE_JOIN_MAP` | gauge | Amount of memory allocated for native join map. | +| `questdb_memory_tag_NATIVE_LATEST_BY_LONG_LIST` | gauge | Amount of memory allocated for native latest by long list. | +| `questdb_memory_tag_NATIVE_LOGGER` | gauge | Amount of memory allocated for native logger. | +| `questdb_memory_tag_NATIVE_MIG` | gauge | Amount of memory allocated for native MIG. | +| `questdb_memory_tag_NATIVE_MIG_MMAP` | gauge | Amount of memory allocated for native MIG mmapped files. | +| `questdb_memory_tag_NATIVE_OFFLOAD` | gauge | Amount of memory allocated for native offload. | +| `questdb_memory_tag_NATIVE_PARALLEL_IMPORT` | gauge | Amount of memory allocated for native parallel import. | +| `questdb_memory_tag_NATIVE_PATH` | gauge | Amount of memory allocated for native path. | +| `questdb_memory_tag_NATIVE_ROSTI` | gauge | Amount of memory allocated for native rosti. | +| `questdb_memory_tag_NATIVE_SAMPLE_BY_LONG_LIST` | gauge | Amount of memory allocated for native sample by long list. | +| `questdb_memory_tag_NATIVE_SQL_COMPILER` | gauge | Amount of memory allocated for native SQL compiler. | +| `questdb_memory_tag_NATIVE_TABLE_READER` | gauge | Amount of memory allocated for native table reader. | +| `questdb_memory_tag_NATIVE_TABLE_WAL_WRITER` | gauge | Amount of memory allocated for native table WAL writer. | +| `questdb_memory_tag_NATIVE_TABLE_WRITER` | gauge | Amount of memory allocated for native table writer. | +| `questdb_memory_tag_NATIVE_TEXT_PARSER_RSS` | gauge | Amount of memory allocated for native text parser RSS. | +| `questdb_memory_tag_NATIVE_TLS_RSS` | gauge | Amount of memory allocated for native TLS RSS. | +| `questdb_memory_tag_NATIVE_UNORDERED_MAP` | gauge | Amount of memory allocated for native unordered map. | +| `questdb_pg_wire_errors_total` | counter | Total number of errors in PostgreSQL wire protocol. | +| `questdb_pg_wire_select_cache_hits_total` | counter | Total number of cache hits for PostgreSQL wire protocol select queries. | +| `questdb_pg_wire_select_cache_misses_total` | counter | Total number of cache misses for PostgreSQL wire protocol select queries. | +| `questdb_wal_apply_physically_written_rows_total` | counter | Total number of physically written rows during WAL apply. | +| `questdb_wal_apply_rows_per_second` | gauge | Rate of rows applied per second during WAL apply. | +| `questdb_wal_apply_written_rows_total` | counter | Total number of rows written during WAL apply. | +| `questdb_wal_apply_seq_txn_total` | counter | Sum of all committed transaction sequence numbers. | +| `questdb_wal_apply_writer_txn_total` | counter | Sum of all transaction sequence numbers applied to tables. | +| `questdb_wal_written_rows_total` | counter | Total number of rows written to WAL. | +| `questdb_suspended_tables` | gauge | The number of tables currently in the suspended state. | +| `questdb_workers_job_start_micros_max` | gauge | Maximum time taken to start a worker job in microseconds. | +| `questdb_workers_job_start_micros_min` | gauge | Minimum time taken to start a worker job in microseconds. | All of the above metrics are volatile, i.e. they're collected since the current database start. From cd56ecd0c9e7addb835ca6ce9bda6e55dcb44846 Mon Sep 17 00:00:00 2001 From: Marko Topolnik Date: Tue, 27 May 2025 15:08:39 +0200 Subject: [PATCH 4/7] Skeleton for Monitoring and Alerting page --- documentation/operations/monitoring-alerting.md | 16 ++++++++++++++++ 1 file changed, 16 insertions(+) create mode 100644 documentation/operations/monitoring-alerting.md diff --git a/documentation/operations/monitoring-alerting.md b/documentation/operations/monitoring-alerting.md new file mode 100644 index 00000000..f7e9820a --- /dev/null +++ b/documentation/operations/monitoring-alerting.md @@ -0,0 +1,16 @@ +--- +title: Monitoring and alerting +description: Shows you how to set up to monitor your database for potential issues, and how to raise alerts +--- + +## Basic health check + +## Alert on critical errors + +## Detect suspended tables + +## Detect slow ingestion + +## Detect slow queries + +## Detect potential causes of performance issues From 4a30618483d077d2540daa39d765495780d40041 Mon Sep 17 00:00:00 2001 From: Marko Topolnik Date: Tue, 27 May 2025 16:30:34 +0200 Subject: [PATCH 5/7] Auto-style on logging-metrics page --- documentation/operations/logging-metrics.md | 16 ++++++++++------ 1 file changed, 10 insertions(+), 6 deletions(-) diff --git a/documentation/operations/logging-metrics.md b/documentation/operations/logging-metrics.md index 07ba5f56..cc4162af 100644 --- a/documentation/operations/logging-metrics.md +++ b/documentation/operations/logging-metrics.md @@ -6,7 +6,7 @@ description: Configure and understand QuestDB logging and metrics, including log import { ConfigTable } from "@theme/ConfigTable" import httpMinimalConfig from "./_http-minimal.config.json" -This page outlines logging in QuestDB. It covers how to configure logs via `log.conf` and expose metrics via Prometheus. +This page outlines logging in QuestDB. It covers how to configure logs via `log.conf` and expose metrics via Prometheus. - [Logging](/docs/operations/logging-metrics/#logging) - [Metrics](/docs/operations/logging-metrics/#metrics) @@ -48,10 +48,10 @@ QuestDB provides the following types of log information: For more information, see the [QuestDB source code](https://github.com/questdb/questdb/blob/master/core/src/main/java/io/questdb/log/LogLevel.java). - ### Example log messages Advisory: + ``` 2023-02-24T14:59:45.076113Z A server-main Config: 2023-02-24T14:59:45.076130Z A server-main - http.enabled : true @@ -60,23 +60,27 @@ Advisory: ``` Critical: + ``` 2022-08-08T11:15:13.040767Z C i.q.c.p.WriterPool could not open [table=`sys.text_import_log`, thread=1, ex=could not open read-write [file=/opt/homebrew/var/questdb/db/sys.text_import_log/_todo_], errno=13] ``` Error: + ``` 2023-02-24T14:59:45.059012Z I i.q.c.t.t.InputFormatConfiguration loading input format config [resource=/text_loader.json] 2023-03-20T08:38:17.076744Z E i.q.c.l.u.AbstractLineProtoUdpReceiver could not set receive buffer size [fd=140, size=8388608, errno=55] ``` Info: + ``` 2020-04-15T16:42:32.879970Z I i.q.c.TableReader new transaction [txn=2, transientRowCount=1, fixedRowCount=1, maxTimestamp=1585755801000000, attempts=0] 2020-04-15T16:42:32.880051Z I i.q.g.FunctionParser call to_timestamp('2020-05-01:15:43:21','yyyy-MM-dd:HH:mm:ss') -> to_timestamp(Ss) ``` Debug: + ``` 2023-03-31T11:47:05.723715Z D i.q.g.FunctionParser call cast(investmentMill,INT) -> cast(Li) 2023-03-31T11:47:05.723729Z D i.q.g.FunctionParser call rnd_symbol(4,4,4,2) -> rnd_symbol(iiii) @@ -206,10 +210,10 @@ The following configuration options can be set in your `server.conf`: On systems with [8 Cores and less](/docs/operations/capacity-planning/#cpu-cores), contention -for threads might increase the latency of health check service responses. If you use -a load balancer thinks the QuestDB service is dead with nothing apparent in the -QuestDB logs, you may need to configure a dedicated thread pool for the health -check service. To do so, increase `http.min.worker.count` to `1`. +for threads might increase the latency of health check service responses. If you +use a load balancer, and it thinks the QuestDB service is dead with nothing +apparent in the QuestDB logs, you may need to configure a dedicated thread pool +for the health check service. To do so, increase `http.min.worker.count` to `1`. ::: From 0644a514a734ea7ea78cccf9173f56d33cef4cdb Mon Sep 17 00:00:00 2001 From: Marko Topolnik Date: Tue, 27 May 2025 16:30:55 +0200 Subject: [PATCH 6/7] Add Monitoring and Alerting content --- .../operations/monitoring-alerting.md | 67 +++++++++++++++++++ documentation/sidebars.js | 1 + 2 files changed, 68 insertions(+) diff --git a/documentation/operations/monitoring-alerting.md b/documentation/operations/monitoring-alerting.md index f7e9820a..08b467eb 100644 --- a/documentation/operations/monitoring-alerting.md +++ b/documentation/operations/monitoring-alerting.md @@ -5,12 +5,79 @@ description: Shows you how to set up to monitor your database for potential issu ## Basic health check +QuestDB comes with an out-of-the-box health check HTTP endpoint: + +```shell title="GET health status of local instance" +curl -v http://127.0.0.1:9003 +``` + +Getting an OK response means the QuestDB process is up and running. This method +provides no further information. + +If you allocate 8 vCPUs/cores or less to QuestDB, the HTTP server thread may not +be able to get enough CPU time to respodn in a timely manner. Your load balancer +may flag the instance as dead. In such a case, create an isolated thread pool +just for the health check service (the `min` HTTP server), by setting this +configuration option: + +```text +http.min.worker.count=1 +``` + ## Alert on critical errors +QuestDB includes a log writer that sends any message logged at critical level to +Prometheus Alertmanager over a TCP/IP socket. To configure this writer, add it +to the `writers` config alongside other log writers. This is the basic setup: + +```ini title="log.conf" +writers=stdout,alert +w.alert.class=io.questdb.log.LogAlertSocketWriter +w.alert.level=CRITICAL +``` + +For more details, see the +[Logging and metrics page](/docs/operations/logging-metrics/#prometheus-alertmanager). + ## Detect suspended tables +QuestDB exposes a Prometheus gauge called `questdb_suspended_tables`. You can set up +to alert whenever this gauge shows an above-zero value. + ## Detect slow ingestion +QuestDB ingests data in two stages: first it records everything to the +Write-Ahead Log. This step is optimized for throughput and usually isn't the +bottleneck. The next step is inserting the data to the table, and this can +take longer if the data is out of order, or touches different time partitions. +You can monitor the overall performance of this process of applying the WAL +data to tables. QuestDB exposes two Prometheus counters for this: + +1. `questdb_wal_apply_seq_txn_total`: sum of all committed transaction sequence numbers +2. `questdb_wal_apply_writer_txn_total`: sum of all transaction sequence numbers applied to tables + +Both of these numbers are continuously growing as the data is ingested. When +they are equal, all WAL data has been applied to the tables. While data is being +actively ingested, the second counter will lag behind the first one. A steady +difference between them is a sign of healthy rate of WAL application, the +database keeping up with the demand. However, if the difference continously +rises, this indicates that either a table has become suspended and WAL can't be +applied to it, or QuestDB is not able to keep up with the ingestion rate. All of +the data is still safely stored, but a growing portion of it is not yet visible +to queries. + +You can create an alert that detects a steadily increasing difference between +these two numbers. It won't tell you which table is experiencing issues, but it +is a low-impact way to detect there's a problem which needs further diagnosing. + ## Detect slow queries +QuestDB maintains a table called `_query_trace`, which records each executed +query and the time it took. You can query this table to find slow queries. + +Read more on query tracing on the +[Concepts page](/docs/concept/query-tracing/). + ## Detect potential causes of performance issues + +... mention interesting Prometheus metrics ... diff --git a/documentation/sidebars.js b/documentation/sidebars.js index b5c7cd9e..b85e47d3 100644 --- a/documentation/sidebars.js +++ b/documentation/sidebars.js @@ -468,6 +468,7 @@ module.exports = { ] }, "operations/logging-metrics", + "operations/monitoring-alerting", "operations/data-retention", "operations/design-for-performance", "operations/updating-data", From d8f679caf5c0770b3f251146f7e29699807fc92d Mon Sep 17 00:00:00 2001 From: Marko Topolnik Date: Fri, 27 Jun 2025 11:45:59 +0200 Subject: [PATCH 7/7] Fix typos Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> --- documentation/operations/monitoring-alerting.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/documentation/operations/monitoring-alerting.md b/documentation/operations/monitoring-alerting.md index 08b467eb..345dcb52 100644 --- a/documentation/operations/monitoring-alerting.md +++ b/documentation/operations/monitoring-alerting.md @@ -15,7 +15,7 @@ Getting an OK response means the QuestDB process is up and running. This method provides no further information. If you allocate 8 vCPUs/cores or less to QuestDB, the HTTP server thread may not -be able to get enough CPU time to respodn in a timely manner. Your load balancer +be able to get enough CPU time to respond in a timely manner. Your load balancer may flag the instance as dead. In such a case, create an isolated thread pool just for the health check service (the `min` HTTP server), by setting this configuration option: @@ -60,7 +60,7 @@ Both of these numbers are continuously growing as the data is ingested. When they are equal, all WAL data has been applied to the tables. While data is being actively ingested, the second counter will lag behind the first one. A steady difference between them is a sign of healthy rate of WAL application, the -database keeping up with the demand. However, if the difference continously +database keeping up with the demand. However, if the difference continuously rises, this indicates that either a table has become suspended and WAL can't be applied to it, or QuestDB is not able to keep up with the ingestion rate. All of the data is still safely stored, but a growing portion of it is not yet visible