All OpenBao telemetry metrics
For completeness, we provide a full list of available metrics below in
alphabetic order by name.
Full metric list
database.Close
Metric type | Value | Description |
---|
summary | ms | Time required to close a database secret engine (across all database secrets engines) |
database.Close.error
Metric type | Value | Description |
---|
counter | number | Number of errors encountered across all database secrets engines while closing database connections |
database.CreateUser
Metric type | Value | Description |
---|
summary | ms | Time required to create a user across all database secrets engines |
database.CreateUser.error
Metric type | Value | Description |
---|
counter | number | Number of errors encountered across all database secrets engines while creating users |
database.Initialize
Metric type | Value | Description |
---|
summary | ms | Time required to initialize a database secret engine (across all database secrets engines) |
database.Initialize.error
Metric type | Value | Description |
---|
counter | number | Number of errors encountered across all database secrets engines while initializing the database |
database.{NAME}.Close
Metric type | Value | Description |
---|
summary | ms | Time required to close the database secrets engine {NAME} |
database.{NAME}.Close.error
Metric type | Value | Description |
---|
counter | number | Number of errors encountered for the named database secrets engines while closing database connections |
database.{NAME}.CreateUser
Metric type | Value | Description |
---|
summary | ms | Time required to create a user for the named database secrets engine |
database.{NAME}.CreateUser.error
Metric type | Value | Description |
---|
counter | number | Number of errors encountered for the named database secrets engines while creating users |
database.{NAME}.Initialize
Metric type | Value | Description |
---|
summary | ms | Time required to initialize a database secret engine for the named database |
database.{NAME}.Initialize.error
Metric type | Value | Description |
---|
counter | number | Number of errors encountered for the named database secrets engines while initializing the database |
database.{NAME}.RenewUser
Metric type | Value | Description |
---|
summary | ms | Time required to renew a user for the named database secrets engine |
database.{NAME}.RenewUser.error
Metric type | Value | Description |
---|
counter | number | Number of errors encountered for the named database secrets engines while renewing users |
database.{NAME}.RevokeUser
Metric type | Value | Description |
---|
summary | ms | Time required to revoke a user for the named database secrets engine |
database.{NAME}.RevokeUser.error
Metric type | Value | Description |
---|
counter | number | Number of errors encountered for the named database secrets engines while revoking users |
database.RenewUser
Metric type | Value | Description |
---|
summary | ms | Time required to renew a user across all database secrets engines |
database.RenewUser.error
Metric type | Value | Description |
---|
counter | number | Number of errors encountered across all database secrets engines while renewing users |
database.RevokeUser
Metric type | Value | Description |
---|
summary | ms | Time required to revoke a user across all database secrets engines |
database.RevokeUser.error
Metric type | Value | Description |
---|
counter | number | Number of errors encountered across all database secrets engines while revoking users |
secrets.pki.tidy.cert_store_current_entry
Metric type | Value | Description |
---|
gauge | number | Index of the certificate store entry currently being verified by the tidy operation |
secrets.pki.tidy.cert_store_deleted_count
Metric type | Value | Description |
---|
counter | number | Number of entries deleted from the certificate store |
secrets.pki.tidy.cert_store_total_entries_remaining
Metric type | Value | Description |
---|
gauge | number | Number of entries in the certificate store checked, but not removed, during the tidy operation |
secrets.pki.tidy.cert_store_total_entries
Metric type | Value | Description |
---|
gauge | number | Number of entries in the certificate store to verify during the tidy operation |
secrets.pki.tidy.duration
Metric type | Value | Description |
---|
summary | ms | Time required to complete the PKI tidy operation |
secrets.pki.tidy.failure
Metric type | Value | Description |
---|
counter | number | Number of times the PKI tidy operation failed to finish due to errors |
secrets.pki.tidy.revoked_cert_current_entry
Metric type | Value | Description |
---|
gauge | number | Index of the revoked certificate store entry currently being verified by the tidy operation |
secrets.pki.tidy.revoked_cert_deleted_count
Metric type | Value | Description |
---|
counter | number | Number of entries deleted from the certificate store for revoked certificates |
secrets.pki.tidy.revoked_cert_total_entries_fixed_issuers
Metric type | Value | Description |
---|
gauge | number | Number of entries in the certificate store found to have incorrect issuer information that were fixed during the tidy operation |
secrets.pki.tidy.revoked_cert_total_entries_incorrect_issuers
Metric type | Value | Description |
---|
gauge | number | Total number of entries in the certificate store found to have incorrect issuer information |
secrets.pki.tidy.revoked_cert_total_entries_remaining
Metric type | Value | Description |
---|
gauge | number | Number of revoked certificates in the certificate store checked, but not removed, during the tidy operation |
secrets.pki.tidy.revoked_cert_total_entries
Metric type | Value | Description |
---|
gauge | number | Number of revoked certificate entries in the certificate store to be verified during the tidy operation |
secrets.pki.tidy.start_time_epoch
Metric type | Value | Description |
---|
gauge | seconds | Epoch time (seconds since 1970-01-01) when the PKI tidy operation began |
The start time metric reports a value of 0
if the PKI tidy operation is not
currently active.
secrets.pki.tidy.success
Metric type | Value | Description |
---|
counter | number | Number of times the PKI tidy operation completed successfully |
vault.audit.{DEVICE}.log_request_failure
Metric type | Value | Description |
---|
counter | number | Number of audit log request failures |
vault.audit.{DEVICE}.log_request
Metric type | Value | Description |
---|
summary | ms | Time required to complete all audit log requests across the device |
vault.audit.{DEVICE}.log_response_failure
Metric type | Value | Description |
---|
counter | number | Number of audit log request failures |
vault.audit.{DEVICE}.log_response
Metric type | Value | Description |
---|
summary | ms | Time required to complete all audit log responses across the device |
vault.audit.log_request_failure
Metric type | Value | Description |
---|
counter | number | Number of audit log request failures across all devices |
The number of request failures is a crucial metric.
A non-zero value for vault.audit.log_request_failure
indicates that all your
configured audit devices failed to log a request (or response). If OpenBao cannot
properly audit a request, or the response to a request, the original request
will fail.
Refer to the OpenBao logs and any device-specific metrics to troubleshoot the
failing audit log device.
vault.audit.log_request
Metric type | Value | Description |
---|
summary | ms | Time required to complete all audit log requests across all audit log devices |
vault.audit.log_response_failure
Metric type | Value | Description |
---|
counter | number | Number of audit log request failures across all devices |
The number of request failures is a crucial metric.
A non-zero value for vault.audit.log_response_failure
indicates that one of
the configured audit log devices failed to respond to OpenBao. If OpenBao cannot
properly audit a request, or the response to a request, the original request
will fail.
Refer to the device-specific metrics and logs to troubleshoot the failing audit
log device.
vault.audit.log_response
Metric type | Value | Description |
---|
summary | ms | Time required to complete audit log responses across all audit log devices |
vault.autopilot.failure_tolerance
Metric type | Value | Description |
---|
gauge | nodes | The number of healthy nodes in excess of quorum |
The failure tolerance indicates how many currently healthy nodes can fail without losing quorum.
vault.autopilot.healthy
Metric type | Value | Description |
---|
gauge | boolean | Indicates whether all nodes are healthy |
- A value of
1
on the gauge means that Autopilot deems all nodes healthy.
- A value of
0
on the gauge means that Autopilot deems at least 1 node
unhealthy.
vault.autopilot.node.healthy
Metric type | Value | Description |
---|
gauge | boolean | Indicates whether the active node is healthy |
- A value of
1
on the gauge means that Autopilot deems the node indicated by
node_id
is healthy.
- A value of
0
on the gauge means that Autopilot cannot communicate with the
node indicated by node_id
, or deems the node unhealthy.
vault.barrier.delete
Metric type | Value | Description |
---|
summary | ms | Time required to complete a DELETE operation at the barrier |
vault.barrier.get
Metric type | Value | Description |
---|
summary | ms | Time required to complete a GET operation at the barrier |
vault.barrier.list
Metric type | Value | Description |
---|
summary | ms | Time required to complete a LIST operation at the barrier |
vault.barrier.put
Metric type | Value | Description |
---|
summary | ms | Time required to complete a PUT operation at the barrier |
vault.cache.delete
Metric type | Value | Description |
---|
counter | number | Number of deletes from the LRU cache |
vault.cache.hit
Metric type | Value | Description |
---|
counter | number | Number of hits against the LRU cache that avoided a read from configured storage |
vault.cache.miss
Metric type | Value | Description |
---|
counter | number | Number of misses against the LRU cache that required a read from configured storage |
vault.cache.write
Metric type | Value | Description |
---|
counter | number | Number of writes to the LRU cache |
vault.core.active
Metric type | Value | Description |
---|
gauge | boolean | Indicates whether the OpenBao node is active |
- A value of
1
indicates that the node is active.
- A value of
0
indicates that the node is in standby.
vault.core.check_token
Metric type | Value | Description |
---|
summary | ms | Time required to complete a token check |
vault.core.fetch_acl_and_token
Metric type | Value | Description |
---|
summary | ms | Time required to fetch ACL and token entries |
vault.core.handle_login_request
Metric type | Value | Description |
---|
summary | ms | Time required to complete a login request |
vault.core.handle_request
Metric type | Value | Description |
---|
summary | ms | Time required to complete a non-login request |
vault.core.in_flight_requests
Metric type | Value | Description |
---|
gauge | requests | Number of requests currently in progress |
vault.core.leadership_lost
Metric type | Value | Description |
---|
summary | ms | Total time that a high-availability cluster node last maintained leadership |
Leadership time updates occur whenever leadership changes. Frequent updates to
vault.core.leadership_lost
with low leadership times indicates flapping as
leader status rotates between nodes.
vault.core.leadership_setup_failed
Metric type | Value | Description |
---|
summary | ms | Time taken by the most recent leadership setup failure |
Setup failure time is an important health metric for your high-availability
OpenBao installation. We strongly recommend that you closely monitor
vault.core.leadership_setup_failed
and set alerts that keep you informed of
the overall cluster leadership status.
vault.core.license.expiration_time_epoch
Metric type | Value | Description |
---|
gauge | timestamp | Epoch time (seconds since 1970-01-01) at which the license will expire |
vault.core.locked_users
Metric type | Value | Description |
---|
gauge | users | The number of users currently locked out of OpenBao |
The number of locked users refreshes every 15 minutes.
vault.core.mount_table.num_entries
Metric type | Value | Description |
---|
gauge | objects | Number of mounts in the given mount table |
Mountpoint count metrics include labels to indicate whether the relevant table
is an authentication table or a logical table and whether the table is
replicated or local.
vault.core.mount_table.size
Metric type | Value | Description |
---|
gauge | bytes | The current size of the relevant mount table. |
Table size metrics include labels to indicate whether the relevant table is an
authentication table or a logical table and whether the table is replicated or
local.
vault.core.post_unseal
Metric type | Value | Description |
---|
summary | ms | Time required to complete post-unseal operations |
vault.core.pre_seal
Metric type | Value | Description |
---|
summary | ms | Time required to complete pre-seal operations |
vault.core.seal-internal
Metric type | Value | Description |
---|
summary | ms | Time required to complete internal OpenBao seal operations |
vault.core.seal-with-request
Metric type | Value | Description |
---|
summary | ms | Time required to complete seal operations that were triggered by explicit request |
vault.core.step_down
Metric type | Value | Description |
---|
summary | ms | Time required to step down cluster leadership |
vault.core.unseal
Metric type | Value | Description |
---|
summary | ms | Time required to complete unseal operations |
vault.core.unsealed
Metric type | Value | Description |
---|
gauge | boolean | Indicates whether OpenBao is currently unsealed |
- A value of
1
indicates OpenBao is currently unsealed and clients can
read secrets.
- A value of
0
indicates OpenBao is currently sealed and clients cannot
read secrets.
vault.expire.fetch-lease-times-by-token
Metric type | Value | Description |
---|
summary | ms | Time taken to retrieve lease times by token |
vault.expire.fetch-lease-times
Metric type | Value | Description |
---|
summary | ms | Time taken to retrieve lease times |
vault.expire.job_manager.queue_length
Metric type | Value | Description |
---|
summary | leases | The total number of pending revocation jobs by queue_id |
The queue ID in the queue_id
label indicates the mount accessor associated
with the expiring lease. For example, the secrets engine or authentication method.
vault.expire.job_manager.total_jobs
Metric type | Value | Description |
---|
summary | leases | The total number of pending revocation jobs |
vault.expire.lease_expiration
Metric type | Value | Description |
---|
counter | number | The number of lease expirations to date |
vault.expire.lease_expiration.error
Metric type | Value | Description |
---|
counter | number | The total number of lease expiration errors |
vault.expire.lease_expiration.time_in_queue
Metric type | Value | Description |
---|
summary | ms | Time taken for a lease to get to the front of the revoke queue |
vault.expire.leases.by_expiration
Metric type | Value | Description |
---|
gauge | leases | The number of leases set to expire, grouped by the configured interval |
The relevant time intervals are defined in the telemetry stanza for your
OpenBao server configuration with the following parameters:
lease_metrics_epsilon
: 1 hour (default)
num_lease_metrics_buckets
: 168 hours (default)
add_lease_metrics_namespace_labels
: false (default)
OpenBao reports the number of leases due to expire every lease_metrics_epsilon
interval in the time period current_time + num_lease_metrics_buckets
.
vault.expire.num_irrevocable_leases
Metric type | Value | Description |
---|
gauge | leases | The number of leases that cannot be automatically revoked |
vault.expire.num_leases
Metric type | Value | Description |
---|
gauge | leases | The total number of leases eligible for eventual expiry |
vault.expire.register-auth
Metric type | Value | Description |
---|
summary | ms | Time taken to register leases associated with new service tokens |
vault.expire.register
Metric type | Value | Description |
---|
summary | ms | Time taken for register operations |
vault.expire.renew-token
Metric type | Value | Description |
---|
summary | ms | Time taken to renew a token |
vault.expire.renew
Metric type | Value | Description |
---|
summary | ms | Time taken to renew a lease |
vault.expire.revoke-by-token
Metric type | Value | Description |
---|
summary | ms | Time taken to revoke all secrets issued with a given token |
vault.expire.revoke-force
Metric type | Value | Description |
---|
summary | ms | Time taken to forcibly revoke a token |
vault.expire.revoke-prefix
Metric type | Value | Description |
---|
summary | ms | Time taken to revoke all tokens on a prefix |
vault.expire.revoke
Metric type | Value | Description |
---|
summary | ms | Time taken to revoke a token |
vault.ha.rpc.client.echo
Metric type | Value | Description |
---|
summary | ms | Time taken to send an echo request from a standby to the active node (also emitted by perf standbys) |
vault.ha.rpc.client.echo.errors
Metric type | Value | Description |
---|
counter | number | Number of standby echo request failures (also emitted by perf standbys) |
vault.ha.rpc.client.forward
Metric type | Value | Description |
---|
summary | ms | Time taken to forward a request from a standby to the active node |
vault.ha.rpc.client.forward.errors
Metric type | Value | Description |
---|
counter | number | Number of standby request forwarding failures |
vault.identity.entity.alias.count
Metric type | Value | Description |
---|
gauge | aliases | The number of identity entities aliases (per authN mount) currently stored in OpenBao |
OpenBao updates the alias count every usage_guage_period
interval.
vault.identity.entity.count
Metric type | Value | Description |
---|
gauge | entities | The number of identity entity aliases (per namespace) currently stored in OpenBao |
vault.identity.entity.creation
Metric type | Value | Description |
---|
counter | number | The number of identity entities created per namespace |
vault.identity.num_entities
Metric type | Value | Description |
---|
gauge | entities | The total number of identity entities currently stored in OpenBao |
vault.identity.upsert_entity_txn
Metric type | Value | Description |
---|
summary | ms | Time required to upsert an entity to the in-memory database and, on the active node, persist the data to storage |
vault.identity.upsert_group_txn
Metric type | Value | Description |
---|
summary | ms | Time required to upsert group membership to the in-memory database and, on the active node, persist the data to storage |
vault.logshipper.buffer.length
Metric type | Value | Description |
---|
gauge | buffer entries | Current length of the log shipper buffer |
vault.logshipper.buffer.max_length
Metric type | Value | Description |
---|
gauge | buffer entries | Maximum length of the log shipper buffer seen to date |
vault.logshipper.buffer.max_size
Metric type | Value | Description |
---|
gauge | bytes | Maximum allowable size of the log shipper buffer |
vault.logshipper.buffer.size
Metric type | Value | Description |
---|
gauge | bytes | Current size of the log shipper buffer |
vault.logshipper.streamWALs.guard_found
Metric type | Value | Description |
---|
counter | number | Number of times OpenBao began streaming WAL entires and found a starting index in the merkle tree |
vault.logshipper.streamWALs.missing_guard
Metric type | Value | Description |
---|
counter | number | Number of times OpenBao began streaming WAL entires without finding a starting index in the Merkle tree |
vault.logshipper.streamWALs.scanned_entries
Metric type | Value | Description |
---|
summary | entries | Number of entries scanned in the buffer before OpenBao found the correct entry |
vault.merkle.flushDirty
Metric type | Value | Description |
---|
summary | ms | The average time required to flush dirty pages to storage |
vault.merkle.flushDirty.num_pages
Metric type | Value | Description |
---|
gauge | pages | Number of pages flushed |
vault.merkle.flushDirty.outstanding_pages
Metric type | Value | Description |
---|
gauge | pages | Number of dirty pages waiting to be flushed |
vault.merkle.saveCheckpoint
Metric type | Value | Description |
---|
summary | ms | The average time required to save a checkpoint |
vault.merkle.saveCheckpoint.num_dirty
Metric type | Value | Description |
---|
gauge | pages | Number of dirty pages at checkpoint |
vault.metrics.collection
Metric type | Value | Description |
---|
summary | ms | The average time required (per gauge type) to collect usage data |
vault.metrics.collection.error
Metric type | Value | Description |
---|
counter | number | The total number of errors (per gauge type) that OpenBao encountered while collecting usage data |
vault.metrics.collection.interval
Metric type | Units | Description |
---|
summary | time duration | The current value of usage_gauge_period |
vault.policy.delete_policy
Metric type | Value | Description |
---|
summary | ms | Time required to delete a policy |
vault.policy.get_policy
Metric type | Value | Description |
---|
summary | ms | Time required to read a policy |
vault.policy.list_policies
Metric type | Value | Description |
---|
summary | ms | Time required to list all policies |
vault.policy.set_policy
Metric type | Value | Description |
---|
summary | ms | Time required to set a policy |
vault.quota.lease_count.counter
Metric type | Value | Description |
---|
gauge | lease | Total number of leases associated with the named quota rule |
The number of leases reported is specific to the quota rule listed in the name
label, not the number of leases in general. For example, if the named rule
allows for 50 leases max and there are currently 40 leases in the scope of that
quota rule, the value of vault.quota.lease_count.counter
is 40 even if there
are 1000 other leases that are unscoped or in the scope of other quota rules.
vault.quota.lease_count.max
Metric type | Value | Description |
---|
gauge | lease | Maximum number of leases allowed by the named quota rule |
vault.quota.lease_count.violation
Metric type | Value | Description |
---|
counter | number | Number of requests rejected due to exceeding the named lease count quota |
vault.quota.rate_limit.violation
Metric type | Value | Description |
---|
counter | number | Number of requests rejected due to exceeding the named rate limit quota rule |
vault.raft_storage.bolt.cursor.count
Metric type | Value | Description |
---|
gauge | number | Number of cursors created in the Bolt database |
vault.raft_storage.bolt.freelist.allocated_bytes
Metric type | Value | Description |
---|
gauge | bytes | Total space allocated for the freelist for the Bolt database |
vault.raft_storage.bolt.freelist.free_pages
Metric type | Value | Description |
---|
gauge | number | Number of free pages in the freelist for the Bolt database |
vault.raft_storage.bolt.freelist.pending_pages
Metric type | Value | Description |
---|
gauge | number | Number of pending pages in the freelist for the Bolt database |
vault.raft_storage.bolt.freelist.used_bytes
Metric type | Value | Description |
---|
gauge | bytes | Total space used by the freelist for the Bolt database |
vault.raft_storage.bolt.node.count
Metric type | Value | Description |
---|
gauge | number | Number of node allocations for the Bolt database |
vault.raft_storage.bolt.node.dereferences
Metric type | Value | Description |
---|
gauge | number | Total number of node dereferences by the Bolt database |
vault.raft_storage.bolt.page.bytes_allocated
Metric type | Value | Description |
---|
gauge | bytes | Total space allocated to the Bolt database |
vault.raft_storage.bolt.page.count
Metric type | Value | Description |
---|
gauge | number | Number of page allocations in the Bolt database |
vault.raft_storage.bolt.rebalance.count
Metric type | Value | Description |
---|
gauge | number | Number of node rebalances performed by the Bolt database |
vault.raft_storage.bolt.rebalance.time
Metric type | Value | Description |
---|
summary | ms | Time required by the Bolt database to rebalance nodes |
vault.raft_storage.bolt.spill.count
Metric type | Value | Description |
---|
gauge | number | Number of nodes spilled by the Bolt database |
vault.raft_storage.bolt.spill.time
Metric type | Value | Description |
---|
summary | ms | Total time spent spilling by the Bolt database |
vault.raft_storage.bolt.split.count
Metric type | Value | Description |
---|
gauge | number | Number of nodes split by the Bolt database |
vault.raft_storage.bolt.transaction.currently_open_read_transactions
Metric type | Value | Description |
---|
gauge | number | Number of in-process read transactions for the Bolt DB |
vault.raft_storage.bolt.transaction.started_read_transactions
Metric type | Value | Description |
---|
gauge | number | Number of read transactions started by the Bolt DB |
vault.raft_storage.bolt.write.count
Metric type | Value | Description |
---|
gauge | number | Number of writes performed by the Bolt database |
vault.raft_storage.bolt.write.time
Metric type | Value | Description |
---|
counter | ms | Total cumulative time the Bolt database has spent writing to disk. |
vault.raft_storage.follower.applied_index_delta
Metric type | Value | Description |
---|
gauge | number | The difference between the index applied by the leader and the index applied by the follower as reported by echoes |
vault.raft_storage.follower.last_heartbeat_ms
Metric type | Value | Description |
---|
gauge | ms | Time since the follower last received a heartbeat request |
vault.raft_storage.stats.applied_index
Metric type | Value | Description |
---|
gauge | number | Highest index of raft log last applied to the finite state machine or added to fsm_pending queue |
vault.raft_storage.stats.commit_index
Metric type | Value | Description |
---|
gauge | number | Index of the last raft log committed to disk on the node |
vault.raft_storage.stats.fsm_pending
Metric type | Value | Description |
---|
gauge | number | Number of raft logs queued by the node for the finite state machine to apply |
vault.raft-storage.delete
Metric type | Value | Description |
---|
timer | ms | Time required to insert log entry to delete path |
vault.raft-storage.entry_size
Metric type | Value | Description |
---|
summary | bytes | The total size of a raft entry during log application |
vault.raft-storage.get
Metric type | Value | Description |
---|
timer | ms | Time required to retrieve a value for the given path from the finite state machine |
vault.raft-storage.list
Metric type | Value | Description |
---|
timer | ms | Time required to list all entries under the prefix from the finite state machine |
vault.raft-storage.put
Metric type | Value | Description |
---|
timer | ms | Time required to insert a log entry to the persist path |
vault.raft-storage.transaction
Metric type | Value | Description |
---|
timer | ms | Time required to insert operations into a single log |
vault.raft.apply
Metric type | Value | Description |
---|
counter | number | Number of transactions in the configured interval |
The vault.raft.apply
metric is generally a good indicator of the write load
on your raft internal storage.
vault.raft.barrier
Metric type | Value | Description |
---|
counter | number | Number of times the node started the barrier |
A node starts the barrier by issuing a blocking call when it wants to ensure
that all pending operations that need to be applied to the finite state machine
are properly queued.
vault.raft.candidate.electSelf
Metric type | Value | Description |
---|
summary | ms | Time required for a node to send a vote request to a peer |
vault.raft.commitNumLogs
Metric type | Value | Description |
---|
gauge | number | Number of logs processed for application to the finite state machine in a single batch |
vault.raft.commitTime
Metric type | Value | Description |
---|
summary | ms | Time required to commit a new entry to the raft log on the leader node |
vault.raft.compactLogs
Metric type | Value | Description |
---|
summary | ms | Time required to trim unnecessary logs |
vault.raft.fsm.apply
Metric type | Value | Description |
---|
summary | number | Number of logs committed by the finite state machine since the last interval |
vault.raft.fsm.applyBatch
Metric type | Value | Description |
---|
summary | ms | Time required by the finite state machine to apply the most recent batch of logs |
vault.raft.fsm.applyBatchNum
Metric type | Value | Description |
---|
counter | number | Number of logs applied in the most recent batch |
vault.raft.fsm.enqueue
Metric type | Value | Description |
---|
summary | ms | Time required to queue up a batch of logs for the finite state machine to apply |
vault.raft.fsm.restore
Metric type | Value | Description |
---|
summary | ms | Time required by the finite state machine to complete a restore operation from a snapshot |
vault.raft.fsm.snapshot
Metric type | Value | Description |
---|
summary | ms | Time required by the finite state machine to record state information for the current snapshot |
vault.raft.fsm.store_config
Metric type | Value | Description |
---|
summary | ms | Time required to store the most recent raft configuration |
vault.raft.get
Metric type | Value | Description |
---|
summary | ms | Time required to retrieve an entry from underlying storage |
vault.raft.leader.dispatchLog
Metric type | Value | Description |
---|
timer | ms | Time required for the leader node to write a log entry to disk |
vault.raft.leader.dispatchNumLogs
Metric type | Value | Description |
---|
gauge | number | Number of logs committed to disk in the most recent batch |
Metric type | Value | Description |
---|
summary | ms | Time since the leader was last able to contact the follower nodes when checking its leader lease |
vault.raft.list
Metric type | Value | Description |
---|
summary | ms | Time required to retrieve a list of keys from underlying storage |
vault.raft.peers
Metric type | Value | Description |
---|
guage | number | The number of peers in the raft cluster configuration |
vault.raft.replication.appendEntries.log
Metric type | Value | Description |
---|
summary | number | Number of logs replicated to a node to establish parity with leader logs |
vault.raft.replication.appendEntries.rpc
Metric type | Value | Description |
---|
timer | ms | Time required to replicate leader node log entries to all follower nodes with appendEntries |
vault.raft.replication.heartbeat
Metric type | Value | Description |
---|
timer | ms | Time required to invoke appendEntries on a peer so the peer does not time out |
vault.raft.replication.installSnapshot
Metric type | Value | Description |
---|
timer | ms | Time required to process an installSnapshot RPC call |
Only nodes currently in the follower
state report
vault.raft.replication.installSnapshot
metrics.
vault.raft.restore
Metric type | Value | Description |
---|
counter | number | Number of times that the node performed a restore operation |
In the context of raft storage, a restore operation refers to the process where
raft consumes an external snapshot to restore its state.
vault.raft.restoreUserSnapshot
Metric type | Value | Description |
---|
timer | ms | Time required to restore the finite state machine from a user snapshot |
vault.raft.rpc.appendEntries
Metric type | Value | Description |
---|
timer | ms | Time required to process a remote appendEntries call from a node |
vault.raft.rpc.appendEntries.processLogs
Metric type | Value | Description |
---|
timer | ms | Time required to completely process the outstanding logs for the given node |
vault.raft.rpc.appendEntries.storeLogs
Metric type | Value | Description |
---|
timer | ms | Time required to record any outstanding logs since the last request to append entries for the given node |
vault.raft.rpc.installSnapshot
Metric type | Value | Description |
---|
timer | ms | Time required to process an installSnapshot RPC call |
Only nodes currently in the follower
state report
vault.raft.rpc.installSnapshot
metrics.
vault.raft.rpc.processHeartbeat
Metric type | Value | Description |
---|
timer | ms | Time required to process a heartbeat request |
vault.raft.rpc.requestVote
Metric type | Value | Description |
---|
summary | ms | Time required to complete a requestVote call |
vault.raft.snapshot.create
Metric type | Value | Description |
---|
timer | ms | Time required to capture a new snapshot |
vault.raft.snapshot.persist
Metric type | Value | Description |
---|
timer | ms | Time required to record snapshot meta information to disk while taking snapshots |
vault.raft.snapshot.takeSnapshot
Metric type | Value | Description |
---|
timer | ms | Total time required to create and persist the current snapshot |
In most cases, vault.raft.snapshot.takeSnapshot
is approximately equal to
vault.raft.snapshot.create + vault.raft.snapshot.persist
.
vault.raft.state.candidate
Metric type | Value | Description |
---|
counter | number | Number of times the raft server initiated an election |
vault.raft.state.follower
Metric type | Value | Description |
---|
summary | number | Number of times in the configured interval that the raft server became a follower |
Nodes transition to follower
state under the following conditions:
- when the node joins the cluster
- when a leader is elected, but the node was not elected leader
vault.raft.state.leader
Metric type | Value | Description |
---|
counter | number | Number of times the raft server became a leader |
vault.raft.transition.heartbeat_timeout
Metric type | Value | Description |
---|
summary | number | Number of times that the node transitioned to candidate state after not receiving a heartbeat message from the last known leader |
vault.raft.transition.leader_lease_timeout
Metric type | Value | Description |
---|
counter | number | The number of times the leader could not contact a quorum of nodes and therefore stepped down |
vault.raft.verify_leader
Metric type | Value | Description |
---|
counter | number | Number of times in the configured interval that the node confirmed it is still the leader |
vault.rollback.attempt.{MOUNTPOINT}
Metric type | Value | Description |
---|
summary | ms | Time required to perform a rollback operation on the given mount point |
vault.rollback.inflight
Metric type | Value | Description |
---|
gauge | number | Number of rollback operations inflight |
vault.rollback.queued
Metric type | Value | Description |
---|
guage | number | The number of rollback operations waiting to be started |
vault.rollback.waiting
Metric type | Value | Description |
---|
summary | ms | Time between queueing a rollback operation and the operation starting |
vault.route.create.{MOUNTPOINT}
Metric type | Value | Description |
---|
summary | ms | Time required to send a create request to the backend and for the backend to complete the operation for the given mount point |
vault.route.delete.{MOUNTPOINT}
Metric type | Value | Description |
---|
summary | ms | Time required to send a delete request to the backend and for the backend to complete the operation for the given mount point |
vault.route.list.{MOUNTPOINT}
Metric type | Value | Description |
---|
summary | ms | Time required to send a list request to the backend and for the backend to complete the operation for the given mount point |
vault.route.read.{MOUNTPOINT}
Metric type | Value | Description |
---|
summary | ms | Time required to send a read request to the backend and for the backend to complete the operation for the given mount point |
vault.route.rollback.{MOUNTPOINT}
Metric type | Value | Description |
---|
summary | ms | Time required to send a rollback request to the backend and for the backend to complete the operation for the given mount point |
OpenBao automatically schedules and performs mount point rollback operations to
clean up partial errors.
vault.runtime.alloc_bytes
Metric type | Value | Description |
---|
gauge | bytes | Space currently allocated to OpenBao processes |
The number of allocated bytes may peak from time to time, but should
always return to a steady state value in a health OpenBao installation.
vault.runtime.free_count
Metric type | Value | Description |
---|
gauge | number | Number of freed objects |
vault.runtime.gc_pause_ns
Metric type | Value | Description |
---|
summary | ns | Time required to complete the last garbage collection run |
vault.runtime.heap_objects
Metric type | Value | Description |
---|
gauge | number | Total number of objects on the heap in memory |
The vault.runtime.heap_objects
metric is a good memory pressure indicator. We
recommend monitoring vault.runtime.heap_objects
to establish an accurate
baseline and thresholds for alerting on the health of your OpenBao installation.
vault.runtime.malloc_count
Metric type | Value | Description |
---|
gauge | number | Total number of allocated heap objects in memory |
vault.runtime.num_goroutines
Metric type | Value | Description |
---|
gauge | number | Total number of Go routines running in memory |
The vault.runtime.num_goroutines
metric is a good system load indicator. We
recommend monitoring vault.runtime.num_goroutines
to establish an accurate
baseline and thresholds for alerting on the health of your OpenBao installation.
vault.runtime.sys_bytes
Metric type | Value | Description |
---|
gauge | number | Total number of bytes allocated to OpenBao |
The total number of allocated system bytes includes space currently used by the
heap plus space that has been reclaimed by, but not returned to, the operating
system.
vault.runtime.total_gc_pause_ns
Metric type | Value | Description |
---|
gauge | ns | The total garbage collector pause time since OpenBao was last started |
vault.runtime.total_gc_runs
Metric type | Value | Description |
---|
gauge | number | The total number of garbage collection runs since OpenBao was last started |
vault.secret.kv.count
Metric type | Value | Description |
---|
gauge | number | Number of entries in each key-value secrets engines |
OpenBao organizes the key-value pair count by cluster, namespace, and mount point.
vault.secret.lease.creation
Metric type | Value | Description |
---|
counter | number | Number of leases created by secrets engines |
OpenBao organizes the lease count by cluster, namespace, secret engine, mount
point, and time to live (TTL).
vault.token.count
Metric type | Value | Description |
---|
gauge | number | Number of un-expired and un-revoked tokens available for use in the token store |
OpenBao updates the token count every 10 minutes organizes the result by cluster
and namespace.
vault.token.count.by_auth
Metric type | Value | Description |
---|
gauge | number | Total number of service tokens created by a particular auth method |
OpenBao organizes the token count by cluster, namespace, and authentication
method.
vault.token.count.by_policy
Metric type | Value | Description |
---|
gauge | number | Total number of service tokens with a particular policy attached |
OpenBao organizes the token count by cluster, namespace, and policy. Tokens with
more than one policy attached appear in the gauge for each associated policy.
vault.token.count.by_ttl
Metric type | Value | Description |
---|
gauge | number | Total number of service tokens assigned a particular time to live (TTL) |
OpenBao organizes the token count by cluster, namespace, and the TTL
range assigned at creation.
vault.token.create_root
Metric type | Value | Description |
---|
counter | number | Number of root tokens created |
The vault.token.create_root
counts the total number of root tokens created
over time, not the number of root tokens currently in use. As a result, the
value of vault.token.create_root
does not decrease when a root token is
revoked.
vault.token.create
Metric type | Value | Description |
---|
summary | ms | Time required to create a token in OpenBao |
vault.token.createAccessor
Metric type | Value | Description |
---|
summary | ms | Time required to create a token accessor in OpenBao |
vault.token.creation
Metric type | Value | Description |
---|
counter | number | Number of service or batch tokens created |
OpenBao organizes the creation count by cluster, namespace, authentication method,
mount point, time to live (TTL), and token type.
vault.token.lookup
Metric type | Value | Description |
---|
summary | ms | Time required to look up a token in OpenBao |
vault.token.revoke-tree
Metric type | Value | Description |
---|
summary | ms | Time required to fully revoke a token tree in OpenBao |
vault.token.revoke
Metric type | Value | Description |
---|
summary | ms | Time required to revoke a token in OpenBao |
vault.token.store
Metric type | Value | Description |
---|
summary | ms | Time required to store an updated token entry without writing to the secondary index |
vault.wal.deleteWALs
Metric type | Value | Description |
---|
summary | ms | Time required to fully delete a write-ahead log |
vault.wal.flushReady
Metric type | Value | Description |
---|
summary | ms | Time required to fully flush a write-ahead log that is ready for storage |
vault.wal.flushReady.queue_len
Metric type | Value | Description |
---|
summary | number | Current size of the write queue in the WAL system |
vault.wal.gc.deleted
Metric type | Value | Description |
---|
gauge | number | Number of write-ahead logs deleted during garbage collection |
vault.wal.gc.total
Metric type | Value | Description |
---|
gauge | number | Total number of write-ahead logs currently on disk |
vault.wal.loadWAL
Metric type | Value | Description |
---|
summary | ms | Time required to load a write-ahead log |
vault.wal.persistWALs
Metric type | Value | Description |
---|
summary | ms | Time required to persist a write-ahead log |