All pages
Powered by GitBook
1 of 6

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Getting Started

Installation

How to install IRONdb on a system.

System Requirements

IRONdb requires one of the following operating systems:

  • Ubuntu 22.04 LTS

Additionally, IRONdb requires the ZFS filesystem. This is available natively on Ubuntu.

Hardware requirements will necessarily vary depending upon system scale and cluster size. An appendix with general guidelines for is provided. Please contact us with questions regarding system sizing.

Apica recommends the following minimum system specification for the single-node, free, 25K-metrics option:

  • 1 CPU

  • 4 GB RAM

  • SSD-based storage, 20 GB available space

The following network protocols and ports are utilized. These are defaults and may be changed via configuration files.

  • 2003/tcp (Carbon plaintext submission)

  • 4242/tcp (OpenTSDB plaintext submission)

  • 8112/tcp (admin UI, HTTP REST API, , )

  • 8112/udp ()

System Tuning

IRONdb is expected to perform well on a standard installation of supported platforms, but to ensure optimal performance, there are a few tuning changes that should be made. This is especially important if you plan to push your IRONdb systems to the limit of your hardware.

Disable Swap

With systems dedicated solely to IRONdb, there is no need for swap space. Configuring no swap space during installation is ideal, but you can also swapoff -a and comment out any swap lines from /etc/fstab.

Disable Transparent Hugepages

can interact poorly with the ZFS ARC, causing reduced performance for IRONdb.

Disable by setting these two kernel options to never:

Making these changes persistent across reboot differs depending on distribution.

For Ubuntu, install the sysfsutils package and edit /etc/sysfs.conf, adding the following lines:

Note: the sysfs mount directory is automatically prepended to the attribute name.

Installation Steps

Follow these steps to get IRONdb installed on your system.

System commands must be run as a privileged user, such as root, or via sudo.

Configure Software Sources

Install the signing keys:

Create the file /etc/apt/sources.list.d/circonus.list with the following contents, depending on the version:

For Ubuntu 22.04:

For Ubuntu 24.04:

Finally, run sudo apt-get update.

Install Package

There is a helper package that works around issues with dependency resolution, since IRONdb is very specific about the versions of dependent Apica packages, and apt-get is unable to cope with them. The helper package must be installed first, i.e., it cannot be installed in the same transaction as the main package.

Setup Process

Prepare site-specific information for setup. These values may be set via shell environment variables, or as arguments to the setup script. The environment variables are listed below.

NOTE: if you wish to use environment variables, you will need to run the install from a root shell, as sudo will clear the environment when it runs.

IRONDB_NODE_UUID

(required) The ID of the current node, which must be unique within a given cluster. You may use the uuidgen command that comes with your OS, or generate a UUID with an external tool or website. Note that this must be a lowercase UUID. The uuidgen tool on some systems, notably MacOS, produces uppercase. Setup will warn and convert the UUID to lowercase.

IRONDB_NODE_ADDR

(required) The IPv4 address or hostname of the current node, e.g., "192.168.1.100" or "host1.domain.com". Hostnames will be resolved to IP addresses once at service start. Failures in DNS resolution may cause service outages.

IRONDB_CHECK_UUID

(required) Check ID for Graphite, OpenTSDB, and Prometheus metric ingestion, which must be the same on all cluster nodes. You may use the uuidgen command that comes with your OS, or generate a UUID with an external tool or website. Note that this must be a lowercase UUID. The uuidgen tool on some systems, notably MacOS, produces uppercase. Setup will warn and convert the UUID to lowercase.

IRONDB_TLS

(optional) Configures listeners to require TLS where applicable. Default is "off". If set to "on", a second HTTPS listener will be created on port 8443, for external clients to use for metric submission and querying. Two SSL certificates will be required, utilizing different CNs. See for details.

This is currently an alpha feature, for testing only.

Note that OpenTSDB does not support TLS. Even if this option is set to "on", the listener on port 4242 will not use TLS.

Because of the certificate requirement, the service will not automatically start post-setup.

IRONDB_CRASH_REPORTING

(optional) Controls enablement of automated crash reporting. Default is "on". IRONdb utilizes sophisticated crash tracing technology to help diagnose errors. Enabling crash reporting requires that the system be able to connect out to the Apica reporting endpoint: . If your site's network policy forbids this type of outbound connectivity, set the value to "off".

IRONDB_ZPOOL

(optional) The name of the zpool that should be used for IRONdb storage. If this is not specified and there are multiple zpools in the system, setup chooses the pool with the most available space.

Run Installer

Run the setup script. All required options must be present, either as environment variables or via command-line arguments. A mix of environment variables and arguments is permitted, but environment variables take precedence over command-line arguments.

Use the -h option to view a usage summary.

The setup script will configure your IRONdb instance and start the service. If you chose to turn on TLS support, the service will not automatically start. Once you have installed the necessary key and certificate files, .

Upon successful completion, it will print out specific information about how to submit Graphite, OpenTSDB, and Prometheus metrics. See the section for details.

Add License

(Optional)

IRONdb comes with an embedded license that allows all features with a limit of 25K active, unique metric streams. If you wish to obtain a more expansive license, please contact .

Add the <license> stanza from your purchased IRONdb license to the file/opt/circonus/etc/licenses.conf on your IRONdb instance, within the enclosing<licenses> tags. It should look something like this:

If you are running a cluster of IRONdb nodes, the license must be installed on all nodes.

Restart the IRONdb service:

  • /bin/systemctl restart circonus-irondb

For more on licensing see:

Cluster Configuration

Additional configuration is required for clusters of more than one IRONdb node. The topology of a cluster describes the addresses and UUIDs of the participating nodes, as well as the desired number of write copies for stored data. Ownership of metric streams (deciding which node that stream's data should be written to) is determined by the topology.

The above setup script configures a single, standalone instance. If you have already been using such an instance, configuring it to be part of a cluster will cause your existing stored data to become unavailable. It is therefore preferable to complete cluster setup prior to ingesting any metric data into IRONdb.

Note for existing clusters: adding one or more nodes to an existing cluster requires a special "rebalance" operation to shift stored metric data to different nodes, as determined by a new topology. See for details.

Determine Cluster Parameters

The number and size of nodes you need is determined by several factors:

  • Frequency of measurement ingestion

  • Desired level of redundancy (write copies)

  • Minimum granularity of rollups

  • Retention period

The number of write copies determines the number of nodes that can be unavailable before metric data become inaccessible. A cluster with W write copies can survive W-1 node failures before data become inaccessible.

See the for details.

Topology Requirements

There are a few important considerations for IRONdb cluster topologies:

  • A specific topology is identified by a hash. IRONdb clusters always have an "active" topology, referenced by the hash.

  • The topology hash is determined using the values of id, port, and weight, as well as the ordering of the <node> stanzas. Changing any of these on a previously configured node will invalidate the topology and cause the node to refuse to start. This is a safety measure to guard against data loss.

Create Topology Layout

The topology layout describes the particular nodes that are part of the cluster as well as aspects of operation for the cluster as a whole, such as the number of write copies. The layout file is not read directly by IRONdb, rather it is used to create a canonical topology representation that will be referenced by the IRONdb config.

A helper script exists for creating the topology: /opt/circonus/bin/topo-helper:

This will create a temporary config, which you can edit afterward, if needed, before importing. There are multiple options for generating the list of IP addresses or hostnames, and for choosing the node UUIDs.

The simplest form is to give a starting IP address, a node count, and a write-copies value. For example, in a cluster of 3 nodes, where we want 2 write copies:

The resulting temporary config (/tmp/topology.tmp) looks like this:

The helper script auto-generated the node UUIDs. You may edit this file if needed, for example if your IP addresses are not sequential.

You may supply your own UUIDs in a comma-separated list, in which case the node count will be implied by the number of UUIDs:

If you wish to use DNS names instead of IP addresses, you can provide them in a file, one per line:

Then pass the filename to the helper script:

To configure a sided cluster, use the -s option. This will assign alternate nodes to side "a" or "b". If you wish to divide the list differently, you may edit the /tmp/topology.tmp file accordingly. If omitted, the cluster will be non-sided, if the node count is less than 10. For clusters of 10 or more nodes, the helper script will default to configuring a sided cluster, because there are significant operational benefits, described below.

When you are satisfied that it looks the way you want, copy /tmp/topology.tmp to /opt/circonus/etc/topology on each node, then proceed to the step.

Sided Clusters

One additional configuration dimension is possible for IRONdb clusters. A cluster may be divided into two "sides", with the guarantee that at least one copy of each stored metric exists on each side of the cluster. For W values greater than 2, write copies will be assigned to sides as evenly as possible. Values divisible by 2 will have the same number of copies on each side, while odd-numbered W values will place the additional copy on the same side as the primary node for each metric. This allows for clusters deployed across typical failure domains such as network switches, rack cabinets or physical locations.

Even if the cluster nodes are not actually deployed across a failure domain, there are operational benefits to using a sided configuration, and as such it is highly recommended that clusters of 10 or more nodes be configured to be sided. For example, a 32-node, non-sided cluster with 2 write copies will have a partial outage of data availability if any 2 nodes are unavailable simultaneously. If the same cluster were configured with sides, then up to half the nodes (8 from side A and 8 from side B) could be unavailable and all data would still be readable.

Sided-cluster configuration is subject to the following restrictions:

  • Only 2 sides are permitted.

  • An active, non-sided cluster cannot be converted into a sided cluster as this would change the existing topology, which is not permitted. The same is true for conversion from sided to non-sided.

  • Both sides must be specified, and non-empty (in other words, it is an error to configure a sided cluster with all hosts on one side.)

To configure a sided topology, add the side attribute to each <node>, with a value of either a or b. If using the topo-helper tool in the previous section, use the -s option. A sided configuration looks something like this:

Import Topology

This step calculates a hash of certain attributes of the topology, creating a unique "fingerprint" that identifies this specific topology. It is this hash that IRONdb uses to load the cluster topology at startup. Import the desired topology with the following command:

If successful, the output of the command is compiling to <long-hash-string>.

Next, update /opt/circonus/etc/irondb.conf and locate the topology section, typically near the end of the file. Set the value of the topology's active attribute to the hash reported by snowthimport. It should look something like this:

Save the file and restart IRONdb:

  • /bin/systemctl restart circonus-irondb

Repeat the import process on each cluster node.

Verify Cluster Communication

Once all nodes have the cluster topology imported and have been restarted, verify that the nodes are communicating with one another by viewing the Replication Latency tab of the on any node. You should see all of the cluster nodes listed by their IP address and port, and there should be a latency meter for each of the other cluster peers listed within each node's box.

The node currently being viewed is always listed in blue, with the other nodes listed in either green, yellow, or red, depending on when the current node last received a gossip message from that node. If a node is listed in black, then no gossip message has been received from that node since the current node started. Ensure that the nodes can communicate with each other via port 8112 over both TCP and UDP. See the documentation for details on the information visible in this tab.

Updating

An installed node may be updated to the latest available version of IRONdb by following these steps:

Ubuntu:

We have a helper package on Ubuntu that works around issues with dependency resolution, since IRONdb is very specific about the versions of dependent Apica packages, and apt-get is unable to cope with them. The helper package must be upgraded first, i.e., it cannot be upgraded in the same transaction as the main package.

In a cluster of IRONdb nodes, service restarts should be staggered so as not to jeopardize availability of metric data. An interval of 30 seconds between node restarts is considered safe.

8443/tcp (admin UI, HTTP REST API when TLS configuration is used)

  • 32322/tcp (admin console, localhost only)

  • UUIDs must be
    , and lowercase.
  • The node address may be changed at any time without affecting the topology hash, but care should be taken not to change the ordering of any node stanzas.

  • If a node fails, its replacement should keep the same UUID, but it can have a different IP address or hostname.

  • calculating cluster size
    cluster replication
    request proxying
    cluster gossip
    THP
    well-formed, non-nil
    well-formed, non-nil
    TLS Configuration
    https://circonus.sp.backtrace.io:6098
    enable and start the service
    Integrations
    Apica Sales
    Configuration/licenses
    Resizing Clusters
    appendix on cluster sizing
    Import Topology
    IRONdb Operations Dashboard
    Replication Latency tab
    well-formed, non-nil
    echo never > /sys/kernel/mm/transparent_hugepage/enabled
    echo never > /sys/kernel/mm/transparent_hugepage/defrag
    kernel/mm/transparent_hugepage/enabled = never
    kernel/mm/transparent_hugepage/defrag = never
    sudo curl -s -o /etc/apt/trusted.gpg.d/circonus.asc \
      'https://keybase.io/circonuspkg/pgp_keys.asc?fingerprint=14ff6826503494d85e62d2f22dd15eba6d4fa648'
    
    sudo curl -s -o /etc/apt/trusted.gpg.d/backtrace.asc \
      https://updates.circonus.net/backtrace/ubuntu/backtrace_package_signing.key
    deb https://updates.circonus.net/irondb/ubuntu/ jammy main
    deb https://updates.circonus.net/backtrace/ubuntu/ jammy main
    deb https://updates.circonus.net/irondb/ubuntu/ noble main
    deb https://updates.circonus.net/backtrace/ubuntu/ noble main
    sudo apt-get install circonus-platform-irondb-apt-policy
    sudo apt-get install circonus-platform-irondb
    /opt/circonus/bin/setup-irondb \
        -a <ip_or_hostname> \
        -n <node_uuid> \
        -u <integration_check_uuid>
    <licenses>
      <license id="(number)" sig="(cryptographic signature)">
        <graphite>true</graphite>
        <max_streams>25000</max_streams>
        <company>MyCompany</company>
      </license>
    </licenses>
    Usage: ./topo-helper [-h] -a <start address>|-A <addr_file> -w <write copies> [-i <uuid,uuid,...>|-n <node_count>] [-s]
      -a <start address> : Starting IP address (inclusive)
      -A <addr_file>     : File containing node IPs or hostnames, one per line
      -i <uuid,uuid,...> : List of (lowercased) node UUIDs
                           If omitted, UUIDs will be auto-generated
      -n <node_count>    : Number of nodes in the cluster (required if -i is omitted)
      -s                 : Create a sided configuration
      -w <write copies>  : Number of write copies
      -h                 : Show usage summary
    /opt/circonus/bin/topo-helper -a 192.168.1.11 -n 3 -w 2
    <nodes write_copies="2">
      <node id="7dffe44b-47c6-43e1-db6f-dc3094b793a8"
            address="192.168.1.11"
            apiport="8112"
            port="8112"
            weight="170"/>
      <node id="964f7a5a-6aa5-4123-c07c-8e1a4fdb8870"
            address="192.168.1.12"
            apiport="8112"
            port="8112"
            weight="170"/>
      <node id="c85237f1-b6d7-cf98-bfef-d2a77b7e0181"
            address="192.168.1.13"
            apiport="8112"
            port="8112"
            weight="170"/>
    </nodes>
    /opt/circonus/bin/topo-helper -a 192.168.1.11 -w 2 -i <uuid>,<uuid>,<uuid>
    $ cat host_list.txt
    myhost1.example.com
    myhost2.example.com
    myhost3.example.com
    /opt/circonus/bin/topo-helper -A host_list.txt -n 3 -w 2
    <nodes write_copies="2">
      <node id="7dffe44b-47c6-43e1-db6f-dc3094b793a8"
            address="192.168.1.11"
            apiport="8112"
            port="8112"
            side="a"
            weight="170"/>
      <node id="964f7a5a-6aa5-4123-c07c-8e1a4fdb8870"
            address="192.168.1.12"
            apiport="8112"
            port="8112"
            side="a"
            weight="170"/>
      <node id="c85237f1-b6d7-cf98-bfef-d2a77b7e0181"
            address="192.168.1.13"
            apiport="8112"
            port="8112"
            side="b"
            weight="170"/>
    </nodes>
    /opt/circonus/bin/snowthimport \
      -c /opt/circonus/etc/irondb.conf \
      -f /opt/circonus/etc/topology
    <topology path="/opt/circonus/etc/irondb-topo"
              active="742097e543a5fb8754667a79b9b2dc59e266593974fb2d4288b03e48a4cbcff2"
              next=""
              redo="/irondb/redo/{node}"
    />
    /usr/bin/apt-get update && \
    /usr/bin/apt-get install circonus-platform-irondb-apt-policy && \
    /usr/bin/apt-get install circonus-platform-irondb && \
    /bin/systemctl restart circonus-irondb

    ZFS Guide

    In the following guide we will demonstrate a typical IRONdb installation on Linux, using ZFS.

    This guide assumes a server with the following storage configuration:

    • One or more OS drives with ext4 on LVM, md, etc., depending on installer choices and/or operator preference.

    • 12 data drives attached via a SAS or SATA HBA (non-RAID) that will be used exclusively for ZFS.

    ZFS Terminology

    If you are new to ZFS, there are some basic concepts that you should become familiar with to best utilize your server hardware with ZFS.

    References:

    • Old but still largely relevant presentation introducing ZFS, from Sun Microsystems

    Pools

    Pools are the basis of ZFS storage. They are constructed out of "virtual devices" (vdevs), which can be individual disks or groupings of disks that provide some form of redundancy for writes to the group.

    Review the zpool man page for details.

    Datasets

    Datasets are logical groupings of objects within a pool. They are accessed in one of two ways: as a POSIX-compliant filesystem, or as a block device. In this guide we will only be dealing with the filesystem type.

    Filesystem datasets are mounted in the standard UNIX hierarchy just as traditional filesystems are. The difference is that the "device" part of the mount is a hierarchical name, starting with the pool name, rather than a device name such as /dev/sdc1. The specific mountpoint of a given filesystem is determined by its mountpoint property. See the zfs man page for more information on ZFS dataset properties.

    Please note that IRONdb setup configures all necessary datatset properties. No pre-configuration is required.

    On Linux, ZFS filesystems are mounted at boot by the zfs-mount service. They are not kept in the traditional /etc/fstab file.

    Obtaining ZFS Packages

    Ubuntu

    Packages for ZFS are available from the standard Ubuntu repository.

    Creating a ZFS Pool

    IRONdb setup expects a zpool to exist, but will take care of creating all necessary filesystems and directories.

    For best performance with IRONdb, consider using mirror groups. These provide the highest number of write IOPS, but at a cost of 50% of available raw storage. Balancing the capacity of individual nodes with the number of nodes in your IRONdb cluster is something that Apica Support can help you with.

    In our example system we have 12 drives available for our IRONdb pool. We will configure six 2-way mirror groups, across which writes will be striped. This is similar to a RAID-10 setup. We will call our pool "data". To simplify the example command we are using the traditional sdX names, but it's recommended that you use for your devices that are less susceptible to change and make it easier to maintain.

    Using the zpool status command we can see our new pool:

    At this point you may wish to reboot the system to ensure that the pool is present at startup.

    Proceed to IRONdb Setup

    This step is only required if using the standalone IRONdb product. If you are referring to this appendix as an on-premise Apica Inside user, there is no further manual setup required at this point. All IRONdb setup from this point is handled by the Apica Inside installer.

    Now that you have created a ZFS pool you may begin the IRONdb . If you have multiple pools configured and you want to use a specific pool for IRONdb, you can use the -z option to the setup script.

    The setup script takes care of creating the /irondb mountpoint and all other necessary filesystems, as well as setting the required properties on those filesystems. No other administrative action at the ZFS level should be required at this point.

    Cluster Sizing

    This is intended as a general guide to determining how many nodes and how much storage space per node you require for your workload. Please if you have questions arising from your specific needs.

    Key Terminology

    • T is the number of unique metric streams.

    OpenZFS Administration
    ZFS: The Last Word in Filesystems
    different identifiers
    installation
    sudo apt-get update
    sudo apt-get install zfsutils-linux
    zpool create data \
        mirror sdc sdd \
        mirror sde sdf \
        mirror sdg sdh \
        mirror sdi sdj \
        mirror sdk sdl \
        mirror sdm sdn
      pool: data
     state: ONLINE
      scan: none requested
    config:
    
        NAME        STATE     READ WRITE CKSUM
        data        ONLINE       0     0     0
          mirror-0  ONLINE       0     0     0
            sdc     ONLINE       0     0     0
            sdd     ONLINE       0     0     0
          mirror-1  ONLINE       0     0     0
            sde     ONLINE       0     0     0
            sdf     ONLINE       0     0     0
          mirror-2  ONLINE       0     0     0
            sdg     ONLINE       0     0     0
            sdh     ONLINE       0     0     0
          mirror-3  ONLINE       0     0     0
            sdi     ONLINE       0     0     0
            sdj     ONLINE       0     0     0
          mirror-4  ONLINE       0     0     0
            sdk     ONLINE       0     0     0
            sdl     ONLINE       0     0     0
          mirror-5  ONLINE       0     0     0
            sdm     ONLINE       0     0     0
            sdn     ONLINE       0     0     0
    
    errors: No known data errors
    /opt/circonus/bin/setup-irondb (other options) -z data

    N is the number of nodes participating in the cluster.

  • W is the number of times a given measurement is stored across the cluster.

    • For example, if you have 1 GB of metric data, you must have W GB of storage space across the cluster.

  • The value of W determines the number of nodes that can be unavailable before metric data become inaccessible. A cluster with W write copies can survive W-1 node failures before a partial data outage will occur.

    Metric streams are distributed approximately evenly across the nodes in the cluster. In other words, each node is responsible for storing approximately (T*W)/N metric streams. For example, a cluster of 4 nodes with 100K streams and W=2 would store about 50K streams per node.

    Rules of Thumb

    • Nodes should be operated at no more than 70% capacity.

    • Favor ZFS striped mirrors over other pool layouts. This provides the highest performance in IOPS.

    • W must be >= 2

    • N must be >= W

    • W should be >= 3 when N >= 6

    • W should be >= 4 when N >= 100

    Storage Space

    The system stores three types of data: text, numeric (statistical aggregates), and histograms. Additionally there are two tiers of data storage: near-term and long-term. Near-term storage is called the raw database and stores at full resolution (however frequently measurements were collected.) Long-term resolution is determined by the rollup configuration.

    The default configuration for the raw database is to collect data into shards (time buckets) of 1 week, and to retain those shards for 4 weeks before rolling them up into long-term storage. At 1-minute collection frequency, a single numeric stream would require approximately 118 KiB per 1-week shard, or 472 KiB total, before being rolled up to long-term storage.

    These numbers represent uncompressed data. With our default LZ4 compression setting in ZFS, we see 3.5x-4x compression ratios for numeric data.

    The following modeling is based on an observed distribution of all data types, in long-term storage, across many clients and may be adjusted from time to time. This would be in addition to the raw database storage above.

    Minimum Resolution
    Storage Space / Day
    Storage Space / Year

    10 seconds

    120,000 bytes

    43,020,000 bytes

    1 minute

    20,000 bytes

    7,170,000 bytes

    5 minute

    3,800 bytes

    1,386,000 bytes

    All sizing above represents uncompressed data.

    Sizing Example

    Suppose we want to store 100,000 metric streams at 1-minute resolution for 5 years. We'd like to build a 4-node cluster with a W value of 2.

    Hardware Choices

    Apica recommends server-class hardware for all production deployments. This includes, but is not limited to, features like ECC memory and hot-swappable hard drives.

    • See OpenZFS guidelines for general advice.

      • Specifically, hardware RAID should be avoided. ZFS should be given access to raw hard drive devices whenever possible.

    In addition to the overall storage space requirements above, consideration must be given to the IOPS requirements. The minimum IOPS required is the primary write load of ingesting metric data (approximately 12 bytes per measurement point), but there is additional internal work such as parsing and various database accounting operations that can induce disk reads beyond the pure writing of measurement data. After initial ingestion there are other operations, such as searching, rollups, and maintenance activity like reconstitution and ZFS scrubbing that require additional IOPS. Ensure that the hardware you choose for your nodes has the capacity to allow for these operations without significantly impacting ongoing ingestion.

    ZFS's ARC helps by absorbing some portion of the read load, so the more RAM available to the system, the better.

    Hardware Profiles

    The following are sample profiles to guide you in selecting the right combination of hardware and cluster topology for your needs.

    Assumptions:

    • 10-second collection frequency

    • 4 weeks of near-term (full-resolution) storage

    • 2 years of historical data at 1-minute resolution

    • striped-mirror ZFS pool layout

    Streams per 10sec
    Write Copies
    Total Streams
    Node Count
    Streams per Node
    Physical CPU cores
    RAM (GB)
    7200rpm spindles

    1MM

    3

    3MM

    5

    600K

    12

    128

    contact Apica
    T=100,000
    N=4
    W=2
    
    T * 7,170,000 (bytes/year/stream) * 5 years = 3,585,000,000,000 bytes
    
    3,585,000,000,000 bytes / (1024^3) = 3338 GiB
    
    T * 483,840 (bytes/4 weeks raw/stream) / (1024^3) = 45 GiB
    
    ( (3338+45) * W) / N = 1692 GiB per node
    
    1692 GiB / 70% utilization = 2417 GiB of usable space per node
    
    2417 GiB * 2 = 4834 GiB of raw attached storage in ZFS mirrors per node

    6x 2T

    10MM

    3

    30MM

    15

    2MM

    24

    256

    24x 4T

    100MM

    3

    300MM

    75

    4MM

    36

    384

    45x 4T

    Command Line Options

    Reference to available options and arguments.

    • Synopsis

    • Process Control

    • Operating Modes

    To obtain the most current usage summary: /opt/circonus/sbin/snowthd -h

    Synopsis

    Process Control Options

    • -k <start|stop|status>

    status will exit 0 if the process is running, non-zero otherwise.

    Operating Mode Options

    These options are mutually exclusive of one another. One or the other is required.

    • -i <uuid>

    Identify this node with <uuid>. This is the normal mode of operation.

    • -e

    Boot the node in ephemeral mode. Ephemeral nodes are read-only participants in the cluster. They do not appear in the cluster topology, and do not accept incoming metrics, but may be used to read metric data from other nodes and perform intensive computation that would add unreasonable load to the main nodes.

    Loader Options

    These options imply foreground operation and perform a specific task, then exit. They are only valid in identified mode (-i).

    • -m

    Merge text reconstitution files. DEPRECATED

    • -H

    Merge histogram reconstitution files. DEPRECATED

    The above 2 options were used in a previous version of the reconstitute process and are no longer strictly required. They may be removed in a future version.

    Maintenance Options

    These options imply foreground operation and perform a specific task, then exit. They are only valid in identified mode (-i).

    • -r text/metrics

    Repair text inventory.

    • -r text/changelog

    Repair text datastore.

    • -r hist/metrics

    Repair histogram inventory.

    • -r hist/<rollup>

    Repair a histogram rollup. The value is one of the existing histogram rollup periods from the config file, e.g., hist/60 to repair the 1-minute histogram rollups.

    • -j

    Journal-drain mode. Does not start a network listener, so this node will appear "down" to its peers, but will send any pending journal data to them. This is useful if you are planning to retire and replace a cluster node, and want to ensure that it has sent all outgoing journal data without accepting any new input.

    Behavioral Options

    These determine optional behavior, and are not required.

    • -c <file>

    Load configuration from <file>. Must be a full path. If not specified, the default path is /opt/circonus/etc/snowth.conf.

    • -d

    Activate additional debug logging. Use with caution; can generate a large volume of logs.

    • -D

    Stay in the foreground, rather than daemonizing. If specified once, run as a single process with no watchdog. If specified twice, run as a parent/child pair, with the parent (watchdog) process in the foreground.

    See the for details on foreground operation.

    • -u <user>

    Drop privileges after start and run as this user.

    • -g <group>

    Drop privileges after start and run as this group.

    • -t <path>

    Chroot to <path> for operation. Ensure that log file locations may be accessed within the chrooted environment.

    • -l <logname>

    Enable <logname>, even if it is disabled in the configuration file. The specified log stream must exist.

    • -L <logname>

    Disable <logname>, even if it is enabled in the configuration file. The specified log stream must exist.

    Reconstitute Options

    These operations are used when .

    • -B

    Enable reconstitute mode.

    • -T <topo_hash>

    Reconstitute from this remote/foreign topology. Used when creating a new cluster from an existing one.

    • -O <ip>[:<port>]

    Bootstrap remote reconstitute from this node in the source cluster. Used when creating a new cluster from an existing one. The reconstituting node will fetch information about the source cluster's topology from this node, but actual metric data will be fetched from all source cluster nodes.

    • -A <type>

    Reconstitute one type of data, or all if the option is omitted. May be specified multiple times to reconstitute multiple data types.

    • -S <node_uuid>

    Skip the specified node(s) when pulling data for reconstitute. This is useful if a node is unavailable at the time a reconstitute is started. May be specified multiple times to skip more than one node. Use with caution. If the number of skipped nodes exceeds the number of data copies, the reconstitute may be incomplete.

    Loader Options
    Maintenance Actions
    Behavioral Options
    Reconstitute Options
    libmtev documentation
    rebuilding a node
    process control flags:
            -k start                start the process (default)
            -k stop                 stop a running process
            -k status               report the status via exit code
    
    mutually exclusive flags:
            -e                      boot this node ephemerally (compute node)
            -i <uuid>               identify this node
    
    standalone loader flags for use with -i
            -m                      merge text reconstitution files (deprecated)
            -H                      merge hist reconstitution files (deprecated)
    
    standalone maintenance flags for use with -i
            -r text/metrics         repair text inventory
            -r text/changelog       repair text datastore
            -r hist/metrics         repair hist inventory
            -r hist/<period>        repair hist rollup for configured <period>
            -j                      only write journal data to other nodes
    
    optional behavior flags:
            -c <file>               load config from <file> (full path)
                                            default: /opt/circonus/etc/snowth.conf
            -d                      debugging
            -D                      foreground operations (don't daemonize)
            -u <user>               run as <user>
            -g <group>              run as <group>
            -t <path>               chroot to <path>
            -l <logname>            enable <logname>
            -L <logname>            disable <logname>
            -q                      disable gossip on this node
    
    reconstitute parameters:
            -B                      Reconstitute mode
            -T <topo_hash>          Reconstitute new cluster from remote topology
            -O <ip>[:<port>]        Reconstitute from remote host
            -A <type>               Reconstitute type
                                    Acceptable values: nntbs,text,hist,raw,surrogate
                                    May be specified multiple times
                                    All if omitted
            -S <node_uuid>          Skip/ignore this node during reconstitute
                                    May be specified multiple times
    
    this usage message:
            -h                      usage

    Configuration

    Configuration files and options.

    IRONdb is implemented using libmtev, a framework for building high-performance C applications. You may wish to review the libmtev configuration documentation for an overview of how libmtev applications are configured generally.

    This document deals with options that are specific to IRONdb, but links to relevant libmtev documentation where appropriate.

    Default values are those that are present in the default configuration produced during initial installation.

    Time periods are specified as second-resolution libmtev time durations.

    irondb.conf

    This is the primary configuration file that IRONdb reads at start. It includes additional configuration files which are discussed later.

    snowth

    IRONdb's libmtev application name. This is a required node and must not be changed.

    snowth lockfile

    Path to a file that prevents multiple instances of the application from running concurrently. You should not need to change this.

    Default: /irondb/logs/snowth.lock

    snowth text_size_limit

    The maximum length of a text-type metric value. Text metric values longer than this limit will be truncated.

    Default: 512

    Text-type metrics are supported in IRONdb but Graphite currently has no way to render these when using a Storage Finder plugin.

    cache

    An LRU cache of open filehandles for numeric metric rollups. This can improve rollup read latency by keeping the on-disk files for frequently-accessed streams open.

    cache cpubuckets

    The cache is divided up into the specified number of "buckets" to facilitate concurrent access by multiple threads. This parameter rarely requires tuning.

    Default: 128

    logs

    Libmtev logging configuration. See the .

    By default, the following log files are written and automatically rotated, with the current file having the base name and rotated files having an epoch-timestamp suffix denoting when they were created:

    • /irondb/logs/errorlog: Output from the daemon process, including not just errors but also operational warnings and other information that may be useful to Apica Support.

      • Rotated: 24 hours

      • Retained: 1 week

    • /irondb/logs/startuplog

    Logging old data submission

    Sometimes it may be desirable to log data submissions that are older than some threshold, in order to identify the source. Submitting "old" data can cause issues with rollups being interrupted, as well as introducing unwanted changes to historical data. IRONdb has a debug-level logging facility for recording such submissions.

    Since version 0.20.2 a configuration to log such submissions has been available. It is not active by default, but can be activated by setting disabled="false" on the debug/old_data log:

    The threshold for what is considered "old" is controlled by metric_age_threshold. The value is a string representing an offset into the past from "now". The default is 7 days. Any data submitted with a timestamp that is further in the past will be logged.

    listeners

    Libmtev network listener configuration. See the .

    Each listener below is configured within a <listener> node. Additional listeners may be configured if desired, or the specific address and/or port may be modified to suit your environment.

    Main listener

    The main listener serves multiple functions:

    • (TCP) and gossip (UDP)

    • JSON-formatted node statistics (http://thisnode:thisport/stats.json)

    Main listener address

    The IP address on which to listen, or the special * to listen on any local IP address.

    Default: *

    Main listener port

    The port number to listen on. For the main listener this will utilize both TCP and UDP.

    Default: 8112

    Main listener backlog

    The size of the queue of pending connections. This is used as an argument to the standard listen(2) system call. If a new connection arrives when this queue is full, the client may receive an error such as ECONNREFUSED.

    Default: 100

    Main listener type

    The type of libmtev listener this is. The main listener is configured to be only a REST API listener. This value should not be changed.

    Default: http_rest_api

    Main listener accept_thread

    If set to on, IRONdb will dedicate an eventer thread to handling incoming connections. This improves performance by ensuring that a new connection will be fully processed in blocking fashion, without preemption.

    Default: off

    Main listener fanout

    If set to true, new events from accepted connections will be fanned out across all threads in the event pool owning the listening socket (usually the default event pool).

    Default: false

    Main listener ssl

    When set to on, the listener will expect incoming connections to use Transport Layer Security (TLS), also known as "SSL". Additional TLS configuration is required. See .

    Default: off

    Graphite listener

    The Graphite listener operates a Carbon-compatible submission pathway using the .

    Multiple Graphite listeners may be configured on unique ports and associated with different check UUIDs. See the section on for details.

    Graphite listener address

    The IP address on which to listen, or the special * to listen on any local IP address.

    Default: *

    Graphite listener port

    The TCP port number to listen on.

    Default: 2003

    Graphite listener type

    The type of listener. IRONdb implements a Graphite-compatible handler in libmtev, using the custom type "graphite".

    Default: graphite

    Graphite listener config

    These configuration items control which check UUID, name, and account ID are associated with this listener. The first Graphite listener is configured during .

    • check_uuid is a UUID the will be associated with all metrics ingested via this listener.

    • account_id is also part of namespacing, for disambiguation.

    Pickle listener

    The Pickle listener operates a Carbon-compatible submission pathway using the .

    Its configuration is identical to the plaintext listener, except the type is graphite_pickle.

    CLI listener

    The CLI listener provides a local for interacting with libmtev subsystems, including modifying configuration. As there is no authentication mechanism available for this listener, it is recommended that it only be operated on the localhost interface.

    CLI listener address

    The IP address on which to listen, or the special * to listen on any local IP address.

    Default: 127.0.0.1

    CLI listener port

    The TCP port number to listen on.

    Default: 32322

    CLI listener type

    The CLI listener uses the built-in libmtev type "mtev_console" to allow access to the telnet console.

    Default: mtev_console

    pools

    NOTE: As of version 0.20.0, resource configuration from this stanza is deprecated. Fresh installations will no longer contain this stanza.

    Values from these attributes will still be respected until a future release. Deprecation messages will be logged for each pools attribute encountered in the configuration, and will include the name of the jobq that corresponds to that attribute.

    The value of the "concurrency" attribute is the first value in jobq configuration. See for details.

    Resource pools within IRONdb are used for various functions, such as reading and writing metric data. Some aspects of pool behavior are configurable, typically to adjust the number of worker threads to spawn.

    The defaults presented are widely applicable to most workloads, but may be adjusted to improve throughput. Use caution when raising these values too high, as it could produce thrashing and decrease performance.

    If in doubt, .

    pools rollup concurrency

    Deprecated

    Use jobq_rollup_raw to preserve customizations.

    The number of unique metric names (UUID + metric name) to process in parallel when performing rollups. A higher number generally causes the rollup operation to finish more quickly, but has the potential to overwhelm the storage subsystem if set too high.

    Default: 1

    These tasks compete with other readers of the raw_database, so if rollup concurrency is set higher than 4x raw_writer concurrency, it cannot be reached.

    pools nnt_put concurrency

    Deprecated

    This attribute is obsolete and may be removed from configuration files.

    The number of threads used for writing to numeric rollup files. Writes to a given rollup file will always occur in the same queue.

    Default: the number of physical CPU cores present during installation

    pools raw_writer concurrency

    Deprecated

    Use jobq_data_write to preserve customizations.

    The number of threads used for writing to the raw metrics database. Additionally, by default, IRONdb will use 4x this number of threads for reading from the raw metrics database.

    Default: 4

    pools raw_reader concurrency

    Deprecated

    Use jobq_data_read to preserve customizations.

    The number of threads used for reading from the raw metrics database.

    Default: (raw_writer concurrency * 4)

    pools rest_graphite_numeric_get concurrency

    Deprecated

    Use jobq_snowth_graphite_numeric_get to preserve customizations.

    The number of threads used for handling Graphite fetches. This is a general queue for all fetch operations, and there are two other thread pools for specific tasks within a fetch operation (see below.)

    Default: 4

    pools rest_graphite_find_metrics concurrency

    Deprecated

    Use jobq_snowth_graphite_find_metrics_local and jobq_snowth_graphite_find_metrics_remote to preserve customizations. The value for this pools attribute was interpreted as the remote concurrency, which was divided by 4 to get the local concurrency (minimum 1).

    The number of threads used for resolving metric names prior to fetch.

    Default: 4

    pools rest_graphite_fetch_metrics concurrency

    Deprecated

    Use jobq_snowth_graphite_fetch_metrics_local and jobq_snowth_graphite_fetch_metrics_remote to preserve customizations. The value for this pools attribute was interpreted as the remote concurrency, which was divided by 4 to get the local concurrency (minimum 1).

    The number of threads used for actually fetching Graphite metrics, including those local to the node and those residing on remote nodes.

    Default: 10

    REST Configuration

    This is the node under which REST API configuration items are organized.

    DELETE Configuration

    This is the node used to configure DELETE endpoint behavior.

    max_advisory_limit="<val>" attribute is used to configure how many deletes may be attempted by this operation where <val> may not be exceeded via X-Snowth-Advisory-Limit. Currently, this only affects the /full/tags endpoint.

    raw_database

    Raw numeric metrics database. This stores all ingested numeric metrics at full resolution for a configurable period of time, after which the values are rolled up and stored in one or more .

    The location and data_db attributes should not be modified.

    raw_database granularity

    Granularity controls the sharding of the raw numeric database. A shard is the unit of data that will be rolled up and removed after a configurable age and period of quiescence (no new writes coming in for that shard.)

    Do not change granularity after starting to collect data, as this will result in data loss.

    Default: 1 week

    raw_database recordsize

    Recordsize controls the amount of data stored in an individual raw record.

    Do not change recordsize after starting to collect data, as this will result in data loss.

    Default: 1 hour

    raw_database min_delete_age

    The minimum age that a shard must be before it is considered for deletion.

    Default: 4 weeks

    raw_database delete_after_quiescent_age

    The period after which a shard, if it has been rolled up and not subsequenty written to, may be deleted.

    Default: 1 day

    raw_database rollup_after_quiescent_age

    The period the system will delay after the last write to a raw shard before attempting to roll it up. New writes to the time period/shard will interrupt the rollup process and reset the quiescent timer which must again reach the rollup_after_quiescent_age before a re-roll will be attempted.

    Default: 8 hours

    raw_database startup_rollup_delay

    If an irondb instance restarted while it was doing a rollup, it will restart that rollup after it finishes booting, however it will wait startup_rollup_delay before doing so. This gives the node time to catch-up on ingestion, populate caches, and other operations it may need to do after a restart.

    Default: 30 minutes

    raw_database max_clock_skew

    Allow the submission of metrics timestamped up to this amount of time in the future, to accommodate clients with incorrect clocks.

    Default: 1 week

    raw_database conflict_resolver

    When a metric gets written more than one time at the exact millisecond offset you have a conflict we have to resolve. All operations in IRONdb are commutative and this lets us avoid complicated consensus algorithms for data. Conflicts, therefore, need to choose a winner and this choice needs to be consistent across the cluster. IRONdb gives you the following choices for conflict resolution should a datapoint appear more than once at the same millisecond.

    • abs_biggest - save the largest by absolute value.

    • last_abs_biggest - if used with the aggregation capabilities the datapoints can track a generation counter. This resolver considers the generation of the datapoint and then uses the largest by absolute value if the generations collide. If you are not using the relay, this will fall back to the same behavior as abs_biggest.

    • abs_smallest - save the smallest by absolute value.

    This setting should be the same on all nodes of the IRONdb cluster.

    This value should never be changed when data is "in flight", that is, while a cluster is actively ingesting data, or there are nodes down, or nodes are suffering replication latency.

    If you wish to change this setting after beginning to collect data, the following conditions must be met:

    • All nodes must be running and available.

    • All ingestion must be stopped.

    • All from all nodes must be completely drained and applied on the destination node.

    Once these conditions are met:

    1. Bring down all nodes.

    2. Change the value of this option in the configuration file for each node.

    3. Restart all nodes.

    Default: "abs_biggest"

    raw_database rollup_strategy

    Control how rollups are performed. By default, all levels of rollup data are calculated from the raw database as it is iterated.

    Prior to version 0.12 the default if not specified was that the lowest level of rollup was computed and then IRONdb would read this lowest level data and compute higher level rollups. This rollup strategy has been removed.

    Default: "raw_iterator"

    raw_database sync_after_full_rollup_finishes

    Enables an LMDB sync to disk after each raw shard finishes rolling up. Each shard that the raw shard rolls up into will be synced.

    Default: "false"

    raw_database sync_after_column_family_rollup_finishes

    Enables an LMDB sync to disk after each column family within a raw shard finishes rolling up. Each shard that the raw shard rolls up into will be synced.

    Default: "false"

    raw_database suppress_rollup_filter

    Metrics that match this are never rolled up and only exist in the raw database. Raw only metrics are supported for both numeric and histogram metric types. When raw shards are deleted, a verify step is done on any metric that matches the filter to determine if there is any remaining data for that metric. If there is no remaining data, the metric will be completely deleted from the .

    Default: and(__rollup:false)

    Introduced in IRONdb version 0.19.2

    nntbs

    NNTBS is the rollup storage engine for data once it proceeds past the .

    Each shard specifies a rollup using a given granularity in seconds (period).

    Shard size is the included in one shard. The minimum size for a shard is 127 * period; for a 60-second period, this would be 7620 seconds. Whatever time span you provide here will be rounded up to that multiple. For example, if you provided 1d for the period=60 shard as in the defaults above, you would actually get 91440 seconds per shard instead of 86400.

    NOTE: for installations with a high cardinality of metric names you will want to reduce the size parameters to keep the shards small to ensure performance remains consistent.

    The retention setting for each shard determines how long to keep this data on disk before deleting it permanently. retention is optional and if you don't provide it, IRONdb will keep the data forever. When a timeshard is completely past the retention limit based on the current time, the entire shard is removed from disk. In the above example, 60-second rollups are retained for 52 weeks (1 year), 5- and 30-minute rollups are retained for 104 weeks (2 years), and 3-hour rollups are retained for 520 weeks (10 years). Retention uses the same time duration specifications as size above.

    Whatever settings are chosen here cannot be changed after the database starts writing data into NNTBS (except for retention). If you change your mind about sizing you will have to wipe and reconstitute each node in order to apply new settings.

    histogram_ingest

    Raw histogram metrics database. This stores all ingested histogram metrics at full resolution for a configurable period of time, after which the values are rolled up and stored in one or more .

    The location and data_db attributes should not be modified.

    histogram_ingest granularity

    Granularity controls the sharding of the raw histogram database. A shard is the unit of data that will be rolled up and removed after a configurable age and period of quiescence (no new writes coming in for that shard.)

    Do not change granularity after starting to collect data, as this will result in data loss.

    Default: 1 week

    histogram_ingest min_delete_age

    The minimum age that a shard must be before it is considered for deletion.

    Default: 4 weeks

    histogram_ingest delete_after_quiescent_age

    The period after which a shard, if it has been rolled up and not subsequenty written to, may be deleted.

    Default: 1 day

    histogram_ingest rollup_after_quiescent_age

    The period the system will delay after the last write to a shard before attempting to roll it up. New writes to the time period/shard will interrupt the rollup process and reset the quiescent timer which must again reach the rollup_after_quiescent_age before a re-roll will be attempted.

    Default: 8 hours

    histogram_ingest max_clock_skew

    Allow the submission of metrics timestamped up to this amount of time in the future, to accommodate clients with incorrect clocks.

    Default: 1 week

    histogram

    The histogram rollup database for data once it proceeds past the . Rollups must be individually configured with a period, granularity, and optional retention period.

    Whatever settings are chosen here cannot be changed after the database starts writing data (except for retention). If you change your mind about sizing you will have to wipe and reconstitute each node in order to apply new settings.

    histogram rollup period

    The period defines the time interval, in seconds, for which histogram metrics will be aggregated into the rollup.

    histogram rollup granularity

    Shard granularity is the included in one shard. The granularity must be divisible by the period and will be rounded up if not compatible.

    NOTE: for installations with a high cardinality of metric names you will want to reduce the granularity parameters to keep the shards small to ensure performance remains consistent.

    histogram rollup retention

    Shard retention is the that determines how long to keep this rollup data on disk before deleting it permanently.

    retention is optional and the default behavior is to keep the rollup data forever.

    When a rollup timeshard is completely past the retention limit based on the current time, the entire shard is removed from disk.

    Introduced in IRONdb version 0.23.7

    surrogate_database

    The surrogate database contains bidirectional mappings between full metric names (including tags) and integer-based keys which are used internally to refer to metrics. It also records on each metric.

    Data files are stored on disk and memory-mapped on demand when metrics are referenced by queries (read) or ingestion (write).

    surrogate_database location

    This is the location of the surrogate database on disk.

    This field is required; there is no default location if left unspecified.

    surrogate_database implicit_latest

    Toggle for maintaining an in-memory copy of the latest values for all newly seen metrics values during ingestion. If set to false, it will only maintain latest values for metrics that have been specifically "asked for" via a .

    Default: false

    surrogate_database latest_future_bound

    This is the upper bound on whether a metric will be considered as a "latest value" candidate. By default if a metric timestamp is more than 4 hours in the future, it will be ignored for consideration as a replacement for the latest value. These values are only updated at ingestion time.

    This value can be from 0s (ignore any future timestamps) to 4h (maximum).

    Default: 4h

    surrogate_database runtime_concurrency

    This value allows users to set the number of concurrent surrogate database reader threads available.

    Default: IRONdb will retrieve a hint about the number of available hardware threads and use this value.

    surrogate_database max_page_size

    When performing surrogate lookups in batches, IRONdb uses individual "pages" of results to prevent the system from getting overloaded. This setting specifies the maximum number of results that can be returned in a single page.

    Default: 50,000

    surrogate_database capacity_per_reader_shard

    When looking up surrogates, readers will store the results in both a id-to-metric-name and a metric-name-to-id lookup tables on each lookup thread so that future lookups will be much faster. These tables will pre-allocate space for these so that new space does not need to be allocated on the fly when new entries are added, improving lookup time. This field sets what the amount of space to pre-allocate in a reader is. Once this limit has been reached, future results will be allocated manually and may require internal rehashes, slowing the system down.

    Default: 96,000,000 divided by the number of threads specified in runtime_concurrency.

    surrogate_database compaction

    compaction is a sub-field of surrogate_database. Within it, you can define compaction levels. There are two levels that can be configured: metadata (for basic metric information and mapping) and activity (for collection activity data). Each of these may only be defined once, and any other type value is invalid. A sample configuration might look like this:

    Each level for a type consists of a set of restrictions that determine when the individual files that make up the surrogate database are compacted; this allows, for example, small files to always compact with other small files, large files to only compact with large files, and so on. This reduces the strain on the system that could be caused by doing too frequent compactions or compacting files that do not need to be compacted.

    If a level is defined, all fields within it are required. An arbitrary number of level elements can be defined under levels. IRONdb has a sane set of default configurations that are used if no level data is provided; generally speaking, it is not recommended to define or adjust these fields unless you know exactly what you're doing and know why you're adjusting them.

    The fields within each level are as follows:

    level level_name

    The name of the level. This is used internally for debug logging.

    level min_file_size

    The minimum size of a single file to consider for compaction. Files smaller than this will not be considered for compaction at this level.

    level max_file_size

    The maximum size of a single file to consider for compaction. Files larger than this will not be considered for compaction at this level.

    level min_number_file_budget

    The minimum number of files to compact at a time for the level. If there are fewer files than this that match the criteria, a compaction will not run at this level.

    level max_number_file_budget

    The maximum number of files to compact at a time. If there are more files than this, then multiple compactions will run.

    level selection_phase_scan_budget

    The maximum number of files to scan in a single pass through the database.

    level compaction_phase_scan_budget

    The maximum number of surrogates to scan in a single pass through the database.

    level selection_phase_scan_skip

    The number of files to skip before starting the selection phase.

    metric_name_database

    This database stanza controls where IRONdb keeps certain aspects of its indexes.

    The database of stored metric names. This database is used to satisfy graphite /metrics/find queries. By default, this database will cache 1000 queries for 900 seconds. Any newly arriving metric names will invalidate the cache so subsequent queries are correct.

    metric_name_database enable_level_indexing

    Level indexing is used for graphite-style query acceleration. For large clusters that do not user graphite-style metrics, it may improve memory/CPU utilization to disable this index.

    Default: true

    metric_name_database materialize_after

    The number of mutations that must occur before the system will flush to disk and trigger a compaction to occur, draining the jlog of queued updates.

    Default: 100,000

    metric_name_database location

    The location on disk where the database files reside.

    metric_name_database query_cache_size

    The number of incoming graphite/find queries to cache the results for.

    Default: 1000

    metric_name_database query_cache_timeout

    The number of seconds that cached queries should remain in the cache before being expired.

    Default: 900

    metric_name_database enable_saving_bad_level_index_jlog_messages

    Enables saving of invalid jlog messages found when attempting to replay the jlog in the metric name database to build the indexes. The messages will be saved within the metric name database location for the account on which the error occurred in a folder called bad_flatbuffer_messages.

    Default: "false"

    journal

    Journals are write-ahead logs for replicating metric data to other nodes. Each node has one journal for each of its cluster peers.

    journal concurrency

    Establishes this number of concurrent threads for writing to each peer journal, improving ingestion throughput.

    Default: 4

    A concurrency of 4 is enough to provide up to 700K measurements/second throughput, and is not likely to require adjustment except in the most extreme cases.

    journal replicate_concurrency

    Attempt to maintain this number of in-flight HTTP transactions, per peer journal, for posting replication data to peers. Higher concurrency helps keep up with ingestion at scale.

    Each thread reads a portion of the journal log and is responsible for sending that portion to the peer. When it finishes its portion, and there are fewer than replicate_concurrency other jobs in flight for that peer, it skips ahead to the next "unclaimed" portion of the log and resumes sending.

    Default: 4

    Prior to version 0.15.3, the default was 1.

    journal max_bundled_messages

    Outbound journal messages will be sent in batches of up to this number, improving replication speed.

    Default: 50000

    journal max_total_timeout_ms

    A node sending replication journals to its peers will allow up to this amount of time, in milliseconds, for the remote node to receive and process a batch. If nodes are timing out while processing incoming journal batches, increasing this timeout may give them enough time, avoiding repeatedly sending the same batch.

    Default: 10000 (10 seconds)

    journal pre_commit_size

    An in-memory buffer of this number of bytes will be used to hold new journal writes, which will be flushed to the journal when full. This can improve ingestion throughput, at the risk of losing up to this amount of data if the system should fail before commit. To disable the pre-commit buffer, set this attribute to 0.

    Default: 131072 (128 KB)

    journal send_compressed

    When sending journal messages to a peer, compress the messages before sending to save bandwidth, at the cost of sligtly more CPU usage. The bandwidth savings usually outweigh the cost of compression.

    Default: true

    journal use_indexer

    Spawn a dedicated read-ahead thread to build indexes of upcoming segments in the write-ahead log for each remote node. This is only needed in the most extreme cases where the highest replication throughput is required. Almost all other installations will not notice any slowdown from indexing "on demand", as new segments are encountered.

    Note that this will spawn one extra thread per journal (there is one journal for every remote node in the cluster.) For example, activating this feature will spawn 15 additional threads on each node in a 16-node cluster.

    Default: false

    topology

    The topology node instructs IRONdb where to find its current cluster configuration. The path is the directory where the imported topology config lives, which was created during setup. active indicates the hash of the currently-active topology. next is currently unused. The redo path is where are located for this topology.

    No manual configuration of these settings is necessary.

    Module Config

    The that provide support for ingesting Graphite and/or OpenTSDB data have optional configuration, described below. These settings are placed in the main irondb.conf file, as children of the <snowth> node (i.e., peers of <logs>, <topology>, etc.) If omitted, the defaults shown below will be used.

    Graphite Config

    graphite max_ingest_age

    The maximum offset into the past from "now" that will be accepted. Value may be any valid . If importing older data, it may be necessary to increase this value.

    Default: 1 year

    graphite min_rollup_span_ms

    The smallest rollup period that is being collected. This prevents gaps when requesting data at shorter intervals.

    Default: 1 minute

    graphite whisper

    The whisper entity configures . Each entity refers to the top of a directory hierarchy containing Whisper database files. This directory may exist on a local filesystem, or on a shared network-filesystem mountpoint. Any Whisper databases discovered in scanning this directory hierarchy with the whisper_loader tool (see link above) will be indexed for searching and querying.

    Note that regardless of filesystem choice, it is highly desirable to mount it read-only on each cluster node. This becomes a requirement if using a shared storage volume in the cloud.

    Multiple whisper entitites may be configured, each representing a logically distinct Graphite installation. Using different values for check_uuid and (potentially) account_id will segregate these metrics from others.

    graphite whisper directory

    The directory attribute is required, and indicates the start of a hierarchy of directories containing Whisper database files. This path may exist on the local filesystem, or on a network-mounted filesystem.

    For example, to locate a Whisper database stored at /opt/graphite/storage/whisper/foo/bar.wsp, set the directory attribute to "/opt/graphite/storage/whisper". The metric will be indexed as foo.bar.

    Each whisper entity must have a unique, non-overlapping directory value. For example, it is an error to configure one with /foo and another with /foo/bar.

    graphite whisper check_uuid

    The check_uuid attribute is required, and the contained metrics within IRONdb. This UUID may be arbitrarily chosen, but if the metrics in this collection are the same as those being currently ingested directly into IRONdb, it may be desirable to use the same check_uuid value as the corresponding .

    graphite whisper account_id

    The account_id attribute is required, and the contained metrics within IRONdb. This ID may be arbitrarily chosen, but if the metrics in this collection are the same as those being currently ingested directly into IRONdb, it may be desirable to use the same account_id value as the corresponding .

    graphite whisper end_epoch_time

    The end_epoch_time is optional and represents the last timestamp for which there is whisper data. The timestamp is provided as an epoch timestamp, in seconds. If a fetch has a start time after the provided time, the node will not look in the whisper file in order to be more efficient. If this field is not provided, the whisper files will be checked regardless of the start time of the fetch.

    OpenTSDB Config

    opentsdb max_ingest_age

    The maximum offset into the past from "now" that will be accepted. Value may be any valid . If importing older data, it may be necessary to increase this value.

    Default: 1 year

    TLS Configuration

    As of version 1.1.0, IRONdb supports TLS for both client and intra-cluster communications. This is currently an alpha feature, for testing only.

    Due to certificate verification requirements, two sets of cryptographic keys and associated certificates are required:

    1. Intra-cluster communication: cluster nodes exchange information and replicate metric data using port 8112, and they use the node UUID as the hostname for all requests. When TLS is used, the certificates for this listener must use the node UUID as the certificate CommonName (CN).

    2. External client connections: since it would be awkward for external clients to verify a CN that is just a UUID, a second listener is added, using port 8443 and having its certificate CN set to the host's FQDN. This matches the expectation of clients connecting to the node to submit metrics or run queries.

    The will automatically configure TLS listeners on a fresh installation when the -t option or the IRONDB_TLS environment variable is set to on.

    The following files must be present on each node in order for the service to work properly with TLS. Place them in /opt/circonus/etc/ssl:

    • cluster.key - An RSA key for the intra-cluster listener.

    • cluster.crt - A certificate issued for the intra-cluster listener. Its commonName (CN) must be the node's UUID.

    • cluster-ca.crt - The Certificate Authority's public certificate, sometimes referred to as an intermediate or chain cert, that issued cluster.crt.

    Converting To TLS

    To update an existing cluster to use TLS, several things need to change.

    1. A modified topology configuration that indicates TLS should be used for intra-cluster communication.

    2. Changes to listener configuration to specify locations for key, certificate, and CA chain certificate, add a new listener port for external clients, and to activate TLS.

    3. Changes to metric submission pipelines and any visualization tools to use the new, externally-verifiable listener. This could include tools such as graphite-web or Grafana, as well as .

    The first two items will be done on all IRONdb nodes. The third item will vary depending on the specifics of the metric submission pipeline(s) and visualization platforms.

    NOTE: because of the nature of this change, there will be disruption to cluster availability as the new configuration is rolled out. Nodes with TLS active will not be able to communicate with nodes that do not have TLS active, and vice versa.

    Update Topology

    The active topology for a cluster will be located in the/opt/circonus/etc/irondb-topo directory, as a file whose name matches the topology hash. This hash is recorded in /opt/circonus/etc/irondb.conf as the value for the active attribute within the <topology> stanza, e.g.

    Edit the /opt/circonus/etc/irondb-topo/<hash> file and add the use_tls="true" attribute to the nodes line:

    Distribute the updated file to all nodes in the cluster.

    Update Listeners

    In /opt/circonus/etc/irondb.conf, locate the <listeners> stanza. The listeners that will be changing are the ones for port 8112 and, if used, the Graphite listener on port 2003.

    In a default configuration, the non-TLS listeners look like this:

    The Graphite check_uuid and account_id may differ from the above. Preserve those values in the new listener config.

    Replace the above listener configs with this, ensuring that it is within the opening and closing listeners tags, and substituting your Graphite check UUID and account ID from the original config:

    Generate and/or obtain the above key and certificate files, ensuring they are placed in the correct location as set in the listener sslconfig configuration.

    Included Files

    circonus-watchdog.conf

    watchdog

    The watchdog configuration specifies a handler, known as a "glider", that is to be invoked when a child process crashes or hangs. See the .

    If is turned on, the glider is what invokes the tracing, producing one or more files in the tracedir. Otherwise, it just reports the error and exits.

    irondb-eventer.conf

    The eventer configuration contains .

    This file contains default settings for event loops and job queues. Overrides should be placed in irondb-eventer-site.conf.

    Event Loop Configuration

    Settings in here should generally not be changed unless directed by Apica Support.

    Job Queue Configuration

    Many parts of IRONdb's functionality are handled within pools of threads that form "job queues" (abbreviated as jobq). Any actions that may block for some period of time, such as querying for data, performing rollups, etc. are handled asynchronously via these queues.

    The value of each jobq_NAME is one or more comma-separated values:

    Concurrency is required; all others are optional, but position is significant. For example, overriding the backlog value will require min, max, and memory_safety to be filled in as well.

    As with event loop settings, the job queue defaults are suitable for a wide range of workloads, so changes should be carefully tested to ensure they do not reduce performance or cause instability.

    To override a jobq named foo, which might be defined by default as:

    Place a line in the site configuration file with one or more different values, preserving the others:

    The above would increase the desired concurrency from 4 to 8, keeping the minimum of 1 and maximum of 24.

    irondb-eventer-site.conf

    See the comment at the top of the file for how to override eventer settings. This file is included from irondb-eventer.conf.

    This file's contents will be preserved across package updates.

    irondb-modules.conf

    Contains options for vendor-supplied .

    Settings in this file should not be changed.

    irondb-modules-site.conf

    See the comment at the top of the file for how to configure optional modules. This file is included from irondb-modules.conf.

    This file's contents will be preserved across package updates.

    irondb-extensions-site.conf

    See the comment at the top of the file for how to add or override extension configuration. This file is included from irondb-modules.conf.

    This file's contents will be preserved across package updates.

    licenses.conf

    This file holds any and all licenses that apply to this IRONdb node. Refer to the for details on obtaining and installing licenses.

    In a cluster, the license configuration must be the same on all cluster nodes.

    If no license is configured, an embedded license is used, which enables all features described below with a limit of 25,000 active streams (max_streams).

    Licensed Features

    The IRONdb license governs the following functionality:

    License Term

    Name: <expiry>

    After this unix timestamp the license is invalid and will no longer work for any of the below.

    Ingest Cardinality

    Name: <max_streams>

    How many unique time series (uniquely named streams of data) this installation can ingest in the most recent 5-minute period.

    This number applies to all nodes in the cluster although each node applies this restriction individually. The math for unique streams is an estimate in the past 5 minutes and you are given a 15% overage before ingestion is affected.

    If this license is violated, ingestion will stop for the remainder of the 5-minute period that the violation was detected. After the 5-minute period ends, the counter will reset to test the new 5-minute period.

    Enablement of Lua Extensions

    Name: <lua_extension>

    Whether or not Lua extensions will operate.

    Stream Tags Support

    Name: <stream_tags>

    Whether or not stream tag related API calls and stream tag ingestion will work. If you do not have this license and stream tagged data arrives it will be silently discarded.

    Histogram Support

    Name: <histograms>

    Whether or not histograms can be ingested. If you do not have this license and attempt to ingest histogram data it will be silently discarded.

    Text Metric Support

    Name: <text>

    Whether or not text metrics can be ingested. If you do not have this license and attempt to ingest text data it will be silently discarded.

    Obtain A License

    If you are interested in any of the above functionality and do not currently have a license please contact to upgrade your license.

    : Additional non-error initialization output.
    • Rotated: 24 hours

    • Retained: 1 week

  • /irondb/logs/accesslog: Logs from the REST API, including metric writes and reads as well as inter-node communication.

    • Rotated: 1 hour

    • Retained: 1 week

  • last_abs_smallest - same as last_abs_biggest but smallest instead.

  • last_biggest - same as last_abs_biggest but uses the largest without absolute value.

  • last_smallest - same as last but smallest.

  • biggest - the larger value without absolute.

  • smallest - the smaller value without absolute.

  • client.key - An RSA key for the external client listener.
  • client.crt - A certificate issued for the external client listener. Its commonName (CN) should match the hostname used to connect to the node, typically its FQDN.

  • client-ca.crt - The Certificate Authority's public certificate, sometimes referred to as an intermediate or chain cert, that issued client.crt.

  • ​
    ​
    ​
    libmtev logging documentation
    ​
    libmtev time duration
    libmtev listener documentation
    ​
    HTTP REST API
    Cluster replication
    Operations Dashboard
    ​
    ​
    ​
    ​
    ​
    ​
    ​
    TLS Configuration
    ​
    plaintext format
    Graphite ingestion
    ​
    ​
    ​
    ​
    initial installation
    well-formed, non-nil
    ​
    pickle format
    ​
    telnet console
    ​
    ​
    ​
    Job Queue Configuration
    contact support
    ​
    ​
    ​
    ​
    ​
    ​
    ​
    ​
    period-specific files
    ​
    ​
    ​
    ​
    ​
    ​
    ​
    ​
    IRONdb-relay
    journals
    ​
    ​
    ​
    ​
    tag query
    surrogate database
    raw database
    time span
    period-specific files
    ​
    ​
    ​
    ​
    ​
    raw histogram database
    ​
    ​
    time span
    ​
    time span
    collection activity periods
    ​
    ​
    tag search
    ​
    ​
    ​
    ​
    ​
    ​
    ​
    ​
    ​
    ​
    ​
    ​
    ​
    ​
    ​
    ​
    ​
    JLog
    journals
    integration modules
    ​
    libmtev time duration
    ​
    ​
    read access to Whisper database files
    ​
    ​
    namespaces
    listener
    ​
    namespaces
    listener
    ​
    ​
    libmtev time duration
    installer script
    IRONdb Relay
    ​
    ​
    ​
    libmtev watchdog documentation
    crash handling
    libmtev eventer configuration
    ​
    ​
    libmtev dynamically-loadable modules
    installation steps
    ​
    ​
    ​
    ​
    ​
    ​
    ​
    ​
    [email protected]
    <snowth lockfile="/irondb/logs/snowth.lock" text_size_limit="512">
    <cache cpubuckets="128" size="0"/>
    <log name="debug/old_data" disabled="false"/>
    <old_data_logging metric_age_threshold="7d"/>
    <listener address="*" port="8112" backlog="100" type="http_rest_api" accept_thread="on" fanout="true" ssl="off">
      <config>
        <document_root>/opt/circonus/share/snowth-web</document_root>
      </config>
    </listener>
    <listener address="*" port="2003" type="graphite">
      <config>
        <check_uuid>3c253dac-7238-41a1-87d7-2e546f3b4318</check_uuid>
        <account_id>1</account_id>
      </config>
    </listener>
    <listener address="127.0.0.1" port="32322" type="mtev_console">
      <config>
        <line_protocol>telnet</line_protocol>
      </config>
    </listener>
    <pools>
      <rollup concurrency="1"/>
      <nnt_put concurrency="16"/>
      <raw_writer concurrency="4"/>
      <raw_reader concurrency="16"/>
      <rest_graphite_numeric_get concurrency="4"/>
      <rest_graphite_find_metrics concurrency="4"/>
      <rest_graphite_fetch_metrics concurrency="10"/>
    </pools>
    <rest>
      <acl>
        <rule type="allow" />
      </acl>
      <delete max_advisory_limit="10000" />
    </rest>
    <rest>
      <delete max_advisory_limit="<val>"/>
    </rest>
    <raw_database location="/irondb/raw_db/{node}"
                  data_db="nomdb"
                  granularity="1w"
                  recordsize="1h"
                  min_delete_age="4w"
                  delete_after_quiescent_age="1d"
                  rollup_after_quiescent_age="8h"
                  startup_rollup_delay="30m"
                  max_clock_skew="1w"
                  conflict_resolver="abs_biggest"
                  rollup_strategy="raw_iterator"
                  sync_after_full_rollup_finishes="false"
                  sync_after_column_family_rollup_finishes="false"
                  suppress_rollup_filter="and(__rollup:false)"
    />
    <nntbs path="/irondb/nntbs/{node}">
      <shard period="60" size="1d" retention="52w" />
      <shard period="300" size="5d" retention="104w" />
      <shard period="1800" size="30d" retention="104w" />
      <shard period="10800" size="180d" retention="520w" />
    </nntbs>
    <histogram_ingest location="/irondb/hist_ingest/{node}"
                  data_db="nomdb"
                  granularity="7d"
                  min_delete_age="4w"
                  delete_after_quiescent_age="1d"
                  rollup_after_quiescent_age="8h"
                  max_clock_skew="1w"
    />
    <histogram location="/irondb/hist_rollup/{node}">
      <rollup period="60" granularity="7d"/>
      <rollup period="300" granularity="30d"/>
      <rollup period="1800" granularity="12w"/>
      <rollup period="10800" granularity="52w"/>
      <rollup period="86400" granularity="260w"/>
    </histogram>
    <surrogate_database location="/irondb/surrogate_db/{node}"/>
    <surrogate_database location="/irondb/surrogate_db/{node}">
      <compaction>
        <levels type="metadata">
          <level
            level_name="level1"
            min_file_size="1B"
            max_file_size="512MiB"
            min_number_file_budget="2"
            max_number_file_budget="8"
            selection_phase_scan_budget="200000"
            compaction_phase_scan_budget="100000"
            selection_phase_scan_skip="50"
          />
          <level
            level_name="level2"
            min_file_size="10B"
            max_file_size="5120MiB"
            min_number_file_budget="2"
            max_number_file_budget="8"
            selection_phase_scan_budget="200000"
            compaction_phase_scan_budget="100000"
            selection_phase_scan_skip="100"
          />
        </levels>
        <levels type="activity">
          <level
            level_name="oil_micro"
            min_file_size="1B"
            max_file_size="32MiB"
            min_number_file_budget="2"
            max_number_file_budget="64"
            selection_phase_scan_budget="1000"
            compaction_phase_scan_budget="10000"
            selection_phase_scan_skip="0"/>
          <level
            level_name="oil_micro_l2"
            min_file_size="1B"
            max_file_size="64MiB"
            min_number_file_budget="2"
            max_number_file_budget="64"
            selection_phase_scan_budget="10000"
            compaction_phase_scan_budget="10000"
            selection_phase_scan_skip="64"/>
          <level
            level_name="oil_mini"
            min_file_size="64MiB"
            max_file_size="512MiB"
            min_number_file_budget="2"
            max_number_file_budget="8"
            selection_phase_scan_budget="1000"
            compaction_phase_scan_budget="100"
            selection_phase_scan_skip="128"/>
          <level
            level_name="oil_regular"
            min_file_size="512MiB"
            max_file_size="2GiB"
            min_number_file_budget="2"
            max_number_file_budget="4"
            selection_phase_scan_budget="1000"
            compaction_phase_scan_budget="100"
            selection_phase_scan_skip="128"/>
        </levels>
      </compaction>
    </surrogate_database>
    <metric_name_database location="/irondb/metric_name_db/{node}"
                  enable_level_indexing="true"
                  materialize_after="100000"
                  query_cache_size="1000"
                  query_cache_timeout="900"
                  enable_saving_bad_level_index_jlog_messages="false"
    />
    <journal concurrency="4"
             replicate_concurrency="4"
             max_bundled_messages="50000"
             max_total_timeout_ms="10000"
             pre_commit_size="131072"
             send_compressed="true"
             use_indexer="false"
    />
    <topology path="/opt/circonus/etc/irondb-topo"
              active="(hash value)"
              next=""
              redo="/irondb/redo/{node}"
    />
    <graphite min_rollup_span_ms="60000" max_ingest_age="365d">
      <whisper directory="/opt/graphite/storage/whisper"
               check_uuid="3c253dac-7238-41a1-87d7-2e546f3b4318"
               account_id="1"
               end_epoch_time="1780000000"
      />
    </graphite>
    <opentsdb max_ingest_age="365d"/>
      <!-- Cluster definition -->
      <topology path="/opt/circonus/etc/irondb-topo"
                active="98e4683192dca2a2c22b9a87c7eb6acecd09ece89f46ce91fd5eb6ba19de50fb"
                next=""
                redo="/irondb/redo/{node}"
      />
    -<nodes write_copies="2">
    +<nodes write_copies="2" use_tls="true">
        <listener address="*" port="8112" backlog="100" type="http_rest_api" accept_thread="on" fanout="true">
          <config>
            <document_root>/opt/circonus/share/snowth-web</document_root>
          </config>
        </listener>
    
       <listener address="*" port="2003" type="graphite">
          <config>
            <check_uuid>6a07fd71-e94d-4b67-a9bc-29ac4c1739e9</check_uuid>
            <account_id>1</account_id>
          </config>
        </listener>
        <!--
          Intra-cluster listener. Used for gossip and replication.
        -->
        <cluster>
          <sslconfig>
            <!-- Certificate CNs MUST match node UUIDs assigned in the current topology. -->
            <certificate_file>/opt/circonus/etc/ssl/cluster.crt</certificate_file>
            <key_file>/opt/circonus/etc/ssl/cluster.key</key_file>
            <ca_chain>/opt/circonus/etc/ssl/cluster-ca.crt</ca_chain>
            <layer_openssl_10>tlsv1.2</layer_openssl_10>
            <layer_openssl_11>tlsv1:all,>=tlsv1.2,cipher_server_preference</layer_openssl_11>
            <ciphers>ECDHE+AES128+AESGCM:ECDHE+AES256+AESGCM:DHE+AES128+AESGCM:DHE+AES256+AESGCM:!DSS</ciphers>
          </sslconfig>
          <listener address="*" port="8112" backlog="100" type="http_rest_api" accept_thread="on" fanout="true" ssl="on">
            <config>
              <document_root>/opt/circonus/share/snowth-web</document_root>
            </config>
          </listener>
        </cluster>
    
        <!-- Client-facing listeners. -->
        <clients>
          <sslconfig>
            <!-- Certificate CNs should be the FQDN of the node. -->
            <certificate_file>/opt/circonus/etc/ssl/client.crt</certificate_file>
            <key_file>/opt/circonus/etc/ssl/client.key</key_file>
            <ca_chain>/opt/circonus/etc/ssl/client-ca.crt</ca_chain>
            <layer_openssl_10>tlsv1.2</layer_openssl_10>
            <layer_openssl_11>tlsv1:all,>=tlsv1.2,cipher_server_preference</layer_openssl_11>
            <ciphers>ECDHE+AES128+AESGCM:ECDHE+AES256+AESGCM:DHE+AES128+AESGCM:DHE+AES256+AESGCM:!DSS</ciphers>
          </sslconfig>
    
          <!-- Used for HTTP metric submission, admin UI. -->
          <listener address="*" port="8443" backlog="100" type="http_rest_api" accept_thread="on" fanout="true" ssl="on">
            <config>
              <document_root>/opt/circonus/share/snowth-web</document_root>
            </config>
          </listener>
    
          <!--
            Graphite listener
              This installs a network socket graphite listener under the account
              specified by <account_id>.
          -->
          <listener address="*" port="2003" type="graphite" ssl="on">
            <config>
              <check_uuid>GRAPHITE_CHECK_UUID</check_uuid>
              <account_id>ACCOUNT_ID</account_id>
            </config>
          </listener>
        </clients>
    <watchdog glider="/opt/circonus/bin/backwash" tracedir="/opt/circonus/traces"/>
    concurrency[,min[,max[,memory_safety[,backlog]]]]
    <jobq_foo>4,1,24</jobq_foo>
    <jobq_foo>8,1,24</jobq_foo>