This document helps you understand tools needed to maintain Cassandra cluster. Following tools you may use to maintain Cassandra ring.
Tools:
- nodetool: The nodetool utility is a command line interface for managing a cluster. refer “nodetool help” for more options with node tool. Listed below some of the nodetool command option you would most likely use.
OptionsShortLongDescription-h--hostHostname or IP address.-p--portPort number.-pwf--password-filePassword file path.-pw--passwordPassword.-u--usernameRemote JMX agent username.- nodeool compact:
- Force a major compaction on one or more tables.
- nodetool repair:
- When nodetool repair is run against a node it initiates a repair for some range of tokens. The range being repaired depends on what options are specified. The default options, just calling “nodetool repair”, initiate a repair of every token range owned by the node. The node you issued the call to becomes the coordinator for the repair operation, and it coordinates repairing those token ranges between all of the nodes that own them.
- When you use “nodetool repair -pr” each node picks a subset of its token range to schedule for repair, such that if “-pr” is run on EVERY node in the cluster, every token range will only be repaired once. What that means is, when ever you use -pr, you need to be repairing the entire ring (every node in every data center). If you use “-pr” on just one node, or just the nodes in one data center, you will only repair a subset of the data on those nodes.
- When running repair to fix a problem, like a node being down for longer than the hint windows, you need to repair the entire token range of that node. So you can’t just run “nodetool repair -pr” on it. You need to initiate a full “nodetool repair” on it, or do a full cluster repair with “nodetool repair -pr”.
- If you have multiple data centers, by default when running repair all nodes in all data centers will sync with each other on the range being repaired.
- Repairs are important for every Cassandra cluster, especially when frequently deleting data. Running the
nodetool repaircommand initiates the repair process on a specific node which in turn computes a Merkle tree for each range of data on that node. The merkle tree is a binary tree of hashes used by Cassandra for calculating the differences in datasets between nodes in a cluster. Every time a repair is carried out, the tree has to be calculated, each node that is involved in the repair has to construct its merkle tree from all the sstables it stores making the calculation very expensive. This allows for repairs to be network efficient as only targeted rows identified by the merkle tree as inconsistencies are sent across the network. - Scanning every sstable to allow for the creation of merkle trees is an expensive operation. To avoid the need for constant tree construction incremental repairs are being introduced in Cassandra 2.1. The idea is to persist already repaired data, and only calculate merkle trees for sstables that haven’t previously undergone repairs allowing the repair process to stay performant and lightweight even as datasets grow so long as repairs are run frequently.
- node tool repair – should be running a repair at a frequency that is less than the
GC_GRACE_SECONDS - nodetool repair –options
- nodetool repairnodetool repair -pr
nodetool repair -incnode tool repair -snapshot
- nodetool repairnodetool repair -pr
- nodetool gestates:
- nodetool flush:
- nodetool netstats:
- node removenode
- nodetool rebuild
- nodetool snapshot
- nodetool table stats
- nodetool tpstats
- nodetool status
- nodetool upgradesstables
- nodetool stop
- nodetool failuredetector
- nodetool info
- nodetool info
- nodetool tpstats
- nodetool tpstats
- nodetool status
- sshnp admin@10.33.XXX.XX ‘nodetool status’
- nodetool cfstats
- nodetool cfstats finance.custprofile
- nodeool compact:
- A
Maintenance
- Repairing Nodes
- Backing up Cassandra database
- Restoring from a snapshot
- Restoring a snapshot into a new cluster
- Recovering from a single disk failure using JBOD.
- Steps for recovering from a single disk failure in a disk array using JBOD (just a bunch of disks).
- Cassandra might not fail from the loss of one disk in a JBOD array, but some reads and writes may fail when:
- The operation’s consistency level is ALL.
- The data being requested or written is stored on the defective disk.
- The data to be compacted is on the defective disk.
- It’s possible that you can simply replace the disk, restart Cassandra, and run nodetool repair. However, if the disk crash corrupted the Cassandra system table, you must remove the incomplete data from the other disks in the array. The procedure for doing this depends on whether the cluster uses vnodes or single-token architecture.
- These steps are supported for Cassandra versions 3.2 and later. If a disk fails on a node in a cluster using an earlier version of Cassandra, replace the node.
- Replacing a dead node or dead seed node.
One thought on “Cassandra Tools & Maintenance”