Building
The Perfect Cassandra Test Environment
A month
back, one of our clients asked us to set up 15 individual single-node Cassandra
instances, each of which would live in 64MB of RAM and each of which would
reside on the same machine. My first response was “Why!?”
Qualities
Of An Ideal Cassandra Test Framework
So what are the
qualities of an ideal Cassandra test framework?
Light-weight
and available — A good test framework will take up as little resources as
possible and be accessible right when you want it.
Parity
with Production — The test environment should perfectly simulate the
production environment. This is a no-brainer. After all what good does it do
you to pass a test only to wonder whether or not an error lurks in the
differences between the test and production environments?
Stateless —
Between running tests, there’s no reason to keep any information around. So why
not just throw it all away?
Isolated —
Most often there will be several developers on a team, and there’s a good
chance they’ll be testing things at the same time. It’s important to keep each
developer quarantined from the others.
Fault
Resistant — Remember, we’re a little concerned here that Cassandra is
going to be a resource hog or otherwise just not work. Being “fault resistant”
means striking the right balance so that Cassandra takes up as little resources
as possible without actually failing.
Implementing
The Ideal Cassandra Test Framework
The first
thing to do is to set up the test environment on a per-developer basis. This
means changing a few paths. From cassandra.yaml change
data_file_directories:
- /home/jberryman/cassandra/data
commitlog_directory:
/home/jberryman/cassandra/commitlog
saved_caches_directory:
/home/jberryman/saved_caches
And then
in log4j-server.properties change
log4j.appender.R.File=/home/jberryman/cassandra/system.log
Next,
it’s a good idea to create a wrapper around whatever client you’re using. This
has several benefits. For one thing, creating a wrapper provides a guard
against the client changing from under you. This is especially important right
now since so many clients are scrambling to be CQL3 compliant. This wrapper is
also a great place to stick any safeguards against horking up your production
data when you thinkyou’re running a test. Perhaps the easiest way to
safeguard against this is to issue the CQL DESCRIBE CLUSTER statement
and make sure that the cluster name is “TestCluster”. (If your CQL client
doesn’t honor this statement, you can just create a keyspace called
“Yes_ThisIsIndeedATestCluster” and test for its existence.) Once the wrapper is
complete, it can be used with functional parity on both the test and production
cluster.
The
simplest way to make Cassandra light weight is to simply declare it so! In
cassandra-env.sh, simply change
MAX_HEAP_SIZE="64M"
HEAP_NEWSIZE="12M"
However,
just because you have now declared Cassandra to be light weight doesn’t mean
that it will JustWork™. Given this little heap space to move in, Cassandra will
happily toss you an OutOfMemory error on it’s first SSTable flush or compaction
or garbage collection. To guard against this we have a bit of work to do!
The first
thing to do is to reduce the number of threads, especially for reading and
writing. In cassandra.yaml there are several changes to make:
rpc_server_type:
hsha
Here,
hsha stands for “half synchronous, half asynchronous.” This makes sure that all
thrift clients are handled asynchronously using a small number of threads that
do not vary with the amount of thrift clients.
concurrent_reads:
2;
concurrent_writes:
2
rpc_min_threads:
1;
rpc_max_threads:
1
As
stated, the first two lines limit the number of reads and writes that can
happen at the same time. 2 is the minimum number allowed here. The second two
lines limit how many threads are available used for serving requests.
Everything to this point serves to make sure that writes and reads can not
overpower Cassandra during flushes and compactions. Next up:
concurrent_compactors:
1
If you
are using SSD’s then this will limit the number of compactors to 1. If you’re
using spinning magnets, then you’re already limited to a single
concurrent compactor.
Next we
need to make sure that we do everything we can so that compaction is not hindered.
One setting here:
compaction_throughput_mb_per_sec:
0
This
disables compaction throttling completely so that compaction has full reign
over other competing priorities.
Next we
turn all the knobs on memory usage as low as possible:
in_memory_compaction_limit_in_mb:
1
This is
the minimal limit for allowing compaction to take place in memory. With such a
low setting, much of compaction will take place in a 2-pass method that is I/O
intensive — but I/O is not the thing we’re worried about!
key_cache_size_in_mb: 0
At the expense
of read times, we can do away with key caches. But this may not even be
necessary because we can do even better:
reduce_cache_sizes_at:
0
reduce_cache_capacity_to:
0
The first
line say “As soon as you’ve used up this much memory, then reduce
cache capacity.” And since this is set to 0, cache capacity is reduced just
about as soon as Cassandra starts being used. The second line then dictates
that the caches should effectively not be used at all.
Finally,
on a test cluster, we’re not worried about data durability, so there are plenty
of safeguards that we can simply do away with. For one, before starting the
test cluster, go ahead and remove everything in the data dir and commitlog
directories. Next, in cassandra.yaml set hinted_handoff_enabled: false. When
creating a test keyspace, go ahead and set durable_writes = false so
that the commit log is never even populated. Finally, when creating test
tables, consider settingread_repair_chance = 0 and bloom_filter_fp_chance
= 1. Though perhaps these modifications on keyspaces and tables are unnecessary
because I was able to get pretty good performance without them.
Testing
The Test Framework
Now since
all of our changes are in place, let’s fire up Cassandra and see how she
performs!
$ rm -fr /home/jberryman/cassandra
&& bin/cassandra -f
So far
good. “Starting listening for CQL clients on localhost/127.0.0.1:9042″ means
that we’re alive and ready to service requests. Now it’s time to slam
Cassandra:
$ bin/cassandra-stress
total,interval_op_rate,interval_key_rate,latency/95th/99th,elapsed_time
33287,3328,3328,8.0,54.6,277.0,10
85059,5177,5177,7.5,33.3,276.7,20
133153,4809,4809,7.4,34.0,274.8,30
183111,4995,4995,6.9,31.6,165.1,40
233177,5006,5006,6.8,32.0,123.5,51
288998,5582,5582,6.7,26.7,123.5,61
341481,5248,5248,6.7,26.3,129.7,71
391594,5011,5011,6.7,26.7,129.7,81
441645,5005,5005,6.5,29.0,122.5,92
494198,5255,5255,6.3,28.3,122.9,102
539406,4520,4520,6.4,24.4,122.9,112
591272,5186,5186,6.4,26.8,122.9,122
641202,4993,4993,6.6,27.9,122.9,132
696041,5483,5483,6.6,28.2,122.9,143
747078,5103,5103,6.5,26.1,274.4,153
797125,5004,5004,6.4,25.3,274.4,163
839887,4276,4276,6.1,23.9,273.6,173
880678,4079,4079,6.0,22.9,273.6,184
928384,4770,4770,5.8,21.7,273.6,194
979878,5149,5149,5.7,20.2,273.6,204
1000000,2012,2012,5.5,19.4,273.6,208
END
Wow… so
not only does it not die, it’s actually pretty darn performant! Looking back at
the logs I see a couple warnings:
WARN 17:15:57,030
Heap is 0.5260566963447822 full. You may
need to reduce
memtable and/or
cache sizes. Cassandra is now reducing
cache sizes to free up
memory. Adjust reduce_cache_sizes_at threshold in
cassandra.yaml if you don't
want
Cassandra to do this automatically
Ah… this
has to do with the reduce_cache_sizes_at, reduce_cache_capacity_to bit
from earlier. After this warning we hits, we know that caches have been tossed
out. Without caches, I wonder how that will affect the read performance. Let’s
see!
$ bin/cassandra-stress
--operation READ
total,interval_op_rate,interval_key_rate,latency/95th/99th,elapsed_time
34948,3494,3494,8.4,39.9,147.0,10
95108,6016,6016,7.9,19.3,145.2,20
155830,6072,6072,7.8,15.4,144.7,30
213037,5720,5720,7.8,14.6,72.5,40
274021,6098,6098,7.8,13.7,56.8,51
335575,6155,6155,7.7,12.6,56.6,61
396074,6049,6049,7.7,12.6,56.6,71
455660,5958,5958,7.7,12.7,45.8,81
516840,6118,6118,7.7,12.3,45.8,91
576045,5920,5920,7.7,12.3,45.6,102
635237,5919,5919,7.7,12.7,45.6,112
688830,5359,5359,7.7,13.5,45.6,122
740047,5121,5121,7.7,15.1,45.8,132
796249,5620,5620,7.8,14.8,42.4,143
853788,5753,5753,7.9,14.1,37.1,153
906821,5303,5303,7.9,15.1,37.1,163
963981,5716,5716,7.9,14.1,37.1,173
1000000,3601,3601,7.9,13.3,37.1,180
END
Hooray,
it works! And it’s still quite performant! I was concerned about the lack of
caches killing Cassandra read performance, but it seems to be just fine.
Looking back at the log file again, there are several more warnings each look
about like this:
WARN 17:16:25,082
Heap is 0.7914885099943694 full. You may
need to reduce
memtable and/or
cache sizes. Cassandra will now flush up
to the two largest
memtables
to free up memory. Adjust
flush_largest_memtables_at threshold in
cassandra.yaml
if you don't want Cassandra to do this automatically
WARN
17:16:25,083 Flushing CFS(Keyspace='Keyspace1', ColumnFamily='Standard1')
to
relieve memory pressure
Despite
the fact that we’re regularly having these emergency memtable flushes, Cassandra
never died!
Popping
open jconsole, we can make a couple more observations. The first is that while
the unaltered Cassandra process takes up roughly 8GB of memory, this test
Cassandra never goes over 64MB. Second, we also see that that the number of
threads on the unaltered Cassandra hovers around 120-130 while the test
Cassandra remains somewhere between 40 and 50.
Conclusion
So you see,
my client’s request was actually quite reasonable and quite a good idea! Now
they have a test framework that is able to support 15 developers on a single
machine so that each developer has their own isolated test environment. This is
a good example of how consultants sometimes learn from the companies they’re
consulting.
No comments:
Post a Comment