E.g., consider that we have a filled L2 with 100 sstables but L3 also has just 100 sstables (and not 1000).
All sstables have the same size so roughly 90% of all the sstables, and therefore, 90% of all the data, is in L3.
L3 is a run and therefore cannot have any duplicate data.
Each of the other levels, L1, L2, L3, etc., is a single run of an exponentially increasing size: L1 is a run of 10 sstables, L2 is a run of 100 sstables, L3 is a run of 1000 sstables, and so on.
(Factor 10 is the default setting in both Scylla and Apache Cassandra).
The Leveled Compaction Strategy was the second compaction strategy introduced in Apache Cassandra.
It was first introduced in Cassandra 1.0 in 2011, and was based on ideas from Google’s Level DB.It actually has a worst case where we can get 2-fold space amplification.This happens when the last level is not filled, but rather only filled as much as the previous level.This post and the rest of this series are based on a talk that I gave (with Raphael Carvalho) in the last annual Scylla Summit in San Francisco.The video and slides for the talk are available on our Tech Talk page.In other words, a run is a collection of sstables with non-overlapping token ranges.The benefit of using a run of fragments (small sstables) instead of one huge sstable is that with a run, we can compact only parts of the huge sstable instead of all of it.The job of Leveled compaction strategy is to maintain this structure while keeping L0 empty: Let’s explain now why LCS indeed fulfills its ambition to provide low space amplification and therefore indeed solves STCS’s main problem.In the previous post, we saw that space amplification comes in two varieties: The first is temporary disk space use during compaction, and the second is space wasted by storing different values for the same over-written rows.LCS does not have the temporary disk space problem which plagued STCS: While STCS may need to do huge compactions and temporarily have both input and output on disk, LCS always does small compaction steps, involving roughly 11 input and output sstables of a fixed size.This means we may need roughly 11*160MB, less than 2 GB, of temporary disk space – not half the disk as in STCS. The reason is that most of the data is stored in the biggest level, and since this level is a run – with different sstables having no overlap – we cannot have any duplicates inside this run. The best case for LCS is that the last level is filled.