To give you a few examples:
Testing your use case is trivial. Use the
valkey-benchmark
utility to generate random data sets then
check the space used with the INFO memory
command.
In the past, developers experimented with Virtual Memory and other systems in order to allow larger than RAM datasets, but after all we are very happy if we can do one thing well: data served from memory, disk used for storage. So for now there are no plans to create an on disk backend for Valkey. Most of what Valkey is, after all, a direct result of its current design.
If your real problem is not the total RAM needed, but the fact that you need to split your data set into multiple Valkey instances, please read the partitioning page in this documentation for more info.
Yes, a common design pattern involves taking very write-heavy small data in Valkey (and data you need the Valkey data structures to model your problem in an efficient way), and big blobs of data into an SQL or eventually consistent on-disk database. Similarly sometimes Valkey is used in order to take in memory another copy of a subset of the same data stored in the on-disk database. This may look similar to caching, but actually is a more advanced model since normally the Valkey dataset is updated together with the on-disk DB dataset, and not refreshed on cache misses.
A good practice is to consider memory consumption when mapping your logical data model to the physical data model within Valkey. These considerations include using specific data types, key patterns, and normalization.
Beyond data modeling, there is more info in the Memory Optimization page.
Valkey has built-in protections allowing the users to set a max limit
on memory usage, using the maxmemory
option in the
configuration file to put a limit to the memory Valkey can use. If this
limit is reached, Valkey will start to reply with an error to write
commands (but will continue to accept read-only commands).
You can also configure Valkey to evict keys when the max memory limit is reached. See the eviction policy docs for more information on this.
Short answer: echo 1 > /proc/sys/vm/overcommit_memory
:)
And now the long one:
The Valkey background saving schema relies on the copy-on-write
semantic of the fork
system call in modern operating
systems: Valkey forks (creates a child process) that is an exact copy of
the parent. The child process dumps the DB on disk and finally exits. In
theory the child should use as much memory as the parent being a copy,
but actually thanks to the copy-on-write semantic implemented by most
modern operating systems the parent and child process will
share the common memory pages. A page will be duplicated only
when it changes in the child or in the parent. Since in theory all the
pages may change while the child process is saving, Linux can’t tell in
advance how much memory the child will take, so if the
overcommit_memory
setting is set to zero the fork will fail
unless there is as much free RAM as required to really duplicate all the
parent memory pages. If you have a Valkey dataset of 3 GB and just 2 GB
of free memory it will fail.
Setting overcommit_memory
to 1 tells Linux to relax and
perform the fork in a more optimistic allocation fashion, and this is
indeed what you want for Valkey.
You can refer to the proc(5) man page for explanations of the available values.
Yes, the Valkey background saving process is always forked when the server is outside of the execution of a command, so every command reported to be atomic in RAM is also atomic from the point of view of the disk snapshot.
Enable I/O threading to offload client communication to threads. In Valkey 8, the I/O threading implementation has been rewritten and greatly improved. Reading commands from clients and writing replies back uses considerable CPU time. By offloading this work to separate threads, the main thread can focus on executing commands.
You can also start multiple instances of Valkey in the same box and combine them into a cluster.
Valkey can handle up to 232 keys, and was tested in practice to handle at least 250 million keys per instance.
Every hash, list, set, and sorted set, can hold 232 elements.
In other words your limit is likely the available memory in your system.
If you use keys with limited time to live (Valkey expires) this is normal behavior. This is what happens:
INFO
output and in the
DBSIZE
command.Because of this, it’s common for users with many expired keys to see fewer keys in the replicas. However, logically, the primary and replica will have the same content.
Read about the history of Valkey.