Redis:持久化(persistence)
持久化
参考:http://redis.io/topics/persistence 数据持久化:即将数据不同的手段做持久性保存。 如:放在磁盘(而非内存)就是一种持久化,不会因电脑关闭或重启而丢失数据。
Redis 的数据存储在内存中,内存是瞬时的,如果 linux 宕机或重启,又或者 Redis 崩溃或重启,所有的内存数据都会丢失。
为解决这个问题,Redis 提供了多种不同级别的持久化方式:
- RDB 持久化(类比于“物理日志”):可以在指定的时间间隔内生成数据集的时间点快照(point-in-time snapshot)。
- AOF 持久化(类比于“逻辑日志”):记录服务器执行的所有写操作命令。
- 在服务器启动时,通过重新执行这些命令来还原数据集。
- AOF 文件中的命令全部以 Redis 协议的格式来保存,新命令会被追加到文件的末尾。Redis 还可以在后台对 AOF 文件进行重写(rewrite),使得 AOF 文件的体积不会超出保存数据集状态所需的实际大小。
Redis 还可以同时使用 AOF 持久化和 RDB 持久化。 在这种情况下,当 Redis 重启时,它会优先使用 AOF 文件来还原数据集,因为 AOF 文件保存的数据集通常比 RDB 文件所保存的数据集更完整。
RDB(Redis Database)
Redis Database(RDB),就是在指定的时间间隔内将内存中的数据集快照(snapshot)写入磁盘,数据恢复时将快照文件直接再读到内存。 RDB 是一个非常紧凑(compact)的文件,它保存了 Redis 在某个时间点上的数据集。
- 在默认情况下,Redis 将数据库快照保存在名字为 dump.rdb 的二进制文件中。
- 可以设置 Redis 在“ N 秒内数据集至少有 M 个改动”时自动保存一次数据集;也可以调用 SAVE 或者 BGSAVE 来手动开始保存操作。
运作方式
当 Redis 需要保存 dump.rdb 文件时, 服务器执行以下操作:
- Redis 调用 fork(),同时拥有父进程和子进程。
- 子进程将数据集写入到一个临时 RDB 文件中。
- 当子进程完成对新 RDB 文件的写入时,Redis 用新 RDB 文件替换原来的 RDB 文件,并删除旧的 RDB 文件。
- 这种工作方式使得 Redis 可以从写时复制(copy-on-write)机制中获益。
优缺点
优点:
- RDB 在恢复大数据集时的速度比 AOF 的恢复速度要快。
- RDB 可以最大化 Redis 的性能:父进程在保存 RDB 文件时唯一要做的就是 fork 出一个子进程,然后这个子进程就会处理接下来的所有保存工作,父进程无须执行任何磁盘 I/O 操作。
- RDB 非常适合用于进行备份:比如说,你可以在最近的 24 小时内,每小时备份一次 RDB 文件,并且在每个月的每一天,也备份一个 RDB 文件。这样的话,即使遇上问题,也可以随时将数据集还原到不同的版本。
- RDB 非常适用于灾难恢复(disaster recovery):它只有一个文件,并且内容都非常紧凑,可以(在加密后)将它传送到别的数据中心。
缺点:
- 快照不包含丢失执行快照以后更改的数据,所以恢复时可能有数据丢失;
- 快照过程可能影响 Redis 服务:由于需要经常操作磁盘,RDB 会经常 fork() 出一个子进程,在数据集比较庞大时, fork() 会非常耗时,并且可能会影响 Redis 暂停服务一段时间(millisecond 级别)。
- 虽然 AOF 重写也需要进行 fork(),但无论 AOF 重写的执行间隔有多长,数据的耐久性都不会有任何损失。
总而言之,RDB 更适合于数据备份、转移、恢复,但可能有数据丢失。
附:配置项说明
实现:修改配置文件“redis.conf”即可:
################################ SNAPSHOTTING ################################
#
# Save the DB on disk:
#
# save <seconds> <changes>
#
# Will save the DB if both the given number of seconds and the given
# number of write operations against the DB occurred.
#
# In the example below the behaviour will be to save:
# after 900 sec (15 min) if at least 1 key changed
# after 300 sec (5 min) if at least 10 keys changed
# after 60 sec if at least 10000 keys changed
#
# Note: you can disable saving completely by commenting out all "save" lines.
#
# It is also possible to remove all the previously configured save
# points by adding a save directive with a single empty string argument
# like in the following example:
#
# save ""
save 900 1
save 300 10
save 60 10000
# By default Redis will stop accepting writes if RDB snapshots are enabled
# (at least one save point) and the latest background save failed.
# This will make the user aware (in a hard way) that data is not persisting
# on disk properly, otherwise chances are that no one will notice and some
# disaster will happen.
#
# If the background saving process will start working again Redis will
# automatically allow writes again.
#
# However if you have setup your proper monitoring of the Redis server
# and persistence, you may want to disable this feature so that Redis will
# continue to work as usual even if there are problems with disk,
# permissions, and so forth.
stop-writes-on-bgsave-error yes
# Compress string objects using LZF when dump .rdb databases?
# For default that's set to 'yes' as it's almost always a win.
# If you want to save some CPU in the saving child set it to 'no' but
# the dataset will likely be bigger if you have compressible values or keys.
rdbcompression yes
# Since version 5 of RDB a CRC64 checksum is placed at the end of the file.
# This makes the format more resistant to corruption but there is a performance
# hit to pay (around 10%) when saving and loading RDB files, so you can disable it
# for maximum performances.
#
# RDB files created with checksum disabled have a checksum of zero that will
# tell the loading code to skip the check.
rdbchecksum yes
# The filename where to dump the DB
dbfilename dump.rdb
# The working directory.
#
# The DB will be written inside this directory, with the filename specified
# above using the 'dbfilename' configuration directive.
#
# The Append Only File will also be created inside this directory.
#
# Note that you must specify a directory here, not a file name.
dir ./
其中:
- “save <seconds> <changes>”:设置持久化条件“在<seconds>秒内,修改了<changes>次,则进行一次磁盘持久化”;(如上配置,可以多个组合使用)
- “dbfilename”:设置 RDB 的文件名,默认文件名为“dump.rdb”;
- “dir”:指定 RDB 和 AOF 文件的目录
AOF(Append-only File)
Append-only File(AOF),Redis 每次接收到一条改变数据的命令时,它将把该命令写到一个AOF文件中(只记录写操作,读操作不记录),当 Redis 重启时,它通过执行 AOF 文件中所有的命令来恢复数据。
AOF 文件的重写就是对文件内容的整理,将一些命令进行优化,从而可以让文件体积变小;
- 比如“set k1 v1”, 然后又“set k1 v2”,那么重写后就只会留下后者,前者会被删除,因为没有作用。
优点:
- AOF 是另一个可以提供完全数据保障的方案;
缺点:
- AOF 文件会在操作过程中变得越来越大。
- 比如,如果你做一百次加法计算,最后你只会在数据库里面得到最终的数值,但是在你的 AOF 里面会存在 100 次记录,其中 99 条记录对最终的结果是无用的;
- 但 Redis 支持在不影响服务的前提下在后台重构 AOF 文件,让文件得以整理变小;
附:配置项说明
实现:修改配置文件“redis.conf”即可:
############################## APPEND ONLY MODE ###############################
# By default Redis asynchronously dumps the dataset on disk. This mode is
# good enough in many applications, but an issue with the Redis process or
# a power outage may result into a few minutes of writes lost (depending on
# the configured save points).
#
# The Append Only File is an alternative persistence mode that provides
# much better durability. For instance using the default data fsync policy
# (see later in the config file) Redis can lose just one second of writes in a
# dramatic event like a server power outage, or a single write if something
# wrong with the Redis process itself happens, but the operating system is
# still running correctly.
#
# AOF and RDB persistence can be enabled at the same time without problems.
# If the AOF is enabled on startup Redis will load the AOF, that is the file
# with the better durability guarantees.
#
# Please check http://redis.io/topics/persistence for more information.
appendonly no
# The name of the append only file (default: "appendonly.aof")
appendfilename "appendonly.aof"
# The fsync() call tells the Operating System to actually write data on disk
# instead of waiting for more data in the output buffer. Some OS will really flush
# data on disk, some other OS will just try to do it ASAP.
#
# Redis supports three different modes:
#
# no: don't fsync, just let the OS flush the data when it wants. Faster.
# always: fsync after every write to the append only log. Slow, Safest.
# everysec: fsync only one time every second. Compromise.
#
# The default is "everysec", as that's usually the right compromise between
# speed and data safety. It's up to you to understand if you can relax this to
# "no" that will let the operating system flush the output buffer when
# it wants, for better performances (but if you can live with the idea of
# some data loss consider the default persistence mode that's snapshotting),
# or on the contrary, use "always" that's very slow but a bit safer than
# everysec.
#
# More details please check the following article:
# http://antirez.com/post/redis-persistence-demystified.html
#
# If unsure, use "everysec".
# appendfsync always
appendfsync everysec
# appendfsync no
# When the AOF fsync policy is set to always or everysec, and a background
# saving process (a background save or AOF log background rewriting) is
# performing a lot of I/O against the disk, in some Linux configurations
# Redis may block too long on the fsync() call. Note that there is no fix for
# this currently, as even performing fsync in a different thread will block
# our synchronous write(2) call.
#
# In order to mitigate this problem it's possible to use the following option
# that will prevent fsync() from being called in the main process while a
# BGSAVE or BGREWRITEAOF is in progress.
#
# This means that while another child is saving, the durability of Redis is
# the same as "appendfsync none". In practical terms, this means that it is
# possible to lose up to 30 seconds of log in the worst scenario (with the
# default Linux settings).
#
# If you have latency problems turn this to "yes". Otherwise leave it as
# "no" that is the safest pick from the point of view of durability.
no-appendfsync-on-rewrite no
# Automatic rewrite of the append only file.
# Redis is able to automatically rewrite the log file implicitly calling
# BGREWRITEAOF when the AOF log size grows by the specified percentage.
#
# This is how it works: Redis remembers the size of the AOF file after the
# latest rewrite (if no rewrite has happened since the restart, the size of
# the AOF at startup is used).
#
# This base size is compared to the current size. If the current size is
# bigger than the specified percentage, the rewrite is triggered. Also
# you need to specify a minimal size for the AOF file to be rewritten, this
# is useful to avoid rewriting the AOF file even if the percentage increase
# is reached but it is still pretty small.
#
# Specify a percentage of zero in order to disable the automatic AOF
# rewrite feature.
auto-aof-rewrite-percentage 100
auto-aof-rewrite-min-size 64mb
# An AOF file may be found to be truncated at the end during the Redis
# startup process, when the AOF data gets loaded back into memory.
# This may happen when the system where Redis is running
# crashes, especially when an ext4 filesystem is mounted without the
# data=ordered option (however this can't happen when Redis itself
# crashes or aborts but the operating system still works correctly).
#
# Redis can either exit with an error when this happens, or load as much
# data as possible (the default now) and start if the AOF file is found
# to be truncated at the end. The following option controls this behavior.
#
# If aof-load-truncated is set to yes, a truncated AOF file is loaded and
# the Redis server starts emitting a log to inform the user of the event.
# Otherwise if the option is set to no, the server aborts with an error
# and refuses to start. When the option is set to no, the user requires
# to fix the AOF file using the "redis-check-aof" utility before to restart
# the server.
#
# Note that if the AOF file will be found to be corrupted in the middle
# the server will still exit with an error. This option only applies when
# Redis will try to read more data from the AOF file but not enough bytes
# will be found.
aof-load-truncated yes
其中:
- “appendonly”:启用/关闭 AOF 持久化;【默认no】
- “appendfilename”:指定 AOF 文件名,默认文件名为“appendonly.aof”
- “dir”:(公用RDB配置的“dir”配置)指定 RDB 和 AOF 文件的目录;
- “appendfsync”:配置向 AOF 文件写命令数据的策略:
- “no”:不主动进行同步操作,而是完全交由操作系统来做(即每30秒一次),比较快但不是很安全;
- “always”:每次执行写入都会执行同步,慢一些但是比较安全;
- “everysec”:每秒执行一次同步操作,比较平衡,介于速度和安全之间;【默认】
- “auto-aof-rewrite-percentage”:当目前 AOF 文件大小超过上一次重写时的 AOF 文件大小的百分之多少时会再次进行重写;
- (如果之前没有重写,则以启动时的 AOF 文件大小为依据)
- “auto-aof-rewrite-min-size”:允许重写的最小 AOF 文件大小;(一般配置较大,几个 G)
数据备份与恢复
Redis 的 SAVE 和 BGSAVE 命令用于创建当前数据库的备份。
备份
SAVE
“SAVE”用法如下:
redis 127.0.0.1:6379> SAVE
OK
- 该命令将在 redis 安装目录中创建 dump.rdb 文件(RDB 的默认文件)。
BGSAVE
“BGSAVE”用法如下:
127.0.0.1:6379> BGSAVE
Background saving started
- 该命令在后台执行。
BGSAVE 的原理是什么?
fork 和 cow:
- fork:是指 Redis 通过创建子进程来进行 bgsave 操作。
- cow:指的是“copy on write”,子进程创建后,父子进程共享数据段。父进程继续提供读写服务,写脏的页面数据会逐渐和子进程分离开来。
BGSAVE 操作后,会产生 RDB 快照文件。
恢复数据
如果需要恢复数据,只需将备份文件 (dump.rdb) 移动到 redis 安装目录并启动服务即可。
获取 redis 目录可以使用 CONFIG 命令,如下所示:
redis 127.0.0.1:6379> CONFIG GET dir
1) "dir"
2) "/usr/local/redis/bin"