Redis：持久化（persistence）

持久化

参考：http://redis.io/topics/persistence

数据持久化：即将数据不同的手段做持久性保存。
   如：放在磁盘（而非内存）就是一种持久化，不会因电脑关闭或重启而丢失数据。

Redis 的数据存储在内存中，内存是瞬时的，如果 linux 宕机或重启，又或者 Redis 崩溃或重启，所有的内存数据都会丢失。

为解决这个问题，Redis 提供了多种不同级别的持久化方式：

RDB 持久化（类比于“物理日志”）：可以在指定的时间间隔内生成数据集的时间点快照（point-in-time snapshot）。
AOF 持久化（类比于“逻辑日志”）：记录服务器执行的所有写操作命令。
- 在服务器启动时，通过重新执行这些命令来还原数据集。
- AOF 文件中的命令全部以 Redis 协议的格式来保存，新命令会被追加到文件的末尾。Redis 还可以在后台对 AOF 文件进行重写（rewrite），使得 AOF 文件的体积不会超出保存数据集状态所需的实际大小。

Redis 还可以同时使用 AOF 持久化和 RDB 持久化。

    在这种情况下，当 Redis 重启时，它会优先使用 AOF 文件来还原数据集，因为 AOF 文件保存的数据集通常比 RDB 文件所保存的数据集更完整。

RDB（Redis Database）

Redis Database（RDB），就是在指定的时间间隔内将内存中的数据集快照（snapshot）写入磁盘，数据恢复时将快照文件直接再读到内存。

RDB 是一个非常紧凑（compact）的文件，它保存了 Redis 在某个时间点上的数据集。

在默认情况下，Redis 将数据库快照保存在名字为 dump.rdb 的二进制文件中。
可以设置 Redis 在“ N 秒内数据集至少有 M 个改动”时自动保存一次数据集；也可以调用 SAVE 或者 BGSAVE 来手动开始保存操作。

运作方式

当 Redis 需要保存 dump.rdb 文件时，服务器执行以下操作：

Redis 调用 fork()，同时拥有父进程和子进程。
子进程将数据集写入到一个临时 RDB 文件中。
当子进程完成对新 RDB 文件的写入时，Redis 用新 RDB 文件替换原来的 RDB 文件，并删除旧的 RDB 文件。

这种工作方式使得 Redis 可以从写时复制（copy-on-write）机制中获益。

优缺点

优点：

RDB 在恢复大数据集时的速度比 AOF 的恢复速度要快。
RDB 可以最大化 Redis 的性能：父进程在保存 RDB 文件时唯一要做的就是 fork 出一个子进程，然后这个子进程就会处理接下来的所有保存工作，父进程无须执行任何磁盘 I/O 操作。
RDB 非常适合用于进行备份：比如说，你可以在最近的 24 小时内，每小时备份一次 RDB 文件，并且在每个月的每一天，也备份一个 RDB 文件。这样的话，即使遇上问题，也可以随时将数据集还原到不同的版本。
RDB 非常适用于灾难恢复（disaster recovery）：它只有一个文件，并且内容都非常紧凑，可以（在加密后）将它传送到别的数据中心。

缺点：

快照不包含丢失执行快照以后更改的数据，所以恢复时可能有数据丢失；
快照过程可能影响 Redis 服务：由于需要经常操作磁盘，RDB 会经常 fork() 出一个子进程，在数据集比较庞大时， fork() 会非常耗时，并且可能会影响 Redis 暂停服务一段时间（millisecond 级别）。
- 虽然 AOF 重写也需要进行 fork()，但无论 AOF 重写的执行间隔有多长，数据的耐久性都不会有任何损失。

总而言之，RDB 更适合于数据备份、转移、恢复，但可能有数据丢失。

附：配置项说明

实现：修改配置文件“redis.conf”即可：

################################ SNAPSHOTTING  ################################
#
# Save the DB on disk:
#
#   save <seconds> <changes>
#
#   Will save the DB if both the given number of seconds and the given
#   number of write operations against the DB occurred.
#
#   In the example below the behaviour will be to save:
#   after 900 sec (15 min) if at least 1 key changed
#   after 300 sec (5 min) if at least 10 keys changed
#   after 60 sec if at least 10000 keys changed
#
#   Note: you can disable saving completely by commenting out all "save" lines.
#
#   It is also possible to remove all the previously configured save
#   points by adding a save directive with a single empty string argument
#   like in the following example:
#
#   save ""

save 900 1
save 300 10
save 60 10000

# By default Redis will stop accepting writes if RDB snapshots are enabled
# (at least one save point) and the latest background save failed.
# This will make the user aware (in a hard way) that data is not persisting
# on disk properly, otherwise chances are that no one will notice and some
# disaster will happen.
#
# If the background saving process will start working again Redis will
# automatically allow writes again.
#
# However if you have setup your proper monitoring of the Redis server
# and persistence, you may want to disable this feature so that Redis will
# continue to work as usual even if there are problems with disk,
# permissions, and so forth.
stop-writes-on-bgsave-error yes

# Compress string objects using LZF when dump .rdb databases?
# For default that's set to 'yes' as it's almost always a win.
# If you want to save some CPU in the saving child set it to 'no' but
# the dataset will likely be bigger if you have compressible values or keys.
rdbcompression yes

# Since version 5 of RDB a CRC64 checksum is placed at the end of the file.
# This makes the format more resistant to corruption but there is a performance
# hit to pay (around 10%) when saving and loading RDB files, so you can disable it
# for maximum performances.
#
# RDB files created with checksum disabled have a checksum of zero that will
# tell the loading code to skip the check.
rdbchecksum yes

# The filename where to dump the DB
dbfilename dump.rdb

# The working directory.
#
# The DB will be written inside this directory, with the filename specified
# above using the 'dbfilename' configuration directive.
#
# The Append Only File will also be created inside this directory.
#
# Note that you must specify a directory here, not a file name.
dir ./

其中：

“save <seconds> <changes>”：设置持久化条件“在<seconds>秒内，修改了<changes>次，则进行一次磁盘持久化”；（如上配置，可以多个组合使用）
“dbfilename”：设置 RDB 的文件名，默认文件名为“dump.rdb”；
“dir”：指定 RDB 和 AOF 文件的目录

AOF（Append-only File）

Append-only File（AOF），Redis 每次接收到一条改变数据的命令时，它将把该命令写到一个AOF文件中（只记录写操作，读操作不记录），当 Redis 重启时，它通过执行 AOF 文件中所有的命令来恢复数据。

AOF 文件的重写就是对文件内容的整理，将一些命令进行优化，从而可以让文件体积变小；

比如“set k1 v1”, 然后又“set k1 v2”，那么重写后就只会留下后者，前者会被删除，因为没有作用。

优点：

AOF 是另一个可以提供完全数据保障的方案；

缺点：

AOF 文件会在操作过程中变得越来越大。
比如，如果你做一百次加法计算，最后你只会在数据库里面得到最终的数值，但是在你的 AOF 里面会存在 100 次记录，其中 99 条记录对最终的结果是无用的；
- 但 Redis 支持在不影响服务的前提下在后台重构 AOF 文件，让文件得以整理变小；

附：配置项说明

实现：修改配置文件“redis.conf”即可：

############################## APPEND ONLY MODE ###############################

# By default Redis asynchronously dumps the dataset on disk. This mode is
# good enough in many applications, but an issue with the Redis process or
# a power outage may result into a few minutes of writes lost (depending on
# the configured save points).
#
# The Append Only File is an alternative persistence mode that provides
# much better durability. For instance using the default data fsync policy
# (see later in the config file) Redis can lose just one second of writes in a
# dramatic event like a server power outage, or a single write if something
# wrong with the Redis process itself happens, but the operating system is
# still running correctly.
#
# AOF and RDB persistence can be enabled at the same time without problems.
# If the AOF is enabled on startup Redis will load the AOF, that is the file
# with the better durability guarantees.
#
# Please check http://redis.io/topics/persistence for more information.

appendonly no

# The name of the append only file (default: "appendonly.aof")
appendfilename "appendonly.aof"

# The fsync() call tells the Operating System to actually write data on disk
# instead of waiting for more data in the output buffer. Some OS will really flush
# data on disk, some other OS will just try to do it ASAP.
#
# Redis supports three different modes:
#
# no: don't fsync, just let the OS flush the data when it wants. Faster.
# always: fsync after every write to the append only log. Slow, Safest.
# everysec: fsync only one time every second. Compromise.
#
# The default is "everysec", as that's usually the right compromise between
# speed and data safety. It's up to you to understand if you can relax this to
# "no" that will let the operating system flush the output buffer when
# it wants, for better performances (but if you can live with the idea of
# some data loss consider the default persistence mode that's snapshotting),
# or on the contrary, use "always" that's very slow but a bit safer than
# everysec.
#
# More details please check the following article:
# http://antirez.com/post/redis-persistence-demystified.html
#
# If unsure, use "everysec".

# appendfsync always
appendfsync everysec
# appendfsync no

# When the AOF fsync policy is set to always or everysec, and a background
# saving process (a background save or AOF log background rewriting) is
# performing a lot of I/O against the disk, in some Linux configurations
# Redis may block too long on the fsync() call. Note that there is no fix for
# this currently, as even performing fsync in a different thread will block
# our synchronous write(2) call.
#
# In order to mitigate this problem it's possible to use the following option
# that will prevent fsync() from being called in the main process while a
# BGSAVE or BGREWRITEAOF is in progress.
#
# This means that while another child is saving, the durability of Redis is
# the same as "appendfsync none". In practical terms, this means that it is
# possible to lose up to 30 seconds of log in the worst scenario (with the
# default Linux settings).
#
# If you have latency problems turn this to "yes". Otherwise leave it as
# "no" that is the safest pick from the point of view of durability.
no-appendfsync-on-rewrite no

# Automatic rewrite of the append only file.
# Redis is able to automatically rewrite the log file implicitly calling
# BGREWRITEAOF when the AOF log size grows by the specified percentage.
#
# This is how it works: Redis remembers the size of the AOF file after the
# latest rewrite (if no rewrite has happened since the restart, the size of
# the AOF at startup is used).
#
# This base size is compared to the current size. If the current size is
# bigger than the specified percentage, the rewrite is triggered. Also
# you need to specify a minimal size for the AOF file to be rewritten, this
# is useful to avoid rewriting the AOF file even if the percentage increase
# is reached but it is still pretty small.
#
# Specify a percentage of zero in order to disable the automatic AOF
# rewrite feature.

auto-aof-rewrite-percentage 100
auto-aof-rewrite-min-size 64mb

# An AOF file may be found to be truncated at the end during the Redis
# startup process, when the AOF data gets loaded back into memory.
# This may happen when the system where Redis is running
# crashes, especially when an ext4 filesystem is mounted without the
# data=ordered option (however this can't happen when Redis itself
# crashes or aborts but the operating system still works correctly).
#
# Redis can either exit with an error when this happens, or load as much
# data as possible (the default now) and start if the AOF file is found
# to be truncated at the end. The following option controls this behavior.
#
# If aof-load-truncated is set to yes, a truncated AOF file is loaded and
# the Redis server starts emitting a log to inform the user of the event.
# Otherwise if the option is set to no, the server aborts with an error
# and refuses to start. When the option is set to no, the user requires
# to fix the AOF file using the "redis-check-aof" utility before to restart
# the server.
#
# Note that if the AOF file will be found to be corrupted in the middle
# the server will still exit with an error. This option only applies when
# Redis will try to read more data from the AOF file but not enough bytes
# will be found.
aof-load-truncated yes

其中：

“appendonly”：启用/关闭 AOF 持久化；【默认no】
“appendfilename”：指定 AOF 文件名，默认文件名为“appendonly.aof”
“dir”：（公用RDB配置的“dir”配置）指定 RDB 和 AOF 文件的目录；
“appendfsync”：配置向 AOF 文件写命令数据的策略：
- “no”：不主动进行同步操作，而是完全交由操作系统来做（即每30秒一次），比较快但不是很安全；
- “always”：每次执行写入都会执行同步，慢一些但是比较安全；
- “everysec”：每秒执行一次同步操作，比较平衡，介于速度和安全之间；【默认】
“auto-aof-rewrite-percentage”：当目前 AOF 文件大小超过上一次重写时的 AOF 文件大小的百分之多少时会再次进行重写；
- （如果之前没有重写，则以启动时的 AOF 文件大小为依据）
“auto-aof-rewrite-min-size”：允许重写的最小 AOF 文件大小；（一般配置较大，几个 G）

数据备份与恢复

Redis 的 SAVE 和 BGSAVE 命令用于创建当前数据库的备份。

备份

SAVE

“SAVE”用法如下：

redis 127.0.0.1:6379> SAVE 
OK

该命令将在 redis 安装目录中创建 dump.rdb 文件（RDB 的默认文件）。

BGSAVE

“BGSAVE”用法如下：

127.0.0.1:6379> BGSAVE

Background saving started

该命令在后台执行。

BGSAVE 的原理是什么？

fork 和 cow：

fork：是指 Redis 通过创建子进程来进行 bgsave 操作。
cow：指的是“copy on write”，子进程创建后，父子进程共享数据段。父进程继续提供读写服务，写脏的页面数据会逐渐和子进程分离开来。

BGSAVE 操作后，会产生 RDB 快照文件。

恢复数据

如果需要恢复数据，只需将备份文件 (dump.rdb) 移动到 redis 安装目录并启动服务即可。

获取 redis 目录可以使用 CONFIG 命令，如下所示：

redis 127.0.0.1:6379> CONFIG GET dir
1) "dir"
2) "/usr/local/redis/bin"

Redis：持久化（persistence）

持久化

RDB（Redis Database）

运作方式

优缺点

附：配置项说明

AOF（Append-only File）

附：配置项说明

数据备份与恢复

备份

SAVE

BGSAVE

BGSAVE 的原理是什么？

恢复数据

导航菜单

搜索