Postgresql rman

  • 联机程序. 并且目标数据库必须处于归档模式。
  • 支持在线全备, 增量备份, 归档备份
    • 增量备份基于已经存在的一个全库备份
  • rman 本身使用pg_start_backup(), copy, pg_stop_backup() 备份模式

本身采用的是文本拷贝… cp/fwrite;

  • pg_start_backup()
    • text 用户定义的标签, 是备份转储文件将被存储的名字
    • boolean 指尽快执行pg_start_backup. 这将会强制一个立即执行的检查点, 会导致I/O操作的峰值, 拖慢任何并发执行的查询.
    • boolean 如果为false, 则在完成备份后, pg_stop_backup将立即返回,而无需等待WAL归档
  • pg_stop_backup()

差异备份与累计备份

rman整体架构

1564449414429

默认配置参数:

  1. PGDATA
  2. BACKUP_PATH
  3. ARCLOG_PATH

pg_rman init

1564449631751

pg_rman show

pg_rman config –list

pg_rman backup -b full

1564449737711

​ -b inc [incremental]

1564449766687

​ -b arch [archive]

pg_rman restore

1564450036123

[新增功能] pg_rman blockrecover –datafile tablespaceOid/databaseOid/relfilenode –block 0

1564450128908

备份策略

  1. 恢复窗口: 指定天数. 默认值为 7.
  2. 备份数量: 冗余度保留。 默认值为 1.

代码组织架构:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
.
├── backup.c
├── blockrecover.c
├── catalog.c
├── COPYRIGHT
├── data.c
├── delete.c
├── dir.c
├── docs
├── expected
├── idxpagehdr.h
├── init.c
├── Makefile
├── parray.c
├── parray.h
├── pg_rman.c
├── pg_rman.h
├── pgsql_src
├── pgut
├── README.md
├── restore.c
├── script
├── show.c
├── sql
├── util.c
├── validate.c
└── xlog.c

pg_rman-源码浅析

代码阅读

1
2
3
4
5
6
7
8
9
10
11
12
13
* +----------------+---------------------------------+  
* | PageHeaderData | linp1 linp2 linp3 ... |
* +-----------+----+---------------------------------+
* | ... linpN | |
* +-----------+--------------------------------------+
* | ^ pd_lower |
* | |
* | v pd_upper |
* +-------------+------------------------------------+
* | | tupleN ... |
* +-------------+------------------+-----------------+
* | ... tuple3 tuple2 tuple1 | "special space" |
* +--------------------------------+-----------------+

如果有数据刷入, 那么将会做持久化,数据库页头部的pd_lsn表示该数据库页最后一次变化时, 变化产生的REDO在wal file中的结束为止.

如果wal flush的lsn插入位置 大于或者等于这个pd_lsn将表示这个页的更改是可靠的. 即每次修改都将发生块的变化: 包含LSN的修改.

即可以通过第一次备份开始时的全局LSN, 以及当前需要备份的数据的Page LSN来判断此页是否发生过修改.

修改了即备份,没修改不需要备份, 从而实现数据库的块级别增量备份

增量备份关联代码:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
	pgBackupGetPath(prev_backup, prev_file_txt, lengthof(prev_file_txt),
DATABASE_FILE_LIST);
prev_files = dir_read_file_list(pgdata, prev_file_txt);

/*
* Do backup only pages having larger LSN than previous backup.
*/
lsn = &prev_backup->start_lsn;
xlogid = (uint32) (*lsn >> 32);
xrecoff = (uint32) *lsn;
elog(DEBUG, _("backup only the page updated after LSN(%X/%08X)"),
xlogid, xrecoff);

/* Construct the directory for this backup within BACKUP_PATH. */
pgBackupGetPath(&current, path, lengthof(path), DATABASE_DIR);

/* Save the files listed above. */
backup_files(pgdata, path, files, prev_files, lsn, current.compress_data, NULL);

[新增]块恢复代码:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
for (loop = 0; loop <= brc.base_index; loop++)
{
backup = (pgBackup *) parray_get(backups, loop);

/* don't use incomplete nor different timeline backup */
if (backup->status != BACKUP_STATUS_OK || backup->tli != base_backup->tli)
continue;
if(-1 == brc.lastBackupIndex && HAVE_ARCLOG(backup) && brc.last_needed_index >= loop)
{
restore_archive_logs(backup,true);
}
/* use database backup only */
if (BACKUP_MODE_INCREMENTAL > backup->backup_mode || brc.last_needed_index < loop)
continue;

elog(DEBUG, "found backup BK_KEY: \"%d\" can be used ",backup->backup_id);

recoverBackup(backup,loop);
=> [[
for(loop = 0; loop < brc.rbNum; loop++)
{
/*If this block has find a page,skip it*/
if(brc.pageArray[loop])
{
elog(DEBUG,"block \'%u\' has find it's page,skip.",brc.recoverBlock[loop]);
continue;
}
page = findPageInBackup(backup, brc.recoverBlock[loop]);
if(page)
{
brc.pageArray[loop] = page;
if(-1 == brc.lastBackupIndex)
{
brc.lastBackupIndex = backupindex;
elog(DEBUG,"Find last backup can be used:BK_KEY \'%d\'",backup->backup_id);
}
}
}
]]
}

问题:

  1. 随意增大filenode大小, 即无法整除8192时, 会默认增大一个Page。 此时的Page是不完整的. pg默认不开启checksum校验. 因此Pg会提示blk Num无效, 进行blockrecover操作时, 将会发生无法恢复. 因为整个filenode本身就没有正确的此Page;
  2. 当随意修改Page数据时, 有时会发生显示数据不全,即数据条目与插入条目不符的情况. 此时Pg本身无法正常的数据异常告警. 请开启checksum. 进行验证.

checkSum异常告警;

1
WARNING:  01000: page verification failed, calculated checksum 11654 but expected 8293
  1. 确定table的tuple Num
  2. 确定table的page Num

确保开启checksum功能, 保证Page的数据正常. 但对上述问题不产生有效影响;;

warning:

关于backup_label中数据参数的提示。 参照postgresql wal解析;

欣赏此文? 求鼓励,求支持!