一、 日志归档参数
上一篇我们学习了日志清理,日志清理虽然解决了日志膨胀的问题,但就无法再恢复检查点之前的一致性状态。因此,我们还需要日志归档,pg的日志归档原理和Oracle类似,不过归档命令需要自己配置。
pg主要归档参数如下:
- archive_mode:归档模式开关参数
- archive_command:配置归档命令
- archive_timeout:如果长时间没有归档,则在日志切换后强制归档
1. archive_mode参数
archive_mode参数有3种模式:
- off:关闭归档
- on:开启归档,但不允许在recovery模式下进行归档
- always:开启归档,且允许在recovery模式下进行归档
以下代码在postmaster.c
/*
* Archiver is allowed to start up at the current postmaster state?
*
* If WAL archiving is enabled always,we are allowed to start archiver
* even during recovery.
*/
#define PgArchStartupAllowed() \
(((XLogArchivingActive() && pmState == PM_RUN) || \
(XLogArchivingAlways() && \
(pmState == PM_RECOVERY || pmState == PM_HOT_STANDBY))) && \
PgArchCanRestart())
除了开启归档外,还需要保证wal_level不能是MINIMAL状态(因为该状态下有些操作不会记录日志)。在db启动时,会同时检查archive_mode和wal_level。以下代码也在postmaster.c(PostmasterMain函数)。
if (XLogArchiveMode > ARCHIVE_MODE_OFF && wal_level == WAL_LEVEL_MINIMAL)
ereport(ERROR,(errmsg("WAL archival cannot be enabled when wal_level is \"minimal\"")));
2. archive_command参数
pg会启动一个辅助进程,作用是实时监控事务日志,发现能归档的日志则会通过用户设置的archive_command参数中的命令进行归档。归档命令可以很自由地被指定,一般是cp或者加上压缩命令,如果设置该参数,或者命令有错误,则无法真正归档。
- %p:源文件路径
- %f:源文件名
以下代码在pgarch.c(pgarch_ArchiverCopyLoop函数)
/* can't do anything if no command ... */
if (!XLogArchiveCommandSet())
{
ereport(WARNING,(errmsg("archive_mode enabled,yet archive_command is not set")));
return;
}
3. archive_timeout参数
如果只在日志切换时归档,假如在日志段未满时宕机,则归档日志会缺失一部分,可能造成数据丢失。另外,如果业务写请求较少,日志可能长期不归档。此时,可以通过archive_timeout参数设置超时强制归档,提高归档频率。
注意,每次日志切换时,即使未写满日志大小依然是16M,因此该参数如果设置太小,可能导致归档过于频繁并且大量浪费空间。
-- 手动切WAL日志
SELECT pg_switch_wal();
代码在checkpointer.c文件(CheckArchiveTimeout函数),有一丢丢长,我们放在下面看。
二、 日志归档主要步骤
每当WAL日志段切换时,就可以通知日志归档进程将该日志进行归档。
- 产生日志切换的进程在pg_wal/archive_status下生成与待归档日志同名的.ready文件
- 发送信号通知归档进程(旧版本是先发给Postmaster进程,再通知归档进程),归档进程只关心是否有.ready文件存在,不关心其内容
- 归档进程按照archive_command进行日志归档
- 归档完成后将.ready文件重命名为.done文件
三、 相关函数
1. XLogWrite
日志写入函数(一个老熟人),当一个日志段写满时,需要切换。
static void
XLogWrite(XLogwrtRqst WriteRqst,bool flexible)
{
…
/* 一个段已满 */
if (finishing_seg)
{
/* 将该段刷入磁盘,保证归档日志数据完整性 */
issue_xlog_fsync(openLogFile,openLogSegNo);
/* 通知WalSender进程发送日志给从库 */
WalSndWakeupRequest();
LogwrtResult.Flush = LogwrtResult.Write; /* end of page */
/* 发送日志归档的通知信息 */
if (XLogArchivingActive())
XLogArchiveNotifySeg(openLogSegNo);
…
}
}
2. XLogArchiveNotify函数
创建归档通知的.ready文件(相当于一种进程间的通信机制),告诉归档进程应该归档哪个日志。
/*
* Create an archive notification file
*
* The name of the notification file is the message that will be picked up
* by the archiver,e.g. we write 0000000100000001000000C6.ready
* and the archiver then knows to archive XLOGDIR/0000000100000001000000C6,* then when complete,rename it to 0000000100000001000000C6.done
*/
void
XLogArchiveNotify(const char *xlog)
{
char archiveStatusPath[MAXPGPATH];
FILE *fd;
/* insert an otherwise empty file called <XLOG>.ready */
StatusFilePath(archiveStatusPath,xlog,".ready");
fd = AllocateFile(archiveStatusPath,"w");
if (fd == NULL)
{
ereport(LOG,(errcode_for_file_access(),errmsg("could not create archive status file \"%s\": %m",archiveStatusPath)));
return;
}
if (FreeFile(fd))
{
ereport(LOG,errmsg("could not write archive status file \"%s\": %m",archiveStatusPath)));
return;
}
/* Notify archiver that it's got something to do,发送信号给日志归档进程(旧版本是先发给Postmaster进程) */
if (IsUnderPostmaster)
PgArchWakeup();
}
3. PgArchWakeup函数
发送信号通知归档进程
/*
* Wake up the archiver
*/
void
PgArchWakeup(void)
{
int arch_pgprocno = PgArch->pgprocno;
/*
* We don't acquire ProcArrayLock here. It's actually fine because
* procLatch isn't ever freed,so we just can potentially set the wrong
* process' (or no process') latch. Even in that case the archiver will
* be relaunched shortly and will start archiving.
*/
if (arch_pgprocno != INVALID_PGPROCNO)
SetLatch(&ProcGlobal->allProcs[arch_pgprocno].procLatch);
}
4. pgarch_ArchiverCopyLoop函数
实际上日志归档的顺序也很重要,归档进程会优先选择段号较小的日志文件。因为日志清理时也是按段号顺序清理的,段号小的日志优先归档完就可以被清理了。
归档完成后,归档进程会将.ready文件改为.done文件。
/*
* pgarch_ArchiverCopyLoop
*
* Archives all outstanding xlogs then returns
*/
static void
pgarch_ArchiverCopyLoop(void)
{
char xlog[MAX_XFN_CHARS + 1];
/*
* 循环处理.ready文件
*/
while (pgarch_readyXlog(xlog))
{
int failures = 0;
int failures_orphan = 0;
for (;;)
{
struct stat stat_buf;
char pathname[MAXPGPATH];
/*
* 如果收到停库请求或者postmaster异常挂掉,不再执行后续操作,直接返回
*/
if (ShutdownRequestPending || !PostmasterIsAlive())
return;
/*
* Check for barrier events and config update. This is so that
* we'll adopt a new setting for archive_command as soon as
* possible,even if there is a backlog of files to be archived.
*/
HandlePgArchInterrupts();
/* 如果没有设置archive_command或者设置有问题,报错返回 */
if (!XLogArchiveCommandSet())
{
ereport(WARNING,yet archive_command is not set")));
return;
}
/* 一段异常宕机导致出现孤儿.ready文件时的处理,略 */
…
/* 进行日志归档 */
if (pgarch_archiveXlog(xlog))
{
/* successful,归档成功,将.ready改为.done文件 */
pgarch_archiveDone(xlog);
/*
* Tell the collector about the WAL file that we successfully archived
*/
pgstat_send_archiver(xlog,false);
/* 开始处理下一个日志 */
break; /* out of inner retry loop */
}
/* 归档失败 */
else
{
/*
* Tell the collector about the WAL file that we failed to
* archive
*/
pgstat_send_archiver(xlog,true);
/* 如果失败次数大于重试次数,报错返回 */
if (++failures >= NUM_ARCHIVE_RETRIES)
{
ereport(WARNING,(errmsg("archiving write-ahead log file \"%s\" failed too many times,will try again later",xlog)));
return; /* give up archiving for now */
}
pg_usleep(1000000L); /* wait a bit before retrying,休眠1秒,重试 */
}
}
}
}
四、 归档超时检查与切换
/*
* CheckArchiveTimeout -- check for archive_timeout and switch xlog files
*/
static void
CheckArchiveTimeout(void)
{
pg_time_t now;
pg_time_t last_time;
XLogRecPtr last_switch_lsn;
/* 未设置超时参数或者在恢复阶段,直接返回 */
if (XLogArchiveTimeout <= 0 || RecoveryInProgress())
return;
now = (pg_time_t) time(NULL);
/* First we do a quick check using possibly-stale local state.
首先快速检查,看当前时间减去本地保存的last_xlog_switch_time是否超时,没有则返回
*/
if ((int) (now - last_xlog_switch_time) < XLogArchiveTimeout)
return;
/*
* Update local state ... note that last_xlog_switch_time is the last time a switch was performed *or requested*.
从共享内存中获得上次日志切换的时间,这是真正的日志切换时间。同时获取上次日志切换的LSN
*/
last_time = GetLastSegSwitchData(&last_switch_lsn);
/* 取两者较新的时间,更新本地保存值 */
last_xlog_switch_time = Max(last_xlog_switch_time,last_time);
/* Now we can do the real checks,真正的检查,如果超时,执行后面的检查 */
if ((int) (now - last_xlog_switch_time) >= XLogArchiveTimeout)
{
/*
* Switch segment only when "important" WAL has been logged since the
* last segment switch (last_switch_lsn points to end of segment
* switch occurred in).
如果日志的“重要”LSN >上次切换的LSN,则说明自上次切换以来有重要的WAL日志写入,执行强制切换日志段
*/
if (GetLastImportantRecPtr() > last_switch_lsn)
{
XLogRecPtr switchpoint;
/* mark switch as unimportant,avoids triggering checkpoints,切换日志段 */
switchpoint = RequestXLogSwitch(true);
/*
* If the returned pointer points exactly to a segment boundary,* assume nothing happened. 如果返回的指针正好在段边界,当做无事发生。否则记录一条DEBUG1级别的切换信息
*/
if (XLogSegmentOffset(switchpoint,wal_segment_size) != 0)
elog(DEBUG1,"write-ahead log switch forced (archive_timeout=%d)",XLogArchiveTimeout);
}
/*
* Update state in any case,so we don't retry constantly when the
* system is idle. 更新切换时间
*/
last_xlog_switch_time = now;
}
}
参考
《PostgreSQL技术内幕:事务处理深度探索》第4章
原文地址:https://blog.csdn.net/Hehuyi_In/article/details/126257457
版权声明:本文内容由互联网用户自发贡献,该文观点与技术仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请发送邮件至 dio@foxmail.com 举报,一经查实,本站将立刻删除。