mysqld dead but subsys locked
我运行ps aux和mysql无处可寻.通过“service mysqld start”再次启动mysqld工作正常.试图阻止它会产生同样的问题.
然后我意识到/ var / lock / subsys / mysqld仍然存在.运行mysqld时,我检查了/var/run/mysqld/mysqld.pid,它与正在运行的服务的pid匹配.
我尝试重新安装mysql并删除所有文件和配置但无济于事.
该怎么办?
编辑:
我在/etc/init.d/mysqld文件中添加了一些echo语句,特别是在stop函数中:
stop(){ if [ ! -f "$mypidfile" ]; then # not running; per LSB standards this is "ok" action $"Stopping $prog: " /bin/true return 0 fi echo "beginning stop sequence" MYSQLPID=`cat "$mypidfile"` if [ -n "$MYSQLPID" ]; then /bin/kill "$MYSQLPID" >/dev/null 2>&1 echo "killing pid $MYSQLPID" ret=$? if [ $ret -eq 0 ]; then echo "return code $ret after kill attempt" TIMEOUT="$STOPTIMEOUT" echo "timeout is set to $STOPTIMEOUT" while [ $TIMEOUT -gt 0 ]; do /bin/kill -0 "$MYSQLPID" >/dev/null 2>&1 || break sleep 1 let TIMEOUT=${TIMEOUT}-1 echo "timeout is now $TIMEOUT" done if [ $TIMEOUT -eq 0 ]; then echo "Timeout error occurred trying to stop MySQL Daemon." ret=1 action $"Stopping $prog: " /bin/false else echo "attempting to del lockfile: $lockfile" rm -f $lockfile rm -f "$socketfile" action $"Stopping $prog: " /bin/true fi else action $"Stopping $prog: " /bin/false fi else # failed to read pidfile,probably insufficient permissions action $"Stopping $prog: " /bin/false ret=4 fi return $ret }
这是我尝试停止服务时得到的结果:
[root@server]# service mysqld stop beginning stop sequence killing pid 9145 return code 0 after kill attempt timeout is set to 60 timeout is now 59 timeout is now 58 timeout is now 57 timeout is now 56 timeout is now 55 timeout is now 54 timeout is now 53 timeout is now 52 timeout is now 51 timeout is now 50 timeout is now 49
从查看代码看来,它永远不会突破while循环,并且无法删除锁定文件.我在解释这个错误吗?我在我的其他服务器上检查了相同的文件,它使用相同的代码.我傻眼了.
编辑:
在while循环部分
/bin/kill -0 "$MYSQLPID" >/dev/null 2>&1 || break
由于某种原因,它无法识别返回码.当调用service mysqld stop时,该进程已被杀死,但不确定为什么它不允许循环中断.
编辑:
进一步测试显示调用/ bin / kill和只调用kill之间的一些奇怪的行为,他们显然返回不同的代码,为什么??????:
[root@server]# /bin/kill 25200 kill 25200: No such process [user@server]# echo ${?} 0 [root@server]# kill 25200 -bash: kill: (25200) - No such process [root@server]# echo ${?} 1
编辑:我以非root用户身份登录并尝试执行“kill”和“/ bin / kill”,结果令人惊讶:
[notroot@server ~]$kill -0 23232 -bash: kill: (23232) - No such process [notroot@server ~]$echo $? 1 [notroot@server ~]$/bin/kill -0 23232 kill 23232: No such process (No info could be read for "-p": geteuid()=501 but you should be root.) [notroot@server ~]$echo $? 0
执行kill和bin / kill作为非root用户时,“无信息可读”错误不会显示在我的其他服务器中.
编辑:添加了quanta描述的日志记录,并检查了mysql日志:
启动和停止后,mysql日志显示:
110918 00:11:28 mysqld_safe Starting mysqld daemon with databases from /var/lib/mysql 110918 0:11:28 [Note] Plugin 'FEDERATED' is disabled. 110918 0:11:28 InnoDB: Initializing buffer pool,size = 16.0M 110918 0:11:28 InnoDB: Completed initialization of buffer pool 110918 0:11:29 InnoDB: Started; log sequence number 0 44233 110918 0:11:29 [Warning] 'user' entry 'root@server' ignored in --skip-name-resolve mode. 110918 0:11:29 [Warning] 'user' entry '@server' ignored in --skip-name-resolve mode. 110918 0:11:29 [Note] Event Scheduler: Loaded 0 events 110918 0:11:29 [Note] /usr/libexec/mysqld: ready for connections. Version: '5.1.58-ius' socket: '/var/lib/mysql/mysql.sock' port: 3306 Distributed by The IUS Community Project 110918 0:11:34 [Note] /usr/libexec/mysqld: Normal shutdown 110918 0:11:34 [Note] Event Scheduler: Purging the queue. 0 events 110918 0:11:34 InnoDB: Starting shutdown... 110918 0:11:39 InnoDB: Shutdown completed; log sequence number 0 44233 110918 0:11:39 [Note] /usr/libexec/mysqld: Shutdown complete 110918 00:11:39 mysqld_safe mysqld from pid file /var/run/mysqld/mysqld.pid ended
然后在tmp / mysql.log中:
kill 23080: No such process kill 23080: No such process kill 23080: No such process kill 23080: No such process kill 23080: No such process kill 23080: No such process kill 23080: No such process kill 23080: No such process kill 23080: No such process kill 23080: No such process
我中途停止了停止过程,所以我不必等待超时.看起来这个过程被杀了.我认为,问题仍然是来自“kill”和“/ bin / kill”的不同返回码
解决方法
在我的RHEL 5.6框中,如果我试图杀死不存在的pid,我总是得到1的返回码.我尝试了root用户和非特权用户,两者都是完整路径,只有命令名称.我也只得到简洁的杀死XXX:没有这样的过程,没有详细的错误消息.
运行rpm -Vv util-linux并查看是否有人没有用新的改进版本替换/ bin / kill可能是个好主意.即使rpm验证说文件是原始的,我也会尝试重命名/ bin / kill并从工作机器上复制二进制文件.如果文件替换有帮助并且您没有发现合法的更改源,那么无论rpm验证的输出如何,我都认为机器已被泄露.
版权声明:本文内容由互联网用户自发贡献,该文观点与技术仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请发送邮件至 dio@foxmail.com 举报,一经查实,本站将立刻删除。