linux程序提示killed的原因是什么?
1. 查看Killed对应的日志
1.1 触发Killed常见原因
当系统资源不足时,Linux 内核也可以决定终止一个或多个进程。 一个非常常见的例子是内存不足 (OOM) killer,会在系统的物理内存耗尽时触发。
当内存不足时,内核会将相关信息记录到内核日志缓冲区中,该缓冲区可通过 /dev/kmsg 获得。
有几个工具/脚本/命令 可以更轻松地从该虚拟设备读取数据,其中最常见的是 dmesg 和 journalctl。
1.2 查看Killed日志
使用sudo dmesg | tail -7命令(任意目录下,不需要进入log目录,这应该是最简单的一种)
可以看到:
oom-kill之后,就是解释那个被killed的程序的pid和uid
Out of memory: Killed process 1138439 (python3) total-vm:8117956kB, anon-rss:5649844kB,内存不够
total_vm:总共使用的虚拟内存 Virtual memory use (in 4 kB pages)
8117956/1024(得到MB)/1024(得到GB)=7.741GB
rss:常驻内存使用Resident memory use (in 4 kB pages)
5649844/1024/1024=5.388GB
另外,网上还有几种方式,如下:
使用下面的这几行命令
journalctl --list-boots |
awk ‘{ print $1 }’ |
xargs -I{} journalctl --utc --no-pager -b {} -kqg ‘killed process’ -o verbose --output-fields=MESSAGE
就可以直接得到,最关键的信息
网上更常见的似乎是:
journalctl -xb | egrep -i 'killed process’
其中-B100,表示 'killed process’之前的100行内容
其中有一部分信息也许有用,如下所示,类似一个表格
重点关注其中最后两列:oom_score_adj和name
可以看到,有些服务常驻内存,所以占了不少内存,考虑清理出去。
同时killed之后,似乎有些内存,不会从内存中自动释放。需要手动释放。
另外,还有
egrep -i ‘killed process’ /var/log/messages 或
egrep -i -r ‘killed process’ /var/log
1.3 dmesg输出信息说明
1.1 触发Killed常见原因
当系统资源不足时,Linux 内核也可以决定终止一个或多个进程。 一个非常常见的例子是内存不足 (OOM) killer,会在系统的物理内存耗尽时触发。
当内存不足时,内核会将相关信息记录到内核日志缓冲区中,该缓冲区可通过 /dev/kmsg 获得。
有几个工具/脚本/命令 可以更轻松地从该虚拟设备读取数据,其中最常见的是 dmesg 和 journalctl。
1.2 查看Killed日志
使用sudo dmesg | tail -7命令(任意目录下,不需要进入log目录,这应该是最简单的一种)
- hs@hs:~$ sudo dmesg | tail -7
- [4772612.871775] [1125836] 1000 1125836 2539 911 61440 0 0 bash
- [4772612.871777] [1126560] 1000 1126560 1787 474 53248 0 0 tmux: client
- [4772612.871779] [1138439] 1000 1138439 2029489 1412461 12845056 0 0 python3
- [4772612.871781] [1141425] 1000 1141425 1809 38 49152 0 0 sleep
- [4772612.871783] oom-kill:constraint=CONSTRAINT_NONE,nodemask=(null),cpuset=/,mems_allowed=0,global_oom,task_memcg=/user.slice/user-1000.slice/session-1549.scope,task=python3,pid=1138439,uid=1000
- [4772612.871840] Out of memory: Killed process 1138439 (python3) total-vm:8117956kB, anon-rss:5649844kB, file-rss:0kB, shmem-rss:0kB, UID:1000 pgtables:12544kB oom_score_adj:0
- [4772612.968800] oom_reaper: reaped process 1138439 (python3), now anon-rss:0kB, file-rss:0kB, shmem-rss:0kB
oom-kill之后,就是解释那个被killed的程序的pid和uid
Out of memory: Killed process 1138439 (python3) total-vm:8117956kB, anon-rss:5649844kB,内存不够
total_vm:总共使用的虚拟内存 Virtual memory use (in 4 kB pages)
8117956/1024(得到MB)/1024(得到GB)=7.741GB
rss:常驻内存使用Resident memory use (in 4 kB pages)
5649844/1024/1024=5.388GB
另外,网上还有几种方式,如下:
使用下面的这几行命令
journalctl --list-boots |
awk ‘{ print $1 }’ |
xargs -I{} journalctl --utc --no-pager -b {} -kqg ‘killed process’ -o verbose --output-fields=MESSAGE
就可以直接得到,最关键的信息
- hs@hs:~$ journalctl --list-boots | \
- > awk '{ print $1 }' | \
- > xargs -I{} journalctl --utc --no-pager -b {} -kqg 'killed process' -o verbose --output-fields=MESSAGE
- Mon 2022-02-14 08:48:43.684987 UTC [s=ed0a1ecfd36e41bda458e5e111c46e95;i=d4573;b=7bc379f894944dcd81ae32ff54afa009;m=456b0ad36d2;t=5d7f67bdee47b;x=5bfe01f8e2ca9b2c]
- MESSAGE=Out of memory: Killed process 1125888 (python3) total-vm:8530488kB, anon-rss:5653404kB, file-rss:0kB, shmem-rss:0kB, UID:1000 pgtables:12552kB oom_score_adj:0
- Mon 2022-02-14 09:29:43.854158 UTC [s=ed0a1ecfd36e41bda458e5e111c46e95;i=d4785;b=7bc379f894944dcd81ae32ff54afa009;m=45743506aa5;t=5d7f70e82184e;x=9b55cfb2e7c081e7]
- MESSAGE=Out of memory: Killed process 1138439 (python3) total-vm:8117956kB, anon-rss:5649844kB, file-rss:0kB, shmem-rss:0kB, UID:1000 pgtables:12544kB oom_score_adj:0
journalctl -xb | egrep -i 'killed process’
- hs@hs:~$ journalctl -xb | egrep -i 'killed process'
- Feb 14 08:48:43 hs kernel: Out of memory: Killed process 1125888 (python3) total-vm:8530488kB, anon-rss:5653404kB, file-rss:0kB, shmem-rss:0kB, UID:1000 pgtables:12552kB oom_score_adj:0
- Feb 14 09:29:43 hs kernel: Out of memory: Killed process 1138439 (python3) total-vm:8117956kB, anon-rss:5649844kB, file-rss:0kB, shmem-rss:0kB, UID:1000 pgtables:12544kB oom_score_adj:0
- Feb 15 03:42:08 hs sudo[1151639]: hs : TTY=pts/0 ; PWD=/home/hs ; USER=root ; COMMAND=/usr/bin/egrep -i -r killed process /var/log
其中有一部分信息也许有用,如下所示,类似一个表格
重点关注其中最后两列:oom_score_adj和name
可以看到,有些服务常驻内存,所以占了不少内存,考虑清理出去。
同时killed之后,似乎有些内存,不会从内存中自动释放。需要手动释放。
- [4770152.819134] Tasks state (memory values in pages):
- [4770152.819135] [ pid ] uid tgid total_vm rss pgtables_bytes swapents oom_score_adj name
- [4770152.819138] [ 414] 0 414 70068 4518 90112 0 -1000 multipathd
- [4770152.819142] [ 704] 0 704 60330 698 106496 0 0 accounts-daemon
- [4770152.819144] [ 708] 0 708 2136 610 57344 0 0 cron
- [4770152.819775] [ 734] 0 734 7372 3870 94208 0 0 python3
- [4770152.819777] [ 735] 0 735 950 519 49152 0 0 atd
- [4770152.819790] [ 1033] 0 1033 82072 5590 131072 0 0 python3
- [4770152.819792] [ 1577] 0 1577 3046 915 61440 0 -1000 sshd
- [4770152.819795] [ 29492] 1000 29492 4663 1187 73728 0 0 systemd
- [4770152.819798] [ 29493] 1000 29493 42531 1196 102400 0 0 (sd-pam)
- [4770152.819800] [ 135845] 0 135845 457433 38533 835584 0 0 nebula-metad
- [4770152.819802] [ 135908] 0 135908 333667 24936 385024 0 0 nebula-graphd
- [4770152.819804] [ 135927] 0 135927 592451 139719 1687552 0 0 nebula-storaged
- [4770152.819818] [ 689086] 0 689086 292212 7132 286720 0 -900 snapd
- [4770152.819821] [ 694790] 1000 694790 1779 641 61440 0 0 dbus-daemon
- [4770152.819823] [ 699514] 1000 699514 407129 5193 1474560 0 0 node
- [4770152.819825] [ 699523] 1000 699523 416398 7757 1634304 0 0 node
- [4770152.819827] [ 699549] 1000 699549 489317 11361 2494464 0 0 node
- [4770152.819830] [ 699555] 1000 699555 488416 10836 2531328 0 0 node
- [4770152.819832] [ 699559] 1000 699559 488945 10316 2461696 0 0 node
- [4770152.819835] [1102457] 0 1102457 3491 1010 65536 0 0 sshd
- [4770152.819837] [1102573] 1000 1102573 3491 428 65536 0 0 sshd
- [4770152.819839] [1102574] 1000 1102574 2215 574 57344 0 0 bash
- [4770152.819841] [1102636] 1000 1102636 654 122 40960 0 0 sh
- [4770152.819843] [1102643] 1000 1102643 403507 180866 3731456 0 0 node
- [4770152.819845] [1102681] 1000 1102681 233944 6666 1265664 0 0 node
- [4770152.819847] [1102707] 1000 1102707 221411 13611 2768896 0 0 node
- [4770152.819849] [1102777] 1000 1102777 2535 952 49152 0 0 bash
- [4770152.819851] [1103566] 1000 1103566 207746 13759 720896 0 0 node
- [4770152.819853] [1110927] 113 1110927 2395391 627318 8884224 0 0 java
- [4770152.819856] [1121028] 1000 1121028 1488563 907652 8630272 0 0 python3
- [4770152.819858] [1125835] 1000 1125835 1983 659 49152 0 0 tmux: server
- [4770152.819860] [1125836] 1000 1125836 2539 944 61440 0 0 bash
- [4770152.819862] [1125888] 1000 1125888 2132622 1413351 12853248 0 0 python3
- [4770152.819864] [1126560] 1000 1126560 1787 569 53248 0 0 tmux: client
- [4770152.819867] [1137700] 1000 1137700 1809 38 49152 0 0 sleep
egrep -i ‘killed process’ /var/log/messages 或
egrep -i -r ‘killed process’ /var/log
- hs@hs:~$ sudo egrep -i -r 'killed process' /var/log
- Binary file /var/log/journal/482df032dea642279e0da48fa9dd4a1a/system.journal matches
- /var/log/auth.log:Feb 15 03:42:08 hs sudo: hs : TTY=pts/0 ; PWD=/home/hs ; USER=root ; COMMAND=/usr/bin/egrep -i -r killed process /var/log
- /var/log/syslog.1:Feb 14 08:48:43 hs kernel: [4770152.820348] Out of memory: Killed process 1125888 (python3) total-vm:8530488kB, anon-rss:5653404kB, file-rss:0kB, shmem-rss:0kB, UID:1000 pgtables:12552kB oom_score_adj:0
- /var/log/syslog.1:Feb 14 09:29:43 hs kernel: [4772612.871840] Out of memory: Killed process 1138439 (python3) total-vm:8117956kB, anon-rss:5649844kB, file-rss:0kB, shmem-rss:0kB, UID:1000 pgtables:12544kB oom_score_adj:0
- /var/log/kern.log.1:Feb 14 08:48:43 hs kernel: [4770152.820348] Out of memory: Killed process 1125888 (python3) total-vm:8530488kB, anon-rss:5653404kB, file-rss:0kB, shmem-rss:0kB, UID:1000 pgtables:12552kB oom_score_adj:0
- /var/log/kern.log.1:Feb 14 09:29:43 hs kernel: [4772612.871840] Out of memory: Killed process 1138439 (python3) total-vm:8117956kB, anon-rss:5649844kB, file-rss:0kB, shmem-rss:0kB, UID:1000 pgtables:12544kB oom_score_adj:0