Page generated: Tue May 8 16:59:32 CDT 2012
TOP sources of PNFS traffic in last 60 seconds:
0% cmssrv28-monitoring 1
0% cmslpc03 156
0% cmswn 4059
0% cmssrv139 4115
2% cmsstor 20995
10% cmssrv148 78943
11% cmssrv147 88869
12% cmssrv33 94671
13% cmsdcam5 108897
49% localhost-PnfsManager 387179
787885 PNFS tcpdump responses or 13131/sec
TOP worker node source of PNFS traffic in last 60 seconds:
0% cmswn1456 22
0% cmswn1940 24
0% cmswn1943 24
0% cmswn1956 24
0% cmswn1952 124
TOP storage node source of PNFS traffic in last 60 seconds:
0% cmsstor320 1420
0% cmsstor281 2359
0% cmsstor251 2554
0% cmsstor150 2895
0% cmsstor346 3416
---------------------------------------------------------------------------------------------------------------------------
PNFS statistics during last 60 seconds:
0% null 0
98% getattr 457221
0% setattr 78
0% root 0
0% lookup 1606
0% readlink 0
0% read 3100
0% wrcache 0
0% write 78
0% create 58
0% remove 3
0% rename 0
0% link 0
0% symlink 0
0% mkdir 0
0% rmdir 0
0% readdir 0
0% fsstat 26
462170 PNFS operations or 7702/sec
---------------------------------------------------------------------------------------------------------------------------
UPTIME:
17:00:42 up 206 days, 1:56, 1 user, load average: 6.82, 7.29, 7.51
---------------------------------------------------------------------------------------------------------------------------
Cleaner failed files:
/diskb/pnfs/cleaner/archive:
total 0
/diskb/pnfs/cleaner/current:
total 484
-rw-r--r-- 1 root root 2500 May 8 14:52 failed.v-cmsstor211-11
-rw-r--r-- 1 root root 2500 May 8 14:52 failed.v-cmsstor211-12
-rw-r--r-- 1 root root 2500 May 8 14:52 failed.v-cmsstor211-13
-rw-r--r-- 1 root root 21225 May 8 14:52 failed.w-cmsdemo2-1
-rw-r--r-- 1 root root 21225 May 8 14:52 failed.w-cmsdemo2-2
-rw-r--r-- 1 root root 21225 May 8 14:52 failed.w-cmsdemo2-3
-rw-r--r-- 1 root root 21225 May 8 14:52 failed.w-cmsdemo2-4
-rw-r--r-- 1 root root 21225 May 8 14:52 failed.w-cmsdemo2-5
-rw-r--r-- 1 root root 21225 May 8 14:52 failed.w-cmsdemo2-6
-rw-r--r-- 1 root root 25 May 8 14:52 failed.w-cmsstor108-1
-rw-r--r-- 1 root root 25 May 8 14:52 failed.w-cmsstor129-1
-rw-r--r-- 1 root root 25 May 8 14:52 failed.w-cmsstor133-2
-rw-r--r-- 1 root root 25 May 8 14:52 failed.w-cmsstor138-2
-rw-r--r-- 1 root root 25 May 8 14:52 failed.w-cmsstor144-1
-rw-r--r-- 1 root root 25 May 8 14:52 failed.w-cmsstor144-2
-rw-r--r-- 1 root root 25 May 8 14:52 failed.w-cmsstor147-1
-rw-r--r-- 1 root root 25 May 8 14:52 failed.w-cmsstor147-2
-rw-r--r-- 1 root root 25 May 8 14:52 failed.w-cmsstor147-3
-rw-r--r-- 1 root root 25 May 8 14:52 failed.w-cmsstor150-3
-rw-r--r-- 1 root root 25 May 8 14:52 failed.w-cmsstor168-2
-rw-r--r-- 1 root root 25 May 8 14:52 failed.w-cmsstor169-1
-rw-r--r-- 1 root root 25 May 8 14:52 failed.w-cmsstor169-2
-rw-r--r-- 1 root root 25 May 8 14:52 failed.w-cmsstor176-2
-rw-r--r-- 1 root root 25 May 8 14:52 failed.w-cmsstor191-2
-rw-r--r-- 1 root root 25 May 8 14:52 failed.w-cmsstor192-1
-rw-r--r-- 1 root root 25 May 8 14:52 failed.w-cmsstor201-2
-rw-r--r-- 1 root root 25 May 8 14:52 failed.w-cmsstor202-2
-rw-r--r-- 1 root root 25 May 8 14:52 failed.w-cmsstor216-1
-rw-r--r-- 1 root root 25 May 8 14:52 failed.w-cmsstor216-2
-rw-r--r-- 1 root root 25 May 8 14:52 failed.w-cmsstor218-1
-rw-r--r-- 1 root root 25 May 8 14:52 failed.w-cmsstor221-1
-rw-r--r-- 1 root root 25 May 8 14:52 failed.w-cmsstor225-1
-rw-r--r-- 1 root root 25 May 8 14:52 failed.w-cmsstor238-1
-rw-r--r-- 1 root root 25 May 8 14:52 failed.w-cmsstor239-2
-rw-r--r-- 1 root root 25 May 8 14:52 failed.w-cmsstor246-2
-rw-r--r-- 1 root root 25 May 8 14:52 failed.w-cmsstor283-1
-rw-r--r-- 1 root root 25 May 8 14:52 failed.w-cmsstor284-1
-rw-r--r-- 1 root root 2500 May 8 14:52 failed.w-cmsstor304-1
-rw-r--r-- 1 root root 2500 May 8 14:52 failed.w-cmsstor304-2
-rw-r--r-- 1 root root 2500 May 8 14:52 failed.w-cmsstor304-3
-rw-r--r-- 1 root root 25 May 8 14:52 failed.w-cmsstor310-2
-rw-r--r-- 1 root root 25 May 8 14:52 failed.w-cmsstor314-1
-rw-r--r-- 1 root root 25 May 8 14:52 failed.w-cmsstor317-2
-rw-r--r-- 1 root root 25 May 8 14:52 failed.w-cmsstor325-1
-rw-r--r-- 1 root root 25 May 8 14:52 failed.w-cmsstor328-3
-rw-r--r-- 1 root root 25 May 8 14:52 failed.w-cmsstor339-1
-rw-r--r-- 1 root root 25 May 8 14:52 failed.w-cmsstor344-1
-rw-r--r-- 1 root root 25 May 8 14:52 failed.w-cmsstor344-3
-rw-r--r-- 1 root root 25 May 8 14:52 failed.w-cmsstor348-3
-rw-r--r-- 1 root root 25 May 8 14:52 failed.w-cmsstor350-3
-rw-r--r-- 1 root root 25 May 8 14:52 failed.w-cmsstor356-2
-rw-r--r-- 1 root root 25 May 8 14:52 failed.w-cmsstor365-1
-rw-r--r-- 1 root root 25 May 8 14:52 failed.w-cmsstor377-3
-rw-r--r-- 1 root root 25 May 8 14:52 failed.w-cmsstor42-3
-rw-r--r-- 1 root root 25 May 8 14:52 failed.w-cmsstor88-3
-rw-r--r-- 1 root root 50 May 8 14:52 failed.w-cmsstor92-1
-rw-r--r-- 1 root root 25 May 8 14:52 failed.w-cmsstor92-2
-rw-r--r-- 1 root root 25 May 8 14:52 failed.w-cmsstor94-3
-rw-r--r-- 1 root root 25 May 8 14:52 failed.w-cmsstor96-1
-rw-r--r-- 1 root root 25 May 8 14:52 failed.w-cmsstor98-3
-rw-r--r-- 1 root root 2500 May 8 14:52 failed.w-cmsstordb2-1
-rw-r--r-- 1 root root 2500 May 8 14:52 failed.w-cmsstordb2-2
-rw-r--r-- 1 root root 2500 May 8 14:52 failed.w-cmsstordb2-3
-rw-r--r-- 1 root root 38325 May 8 14:52 failed.w-cmswn32c-1
-rw-r--r-- 1 root root 34750 May 8 14:52 failed.w-cmswn32c-2
-rw-r--r-- 1 root root 35975 May 8 14:52 failed.w-cmswn32c-3
---------------------------------------------------------------------------------------------------------------------------
VMSTAT 60 2:
procs -----------memory---------- ---swap-- -----io---- --system-- -----cpu------
r b swpd free buff cache si so bi bo in cs us sy id wa st
4 0 4852 4659980 211544 33609948 0 0 284 151 0 0 20 13 66 2 0
1 0 4852 4637084 211632 33613860 0 0 303 864 17080 207348 17 19 63 0 0
---------------------------------------------------------------------------------------------------------------------------
atop -c 60 2:
ATOP - cmspnfs1 2012/05/08 17:00:32 60 seconds elapsed
PRC | sys 89.46s | user 90.39s | #proc 266 | #zombie 2 | #exit 818 |
CPU | sys 120% | user 148% | irq 34% | idle 494% | wait 4% |
cpu | sys 18% | user 27% | irq 5% | idle 49% | cpu001 w 1% |
cpu | sys 16% | user 21% | irq 3% | idle 59% | cpu003 w 1% |
cpu | sys 12% | user 22% | irq 3% | idle 62% | cpu002 w 1% |
CPL | avg1 6.53 | avg5 7.25 | avg15 7.50 | csw 12713523 | intr 1030150 |
MEM | tot 47.3G | free 4.4G | cache 32.1G | buff 206.8M | slab 9.6G |
SWP | tot 32.0G | free 32.0G | | vmcom 4.0G | vmlim 32.9G |
DSK | sdb | busy 5% | read 799 | write 5297 | avio 0 ms |
DSK | sda | busy 1% | read 298 | write 531 | avio 0 ms |
NET | transport | tcpi 108523 | tcpo 113363 | udpi 1182990 | udpo 1180786 |
NET | network | ipi 1292676 | ipo 1293167 | ipfrw 0 | deliv 1292e3 |
NET | bond0 ---- | pcki 1303197 | pcko 448724 | si 187 Mbps | so 13 Mbps |
NET | eth0 18% | pcki 1301185 | pcko 448724 | si 187 Mbps | so 13 Mbps |
NET | lo ---- | pcki 843754 | pcko 843754 | si 16 Mbps | so 16 Mbps |
PID CPU COMMAND-LINE 1/1
12826 48% /usr/java/jdk1.6.0_22/bin/java -server -Xmx2048m -XX:MaxDirectMemory
18417 24% flush-0:19
7772 23% postgres: enstore users [local] SELECT
7747 20% postgres: enstore resilient [local] idle
3999 14% /usr/sbin/tcpdump -i eth0 port 2049 and udp
4004 14% /usr/sbin/tcpdump -i lo port 2049 and udp
7774 9% ./dbserver users
7806 8% ./pnfsd
7804 8% ./pnfsd
7812 8% ./pnfsd
7800 8% ./pnfsd
7810 8% ./pnfsd
7805 8% ./pnfsd
7801 8% ./pnfsd
7808 8% ./pnfsd
7809 7% ./pnfsd
7803 7% ./pnfsd
7802 7% ./pnfsd
7807 7% ./pnfsd
7757 7% postgres: enstore migration [local] idle
7688 5% ./dbserver admin
7734 4% ./dbserver cms
7724 4% ./dbserver cms11
7749 3% ./dbserver resilient
7759 3% ./dbserver migration
29151 3% postgres: enstore companion 127.0.0.1(39979) idle
5687 2% postgres: enstore companion 127.0.0.1(55469) idle
16937 2% postgres: enstore companion 127.0.0.1(59596) idle
5685 2% postgres: enstore companion 127.0.0.1(55467) idle
29154 2% postgres: enstore companion 127.0.0.1(39982) idle
5684 2% postgres: enstore companion 127.0.0.1(55466) idle
7721 2% postgres: enstore cms11 [local] idle
3986 1% egrep IP localhost.nfs|IP localhost.localdomain.nfs
? 1%
7172 1% postgres: stats collector process
7686 1% postgres: enstore admin [local] idle
3982 1% grep IP cmspnfs1.fnal.gov.nfs
? 1%
? 1%
3983 1% cut -f2 -d>
3988 1% cut -f2 -d>
7782 1% postgres: enstore cms4 [local] idle
? 1%
10985 1% /usr/local/bin/zabbix_agentd
---------------------------------------------------------------------------------------------------------------------------
ipcs
------ Shared Memory Segments --------
key shmid owner perms bytes nattch status
0x0052e2c1 0 enstore 600 2247483392 61
0x00001122 32769 root 600 1672 34
0x00000000 65538 root 700 12288 2
0x00000000 98307 root 700 12288 22
0x00000000 131076 root 700 12288 22
0x00000000 163845 root 700 12288 22
0x00000000 196614 root 700 12288 22
0x00000000 229383 root 700 12288 22
0x00000000 262152 root 700 12288 22
0x00000000 294921 root 700 12288 22
0x00000000 327690 root 700 12288 22
0x00000000 360459 root 700 12288 22
0x00000000 393228 root 700 12288 22
0x00000000 425997 root 700 12288 22
0x00000000 458766 root 700 12288 22
0x7a017d22 491535 zabbix 666 1216860 11
------ Semaphore Arrays --------
key semid owner perms nsems
0x0052e2c1 2818048 enstore 600 17
0x0052e2c2 2850817 enstore 600 17
0x0052e2c3 2883586 enstore 600 17
0x0052e2c4 2916355 enstore 600 17
0x0052e2c5 2949124 enstore 600 17
0x0052e2c6 2981893 enstore 600 17
0x0052e2c7 3014662 enstore 600 17
0x002fa327 3047431 root 666 2
0x00000000 3080200 root 600 1
0x00000000 3112969 root 700 2
0x00000000 3145738 root 700 2
0x00000000 3178507 root 700 2
0x00000000 3211276 root 700 2
0x00000000 3244045 root 700 2
0x00000000 3276814 root 700 2
0x00000000 3309583 root 700 2
0x00000000 3342352 root 700 2
0x00000000 3375121 root 700 2
0x00000000 3407890 root 700 2
0x00000000 3440659 root 700 2
0x00000000 3473428 root 700 2
0x00000000 3506197 root 700 2
0x00000000 3538966 root 700 2
0x00000000 3571735 root 700 2
0x00000000 3604504 root 700 2
0x00000000 3637273 root 700 2
0x00000000 3670042 root 700 2
0x00000000 3702811 root 700 2
0x00000000 3735580 root 700 2
0x00000000 3768349 root 700 2
0x00000000 3801118 root 700 1
0x00000000 3833887 root 700 1
0x00000000 3866656 root 700 1
0x00000000 3899425 root 700 1
0x00000000 3932194 root 700 1
0x00000000 3964963 root 700 1
0x00000000 3997732 root 700 1
0x00000000 4030501 root 700 1
0x00000000 4063270 root 700 1
0x00000000 4096039 root 700 1
0x00000000 4128808 root 700 1
0x00000000 4161577 root 700 1
0x00000000 4194346 root 700 1
0x00000000 4554795 root 600 1
0x00000000 4587564 root 600 1
0x00000000 4620333 root 600 1
0x00000000 4653102 root 600 1
0x00000000 4685871 root 600 1
0x00000000 4948016 root 600 1
0x00000000 4915249 root 600 1
0x00000000 4784178 root 600 1
0x00000000 4816947 root 600 1
0x00000000 4849716 root 600 1
0x00000000 4882485 root 600 1
0x00000000 4980790 root 600 1
0x00000000 5013559 root 600 1
0x00000000 5046328 root 600 1
0x00000000 5079097 root 600 1
0x00000000 5111866 root 600 1
0x00000000 5144635 root 600 1
0x00000000 5177404 root 600 1
0x00000000 5210173 root 600 1
0x7a017d22 5603390 zabbix 666 5
0x00000000 5308479 root 600 1
0x00000000 5341248 root 600 1
0x00000000 5374017 root 600 1
0x00000000 5406786 root 600 1
0x00000000 5439555 root 600 1
0x00000000 5472324 root 600 1
0x00000000 5505093 root 600 1
0x00000000 5537862 root 600 1
0x00000000 5570631 root 600 1
------ Message Queues --------
key msqid owner perms used-bytes messages
---------------------------------------------------------------------------------------------------------------------------
time df /pnfs/fs:
Filesystem 1K-blocks Used Available Use% Mounted on
localhost:/fs 400000 80000 284000 22% /pnfs/fs
real 0m0.001s
user 0m0.000s
sys 0m0.000s
---------------------------------------------------------------------------------------------------------------------------
PNFSMANAGER INFO:
dCache Admin (VII) (user=enstore)
[cmssrv32.fnal.gov] (local) enstore > cd PnfsManager
[cmssrv32.fnal.gov] (PnfsManager) enstore > info
$Revision: 12742 $
NameSpace Provider:
diskCacheV111.namespace.provider.PermissionHandlerNameSpaceProvider@6d6564ae
CacheLocation Provider:
$Id: SQLNameSpaceProvider.java,v 1.19 2007-08-22 12:24:38 tigran Exp $
List operations queued: 0
Threads (32) Queue
[0] 0
[1] 0
[2] 0
[3] 0
[4] 0
[5] 0
[6] 0
[7] 0
[8] 0
[9] 0
[10] 0
[11] 0
[12] 0
[13] 0
[14] 0
[15] 0
[16] 0
[17] 0
[18] 0
[19] 0
[20] 0
[21] 0
[22] 0
[23] 0
[24] 0
[25] 0
[26] 0
[27] 0
[28] 0
[29] 0
[30] 0
[31] 0
Thread groups (1)
[0] 0
Cache Location Queues
[0] 0
[1] 0
[2] 0
[3] 0
Statistics:
PnfsManagerV3 requests failed
PnfsUpdateCacheStatisticsMessage 0 0
PnfsMapPathMessage 71676819 11381
PnfsCreateEntryMessage 6477807 22507
PnfsGetCacheLocationsMessage 269373574 341
PnfsGetCacheStatisticsMessage 0 0
PnfsGetFileAttributes 2912735 248033
PnfsGetParentMessage 148 0
PnfsSetStorageInfoMessage 0 0
PnfsFlagMessage 6468969 0
PnfsGetChecksumMessage 9230604 2
PnfsSetFileAttributes 11526743 69036
PnfsSetChecksumMessage 6447020 5
PnfsAddCacheLocationMessage 70621 0
PnfsDeleteEntryMessage 777585 435221
PnfsListDirectoryMessage 5932 1
PnfsGetChecksumAllMessage 3101 0
PnfsRenameMessage 4018 1
PnfsClearCacheLocationMessage 24933843 0
PnfsGetStorageInfoMessage 261688581 70075741
PnfsSetLengthMessage 0 0
PnfsGetFileMetaDataMessage 94181674 7014411
PoolFileFlushedMessage 3923531 5
PnfsCreateDirectoryMessage 2676305 2550444
PnfsSetFileMetaDataMessage 0 0
Total 772379610 80427129
PnfsManagerV3.Folded requests failed
Total 0 0
[cmssrv32.fnal.gov] (PnfsManager) enstore > ..
[cmssrv32.fnal.gov] (local) enstore >
[cmssrv32.fnal.gov] (local) enstore > logoff
dmg.util.CommandExitException: (0) Done
[cmssrv32.fnal.gov] (local) enstore >