2010年2月25日

secondary rlink状态为need_recover的解决

    当secondary机器由于某种原因磁盘出错,dg被deport掉,会导致VVR配置出错,这种情况下,需要做下列操作(不一定都必须,但是都要检查)

1. import dg:

vxdg import datadg

2. start all replicated volume:

vxvol start <vol_name>

这样你在secondary机器上用vxprint检查,所有的rvg、rlink、volume都应该为ENABLED、ACTIVE状态,如果有DISABLE类似状态,请再检查并修复。

[root@secondary datadg]# vxprint
Disk group: datadg

TY NAME         ASSOC        KSTATE   LENGTH   PLOFFS   STATE    TUTIL0  PUTIL0
dg datadg       datadg       -        -        -        -        -       -

dm datadg01     sdb          -        4094608  -        -        -       -
dm datadg02     sdc          -        4094608  -        -        -       -

rv primary_secondary_rvg -   ENABLED  -        -        ACTIVE   -       -
rl rlk_primary_primary_secondary_ primary_secondary_rvg CONNECT - - ACTIVE - -
v  vol1lv       primary_secondary_rvg ENABLED 1048576 - ACTIVE   -       -
pl vol1lv-01    vol1lv       ENABLED  1048576  -        ACTIVE   -       -
sd datadg01-01  vol1lv-01    ENABLED  1048576  0        -        -       -
pl vol1lv-02    vol1lv       ENABLED  1048576  -        ACTIVE   -       -
sd datadg02-01  vol1lv-02    ENABLED  1048576  0        -        -       -
pl vol1lv-03    vol1lv       ENABLED  LOGONLY  -        ACTIVE   -       -
sd datadg01-05  vol1lv-03    ENABLED  32       LOG      -        -       -
pl vol1lv-04    vol1lv       ENABLED  LOGONLY  -        ACTIVE   -       -
sd datadg02-05  vol1lv-04    ENABLED  32       LOG      -        -       -
v  vol2lv       primary_secondary_rvg ENABLED 1048576 - ACTIVE   -       -
pl vol2lv-01    vol2lv       ENABLED  1048576  -        ACTIVE   -       -
sd datadg01-02  vol2lv-01    ENABLED  1048576  0        -        -       -
pl vol2lv-02    vol2lv       ENABLED  1048576  -        ACTIVE   -       -
sd datadg02-02  vol2lv-02    ENABLED  1048576  0        -        -       -
pl vol2lv-03    vol2lv       ENABLED  LOGONLY  -        ACTIVE   -       -
sd datadg01-04  vol2lv-03    ENABLED  32       LOG      -        -       -
pl vol2lv-04    vol2lv       ENABLED  LOGONLY  -        ACTIVE   -       -
sd datadg02-04  vol2lv-04    ENABLED  32       LOG      -        -       -
v  srllv        primary_secondary_rvg ENABLED 262144 SRL ACTIVE  -       -
pl srllv-01     srllv        ENABLED  262144   -        ACTIVE   -       -
sd datadg01-03  srllv-01     ENABLED  262144   0        -        -       -
pl srllv-02     srllv        ENABLED  262144   -        ACTIVE   -       -
sd datadg02-03  srllv-02     ENABLED  262144   0        -        -       -

3. 在需要recover的机器执行recover操作:

root@secondary datadg]# vxrecover –s

这步执行之后应该就ok,可以用下面的命令检查。下面是在执行vxrecover之前和之后的vxprint结果显示比较:

[root@secondary datadg]# vxprint -lP
Disk group: datadg

Rlink:    rlk_primary_primary_secondary_
info:     timeout=500 packet_size=8400 rid=0.1100
          latency_high_mark=10000 latency_low_mark=9950
          bandwidth_limit=none
state:    state=ACTIVE
          synchronous=off latencyprot=off srlprot=autodcm
assoc:    rvg=primary_secondary_rvg
          remote_host=172.111.100.10 IP_addr=172.111.100.10 port=4145
          remote_dg=datadg
          remote_dg_dgid=1265752821.7.localhost.localdomain
          remote_rvg_version=unknown
          remote_rlink=rlk_secondary_primary_secondar
          remote_rlink_rid=0.1106
          local_host=172.111.100.20 IP_addr=172.111.100.20 port=4145
protocol: UDP/IP
flags:    write disabled attached consistent disconnected needs_recovery

[root@secondary datadg]# vxrecover -s
[root@secondary datadg]# vxprint -lP
Disk group: datadg

Rlink:    rlk_primary_primary_secondary_
info:     timeout=500 packet_size=8400 rid=0.1100
          latency_high_mark=10000 latency_low_mark=9950
          bandwidth_limit=none
state:    state=ACTIVE
          synchronous=off latencyprot=off srlprot=autodcm
assoc:    rvg=primary_secondary_rvg
          remote_host=172.111.100.10 IP_addr=172.111.100.10 port=4145
          remote_dg=datadg
          remote_dg_dgid=1265752821.7.localhost.localdomain
          remote_rvg_version=21
          remote_rlink=rlk_secondary_primary_secondar
          remote_rlink_rid=0.1106
          local_host=172.111.100.20 IP_addr=172.111.100.20 port=4145
protocol: UDP/IP
flags:    write enabled attached consistent connected

 

[root@secondary datadg]# vxrlink -g datadg verify rlk_primary_primary_secondary_
RLINK                REMOTE HOST          LOCAL HOST           STATUS     STATE
rlk_primary_primary_secondary_ 172.111.100.10       172.111.100.20       OK         ACTIVE


Full Text

2010年2月10日

在AIX5.3上安装Websphere6.1文件系统为VXFS解决方案

    在AIX5.3上安装Websphere6.1,文件系统为vxfs,安装程序无法正确检查目的文件系统的大小,识别为0M,始终无法通过正确性检查,可以用下面的参数安装:

./install -OPT noPrereqChecks="true"


Full Text

2010年2月2日

Solaris 10 SVM root mirror steps

Environment: solaris 10 with two disks, c1t0d0 and c1t1d0, c1t0d0 is original / file system, c1t0d0 is new attached disk. now we try to make c1t0d0 and c1t1d0 as Raid1 root mirror.

be care that: for SVM volume, there must be a free slice in each disk to contain meta db. in this environment, slice s6 with 170M space is used for this.

1. fdisk new disk c1t0d0 to create solaris patition.

2. copy the original volume table to the second disk:

bash-3.00# prtvtoc /dev/rdsk/c1t0d0s2 | fmthard -s - /dev/rdsk/c1t1d0s2
fmthard:  New volume table of contents now in place.

3. create meta db for the two disks:
bash-3.00# metadb -af -c 3 /dev/dsk/c1t0d0s6 /dev/dsk/c1t1d0s6

4. create meta devices. Remember at this time, just attach one way mirror, the left half will be attached after rebooting:


bash-3.00# metainit -f d11 1 1 /dev/dsk/c1t0d0s0
d11: Concat/Stripe is setup
bash-3.00# metainit -f d21 1 1 /dev/dsk/c1t1d0s0
d21: Concat/Stripe is setup
bash-3.00# metainit -f d12 1 1 /dev/dsk/c1t0d0s1
d12: Concat/Stripe is setup
bash-3.00# metainit -f d22 1 1 /dev/dsk/c1t1d0s1
d22: Concat/Stripe is setup
bash-3.00# metainit -f d13 1 1 /dev/dsk/c1t0d0s7
d13: Concat/Stripe is setup
bash-3.00# metainit -f d23 1 1 /dev/dsk/c1t1d0s7
d23: Concat/Stripe is setup
bash-3.00# metainit -f d1 -m d11
d1: Mirror is setup
bash-3.00# metainit -f d2 -m d12
d2: Mirror is setup
bash-3.00# metainit -f d3 -m d13
d3: Mirror is setup

5. use metaroot to modify vfstab, also manually modify vfstab to use new created meta devices:


bash-3.00# metaroot d1
bash-3.00# vi /etc/vfstab
"/etc/vfstab" 13 lines, 461 characters
#device         device          mount           FS      fsck    mount   mount
#to mount       to fsck         point           type    pass    at boot options
#
fd      -       /dev/fd fd      -       no      -
/proc   -       /proc   proc    -       no      -
/dev/md/dsk/d2  -       -       swap    -       no      -
/dev/md/dsk/d1  /dev/md/rdsk/d1 /       ufs     1       no      -
/dev/md/dsk/d3  /dev/md/rdsk/d3 /export/home    ufs     2       yes     -
/devices        -       /devices        devfs   -       no      -
sharefs -       /etc/dfs/sharetab       sharefs -       no      -
ctfs    -       /system/contract        ctfs    -       no      -
objfs   -       /system/object  objfs   -       no      -
swap    -       /tmp    tmpfs   -       yes     -
~
"/etc/vfstab" 13 lines, 452 characters

6. lock file system for sync, then reboot:

bash-3.00# lockfs -fa
bash-3.00# init 6

7. after reboot, now we may attach the left half to the mirror. the metattach command may return soon, but the really sync work may take a long time depend on the size of the underlying file system.

bash-3.00# metattach  d1 d21
d1: submirror d21 is attached
bash-3.00# metattach d2 d22
d2: submirror d22 is attached
bash-3.00# metattach d3 d23
d3: submirror d23 is attached
bash-3.00#

8. use “metastat –i” to check each sub mirror sync status until all are stated as “Ok”, such as:

d11: Submirror of d1
    State: Okay

9. install grub boot sector(if in Sparc, use installboot instead):

bash-3.00# installgrub  -fm /boot/grub/stage1 /boot/grub/stage2 /dev/rdsk/c1t0d0s0
stage1 written to partition 0 sector 0 (abs 4096)
stage2 written to partition 0, 265 sectors starting at 50 (abs 4146)
stage1 written to master boot sector
bash-3.00#

10. Now you may boot from both hd0 and hd1 to test if the mirror is working well, manually modify the grub menu.lst to use “root (hd0,0,a)” and “root(hd1,0,a)” to test, do not use the “findroot(rootfs0,0,a)” item, since the findroot command need an entry under /boot/grub, we directly define hd0 and hd1 will be more easy to test.

11. in many documents, there is boot-device and altbootpath paramenter needed, they may be define by eeprom command. but in my test, you may switch the boot disk in bios and this is not absolutely needed. if you want this, just check some other doc for this, the command is like following, it’s actually stored at /boot/solaris/bootenv.rc

bash-3.00# eeprom altbootpath=/pci@0,0/pci15ad,1976@10/sd@1,0:a

12. now everything is ok. you may destroy one of your root disk and try to boot from another disk. select in grub list, edit the boot line to set it correct:

root(hd0,0,a)  or

root(hd1,0,a)


Full Text