Do right things is better than do things right: 2010

2010年9月15日

Hints about working with Xming

1. DO NOT use su to switch to another user and then use Xming, it will failed, like following(this is an example from putty). Directly login in with the id you want to work with Xming.

[root@ContentServer66 ~]# su - oracle
[oracle@ContentServer66 ~]$ xclock
Xlib: connection to "localhost:11.0" refused by server
Xlib: PuTTY X11 proxy: MIT-MAGIC-COOKIE-1 data did not match
Error: Can't open display: localhost:11.0

2. If you are using putty as I’m, you need to turn on X11 Forwarding and set X location to localhost, see following example:

Full Text

2010年9月8日

(Oracle)how to recreate control file(Linux)

1. login in as sysdba, export trc file:

[oracle@cs66 orcl]$ sqlplus "/ as sysdba"

SQL*Plus: Release 10.2.0.4.0 - Production on Wed Sep 8 01:48:43 2010

Copyright (c) 1982, 2007, Oracle. All Rights Reserved.

Connected to:
Oracle Database 10g Enterprise Edition Release 10.2.0.4.0 - Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options

SQL> alter database backup controlfile to trace;

Database altered.

SQL>

2. The new trc file is under $ORACLE_BASE/product/10.2.0/oradata/$ORACLE_SID, check the timestamp for newly created trc file. Here for me it’s as following:

-rw-rw---- 1 oracle oracle 6711 Sep 8 01:48 orcl_ora_32195.trc

3. Check the orcl_ora_32195.trc file, save content from the line “-- Set #1. NORESETLOGS case” to the end, you may check the comments for what they are doing, the contents saved to another file, I named it as reCreate.sql.

4. login in as sysdba again, shutdown instance, and delete all control files, then run the file reCreate.sql.

[oracle@cs66 orcl]$ sqlplus " / as sysdba"

SQL*Plus: Release 10.2.0.4.0 - Production on Wed Sep 8 01:57:42 2010

Copyright (c) 1982, 2007, Oracle. All Rights Reserved.

Connected to:
Oracle Database 10g Enterprise Edition Release 10.2.0.4.0 - Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options

SQL> @/home/oracle/oracle/product/10.2.0/db_1/admin/orcl/udump/reCreate.sql

5. check your control files, they are recreated again.

Full Text

(Oracle)OEM unavailable error solution

Sometimes there is error like following when you access you enterprise management console:

503 Service Unavailable

Service is not initialized correctly. The Em Key is not configured properly. Run "emctl status emkey" for more details.

This might be some file owner error and you need to rebuild your emkey file, the solution is:

1. check your $ORACLE_HOME directory(and subdirectory), make sure all files are owned by user oracle(or any other user that you install oracle software).

2. reconfigure your emkey with following commands:

[oracle@cs66 oracle]$ emctl status emkey
TZ set to PRC
Oracle Enterprise Manager 10g Database Control Release 10.2.0.4.0
Copyright (c) 1996, 2007 Oracle Corporation. All rights reserved.
Please enter repository password:

The Em Key is configured properly, but is not secure. Secure the Em Key by running "emctl config emkey -remove_from_repos".
[oracle@cs66 oracle]$ emctl config emkey -remove_from_repos
TZ set to PRC
Oracle Enterprise Manager 10g Database Control Release 10.2.0.4.0
Copyright (c) 1996, 2007 Oracle Corporation. All rights reserved.
Please enter repository password:

The Em Key has been removed from the Management Repository.
Make a backup copy of OH/sysman/config/emkey.ora file and store it on another machine.
WARNING: Encrypted data in Enterprise Manager will become unusable if the emkey.ora file is lost or corrupted.
[oracle@cs66 oracle]$ emctl status emkey
TZ set to PRC
Oracle Enterprise Manager 10g Database Control Release 10.2.0.4.0
Copyright (c) 1996, 2007 Oracle Corporation. All rights reserved.
Please enter repository password:

The Em Key is configured properly.

3. now you may login into your EM console to check again, it’s back now, :)

http://192.168.100.129:1158/em/console/logon/logon

Full Text

2010年9月6日

Oracle Upgrade from 10.0.2.1 to 10.0.2.4

The following is from http://faruqueahmed.wordpress.com, his upgrade log is quite clear, nice guy!

Before database upgrade it is recommanded to backup the PRODUCTION database.

1. Stop all services of oracle

[oracle@ittestdb ~]$ echo $ORACLE_BASE
/u01/app/oracle
[oracle@ittestdb ~]$ echo $ORACLE_HOME
/u01/app/oracle/product/10.2.0/db_1
[oracle@ittestdb ~]$ echo $ORACLE_SID
orcl
[oracle@ittestdb ~]$
[oracle@ittestdb ~]$ emctl stop dbconsole
TZ set to Asia/Baghdad
Oracle Enterprise Manager 10g Database Control Release 10.2.0.1.0
Copyright (c) 1996, 2005 Oracle Corporation. All rights reserved.
http://ittestdb.amardhaka.com:1158/em/console/aboutApplication
Stopping Oracle Enterprise Manager 10g Database Control …
… Stopped.
[oracle@ittestdb ~]$ isqlplusctl stop
iSQL*Plus 10.2.0.1.0
Copyright (c) 2003, 2005, Oracle. All rights reserved.
Stopping iSQL*Plus …
iSQL*Plus stopped.
[oracle@ittestdb ~]$

[oracle@ittestdb ~]$ lsnrctl stop

LSNRCTL for Linux: Version 10.2.0.1.0 – Production on 08-FEB-2010 13:17:18

Connecting to (DESCRIPTION=(ADDRESS=(PROTOCOL=IPC)(KEY=EXTPROC1)))
The command completed successfully
[oracle@ittestdb ~]$ sqlplus “/as sysdba”

SQL*Plus: Release 10.2.0.1.0 – Production on Mon Feb 8 13:17:29 2010

Connected to:
Oracle Database 10g Enterprise Edition Release 10.2.0.1.0 – Production
With the Partitioning, OLAP and Data Mining options

SQL> shutdown immediate
Database closed.
Database dismounted.
ORACLE instance shut down.
SQL> exit
Disconnected from Oracle Database 10g Enterprise Edition Release 10.2.0.1.0 – Production
With the Partitioning, OLAP and Data Mining options
[oracle@ittestdb ~]$ ps -ef|grep oracle
root     13754 32094 0 13:13 pts/1    00:00:00 su – oracle
oracle 13755 13754 0 13:13 pts/1    00:00:00 -bash
oracle 14525 13755 0 13:18 pts/1    00:00:00 ps -ef
oracle 14526 13755 0 13:18 pts/1    00:00:00 grep oracle
[oracle@ittestdb ~]$

Step 2: Install the Database Patch Set

[oracle@ittestdb ~]$ export DISPLAY=10.13.5.95:0.0
[oracle@ittestdb ~]$ /u01/stage/patch/Disk1/runInstaller
Starting Oracle Universal Installer…

Checking installer requirements…

Checking operating system version: must be redhat-3, SuSE-9, SuSE-10, redhat-4, redhat-5, UnitedLinux-1.0, asianux-1, asianux-2 or asianux-3
Passed

All installer requirements met.

Preparing to launch Oracle Universal Installer from /tmp/OraInstall2010-02-08_01-22-19PM. Please wait …[oracle@ittestdb ~]$ Oracle Universal Installer, Version 10.2.0.4.0 Production
Copyright (C) 1999, 2008, Oracle. All rights reserved.

[oracle@ittestdb ~]$

Step 3 : Upgrade Database

[oracle@ittestdb ~]$ ps -ef|grep oracle
root     13754 32094 0 13:13 pts/1    00:00:00 su – oracle
oracle 13755 13754 0 13:13 pts/1    00:00:00 -bash
oracle 18304 13755 0 13:28 pts/1    00:00:00 ps -ef
oracle 18305 13755 0 13:28 pts/1    00:00:00 grep oracle
[oracle@ittestdb ~]$ sqlplus “/as sysdba”

SQL*Plus: Release 10.2.0.4.0 – Production on Mon Feb 8 13:28:53 2010

Connected to an idle instance.

SQL> STARTUP UPGRADE
ORACLE instance started.

Total System Global Area 1224736768 bytes
Fixed Size                  1267188 bytes
Variable Size             318769676 bytes
Database Buffers          889192448 bytes
Redo Buffers               15507456 bytes
Database mounted.
Database opened.
SQL> SPOOL /u01/stage/patch/Disk1/upgrade_info.log
SQL> @?/rdbms/admin/utlu102i.sql
Oracle Database 10.2 Upgrade Information Utility    02-08-2010 13:30:50
.
**********************************************************************
Database:
**********************************************************************
–> name:       ORCL
–> version:    10.2.0.1.0
–> compatible: 10.2.0.1.0
–> blocksize: 8192
.
**********************************************************************
Tablespaces: [make adjustments in the current environment]
**********************************************************************
–> SYSTEM tablespace is adequate for the upgrade.
…. minimum required size: 488 MB
…. AUTOEXTEND additional space required: 8 MB
–> UNDOTBS1 tablespace is adequate for the upgrade.
…. minimum required size: 400 MB
…. AUTOEXTEND additional space required: 370 MB
–> SYSAUX tablespace is adequate for the upgrade.
…. minimum required size: 246 MB
…. AUTOEXTEND additional space required: 16 MB
–> TEMP tablespace is adequate for the upgrade.
…. minimum required size: 58 MB
…. AUTOEXTEND additional space required: 38 MB
–> EXAMPLE tablespace is adequate for the upgrade.
…. minimum required size: 69 MB
.
**********************************************************************
Update Parameters: [Update Oracle Database 10.2 init.ora or spfile]
**********************************************************************
– No update parameter changes are required.
.
**********************************************************************
Renamed Parameters: [Update Oracle Database 10.2 init.ora or spfile]
**********************************************************************
– No renamed parameters found. No changes are required.
.
**********************************************************************
Obsolete/Deprecated Parameters: [Update Oracle Database 10.2 init.ora or spfile]
**********************************************************************
– No obsolete parameters found. No changes are required
.
**********************************************************************
Components: [The following database components will be upgraded or installed]
**********************************************************************
–> Oracle Catalog Views         [upgrade] VALID
–> Oracle Packages and Types    [upgrade] VALID
–> JServer JAVA Virtual Machine [upgrade] VALID
–> Oracle XDK for Java          [upgrade] VALID
–> Oracle Java Packages         [upgrade] VALID
–> Oracle Text                  [upgrade] VALID
–> Oracle XML Database          [upgrade] VALID
–> Oracle Workspace Manager     [upgrade] VALID
–> Oracle Data Mining           [upgrade] VALID
–> OLAP Analytic Workspace      [upgrade] VALID
–> OLAP Catalog                 [upgrade] VALID
–> Oracle OLAP API              [upgrade] VALID
–> Oracle interMedia            [upgrade] VALID
–> Spatial                      [upgrade] VALID
–> Expression Filter            [upgrade] VALID
–> EM Repository                [upgrade] VALID
–> Rule Manager                 [upgrade] VALID
.

PL/SQL procedure successfully completed.

SQL> SPOOL OFF;

SQL> SPOOL /u01/stage/patch/Disk1/patch.log

SQL> @?/rdbms/admin/catupgrd.sql

……………………………………………

…………………………..

177    PROCEDURE selectTablespace( tsname IN varchar2 );
178
179    — This procedure informs this package that the caller intends to do
180    — point-in-time recovery on the specified tablespace. This procedure must
181    — be called once for each tablespace in the recovery set.
182    — It alter selected tablespace read only, also checks datafiles in the
183    — selected tablespace.
184    –
185    — Input parameters:
186    –   tsname
187    –     The tablespace name.
188    –
189    — Exceptions:
190    –   WRONG_ORDER (ORA-29301)
191    –     wrong dbms_pitr package functions/procedure order.
192    –   WRONG_TSNAME (ORA-29304)
193    –     select tablespace does not exist
194    –   NOT_READ_ONLY (ORA-29305)
195    –     cannot alter the tablespace read only
196    –   FILE_OFFLINE (ORA-29306)
197    –     datafile is not online

SQL>
SQL>
SQL>
SQL> SPOOL OFF

SQL> SHUTDOWN IMMEDIATE
Database closed.
Database dismounted.
ORACLE instance shut down.
SQL> STARTUP
ORACLE instance started.

Total System Global Area 1224736768 bytes
Fixed Size                  1267188 bytes
Variable Size             335546892 bytes
Database Buffers          872415232 bytes
Redo Buffers               15507456 bytes
Database mounted.
Database opened.
SQL>

To compile invalid objects

SQL> @?/rdbms/admin/utlrp.sql

……………………………..

…………………………………

TIMESTAMP

——————————————————————————–

COMP_TIMESTAMP UTLRP_END 2010-02-08 15:02:23

DOC> The following query reports the number of objects that have compiled

DOC> with errors (objects that compile with errors have status set to 3 in

DOC> obj$). If the number is higher than expected, please examine the error

DOC> messages reported with each object (using SHOW ERRORS) to see if they

DOC> point to system misconfiguration or resource constraints that must be

DOC> fixed before attempting to recompile these objects.

DOC>#

OBJECTS WITH ERRORS

——————-

DOC> The following query reports the number of errors caught during

DOC> recompilation. If this number is non-zero, please query the error

DOC> messages in the table UTL_RECOMP_ERRORS to see if any of these errors

DOC> are due to misconfiguration or resource constraints that must be

DOC> fixed before objects can compile successfully.

DOC>#

ERRORS DURING RECOMPILATION

—————————

SQL>
SQL> select comp_name, version, status from sys.dba_registry;

COMP_NAME VERSION STATUS

——————————————— —————————— ———–

Oracle Database Catalog Views 10.2.0.4.0 VALID

Oracle Database Packages and Types 10.2.0.4.0 VALID

Oracle Workspace Manager 10.2.0.4.3 VALID

JServer JAVA Virtual Machine 10.2.0.4.0 VALID

Oracle XDK 10.2.0.4.0 VALID

Oracle Database Java Packages 10.2.0.4.0 VALID

Oracle Expression Filter 10.2.0.4.0 VALID

Oracle Data Mining 10.2.0.4.0 VALID

Oracle Text 10.2.0.4.0 VALID

Oracle XML Database 10.2.0.4.0 VALID

Oracle Rule Manager 10.2.0.4.0 VALID

Oracle interMedia 10.2.0.4.0 VALID

OLAP Analytic Workspace 10.2.0.4.0 VALID

Oracle OLAP API 10.2.0.4.0 VALID

OLAP Catalog 10.2.0.4.0 VALID

Spatial 10.2.0.4.0 VALID

Oracle Enterprise Manager 10.2.0.4.0 VALID

SQL>exit

Now Start other services (listener, EM, iSQLPlus…)

Full Text

2010年8月27日

Oracle component start/stop

Dir: $ORACLE_HOME/bin

listener: lsnrctl

lsnrctl start/stop/status

EM: emctl emdctl

emctl status/start/stop dbconsole

iSQL: isqlplusctl

Usage::
isqlplusctl start| stop

Full Text

2010年5月5日

加法的例子

#!/bin/sh

i=1

while [ "$i" -lt "34" ]

echo $i

i=`expr $i + 1`

done

Full Text

2010年4月28日

Increasing the number of open file descriptors – LINUX

Increasing the number of open file descriptors – LINUX, take 131072 as example.

1, increase this to 131072 for all users

vi /etc/security/limits.conf

* soft nofile 131072

* hard nofile 131072

2, ulimit –n 131072

3, You may have to modify ssh and login in pam.d. Because if not, you open a session using ssh, it will not take effect.

vi /etc/pam.d/sshd

Add

session required /lib/security/pam_limits.so

vi /etc/pam.d/login

Add

session required /lib/security/pam_limits.so

4, service sshd restart

Full Text

2010年4月21日

Replacing a failed drive in a Sun Server

Per our friend George Pagan we should be using the following procedure to replace failed/failing disks in Sun Servers:

Replacing failed sun disk

sudo vxdiskadm

Option 4 to remove the disk

Then from the OS

sudo cfgadm -f -y -c unconfigure controller::dsk/target

For example if c1t0d0 is the disk to be removed:

sudo cfgadm -f -y -c unconfigure c1::dsk/c1t0d0

This will turn on either an amber light on older machines or a blue light

on newer machines. Then it is safe to remove the drive.

After you insert the drive you run, for example:

sudo cfgadm -c configure c1::dsk/c1t0d0

Then do

devfsadm

vxdctl enable

And then option 5 in vxdiskadm to replace the disk.

Full Text

2010年4月16日

ERROR: V-3-21268: /dev/vx/dsk/dgD280silo1/orgvol is corrupted. needs checking

Sometime when you migrate from primary to secondary, the volume might be corrupted, just use fsck to fix it:

fsck -F vxfs /dev/vx/rdsk/dgD280silo1/orgvol
log replay in progress
replay complete - marking super-block as CLEAN
# mount -F vxfs /dev/vx/dsk/dgD280silo1/orgvol /orgvol

Full Text

2010年3月15日

recovery from RVG is disabled

sometims your RVG cannot be started anyway, and the “vxprint –Vl” shows something like

“kernel=DISABLED”

In my case, the srl volume status is NEEDSYNC, that makes the RVG cannot be started, so repair the srl volume with following instructions:

1. Dissociate SRL:

# vxvol -g vrdg -f dis vvrsrl

vxvm:vxvol: WARNING: Rvg rvg1 needs recovery. Volume being dissociated may not be up-to-date

2. Stop the SRL:

# vxvol -g vrdg stop vvrsrl

3. Start the SRL. It will take some time for the SRL to resynchronize. You can run vxtask list from another window to monitor progress.

# vxvol start vvrsrl

4. Re-associate the SRL:

# vxvol -g vrdg aslog rvg1 vvrsrl

5. Recover the RVG:

# vxrecover -g vrdg -s

6. Start the RVG:

# vxrvg start rvg1

Reference from symantec:

http://seer.entsupport.symantec.com/docs/268215.htm

Full Text

2010年3月8日

转-恢复unstartable的卷

root@com00biiacc002:~ #> vxprint -g COM2_DATA_DG -hvt
V NAME         RVG/VSET/CO KSTATE   STATE    LENGTH   READPOL   PREFPLEX UTYPE
PL NAME         VOLUME       KSTATE   STATE    LENGTH   LAYOUT    NCOL/WID MODE
SD NAME         PLEX         DISK     DISKOFFS LENGTH   [COL/]OFF DEVICE   MODE
SV NAME         PLEX         VOLNAME NVOLLAYR LENGTH   [COL/]OFF AM/NM    MODE
SC NAME         PLEX         CACHE    DISKOFFS LENGTH   [COL/]OFF DEVICE   MODE
DC NAME         PARENTVOL    LOGVOL
SP NAME         SNAPVOL      DCO
EX NAME         ASSOC        VC                       PERMS    MODE     STATE

v COM_DATA_2   -            ENABLED ACTIVE   312475648 SELECT   -        fsgen
pl COM_DATA_2-01 COM_DATA_2 ENABLED STALE    312475648 CONCAT   -        WO
sd COM2_DATA_DG02-01 COM_DATA_2-01 COM2_DATA_DG02 0 312475648 0   AMS_WMS1_6 RLOC
pl COM_DATA_2-02 COM_DATA_2 ENABLED ACTIVE   312475648 CONCAT   -        RW
sd COM2_DATA_DG01-01 COM_DATA_2-02 COM2_DATA_DG01 0 312475648 0   AMS_WMS0_6 ENA
dc COM_DATA_2_dco COM_DATA_2 COM_DATA_2_dcl
v COM_DATA_2_dcl -          DETACHED DETACH   21888    SELECT    -        gen
pl COM_DATA_2_dcl-01 COM_DATA_2_dcl DISABLED RECOVER 21888 CONCAT -        RW
sd COM2_DATA_DG02-02 COM_DATA_2_dcl-01 COM2_DATA_DG02 312475648 21888 0 AMS_WMS1_6 ENA

发现卷的状态变成detached,尝试start volumn

root@com00biiacc002:~ #> vxvol -g COM2_DATA_DG start COM_DATA_2_dcl
VxVM vxvol ERROR V-5-1-1198 Volume COM_DATA_2_dcl has no CLEAN or non-volatile ACTIVE plexes

start volumn不成功，因为不是clean或active的plex. 下面查看一下dg状态信息

root@com00biiacc002:~ #> vxinfo -g COM2_DATA_DG
COM_DATA_2 fsgen Started
COM_DATA_2_dcl gen Unstartable

发现COM_DATA_2_dcl 是unstartable的卷。尝试将COM_DATA_2_dcl的plex设置为clean状态。

root@com00biiacc002:~ #> vxmend -g COM2_DATA_DG fix clean COM_DATA_2_dcl-01
VxVM vxmend ERROR V-5-1-854 Plex COM_DATA_2_dcl-01 not in STALE state

根据提示将plex设置为stale状态。
root@com00biiacc002:~ #> vxmend -g COM2_DATA_DG fix stale COM_DATA_2_dcl-01
root@com00biiacc002:~ #> vxprint -g COM2_DATA_DG -hvt COM_DATA_2_dcl
V NAME         RVG/VSET/CO KSTATE   STATE    LENGTH   READPOL   PREFPLEX UTYPE
PL NAME         VOLUME       KSTATE   STATE    LENGTH   LAYOUT    NCOL/WID MODE
SD NAME         PLEX         DISK     DISKOFFS LENGTH   [COL/]OFF DEVICE   MODE
SV NAME         PLEX         VOLNAME NVOLLAYR LENGTH   [COL/]OFF AM/NM    MODE
SC NAME         PLEX         CACHE    DISKOFFS LENGTH   [COL/]OFF DEVICE   MODE
DC NAME         PARENTVOL    LOGVOL
SP NAME         SNAPVOL      DCO
EX NAME         ASSOC        VC                       PERMS    MODE     STATE

v COM_DATA_2_dcl - DETACHED DETACH 21888 SELECT - gen
pl COM_DATA_2_dcl-01 COM_DATA_2_dcl DISABLED STALE 21888 CONCAT - RW
sd COM2_DATA_DG02-02 COM_DATA_2_dcl-01 COM2_DATA_DG02 312475648 21888 0 AMS_WMS1_6 ENA
root@com00biiacc002:~ #>

再次尝试start volumn.
root@com00biiacc002:~ #> vxvol -g COM2_DATA_DG start COM_DATA_2_dcl
VxVM vxvol ERROR V-5-1-1198 Volume COM_DATA_2_dcl has no CLEAN or non-volatile ACTIVE plexes

由于plex不是clean或active状态而无法start volumn. 将plex设置为clean状态。
root@com00biiacc002:~ #>
root@com00biiacc002:~ #> vxmend -g COM2_DATA_DG fix clean COM_DATA_2_dcl-01

start volumn.
root@com00biiacc002:~ #> vxvol -g COM2_DATA_DG start COM_DATA_2_dcl
root@com00biiacc002:~ #> vxprint -g COM2_DATA_DG -hvt COM_DATA_2_dcl
V NAME         RVG/VSET/CO KSTATE   STATE    LENGTH   READPOL   PREFPLEX UTYPE
PL NAME         VOLUME       KSTATE   STATE    LENGTH   LAYOUT    NCOL/WID MODE
SD NAME         PLEX         DISK     DISKOFFS LENGTH   [COL/]OFF DEVICE   MODE
SV NAME         PLEX         VOLNAME NVOLLAYR LENGTH   [COL/]OFF AM/NM    MODE
SC NAME         PLEX         CACHE    DISKOFFS LENGTH   [COL/]OFF DEVICE   MODE
DC NAME         PARENTVOL    LOGVOL
SP NAME         SNAPVOL      DCO
EX NAME         ASSOC        VC                       PERMS    MODE     STATE

v COM_DATA_2_dcl - ENABLED ACTIVE 21888 SELECT - gen
pl COM_DATA_2_dcl-01 COM_DATA_2_dcl ENABLED ACTIVE 21888 CONCAT - RW
sd COM2_DATA_DG02-02 COM_DATA_2_dcl-01 COM2_DATA_DG02 312475648 21888 0 AMS_WMS1_6 ENA

volumn及plex恢复正常。

Full Text

2010年2月25日

secondary rlink状态为need_recover的解决

当secondary机器由于某种原因磁盘出错，dg被deport掉，会导致VVR配置出错，这种情况下，需要做下列操作（不一定都必须，但是都要检查）

1. import dg:

vxdg import datadg

2. start all replicated volume:

vxvol start <vol_name>

这样你在secondary机器上用vxprint检查，所有的rvg、rlink、volume都应该为ENABLED、ACTIVE状态，如果有DISABLE类似状态，请再检查并修复。

[root@secondary datadg]# vxprint
Disk group: datadg

TY NAME ASSOC KSTATE LENGTH PLOFFS STATE TUTIL0 PUTIL0
dg datadg datadg - - - - - -

dm datadg01 sdb - 4094608 - - - -
dm datadg02 sdc - 4094608 - - - -

rv primary_secondary_rvg -   ENABLED -        -        ACTIVE   -       -
rl rlk_primary_primary_secondary_ primary_secondary_rvg CONNECT - - ACTIVE - -
v vol1lv       primary_secondary_rvg ENABLED 1048576 - ACTIVE   -       -
pl vol1lv-01    vol1lv       ENABLED 1048576 -        ACTIVE   -       -
sd datadg01-01 vol1lv-01    ENABLED 1048576 0        -        -       -
pl vol1lv-02    vol1lv       ENABLED 1048576 -        ACTIVE   -       -
sd datadg02-01 vol1lv-02    ENABLED 1048576 0        -        -       -
pl vol1lv-03    vol1lv       ENABLED LOGONLY -        ACTIVE   -       -
sd datadg01-05 vol1lv-03    ENABLED 32       LOG      -        -       -
pl vol1lv-04    vol1lv       ENABLED LOGONLY -        ACTIVE   -       -
sd datadg02-05 vol1lv-04    ENABLED 32       LOG      -        -       -
v vol2lv       primary_secondary_rvg ENABLED 1048576 - ACTIVE   -       -
pl vol2lv-01    vol2lv       ENABLED 1048576 -        ACTIVE   -       -
sd datadg01-02 vol2lv-01    ENABLED 1048576 0        -        -       -
pl vol2lv-02    vol2lv       ENABLED 1048576 -        ACTIVE   -       -
sd datadg02-02 vol2lv-02    ENABLED 1048576 0        -        -       -
pl vol2lv-03    vol2lv       ENABLED LOGONLY -        ACTIVE   -       -
sd datadg01-04 vol2lv-03    ENABLED 32       LOG      -        -       -
pl vol2lv-04    vol2lv       ENABLED LOGONLY -        ACTIVE   -       -
sd datadg02-04 vol2lv-04    ENABLED 32       LOG      -        -       -
v srllv        primary_secondary_rvg ENABLED 262144 SRL ACTIVE -       -
pl srllv-01     srllv        ENABLED 262144   -        ACTIVE   -       -
sd datadg01-03 srllv-01     ENABLED 262144   0        -        -       -
pl srllv-02     srllv        ENABLED 262144   -        ACTIVE   -       -
sd datadg02-03 srllv-02     ENABLED 262144   0        -        -       -

3. 在需要recover的机器执行recover操作：

root@secondary datadg]# vxrecover –s

这步执行之后应该就ok，可以用下面的命令检查。下面是在执行vxrecover之前和之后的vxprint结果显示比较：

[root@secondary datadg]# vxprint -lP
Disk group: datadg

Rlink:    rlk_primary_primary_secondary_
info:     timeout=500 packet_size=8400 rid=0.1100
          latency_high_mark=10000 latency_low_mark=9950
          bandwidth_limit=none
state:    state=ACTIVE
          synchronous=off latencyprot=off srlprot=autodcm
assoc:    rvg=primary_secondary_rvg
          remote_host=172.111.100.10 IP_addr=172.111.100.10 port=4145
          remote_dg=datadg
          remote_dg_dgid=1265752821.7.localhost.localdomain
          remote_rvg_version=unknown
          remote_rlink=rlk_secondary_primary_secondar
          remote_rlink_rid=0.1106
          local_host=172.111.100.20 IP_addr=172.111.100.20 port=4145
protocol: UDP/IP
flags:    write disabled attached consistent disconnected needs_recovery

[root@secondary datadg]# vxrecover -s
[root@secondary datadg]# vxprint -lP
Disk group: datadg

Rlink:    rlk_primary_primary_secondary_
info:     timeout=500 packet_size=8400 rid=0.1100
          latency_high_mark=10000 latency_low_mark=9950
          bandwidth_limit=none
state:    state=ACTIVE
          synchronous=off latencyprot=off srlprot=autodcm
assoc:    rvg=primary_secondary_rvg
          remote_host=172.111.100.10 IP_addr=172.111.100.10 port=4145
          remote_dg=datadg
          remote_dg_dgid=1265752821.7.localhost.localdomain
          remote_rvg_version=21
          remote_rlink=rlk_secondary_primary_secondar
          remote_rlink_rid=0.1106
          local_host=172.111.100.20 IP_addr=172.111.100.20 port=4145
protocol: UDP/IP
flags:    write enabled attached consistent connected

[root@secondary datadg]# vxrlink -g datadg verify rlk_primary_primary_secondary_
RLINK REMOTE HOST LOCAL HOST STATUS STATE
rlk_primary_primary_secondary_ 172.111.100.10 172.111.100.20 OK ACTIVE

Full Text

2010年2月10日

在AIX5.3上安装Websphere6.1文件系统为VXFS解决方案

在AIX5.3上安装Websphere6.1，文件系统为vxfs，安装程序无法正确检查目的文件系统的大小，识别为0M，始终无法通过正确性检查，可以用下面的参数安装：

./install -OPT noPrereqChecks="true"

Full Text

2010年2月2日

Solaris 10 SVM root mirror steps

Environment: solaris 10 with two disks, c1t0d0 and c1t1d0, c1t0d0 is original / file system, c1t0d0 is new attached disk. now we try to make c1t0d0 and c1t1d0 as Raid1 root mirror.

be care that: for SVM volume, there must be a free slice in each disk to contain meta db. in this environment, slice s6 with 170M space is used for this.

1. fdisk new disk c1t0d0 to create solaris patition.

2. copy the original volume table to the second disk:

bash-3.00# prtvtoc /dev/rdsk/c1t0d0s2 | fmthard -s - /dev/rdsk/c1t1d0s2
fmthard: New volume table of contents now in place.

3. create meta db for the two disks:
bash-3.00# metadb -af -c 3 /dev/dsk/c1t0d0s6 /dev/dsk/c1t1d0s6

4. create meta devices. Remember at this time, just attach one way mirror, the left half will be attached after rebooting:

bash-3.00# metainit -f d11 1 1 /dev/dsk/c1t0d0s0
d11: Concat/Stripe is setup
bash-3.00# metainit -f d21 1 1 /dev/dsk/c1t1d0s0
d21: Concat/Stripe is setup
bash-3.00# metainit -f d12 1 1 /dev/dsk/c1t0d0s1
d12: Concat/Stripe is setup
bash-3.00# metainit -f d22 1 1 /dev/dsk/c1t1d0s1
d22: Concat/Stripe is setup
bash-3.00# metainit -f d13 1 1 /dev/dsk/c1t0d0s7
d13: Concat/Stripe is setup
bash-3.00# metainit -f d23 1 1 /dev/dsk/c1t1d0s7
d23: Concat/Stripe is setup
bash-3.00# metainit -f d1 -m d11
d1: Mirror is setup
bash-3.00# metainit -f d2 -m d12
d2: Mirror is setup
bash-3.00# metainit -f d3 -m d13
d3: Mirror is setup

5. use metaroot to modify vfstab, also manually modify vfstab to use new created meta devices:

bash-3.00# metaroot d1
bash-3.00# vi /etc/vfstab
"/etc/vfstab" 13 lines, 461 characters
#device         device          mount           FS      fsck    mount   mount
#to mount       to fsck         point           type    pass    at boot options
#
fd      -       /dev/fd fd      -       no      -
/proc   -       /proc   proc    -       no      -
/dev/md/dsk/d2 -       -       swap    -       no      -
/dev/md/dsk/d1 /dev/md/rdsk/d1 /       ufs     1       no      -
/dev/md/dsk/d3 /dev/md/rdsk/d3 /export/home    ufs     2       yes     -
/devices        -       /devices        devfs   -       no      -
sharefs -       /etc/dfs/sharetab       sharefs -       no      -
ctfs    -       /system/contract        ctfs    -       no      -
objfs   -       /system/object objfs   -       no      -
swap    -       /tmp    tmpfs   -       yes     -
~
"/etc/vfstab" 13 lines, 452 characters

6. lock file system for sync, then reboot:

bash-3.00# lockfs -fa
bash-3.00# init 6

7. after reboot, now we may attach the left half to the mirror. the metattach command may return soon, but the really sync work may take a long time depend on the size of the underlying file system.

bash-3.00# metattach d1 d21
d1: submirror d21 is attached
bash-3.00# metattach d2 d22
d2: submirror d22 is attached
bash-3.00# metattach d3 d23
d3: submirror d23 is attached
bash-3.00#

8. use “metastat –i” to check each sub mirror sync status until all are stated as “Ok”, such as:

d11: Submirror of d1
State: Okay

9. install grub boot sector(if in Sparc, use installboot instead):

bash-3.00# installgrub -fm /boot/grub/stage1 /boot/grub/stage2 /dev/rdsk/c1t0d0s0
stage1 written to partition 0 sector 0 (abs 4096)
stage2 written to partition 0, 265 sectors starting at 50 (abs 4146)
stage1 written to master boot sector
bash-3.00#

10. Now you may boot from both hd0 and hd1 to test if the mirror is working well, manually modify the grub menu.lst to use “root (hd0,0,a)” and “root(hd1,0,a)” to test, do not use the “findroot(rootfs0,0,a)” item, since the findroot command need an entry under /boot/grub, we directly define hd0 and hd1 will be more easy to test.

11. in many documents, there is boot-device and altbootpath paramenter needed, they may be define by eeprom command. but in my test, you may switch the boot disk in bios and this is not absolutely needed. if you want this, just check some other doc for this, the command is like following, it’s actually stored at /boot/solaris/bootenv.rc

bash-3.00# eeprom altbootpath=/pci@0,0/pci15ad,1976@10/sd@1,0:a

12. now everything is ok. you may destroy one of your root disk and try to boot from another disk. select in grub list, edit the boot line to set it correct:

root(hd0,0,a) or

root(hd1,0,a)

Full Text

订阅：博文 (Atom)

Do right things is better than do things right

2010年9月15日

Hints about working with Xming

2010年9月8日

(Oracle)how to recreate control file(Linux)

(Oracle)OEM unavailable error solution

503 Service Unavailable

2010年9月6日

Oracle Upgrade from 10.0.2.1 to 10.0.2.4

2010年8月27日

Oracle component start/stop

2010年5月5日

加法的例子

2010年4月28日

Increasing the number of open file descriptors – LINUX

2010年4月21日

Replacing a failed drive in a Sun Server

2010年4月16日

ERROR: V-3-21268: /dev/vx/dsk/dgD280silo1/orgvol is corrupted. needs checking

2010年3月15日

recovery from RVG is disabled

2010年3月8日

转-恢复unstartable的卷

2010年2月25日

secondary rlink状态为need_recover的解决

2010年2月10日

在AIX5.3上安装Websphere6.1文件系统为VXFS解决方案

2010年2月2日

Solaris 10 SVM root mirror steps

category

archive

My Introduce

VisitorCounter