Friday, 23 September 2011

Solaris Network configuration


###Network Configuration Overview###

2-Modes
 1. Local Files Mode - config is defined statically via key files
 2. Network Client Mode - DHCP is used to auto-config interface(s)

Current Dell PE server has 3 NICs:
 1. e1000g0 - plumbed (configured for network client mode)
 2. iprb0 - unplumbed
 3. iprb1 - unplumbed

1-Virtual Mandatory interface lo0 - loopback

Determine physical interfaces using 'dladm show-dev | show-link'
Determine plumbed and loopback interfaces using 'ifconfig -a'

NIC naming within Solaris OS: i.e. e1000g0 - e1000g(driver name) 0(instance)

Layers 2 & 3 info. - ifconfig -a, or ifconfig e1000g0
Layer 1 info. - dladm show-dev | show-link

###Key network configuration files###
svcs -a | grep physical
svcs -a | grep loopback

1. IP Address - /etc/hostname.e1000g0, /etc/hostname.iprb0 | iprb1
2. Domain name - /etc/defaultdomain - linuxcbt.internal
3. Netmask - /etc/inet/netmasks - 192.168.1.0 255.255.255.0
4. Hosts database - /etc/hosts, /etc/inet/hosts - loopback & ALL interfaces
5. Client DNS resolver file - /etc/resolv.conf
6. Default Gateway - /etc/defaultrouter - 192.168.1.1, 172.16.20.1, 10.0.0.1
7. Node name - /etc/nodename
Name service configuration file - /etc/nsswitch.conf

netstat -D - returns DHCP configuration for ALL interfaces
ifconfig -a - returns configuration for ALL interfaces


Reboot system after transitioning from network client(DHCP) mode to local files(Static) mode


mv  dhcp.e1000g0 to some other name or remove the file so that the DHCP agent is NOT invoked
echo "linuxcbtsun1" > /etc/nodename


###Plumb/enable the iprb0 100Mb/s interface###
Plumbing interfaces is analagous to enabling interfaces
Note: 172.16.20.11 is a Linux host waiting to communicate with iprb0 interface
Steps:
 1. ifconfig iprb0 plumb up - this will enable iprb0 interface
 2. ifconfig iprb0 172.16.20.10 netmask 255.255.255.0 - this will enable layer-3 IPv4 address

Steps to UNplumb an interface:
 1. ifconfig iprb0 unplumb down

###Ensure that newly-plumbed interface settings persists across reboots###
Steps include updating/creating the following files:
 1. echo "172.16.20.10" > /etc/hostname.iprb0
 2. create entry in /etc/hosts - 172.16.20.10 linuxcbtsun1
 3. echo "172.16.20.0 255.255.255.0" >> /etc/inet/netmasks

Note: To down interface, execute:
ifconfig interface_name down
ifconfig iprb0 down && ifconfig iprb0


###Sub-interfaces/Logical Interfaces###
e1000g0(physical interface) - 192.168.1.50(Primary Apache website)
                              192.168.1.51(Secondary Apache website)
     192.168.1.52(Used for SSH)

iprb0 - 172.16.20.10
iprb1

Use 'ifconfig interface_name addif ip_address '
ifconfig e1000g0 addif 192.168.1.51 (RFC-1918 - defaults /24)

Note: This will automatically create an 'e1000g0:1' logical interface
Note: Solaris places new logical interface in DOWN mode by default
Note: use 'ifconfig e1000g0:1 up' to bring the interface up

Note: logical/sub-interfaces are contingent upon physical interfaces
Note: if physical interface is down, so will the logical interface(s)
Note: connections are sourced using IP address of physical interface

###Save logical/sub-interface configuration for persistence across reboots###

1. gedit /etc/hostname.e1000g0:1 - 192.168.1.51
2. gedit /etc/hostname.e1000g0:2 - 192.168.1.52
3. Optionally update /etc/hosts - /etc/inet/hosts
4. Optionally update /etc/inet/netmasks - when subnetting

Note: To remove logical interface execute the following:
ifconfig physical_interface_name removeif ip_address
ifconfig iprb0 removeif 172.16.20.20


###/etc/nsswitch.conf - name service configuration information ###
functions as a policy/rules file for various resolution:
 1. DNS
 2. passwd(/etc/passwd,/etc/shadow),group(/etc/group)
 3. protocols(/etc/inet/protocols)
 4. ethers or mac-to-IP mappings
 5. hosts - where to look for hostname resolution: files(/etc/hosts) dns(/etc/resolv.conf)

============================== NETSTAT ========================

###NETSTAT###

Lists connections for ALL protocols & address families to and from machine
Address Families (AF) include:
 INET - ipv4
 INET6 - ipv6
 UNIX - Unix Domain Sockets(Solaris/FreeBSD/Linux/etc.)

Protocols Supported in INET/INET6 include:
 TCP, IP, ICMP(PING(echo/echo-reply)), IGMP, RAWIP, UDP(DHCP,TFTP,etc.)

Lists routing table
Lists DHCP status for various interfaces
Lists net-to-media table - network to MAC(network card) table

###NETSTAT Usage###
netstat - returns sockets by protocol using /etc/services for lookup
/etc/nssswitch.conf is consulted by netstat to resolve names for IPs

netstat -a - returns ALL protocols for ALL address families (TCP/UDP/UNIX)

netstat -an - -n option disables name resolution of hosts & ports

netstat -i - returns the state of interfaces. pay attention to errors/collisions/queue columns when troubleshooting performance

netstat -m - returns streams(TCP) statistics

netstat -p - returns net-to-media info (MAC/layer-2 info.) i.e. arp

netstat -P protocol (ip|ipv6|icmp|icmpv6|tcp|udp|rawip|raw|igmp) - returns active sockets for selected protocol

netstat -r - returns routing table

netstat -D - returns DHCP configuration (lease duration/renewal/etc.)

netstat -an -f address_family
netstat -an -f inet|inet6|unix
netstat -an -f inet - returns ipv4 only information

netstat -n -f inet
netstat -anf inet -P tcp
netstat -anf inet -P udp


Thursday, 1 September 2011

ORA-26040/ORA-01578: ORACLE data block corrupted

Are you sure what segment was affected by corrupted block ?


Execute this SQL below and see what segment was affected. I the segment is the index, for example then you just need drop it and re-create it again.  

SELECT SEGMENT_TYPE,OWNER||'.'||SEGMENT_NAME
FROM DBA_EXTENTS 
WHERE FILE_ID = 9 AND 25759 BETWEEN BLOCK_ID
AND BLOCK_ID+BLOCKS -1
======================================================================================

ORA-1578 / ORA-26040 Corrupt blocks by NOLOGGING - Error explanation and solution

When a segment is defined with the NOLOGGING attribute and if a NOLOGGING/UNRECOVERABLE operation updates the segment, the online redo log file is updated with minimal information to invalidate the affected blocks when a RECOVERY is later performed.  If the associated redo/archived log file is used to RECOVER the data files, Oracle invalidates such blocks and the error ORA-26040 along with error ORA-1578 are reported by SQL statements in the next block reads.  Errors Example:
SQL> select * from test_nologging; ORA-01578: ORACLE data block corrupted (file # 11, block # 84) ORA-01110: data file 4: '/oradata/users.dbf' ORA-26040: Data block was loaded using the NOLOGGING option
The NOLOGGING attribute is stored in column LOGGING in data dictionary views like:  DBA_TABLES, DBA_INDEXES, DBA_LOBS, DBA_TAB_PARTITIONS, DBA_LOB_PARTITIONS, DBA_TAB_SUBPARTITIONS, etc.  LOGGING='NO' indicates NOLOGGING. 
The way for Oracle to identify that the block was previously invalidated due to NOLOGGING is by updating most of the bytes in that block with 0xff only if that "invalidate" redo is applied in a Recovery.  The block is then marked as Soft Corrupt meaning that the next block read will report the ORA-1578/ORA-26040 errors. 
The SCN in the block corresponds to the SCN in the REDO RECORD for when the "INVALIDATE" change was applied in a recovery. This is useful to know the timestamp for when the block was marked as soft corrupt due to NOLOGGING.

RMAN/DBV and Corrupted Blocks by NOLOGGING

DBV prints the generic message DBV-200 in rdbms versions lower than 10.2.0.4 and error DBV-201 in RDBMS versions greater or equal to 10.2.0.4  ( Note  5031712.8 ):
DBV-00200: Block, dba 46137428, already marked corrupted DBV-00201: Block, DBA 46137428, marked corrupt for invalid redo application
In rdbms versions lower than 10.2.0.5 RMAN reports is with a generic message like:
RMAN reports it in v$database_block_corruption with CORRUPTION_TYPE=LOGICAL
When there is a generic message besides the error ORA-26040, a block dump might be taken and see if the byte 0xff is along the block or if the block is associated to a segment, try to read it with a SQL statement for which errors ORA-1578/ORA-26040 will be produced if the block is corrupt due to a recovery with a NOLOGGING operation. For RMAN to identify if the block is corrupt by NOLOGGING, an enhancement has been provided in Bug 7396077.  See Note 7396077.8  RMAN backups don't fail due to NOLOGGING corrupt blocks. In general RMAN does not fails with soft corrupt blocks so the MAXCORRUPT clause is not necessary in such cases.

Important change in 11g

FORCE LOGGING is irrelevant in NOARCHIVELOG mode; this was a change introduced in 11g. Reference Note 1071869.1

SOLUTION

Note that the data inside the affected blocks is not salvageable. Methods like "Media Recovery" or "RMAN blockrecover" will not fix the problem unless the data file was backed up after the NOLOGGING operation was registered in the Redo Log.  In order to resolve the errors and if it is not an INDEX the segment can be recovered from a backup like an export dump or from another source. If backups are not available the segment might be recreated following the next steps:
  • Identify the object as described in Note 819533.1 
  • If it is an INDEX, drop/create the index. 
  • If it is a TABLE then procedure DBMS_REPAIR.SKIP_CORRUPT_BLOCKS can be used to skip the corrupt block in SQL statements and decide to re-create the table. Note 556733.1 has an example of DBMS_REPAIR. 
  • If it is a LOB segment associated to a LOB column in a Table, use Note 293515.1
  • If the error is produced in a Physical STANDBY database, the option is to restore the affected file from the PRIMARY database (only if the problem is not present in the PRIMARY).
Run script provided in Note 472231.1 to identify any additional corrupted objects.

References

NOTE:1071869.1 - ORA-1578 ORA-26040 in 11g for DIRECT PATH with NOARCHIVELOG even if LOGGING is enabled
NOTE:290161.1 - The Gains and Pains of Nologging Operations
NOTE:293515.1 - ORA-1578 ORA-26040 in a LOB segment - Script to solve the errors
NOTE:472231.1 - How to identify all the Corrupted Objects in the Database reported by RMAN
NOTE:556733.1 - DBMS_REPAIR SCRIPT
NOTE:7396077.8 - Bug 7396077 - RMAN does not differentiate NOLOGGING corrupt blocks that produce ORA-1578/ORA-26040
NOTE:819533.1 - How to identify the corrupt Object reported by ORA-1578 / RMAN / DBVERIFY