Showing posts with label Solaris 9. Show all posts
Showing posts with label Solaris 9. Show all posts

Thursday, October 31, 2019

How to Kill a Zombie in Solaris

How to Kill a Zombie in Solaris

Abstract:

When a parent spans a child process, the child process will return a signal to the parent once the child process has died or was terminated. If the parent dies first, the init process inherits the children, and will receive the signals once the children die. This process is called "reaping". Sometimes, things do not go as planned. It is a good topic for Halloween.

[artwork for "ZombieLoad" malware, courtesy zombieloadattack]

When things do not go as planned:

It may take a few minutes for the exit signal to be reaped by a parent or init process, which is quite normal.

If children processes are dying and the parent is not reaping the signals, the child remains in the process table and becomes a Zombie, not taking Memory or CPU, but consuming a process slot. Under modern OS's, like Solaris, the process table can hold millions of entries, but zombies still consumes kernel resources and userland resources when process tables need to be parsed.

Identifying Zombies

Zombies are most easily identified as "defunct" processes.
# ps -ef | grep defunct
root 1260 1 0 - ? 0:00 
This defunct process would normally be managed by the parent process, which is "1" or init, but in this case we can clearly see that this process is not disappearing.
# ps -ef | grep init
root 1 0 0 Oct 25 ? 8:51 /sbin/init
But why call them Zombies and not just Defunct?
$ ps -elf | egrep '(UID|defunct)'
F S UID PID PPID C PRI NI ADDR SZ WCHAN STIME TTY TIME CMD
0 Z root 125 4549 0 0 - - 0 - - ? 0:00
The "S" or "State" flag identifies the defunct process with a "Z" for Zombie, and all can see them.

(Plus, this is being published on Halloween, or All Hallows' Eve, the day before All Hallow's Day or All Saints' Day... this is when people remember the death of the "hallows" or Saints & Martyrs, who had passed on before. So, let's also remember the deaths of the processes!)


[The Grim Reaper, courtesy Encyclopedia Britannica]

To Kill a Zombie:

How does one kill a Zombie?
Well, they are already dead... in the movies, they are shot in the head.
In the modern operating system world of Solaris, we seek the reaper, we Don't Fear The Reaper.

The tool is called Process Reap or "preap" - the manual page is wonderfully descriptive!
# preap 1260
1260: exited with status 0
It should be noted, processes being traced can not be reaped, damage can occur to the parent process if the child is forcibly reaped, and the OS may also put restrictions on reaping recently terminated processes.

To force a reaping, one can place a proverbial "bullet in the head" of the zombie.
# preap -F 125
125: exited with status 0
So, there we go, two dead zombies, see how they no longer run.

Conclusion:

This administrator had personally seen poorly written C code, leaving thousands of zombies behind daily. The application development team no longer had no C programmers on their staff, so this was a good option. It should be carefully exercised on a development or test box, to evaluate the results on the application, before conducing a procedure in production.

Wednesday, February 21, 2018

Solaris: ISO or USB Media with Update Releases

Occasionally, management of infrastructure requires users to have original images of various releases to perform activities such as booting and having compatibility with various file systems.

Oracle Support maintains a list of media with update releases for Solaris, dating back to Oracle 8.



One use case is to create a Solaris 11 OS image readable by the latest versions of Solaris 10.

Monday, April 1, 2013

SunFire 280R: 3737 Days of Uptime

[SunFire 280R, courtesy codigounix.blogspot.com]

SunFire 280R: 3737 Days of Uptime
For anyone who cared & fed systems - 10 Years of Uptime is phenominal.

[フレーム]

Background:
This platform was located in Hungary. I say was, since it was relocated. This video was taken during the last hours of it's relocation, and thus ending 10 years of uptime. This platform was involved in processing outbound internet facing traffic. The last of it's production facing traffic load was removed a number of months earlier. Mid-way through the video, a short interlude was shown with a Solaris 11 platform and a ZFS kernel dump - note, this was not the tribute platform. in question since ZFS was not around back when this 280R system was first powered up - this is a Solaris 9 platform. The music used in this tribute video was performed by Lana Del Rey - Born to Die.
A short post and discussion on slash-dot surrounding the shutdown and relocation of this system has been noted.

Wednesday, May 30, 2012

Ops Center: Manage Mission Critical Apps in the Cloud


Ops Center: Manage Mission Critical Apps in the Cloud

Abstract:
This short video demonstrates how Oracle Ops Center, included in all Oracle hardware service contracts, manages a private cloud hosting applications.
[埋込みオブジェクト:http://c.brightcove.com/services/viewer/federated_f9?isVid=1&isUI=1]

Wednesday, March 28, 2012

SSH Debugging: Public and Private Keys



SSH Key Debugging: Public and Private Keys

Abstract:

There have been several articles published on forwarding ports with SSH over an encrypted tunnel and setting up automatic SSH Auto-Login using an encrypted ssh tunnel. This is the third in the series, discussing a particular problem when differing clients experience differing login symptoms while trying to log into a common server.

Solaris 10 Client Symptom:

If a Solaris 10 Client can not get a password prompt on a server, you might get the following error:

solaris10/user$ ssh badserver
no common kex alg: client
'diffie-hellman-group-exchange-sha1,diffie-hellman-group1-sha1', server
'gss-group1-sha1-toWM5Slw5Ew8Mqkay+al2g=='
Solaris 9 Client Symptom:

If a Solaris 9 Client can not get a password prompt on a server, you might get the following error:

solaris9/user$ ssh badserver
no kex alg

Solaris Server Root Cause:

If the Solaris 9 and Solaris 10 clients are trying to attach to the same server, check to see if your private and public ssh host keys are missing in your /etc/ssh directory:

badserver/root# ls -al /etc/ssh
-rwxr-xr-x 1 root sys 88301 Jan 21 2005 moduli
-rwxr-xr-x 1 root sys 861 Jan 21 2005 ssh_config
-rwxr-xr-x 1 root sys 5025 Aug 6 2010 sshd_config
The /etc/ssh directory should look more like the following:

goodserver/root# ls -al /etc/ssh
-rw-r--r-- 1 root sys 88301 Jan 21 2005 moduli
-rw-r--r-- 1 root sys 861 Jan 21 2005 ssh_config
-rw------- 1 root root 668 Apr 10 2009 ssh_host_dsa_key
-rw-r--r-- 1 root root 602 Apr 10 2009 ssh_host_dsa_key.pub
-rw------- 1 root root 887 Apr 10 2009 ssh_host_rsa_key
-rw-r--r-- 1 root root 222 Apr 10 2009 ssh_host_rsa_key.pub
-rw-r--r-- 1 root sys 5372 Feb 12 21:49 sshd_config
-rw-r--r-- 1 root sys 5106 Dec 15 12:30 sshd_config.orig
Creating Server Keys:

Log into the server, refusing connections with errors and missing the ssh host keys, and create the keys.

badserver/root# cd /etc/ssh
badserver/root# /lib/svc/method/sshd -c
Creating new rsa public/private host key pair
Creating new dsa public/private host key pair

badserver/root# ls -al ssh_host*key*
-rw------- 1 root root 668 Mar 28 22:26 ssh_host_dsa_key
-rw-r--r-- 1 root root 602 Mar 28 22:26 ssh_host_dsa_key.pub
-rw------- 1 root root 887 Mar 28 22:26 ssh_host_rsa_key
-rw-r--r-- 1 root root 222 Mar 28 22:26 ssh_host_rsa_key.pub
Restarting SSH Service:

Once the SSH server public and private keys have been created, the ssh service needs to be restarted, in order to leverage the new private keys.

badserver/root# /usr/bin/svcs ssh
STATE STIME FMRI
online May_21 svc:/network/ssh:default
badserver/root# /usr/sbin/svcadm restart ssh
Validating Repair:

The final step in any repair is validation. In this case, the ssh is attempted.

solaris10/user$ ssh badserver
Last login: Wed Mar 28 22:48:57 2012 from solaris10
Oracle Corporation SunOS
5.10 Generic Patch January 2005
INTR=Ctrl-C ERASE=Ctrl-H KILL=Ctrl-U
badserver/user$

Friday, December 2, 2011

X Tab: OpenWindows Augmented Compatibility Environment



The following has been added to the X Tab for Solaris 9 and Solaris 10.
OpenWindows Augmented Compatibility Environment

owacomp - [http|ix86|sparc|src|readme] - OpenWindows acomp Project
olvwm4.4p4 - [http|pkg|src|readme] - Solaris 8 SPARC OpenLook Virtual Window Manager
olvwm4.4p4 - [http|pkg|src|readme] - Solaris 8 ix86 OpenLook Virtual Window Manager

Friday, April 16, 2010

Solaris 9: Missing dladm show-dev


Solaris 9: Missing dladm show-dev

Abstract:
Solaris 10 has included a new feature referred to as the Data Link Admin tool. This tool provides a simple way to configure and check the status of the layer 2 ethernet interfaces. Some of the information commonly used in dladm under Solaris 10 can be derived in Solaris 9.

Solaris 10: dladm show-dev
The Data Link Administration tool under Solaris 10 has some very nice features, including quickly seeing the interface name, speed, and duplex.

sunt2000# dladm show-dev
ipge0 link: unknown speed: 100 Mbps duplex: full
ipge1 link: unknown speed: 100 Mbps duplex: full
ipge2 link: unknown speed: 100 Mbps duplex: half
ipge3 link: unknown speed: 0 Mbps duplex: unknown


Solaris 9: kstat & nawk
A simple nawk script can be used on a Solaris 9 platform, to perform similar output.

sunt2000# kstat -p | nawk '/duplex/ || /speed/ { split(1,ドルArray,":") ; Dev=Array[3] } /link_duplex/ && 2ドル=="2" { Duplex[Dev]="full" } /link_duplex/ && 2ドル=="1" { Duplex[Dev]="half" } /link_speed/ { if ( Duplex[Dev] == "" ) Duplex[Dev]="unknown" ; Speed[Dev]=2ドル ; print Dev "\tlink: unknown\tspeed: " Speed[Dev] "\tMbit\tduplex: " Duplex[Dev] }'
ce0 link: unknown speed: 100 Mbit duplex: full
ce1 link: unknown speed: 1000 Mbit duplex: full
ce2 link: unknown speed: 1000 Mbit duplex: full
ce3 link: unknown speed: 1000 Mbit duplex: full
ce4 link: unknown speed: 0 Mbit duplex: unknown
ce5 link: unknown speed: 0 Mbit duplex: unknown

Wednesday, March 4, 2009

Partitioning: Oracle Licensing Terms & Agreements

When questions about Oracle licensing comes up, things can get rather puzzling.

How is one to determine license liability?

Single-Core vs Multi-Core Processors

Sometimes, there are web pages which help to determine liability with single and multi-core processors.
http://www.orafaq.com/wiki/Oracle_Licensing

Multi-core processors are priced as (number of cores)*(multi-core factor) processors, where the multi-core factor is:

  • 0.25 for SUN's UltraSPARC T1 processors (1.0 GHz or 1.2 GHz)
  • 0.50 for other SUN's UltraSPARC T1 processors (e.g. 1.4 GHz)
  • 0.50 for Intel and AMD processors
  • 0.75 for SUN's UltraSPARC T2 processors
  • 0.75 for all other multi-core processors
  • 1.00 for single-core processors
This may help guide towards a decision (i.e. if you need half of T2 processor for an application, there is a 50% discount when one purchases a system with a T1 processor for running Oracle.)
  • 8 (T1 cores) * .25 (multi-core 1.2 GHz T1 factor) = 2 (pricing factor)
  • 8 (T2 cores) * .75 (multi-core 1.2 GHz T2 factor) = 6 (pricing factor)
Examples of low-end T1 based systems include:
A T1 platform is a GREAT platform for deploying a basic development or test environment or a development or test clustering environment, where full fledged performance is not required, but binary compatibility is desired with SPARC applications.

Building applications which scale well on a T1 platform will offer excellent performance when the application needs to scale up with larger number of cores or processors, since it will be more likely to scale linearly since the CoolThreads cores will scale linearly in performance.

Partitioning Technologies

There are three primary partitioning technologies with Open platforms:
  • Dynamic System Domains
    Available for mid-range to high-end SUN and Fujitsu systems
    Allows for Solaris 8, Solaris 9, Solaris 10, and Solaris Express operating systems
    M4000 for up to 2 Dynamic System Domains
    M5000 for up to 5 Dynamic System Domains
    M8000 for up to 16 Dynamic System Domains
    M9000 for up to 24 Dynamic System Domains

  • Logical Domains or LDOM's
    Available for low-end to mid-range SUN and Fujitsu systems
    T1 Processors for up to 32 LDOM's
    T2 Processors for up to 64 LDOM's
    T2+ Processors for up to 256 LDOM's

  • Solaris 10 (capped) Conainers
    Solaris 10 Containers are available across all SUN & Fujitsu platforms
    Using BrandZ - Linux, Solaris 8 and Solaris 9 Operating Systems can run in Solaris Branded Zones
Partitioning with Oracle

There is a lot of mis-information about partitioning flooding the internet. The best place to go for information regarding partitioning is Oracle's web site. The following document is dated from 2002 and is still posted as current, as of the publishing of this blog entry.
http://www.oracle.com/corporate/pricing/partitioning.pdf

(page 1)
Soft Partitioning
...
As a result, soft partitioning is not permitted as a means to determine or limit the number of software licenses required for any given server.

(page 2)
Hard Partitioning

Hard partitioning physically segments a server, by taking a single large server and separating it into distinct smaller systems. Each separated system acts as a physically independent, self-contained server, typically with its own CPUs, operating system, separate boot area, memory, input/output subsystem and network resources.
Examples of such partitioning type include: Dynamic System Domains (DSD) -- enabled by Dynamic Reconfiguration (DR), Solaris 10 Containers (capped Containers only)...

Partitioning Examples:
A server has 32 CPUs installed, but it is hard partitioned and only 16 CPUs are made available to run Oracle. The customer is required to license Oracle for only 16 CPUs.

Very clearly, costs can be reduced by using Dynamic System Domains of high-end SPARC systems as well as Solaris 10 (capped) Containers on low-end to mid-range systems.

Other Helpful Oracle Guides
Partitioning & Architecture for Disaster Recovery and Development

With a T2 system being roughly twice the throughput of a T1 system, a low-end T2 makes a good production system which can scale up with a lower initial cost, leveraging hard partitioning options like LDOM's or Solaris 10 (capped) Containers.

For example, the following systems offer similar performance (omitting floating point applications):
  • 8 (T1 cores) * .25 (multi-core 1.2 GHz T1 factor) = 2 (pricing factor)
    A full system, no partitioning
  • 4 (T2 cores) * .75 (multi-core 1.2 GHz T2 factor) = 3 (pricing factor)
    Solaris 10 (capped) Container used to provide half the number of cores, leaving half the cores for later expansion
  • 4 (T2 cores) * .75 (multi-core 1.2 GHz T2 factor) = 3 (pricing factor)
    SPARC CoolThreads LDOM's used to provide half the number of cores, leaving half the cores for later expansion
As greater performance is needed with applications, the appropriate number of cores can be added to a T2 system, in order to provide higher capacity in affordable quantities.
  • 1 (T2 cores) * .75 (multi-core 1.2 GHz T2 factor) = 0.75 (pricing factor) = 1 (rounded up)
  • 2 (T2 cores) * .75 (multi-core 1.2 GHz T2 factor) = 1.50 (pricing factor) = 2 (rounded up)
  • 3 (T2 cores) * .75 (multi-core 1.2 GHz T2 factor) = 2.25 (pricing factor) = 3 (rounded up)
  • 4 (T2 cores) * .75 (multi-core 1.2 GHz T2 factor) = 3 (pricing factor)
  • 5 (T2 cores) * .75 (multi-core 1.2 GHz T2 factor) = 3.75 (pricing factor) = 4 (rounded up)
  • 6 (T2 cores) * .75 (multi-core 1.2 GHz T2 factor) = 4.50 (pricing factor) = 5 (rounded up)
  • 7 (T2 cores) * .75 (multi-core 1.2 GHz T2 factor) = 5.25 (pricing factor) = 6 (rounded up)
  • 8 (T2 cores) * .75 (multi-core 1.2 GHz T2 factor) = 6.00 (pricing factor)
As you can see, scaling up or down an LDOM or Solaris 10 (capped) Container to isolate Oracle license costs can be very effective to control business costs according to capacity need... except some choices may not make a good decision from an economic standpoint (i.e. 3 or 7 cores.)

The T2 may make a great consolidation platform for a Disaster Recovery platform, that could double as a Development Platform, by "scaling down" the number of cores in a container.
Subscribe to: Comments (Atom)

AltStyle によって変換されたページ (->オリジナル) /