Wednesday, November 28, 2012
LDOM 3.0
- Enhances the resource affinity capability to address memory affinity.
- Retrieves a service processor (SP) configuration from the SP into the bootset, which is on the control domain.
- Enhances the domain shutdown process. This feature enables you to specify whether to perform a full shutdown (default), a quick kernel stop, or force a shutdown.
- Adds Oracle Solaris 11 support for Oracle VM Server for SPARC Management Information Base (MIB).
- Enables a migration process to be initiated without specifying a password on the target system.
- Enables the live migration feature while the source machine, target machine, or both, have the power management (PM) elastic policy in effect.
- Enables the dynamic resource management (DRM) feature while the host machine has the PM elastic policy in effect.
I hope this release removes the restriction that prevented dynamic reconfiguration of resources after a live migration.
This release also seems to have been tested on upcoming SPARC processors: "7159011 M4/T5: Migration fails initialization on Logical Domains Manager startup" including Fujitsu Athena: "RFE: Cross CPU Migration support between Athena server and T series".
Friday, October 26, 2012
Solaris 11.1 available for download
The usual SPARC/X86 [text|live|usb|AI] images are available as well as repository images. There are also pre-upgrade repository images available for those of you who are upgrading from 11/11 and have not upgraded to a recent SR or do not have a support contact.
Wednesday, October 10, 2012
Solaris 11.1 announced
General
- New Virtual memory subsystem (VM2.0 or parts of it)
Scales beyond 100TB, predicts and adapts to page demand, higher performance - RSyslog
- USB3 support
- Install on UEFI/4K
- Interactive install on iSCSI
- FedFS
- Parallel zone update
- Must faster LOFI
- FS statistics per zone
- Physical to virtual Solaris 10 migration
- Better support for shared storage
- Remote Administration Daemon
Secure zone administration with C, Java, Python API - VNIC config in zone XML
- Faster install and attach
- ASLR
- OpenSCAP Security Compliance Checking and reporting tool
- Audit remote server
- Data Center Bridging (DCB) IEEE902.1Qaz
- Link aggregation span across switches
- VNIC migration
- Edge Virtual Bridning (EVB) support
- High performance SSH
- GRUB2
- UEFI support
- Improved hardware support
Solaris 11.1 Whats's new
Tuesday, October 9, 2012
SPARC T5, M4 and SPARC64-X
SPARC T5 (early next year)
- 16 Cores 128 threads
- 28nm
- 25% increased thread performance to T4
- 2.5x throughput compared to T4
- Scales from 1 to 8 processors
- PCIe Gen 3
- 8MB L3 cache
- LDOM virtualization (as with all previous T-series)
- Solaris 10 update 11 or Solaris 11.1
- 6 cores 48 threads
- Scales to 32+ sockets
- 48MB L2 cache
- 28nm
- 3.6GHz
- 5-6x performance per socket compared to M-series
- LDOM virtualization
- 32TB+ memory configurations
- Solaris 11 only (but S10 support in LDOM)
- LDOM virtualization
- 16 cores 32 threads
- 24MB L2 Cache
- 3 GHz
- On Chip DB floating point
- Crypto acceleration
- Runs both S10 and S11 in lab
Wednesday, October 3, 2012
Oaktable world and OpenWorld 2012
It was great to talk and listen to the Joyent/Nexenta/illumos guys at Oaktable world.
Thursday, September 13, 2012
Oracle forces wesunsolve to close
In the last two years it has been of tremendous value for administrators who find their support site hard to navigate and the a good overview of patches and updates.
I see nothing Oracle can gain by doing this, no patches where available only metadata with links to their own support site for downloads ( if you have an account with access to the patches ).
Since they already done a much, much worse thing by closing the OpenSolaris source this comes as no surprise. The way companies treat innovation and community efforts are important when you chose a operating system or database engine. Tell your sales representative what you think about these things.
A huge thanks to the people who put time and effort into making wesunsolve.net, it was very useful and will be missed.
Sunday, August 19, 2012
Solaris 10 10/12 and T5-8
Not much is known about this release more than that it will probably feature a ZFS tech refresh, fully integrate OCM and live upgrade enhancements. It could possibly also support the new T5 processors since they might be quite similar to the current T4 processors except for the doubling of cores.
I have also seen fragments of information indicating that Oracle was running Solaris 11 Update 1 on SPARC T5-8 machines as early as March.
Friday, August 17, 2012
Upcoming SPARC CPUs
"SPARC64 X; Fujitsu’s new generation 16 core processor for the next generation UNIX servers
16-core SPARC T5 CMT Processor with glueless 1-hop scaling to 8-sockets"
The SPARC T5 is expected to be built using 28nm technology and double the number of cores compared to the current T4 processor. The Sun Oracle server line should also include a 8 processor version, T5-8 which will then be have four times the number of cores (128) compared to the current T4-4 (32).
This session will be held August 29, hopefully more information will surface afterwards. Otherwise it would be a safe bet to say that we will know more about the SPARC T5 after Oracle OpenWorld in October.
The Register has an article about both the T5 and the M4: Drilling into Oracle's performance boasts.
Saturday, July 28, 2012
Joyent presentations @ FISL 13
Bryan Cantrill, Corporate Open Source Anti-Patterns: Doing It Wrong
video, slides
Bryan speaks his mind about corporate open source patterns with insights from Sun, the OpenSolaris project and Joyent. He does a bad job hiding what he thinks of Oracle ;)
Brendan Gregg, Performance analysis, the USE method
slides, video
Brendan on performance analysis using the USE method with good examples.
Update: Added Brendans video.
Friday, July 20, 2012
Lots of packages for SmartOS, soon for OpenIndiana
9000 packages available for SmartOS and illumos
The packages contains a current PostgresSQL (9.1.3), MySQL, Apache, Ruby 1.9.3, Python 3.2.3 both with lots of modules plus many other useful packages.
All should work on SmartOS and when fixed for OpenIndiana this slightly modified procedure (without sudo and install gtar first) should work, as root:
# pkg install gnu-tar
# curl http://pkgsrc.smartos.org/packages/SmartOS/2012Q2/bootstrap.tar.gz | (cd /; gtar -zxpf - )
# pkgin -y update
# pkgin avail | wc -l
9401
# pkgin search ...
# sudo pkgin -y install
I'll update this entry as soon as it works for OpenIndiana.
Good summary of enhancements in illumos ZFS
Also well worth a read is is Matt Ahrens post about the performance of the new async destroy: Performance of zfs destroy.
Thursday, July 5, 2012
OpenIndiana updated (oi_151a5)
- ZFS feature flags
- ASynchronous destruction of ZFS file systems
- ZFS send progress output
There have also been quite a few userland updates, all is documented in the release notes including a list of CVE-fixes:
OI_151a_prestable5 Release Notes
Update or download images here.
Monday, May 28, 2012
LDOM 2.2 released
From ldm(1M):
"cpu-arch=generic|native
Specifies one of the following values:
generic uses common CPU hardware features to enable a guest domain to perform a CPU-type-independent migration.
native uses CPU-specific hardware features to enable a guest domain to migrate only between platforms that have the same CPU type. native is the default value.
Using the generic value might result in reduced performance compared to the native value. This occurs because the guest domain does not use some features that are only present in newer CPU types. By not using these features, the generic setting enables the flexibility of migrating the domain between systems that use newer and older CPU types."
Another major feature is SR-IOV support, which can enable bare metal I/O performance for logical domains, read more here: SR-IOV feature in OVM Server for SPARC 2.2
There a new set of patches available to update the system firmware to 8.2.0 which is needed for the new features.
Sunday, May 27, 2012
illumian available for download
The next version of NexentaStore (4.0) should also be built upon illumian, previous versions was built on NCP.
There is currently one image available for server text install on X86 hosts.
Thursday, May 24, 2012
ZFS feaure flags and async destroy
Tuesday, May 22, 2012
Solaris 11 / SPARC News
Solaris 11 Update 1 (late this year)
- Updated Virtual memory subsystem This is probably what has been known as vm 2.0 earlier
- Faster Solaris 11 updates with improved python performance
- Already running on the upcoming T5/M4 SPARC(R) chips
- VNIC configuration switch hosts with their zones
- Hotpatching similar to KSplice (Remember DUKS in Solaris 8?)
- Offloading of compression and Oracle arithmetics to CPU besides crypto
- Schedulers for DB or JVM workloads
Wednesday, May 2, 2012
ZFS feature flags update
Hopefully we will see other new feature soon after this is in in place.
ZFS Feature Flags Presentation (PDF)
Feature flags webrev
Sunday, April 15, 2012
OmniOS
It contains the features you expect like Crossbow, ZFS, DTrace, IPS and Comstar but also includes KVM and updates in userland (Python, GCC, Perl, OpenSSL etc.)
"OmniOS is our vision of what OpenSolaris could have been had it remained in the open. It runs better, faster and has more innovations,” continued Schlossnagle. “OmniTI did not want to lose the benefits that OpenSolaris technologies brought to customers, so we decided to pursue the continuation of the OS on our own. We've been running OmniOS in our data centers for six months and have seen tremendous results. We’re excited to announce our news at the DTrace conference because of its importance and relevance to this community."
- Theo Schlossnagle, CEO of OmniTI
More information, install images and source repositories are available here: omnios.omniti.com
I have only installed the image into VirtualBox witch was painless and quick, I might post an update when I've had time for some exploring.
OmniTI Debuts OmniOS, an Open Source Operating System for the Solaris Community
Tuesday, February 21, 2012
S11 and S10 inside LDOM 2.1 on T4
An interesting note is that I've used Solaris 10 as I/O and Control domain for the T4 servers while the LDOM is installed with Solaris 11 11/11. The disks for the LDOM are on LUNs over FC and MPxIO is used for multipathing from the I/O domain:
t42-01# dskinfo list-longExamples of migrating and reconfiguring the LDOM while running:
disk size lun use p spd type lb
c0t5000CBA015B85D98d0 279G - rpool - - disk y
c0t5000CBA015B93B90d0 279G - - - - disk y
c0t50002870000254901593534030832420d0 33G 0x0 - 4 4Gb fc y
c0t50002870000254901593534030832420d0 33G 0x1 - 4 4Gb fc y
t42-01# ldm listWhen performing a live migration between the two hosts, running processes and open network connections are as expected intact, there is only a small delay in the network traffic visible. For my initial tests the delay was about 10 ms.
NAME STATE FLAGS CONS VCPU MEMORY UTIL UPTIME
primary active -n-cv- UART 16 16G 0.1% 12d 6h 37m
ldms11-01 active -n---- 5000 16 8G 0.0% 24m
t42-02# ldm list
NAME STATE FLAGS CONS VCPU MEMORY UTIL UPTIME
primary active -n-cv- UART 16 16G 0.1% 12d 1h 26m
ldms11-01:~$ uptime
5:11pm up 19 min(s), 1 user, load average: 0.00, 0.00, 0.01
henrikj@ldms11-01:~$ prtconf -v |grep Mem
Memory size: 8192 Megabytes
henrikj@ldms11-01:~$ psrinfo | wc -l
16
t42-02# ldm set-vcpu 96 ldms11-01
t42-02# ldm set-memory 200G ldms11-01
t42-02# ldm list
NAME STATE FLAGS CONS VCPU MEMORY UTIL UPTIME
primary active -n-cv- UART 16 16G 0.1% 12d 6h 50m
ldms11-01 active -n---- 5000 96 200G 0.1% 24m
ldms11-01:~$ prtconf -v |grep Mem
Memory size: 204800 Megabytes
ldms11-01:~$ psrinfo | wc -l
96
The live migration seems to work very well and the T4 seems to perform several times faster than the T2/T3 for general workloads. The only thing missing is that LDOM 2.1 is unable to dynamically reconfigure memory and CPU resources for a domain after migration. A reboot is then required, hopefully this will be fixed in the 3.0 release, which people at Oracle Open World said would be focused on removing current limitations (including migration between different types of sun4v processors).
Tuesday, January 3, 2012
The all-seeing eye of DTrace
This tuned out to be a good opportunity to put my DTrace skill to work together with a few finished scripts. Once again it struck what how amazing this tool is, you can really see everything that is going on in your system and as it turns out, you can even see problems that does not even exist. Since this was so much fun and a good example I will walk through the steps again:
The thing that caught my eye was the output from errinfo of the DTrace toolkit. There are a very high rate of system calls returning in error, namely close() with -9 "Bad file number", as seen with errinfo:
whoami ioctl 22 13 Invalid argumentSyscall errors are in itself normal can be seen on any systems, but usually not several thousand per second. As the error message indicates this happens when a close() is issued on a file handle (Integer) that does not represent an open file for that process, which at first look seems like a quite useless operation.
init ioctl 25 212 Inappropriate ioctl for device
awk stat64 2 520 No such file or directory
java lwp_cond_wait 62 3492 timer expired
processx close 9 102073391 Bad file number
We can also see that close() is by far the most used system call here:
# dtrace -q -n 'BEGIN { close=0;total=0 } syscall::close:entry \
{ close = close + 1 } syscall:::entry { total = total + 1 } END \
{ printf("%d close calls of %d total calls\n",close,total) }'
309530 close calls of 426212 total callsLooking at which file descriptor the process is trying to close shows that there is an even distribution of close between 0 and 65536 and the only successful calls where to numbers lower than 1024 where numbers normally used unless a process has a very high amount of open files. # dtrace -n 'syscall::close:entry { this->fd = arg0 } syscall::close:return \
/ errno!= 0 / { @failed = lquantize(this->fd,0,65536,16384) } syscall::close:return \
/errno == 0/ { @good = lquantize(this->fd,0,65535,1024) }
dtrace: description 'syscall::close:entry ' matched 3 probes
value ------------- Distribution ------------- count
< 0 | 7
0 |@@@@@@@@@@@@@ 414811
16384 |@@@@@@@@@ 294912
32768 |@@@@@@@@@ 294912
49152 |@@@@@@@@@ 294912
>= 65536 | 0
value ------------- Distribution ------------- count
< 0 | 0
0 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ 12459
1024 | 0 The processes responsible only lives only a short while, but by using dtruss i could trace system calls based on the process name:15889/1: fork1() = 0 0The process is issuing close on all numbers between 0x3 to 0xFFFF in a loop, as expected the first few are actually open and closed correctly but the other was majority is returning error -9.
15889/1: lwp_sigmask(0x3, 0x0, 0x0) = 0xFFFF 0
11385/1: getpid(0x0, 0x1, 0x1CD0) = 15889 0
11385/1: lwp_self(0x0, 0x1, 0x40) = 1 0
11385/1: lwp_sigmask(0x3, 0x0, 0x0) = 0xFFFF 0
11385/1: fcntl(0xA, 0x9, 0x0) = 0 0
11385/1: schedctl(0xFFFFFFFF7F7361B8, 0xFFFFFFFF7F738D60, 0x11A340)
= 2139062272 0
11385/1: lwp_sigmask(0x3, 0x0, 0x0) = 0xFFFF 0
11385/1: sigaction(0x12, 0xFFFFFFFF7FFDEE20, 0x0) = 0 0
11385/1: sigaction(0x2, 0xFFFFFFFF7FFDEE20, 0x0) = 0 0
11385/1: sigaction(0xF, 0xFFFFFFFF7FFDEE20, 0x0) = 0 0
11385/1: getrlimit(0x5, 0xFFFFFFFF7FFDED90, 0x0) = 0 0
11385/1: close(0x3) = 0 0
11385/1: close(0x4) = 0 0
11385/1: close(0x5) = 0 0
11385/1: close(0x6) = 0 0
....
11397/1: close(0xFFFF) = -1 Err#9
If we look in the beginning of the trace we can see a fork, followed a little later by getrlimit(0x5,...), if we look at what that arguments to getrlimit means:
# egrep "RLIMIT.*5" /usr/include/sys/resource.hThe process is checking the limit of file descriptors and then closes the whole possible range which seems a little unnecessary since almost none of them are open. But this was just after a fork, and a forked process inherits all the open files of it's parent, this might not be what you want so a close is in order. There are however no easy way of getting a list of all used file descriptors so what we see here is a brute-force approach of making sure none are open before continuing. This would probably not have been noticed if it weren't for the unusual high limit of file descriptors.
#define RLIMIT_NOFILE 5 /* file descriptors */
# plimit $(pgrep processx|head -1) | grep nofilesPerhaps a iteration with close on the contents of /proc/${PID}/fd would have been less resource consuming in this scenario.
nofiles(descriptors) 65536 65536
All of this was done in a production system without impact to applications which is crucial, you must be able to trust that it will never bring your system down. This is something DTrace can be trusted with where some platforms lacking it but tries to provide somewhat similar observability fails, read Brendans blog: using systemtap or the older but entertaining DTrace knockoffs.
Download the DTracetoolkit here.
ZFSSA/S7000 major update
- ZFS RAIDZ read performance improvements
- Significant fairness improvements during ZFS resilver operations
- Significant zpool import speed improvements
- Replication enhancements - including self-replication
- Seval more including bug fixes.
The improved RAIDZ performances is the hybrid raidz/mirror allocator in zpool version 29.
The ZFSSA is a fantastic product with probably the best interface and analytics available. But the development seems to have stagnated a bit the last year, so have the blog post with useful information and performance comparison by the people behind it. And I still miss one feature badly; synchronous replication of datasets, continuous replication is not always good enough.
Release Notes