Study of AIA and HMI high-rate telemetry compression efficiency

Images and histograms showing the data from TRACE and MDI used in this test are linked in the table below. The colorscale in some of the images was clipped to enhance visual presentation. The full range can be seen in the histogram plots.

The MDI data used are taken from 256x256 linescan images done in hires mode without square root transformation. All input data was 12 bits.
I have little specific information about the TRACE data, except for the wavelengths and dates suggested by the filenames. Input data accuracy seems to be 12 bits, although the effective dynamic range of the TRACE data is more like 9-10 bits when discounting small isolated peaks caused by cosmic ray hits (or the interesting features in the images around active regions!). The noise (entropy) level is quite low (apart from in the continuum image), so I assume that most of the shot noise has been removed by discarding the LSBs after analog-to-digital conversion onboard.

Table 1: Input data.
Image Histogram Histogram of 1st diff

TRACE:

sample1600a_23aug1998 hist_sample1600a_23aug1998 diffhist_sample1600a_23aug1998

sample171a_29aug1998 hist_sample171a_29aug1998 diffhist_sample171a_29aug1998

sample195a_16aug1998 hist_sample195a_16aug1998 diffhist_sample195a_16aug1998

sample195b_16aug1998 hist_sample195b_16aug1998 diffhist_sample195b_16aug1998

sampleconta_23aug1998 hist_sampleconta_23aug1998 diffhist_sampleconta_23aug1998

MDI (lp_e256_01d.2598):

L1 hist_L1 diffhist_L1

R1 hist_R1 diffhist_R1

R2 hist_R2 diffhist_R2

L2 hist_L2 diffhist_L2

L3 hist_L3 diffhist_L3

R3 hist_R3 diffhist_R3

R4 hist_R4 diffhist_R4

L4 hist_L4 diffhist_L4

Table 1: Input data.
Image	Histogram	Histogram of 1st diff
TRACE:
sample1600a_23aug1998	hist_sample1600a_23aug1998	diffhist_sample1600a_23aug1998
sample171a_29aug1998	hist_sample171a_29aug1998	diffhist_sample171a_29aug1998
sample195a_16aug1998	hist_sample195a_16aug1998	diffhist_sample195a_16aug1998
sample195b_16aug1998	hist_sample195b_16aug1998	diffhist_sample195b_16aug1998
sampleconta_23aug1998	hist_sampleconta_23aug1998	diffhist_sampleconta_23aug1998
MDI (lp_e256_01d.2598):
L1	hist_L1	diffhist_L1
R1	hist_R1	diffhist_R1
R2	hist_R2	diffhist_R2
L2	hist_L2	diffhist_L2
L3	hist_L3	diffhist_L3
R3	hist_R3	diffhist_R3
R4	hist_R4	diffhist_R4
L4	hist_L4	diffhist_L4

Measured compression efficiency:

Procedure

For each test image, a 4096x4096 pixel image was created by repetition / tiling. For the MDI data this makes some sense since the MDI hires resolution is comparable to the expected HMI resolution. For TRACE this is not the case, but the expansion was done anyway to properly simulate the effect of partially filled telemetry packets on the overall efficiency.
The MDI data was multiplied by an analytical limb darkening function of the form 1-sum(c_i*log(mu)), i=1...4, mu=cos(theta), with c1 = 0.41708, c2 = 0.12512, c2 = 0.02825, c2 = 0.00528, c2 = 0.00051. The image was then cropped to contain only the central disk. As an example, the L1 composite image can be seen here.
The 4096x4096 image was compressed according to the high-rate telemetry interface spec, into a stream of high-rate science data packets of size 1776 bytes, of which 38 bytes contain header information.
The compression was done for all possible combinations of values for the compression parameters FSMAX=1,2,...,15, K=1,2,...,8.

Results

Table 2 below summarizes the results. The compression efficiency is given as the average bits per pixel (bpp) for the entire 4096*4096 image. The size of the compressed image without headers can be found by multiplying the numbers in the table by 4096^2.

"entropy of 1st diff" is the raw entropy of the data after applying a first difference operation. Entropy is measured in bits per pixel (bpp) and is calculated from the histogram of first differences shown in Table 1 as sum(p*log2(p)), where p(i) = count(i)/sum(count), i=-4095,-4094,...,0,...,4095. For the MDI, where cropping was applied, the average bpp is equal to the entropy for the pixels in the central disk times pi/4 (to account for the fact that no bits are spent on encoding the zeroes outside the disk).
"best compression" is the lowest bpp achieved.
"(K_best,FSMAX_best)" is the set of compression parameters yielding the lowest bpp.
"compression (K_best,8)" is the bpp achieved for the best K but with the length of the fundamental sequence limited to 8. This illustrates that hardly any compression efficiency is gained by increasing FSMAX beyond 8.

Table 2: Compression efficiency.
Image Entropy of 1st diff Best compression (K_best,FSMAX_best) Compression (K_best,8)

TRACE:

sample1600a_23aug1998 4.2260 bpp 4.5221 bpp (1,15) 4.6277 bpp

sample171a_29aug1998 3.5504 bpp 4.2043 bpp (1,15) 4.3075 bpp

sample195a_16aug1998 5.8434 bpp 6.2999 bpp (3,7) 6.3048 bpp

sample195b_16aug1998 3.5706 bpp 4.0070 bpp (1,15) 4.0541 bpp

sampleconta_23aug1998 6.7705 bpp 6.9288 bpp (4,8) 6.9288 bpp

MDI:

L1 5.8920 bpp 6.0192 bpp (5,9) 6.0200 bpp

R1 5.8881 bpp 6.0158 bpp (5,12) 6.0167 bpp

R2 5.5787 bpp 5.6562 bpp (4,14) 5.6686 bpp

L2 5.5713 bpp 5.6520 bpp (4,14) 5.6603 bpp

L3 5.9960 bpp 6.0971 bpp (5,10) 6.0979 bpp

R3 5.9798 bpp 6.0846 bpp (5,9) 6.0855 bpp

R4 6.1090 bpp 6.1949 bpp (5,9) 6.1957 bpp

L4 6.1205 bpp 6.2056 bpp (5,9) 6.2065 bpp

Table 2: Compression efficiency.
Image	Entropy of 1st diff	Best compression	(K_best,FSMAX_best)	Compression (K_best,8)
TRACE:
sample1600a_23aug1998	4.2260 bpp	4.5221 bpp	(1,15)	4.6277 bpp
sample171a_29aug1998	3.5504 bpp	4.2043 bpp	(1,15)	4.3075 bpp
sample195a_16aug1998	5.8434 bpp	6.2999 bpp	(3,7)	6.3048 bpp
sample195b_16aug1998	3.5706 bpp	4.0070 bpp	(1,15)	4.0541 bpp
sampleconta_23aug1998	6.7705 bpp	6.9288 bpp	(4,8)	6.9288 bpp
MDI:
L1	5.8920 bpp	6.0192 bpp	(5,9)	6.0200 bpp
R1	5.8881 bpp	6.0158 bpp	(5,12)	6.0167 bpp
R2	5.5787 bpp	5.6562 bpp	(4,14)	5.6686 bpp
L2	5.5713 bpp	5.6520 bpp	(4,14)	5.6603 bpp
L3	5.9960 bpp	6.0971 bpp	(5,10)	6.0979 bpp
R3	5.9798 bpp	6.0846 bpp	(5,9)	6.0855 bpp
R4	6.1090 bpp	6.1949 bpp	(5,9)	6.1957 bpp
L4	6.1205 bpp	6.2056 bpp	(5,9)	6.2065 bpp

Comments

The overhead due to the final packet being only partially filled is taken into account in this simulation. It amounts to 1776/2 bytes on average.

The overhead due to the high-rate science packet headers is not included in the numbers above. It amounts to 38/1776 or approximately 2.1%.

If this sequence was representative for the HMI framelist, a cadence of 2 seconds would give rise to an average datarate of 51.35 Mbits/s, including headers.

Overview plots

Plots of bpp as a function of K and FSMAX. Crosses show the bpp for the original input data. Circles (TRACE only) show bpp after replacing 1% of the pixels with random values to simulate cosmic ray hits. The horizontal lines indicate raw entropy of the first differences of the original data.

Table 3: Plots of bpp versus K and FSMAX.
Instrument bpp versus K bpp versus FSMAX

TRACE TRACE versus K TRACE versus FSMAX

MDI MDI versus K MDI versus FSMAX

Table 3: Plots of bpp versus K and FSMAX.
Instrument	bpp versus K	bpp versus FSMAX
TRACE	TRACE versus K	TRACE versus FSMAX
MDI	MDI versus K	MDI versus FSMAX

Square root transformation

The tables below list the results obtained with simple square root transformation of the simulated 12-bit HMI data used above. We used a transformation of the form f(x) = floor(sqrt(C*x)+0.5), finv(y) = floor((y^2)/C+0.5). The multiplier C was chosen as a power of 2 for simplicity. The table below lists e(x) = x - finv(f(x)) and other values of interest for various values of C:

Table 4: Errors in sqrt compression.
Multiplier e_max = max |e(x)| min x, where |e(x)| = e_max min x, where |e(x)|>0

128 -6 2661 41

1024 2 2373 280

Table 4: Errors in sqrt compression.
Multiplier	e_max = max \|e(x)\|	min x, where \|e(x)\| = e_max	min x, where \|e(x)\|>0
128	-6	2661	41
1024	2	2373	280

The table below lists the optimal compression parameter and the compression efficiency obtained for the simulated HMI images.

Table 5: Compression efficiency, MDI, circular crop, limb darkening curve applied.
Image Multiplier Optimal K Compression RMS error

L1 none 5 6.0192 bpp 0

L1 1024 3 4.7023 bpp 1.06

L1 128 2 3.6390 bpp 2.95

Table 5: Compression efficiency, MDI, circular crop, limb darkening curve applied.
Image	Multiplier	Optimal K	Compression	RMS error
L1	none	5	6.0192 bpp	0
L1	1024	3	4.7023 bpp	1.06
L1	128	2	3.6390 bpp	2.95

Alternative image compression algorithms for JSOC archieve

Here we compare the onboard compression algorithm with two standard compression algorithms that are candidates for being used as an internal compressed storage format in the JSOC. The two candidates are JPEG-LS (lossless JPEG) and the Rice compression extension to the FITS standard, denoted by "FITZ". We use code based on the LOCO implementation by Hewlett-Packard for JPEG-LS and a Rice coder based on ??? for FITZ.

All three algorithms are based on Golomb entropy coding (sometimes incorrectly referred to as Rice coding), the simplicity of which ensures high processing speed. They use quite different prediction filters ("median edge detector" for JPEG-LS, simple first differencing for FITZ), parameter estimation/adaptation and handling of very low entropy symbols (run-length coding in JPEG-LS, special zero block symbols in FITZ).

Compression Efficiency

Table 6: Compression efficiency, TRACE.
Image hmicomp JPEG-LS FITZ(16) FITZ(64)

TRACE:

sample1600a_23aug1998 4.7270 bpp 3.332 bpp 4.2882 bpp 4.2001 bpp

sample171a_29aug1998 4.3999 bpp 2.371 bpp 3.3602 bpp 3.4075 bpp

sample195a_16aug1998 6.4400 bpp 5.516 bpp 5.9800 bpp 6.2913 bpp

sample195b_16aug1998 4.1411 bpp 2.722 bpp 3.5912 bpp 3.5764 bpp

sampleconta_23aug1998 7.0774 bpp 6.214 bpp 6.7985 bpp 6.6771 bpp

Table 6: Compression efficiency, TRACE.
Image	hmicomp	JPEG-LS	FITZ(16)	FITZ(64)
TRACE:
sample1600a_23aug1998	4.7270 bpp	3.332 bpp	4.2882 bpp	4.2001 bpp
sample171a_29aug1998	4.3999 bpp	2.371 bpp	3.3602 bpp	3.4075 bpp
sample195a_16aug1998	6.4400 bpp	5.516 bpp	5.9800 bpp	6.2913 bpp
sample195b_16aug1998	4.1411 bpp	2.722 bpp	3.5912 bpp	3.5764 bpp
sampleconta_23aug1998	7.0774 bpp	6.214 bpp	6.7985 bpp	6.6771 bpp

Table 7: Compression efficiency, MDI, circular crop, limb darkening curve applied.
Image hmicomp JPEG-LS FITZ(16) FITZ(64)

L1 6.0200 bpp 5.5730 bpp 6.1977 bpp 6.1140 bpp

R1 6.0167 bpp 5.5720 bpp 6.1925 bpp 6.1099 bpp

R2 5.6736 bpp 5.2687 bpp 5.8574 bpp 5.7654 bpp

L2 5.6645 bpp 5.2603 bpp 5.8552 bpp 5.7655 bpp

L3 6.0979 bpp 5.5859 bpp 6.2985 bpp 6.2076 bpp

R3 6.0855 bpp 5.5739 bpp 6.2843 bpp 6.1932 bpp

R4 6.1957 bpp 5.7619 bpp 6.4171 bpp 6.3239 bpp

L4 6.2065 bpp 5.7738 bpp 6.4291 bpp 6.3358 bpp

Table 7: Compression efficiency, MDI, circular crop, limb darkening curve applied.
Image	hmicomp	JPEG-LS	FITZ(16)	FITZ(64)
L1	6.0200 bpp	5.5730 bpp	6.1977 bpp	6.1140 bpp
R1	6.0167 bpp	5.5720 bpp	6.1925 bpp	6.1099 bpp
R2	5.6736 bpp	5.2687 bpp	5.8574 bpp	5.7654 bpp
L2	5.6645 bpp	5.2603 bpp	5.8552 bpp	5.7655 bpp
L3	6.0979 bpp	5.5859 bpp	6.2985 bpp	6.2076 bpp
R3	6.0855 bpp	5.5739 bpp	6.2843 bpp	6.1932 bpp
R4	6.1957 bpp	5.7619 bpp	6.4171 bpp	6.3239 bpp
L4	6.2065 bpp	5.7738 bpp	6.4291 bpp	6.3358 bpp

(De-) Compression speed / throughput

Table 3 lists compression time measured on Dell Dimension 4600 workstation (3.06 GHz Intel P4, RedHat Linux 9) reading from and writing to local disk. De-compression speeds are comparable to the compression speeds listed in Table 3.

Table 8: Compression time (Usr+Sys) / throughput (MB/s).
Image JPEG-LS FITZ(16) FITZ(64)

sample1600a_23aug1998 1.07 s / 31.4 MB/s 0.55 s / 61.0 MB/s 0.49 s / 68.5 MB/s

L1 1.24 s / 27.1 MB/s 0.60 s / 55.9 MB/s 0.55 s / 61.0 MB/s

sample1600a_23aug1998 (cropped) 0.86 s / 39.0 MB/s 0.49 s / 68.5 MB/s 0.43 s / 78.0 MB/s

L1 (cropped) 1.13 s / 29.7 MB/s 0.53 s / 63.3 MB/s 0.46 s / 72.9 MB/s

Table 8: Compression time (Usr+Sys) / throughput (MB/s).
Image	JPEG-LS	FITZ(16)	FITZ(64)
sample1600a_23aug1998	1.07 s / 31.4 MB/s	0.55 s / 61.0 MB/s	0.49 s / 68.5 MB/s
L1	1.24 s / 27.1 MB/s	0.60 s / 55.9 MB/s	0.55 s / 61.0 MB/s
sample1600a_23aug1998 (cropped)	0.86 s / 39.0 MB/s	0.49 s / 68.5 MB/s	0.43 s / 78.0 MB/s
L1 (cropped)	1.13 s / 29.7 MB/s	0.53 s / 63.3 MB/s	0.46 s / 72.9 MB/s

Comments

"throughput" is (uncompressed size)/(compression time). If this number is equal to or larger than the smallest effective disk and network bandwidth in the system, then (de)compressing files on write (read) does not slow down processing speed if properly pipelined. Of course this comes at the cost of increased CPU load.

Square root compression

The numbers below illustrate the effect of square root transformation with the standard compression algorithms. The last two rows illustrate the compression of square root transformed data by standard algorithms. The last two rows illustrate the compression obtained for square root transformed data that have been transformed back and then compressed with a standard algorithm. As expected, no benefit is obtained by having a sparser histogram. In fact a slight degradation is observed in most cases since the Golomb coding is optimal for a (smooth) geometric distribution.

Table 9: Compression efficiency, MDI, circular crop, limb darkening curve applied.
Image transformation hmicomp JPEG-LS FITZ(16) FITZ(64)

L1 y = x 6.0200 bpp 5.5730 bpp 6.1977 bpp 6.1140 bpp

L1 y = round(sqrt(1024*x)) 4.7023 bpp 4.2853 bpp 4.8993 bpp 4.8132 bpp

L1 y = round(sqrt(128*x)) 3.6390 bpp 3.1110 bpp 3.7374 bpp 3.6409 bpp

L1 y = round((round(sqrt(1024*x))^2)/1024) -- 5.5746 bpp 6.1982 bpp 6.1138 bpp

L1 y = round((round(sqrt(128*x))^2)/128) -- 5.5973 bpp 6.2091 bpp 6.1233 bpp

Image	transformation	hmicomp	JPEG-LS	FITZ(16)	FITZ(64)
L1	y = x	6.0200 bpp	5.5730 bpp	6.1977 bpp	6.1140 bpp
L1	y = round(sqrt(1024*x))	4.7023 bpp	4.2853 bpp	4.8993 bpp	4.8132 bpp
L1	y = round(sqrt(128*x))	3.6390 bpp	3.1110 bpp	3.7374 bpp	3.6409 bpp
L1	y = round((round(sqrt(1024*x))^2)/1024)	--	5.5746 bpp	6.1982 bpp	6.1138 bpp
L1	y = round((round(sqrt(128*x))^2)/128)	--	5.5973 bpp	6.2091 bpp	6.1233 bpp