Study of AIA and HMI high-rate telemetry compression efficiency
Images and histograms showing the data from TRACE and MDI used in this
test are linked in the table below. The colorscale in some of the
images was clipped to enhance visual presentation. The full range can
be seen in the histogram plots.
- The MDI data used are taken from 256x256 linescan images done in
hires mode without square root transformation. All input data was 12
bits.
- I have little specific information about the TRACE data, except
for the wavelengths and dates suggested by the filenames. Input data
accuracy seems to be 12 bits, although the effective dynamic range of
the TRACE data is more like 9-10 bits when discounting small isolated
peaks caused by cosmic ray hits (or the interesting features in the
images around active regions!). The noise (entropy) level is quite
low (apart from in the continuum image), so I assume that most of the
shot noise has been removed by discarding the LSBs after
analog-to-digital conversion onboard.
Measured compression efficiency:
Procedure
- For each test image, a 4096x4096 pixel image was created by
repetition / tiling. For the MDI data this makes some sense since the
MDI hires resolution is comparable to the expected HMI resolution. For
TRACE this is not the case, but the expansion was done anyway to
properly simulate the effect of partially filled telemetry packets on
the overall efficiency.
- The MDI data was multiplied by an analytical limb darkening
function of the form 1-sum(c_i*log(mu)), i=1...4, mu=cos(theta), with
c1 = 0.41708, c2 = 0.12512, c2 = 0.02825, c2 = 0.00528, c2 = 0.00051.
The image was then cropped to contain only the central disk. As an example, the L1 composite image can be seen here.
- The 4096x4096 image was compressed according to the high-rate
telemetry interface spec, into a stream of high-rate science data
packets of size 1776 bytes, of which 38 bytes contain header
information.
- The compression was done for all possible combinations of values
for the compression parameters FSMAX=1,2,...,15, K=1,2,...,8.
Results
Table 2 below summarizes the results. The compression efficiency is
given as the average bits per pixel (bpp) for the entire 4096*4096
image. The size of the compressed image without headers can be found
by multiplying the numbers in the table by 4096^2.
- "entropy of 1st diff" is the raw entropy of the data after
applying a first difference operation. Entropy is measured in bits per
pixel (bpp) and is calculated from the histogram of first differences
shown in Table 1 as sum(p*log2(p)), where p(i) = count(i)/sum(count),
i=-4095,-4094,...,0,...,4095. For the MDI, where cropping was applied,
the average bpp is equal to the entropy for the pixels in the central
disk times pi/4 (to account for the fact that no bits are spent on
encoding the zeroes outside the disk).
- "best compression" is the lowest bpp achieved.
-
"(K_best,FSMAX_best)" is the set of compression parameters yielding
the lowest bpp.
- "compression (K_best,8)" is the bpp achieved for the best K
but with the length of the fundamental sequence limited to 8. This
illustrates that hardly any compression efficiency is gained by
increasing FSMAX beyond 8.
Table 2: Compression efficiency.
| Image | Entropy of 1st diff | Best compression | (K_best,FSMAX_best) | Compression (K_best,8) |
| TRACE: |
| sample1600a_23aug1998 |
4.2260 bpp |
4.5221 bpp |
(1,15) |
4.6277 bpp |
| sample171a_29aug1998 |
3.5504 bpp |
4.2043 bpp |
(1,15) |
4.3075 bpp |
| sample195a_16aug1998 |
5.8434 bpp |
6.2999 bpp |
(3,7) |
6.3048 bpp |
| sample195b_16aug1998 |
3.5706 bpp |
4.0070 bpp |
(1,15) |
4.0541 bpp |
| sampleconta_23aug1998 |
6.7705 bpp |
6.9288 bpp |
(4,8) |
6.9288 bpp |
| MDI: |
| L1 |
5.8920 bpp |
6.0192 bpp |
(5,9) |
6.0200 bpp |
| R1 |
5.8881 bpp |
6.0158 bpp |
(5,12) |
6.0167 bpp |
| R2 |
5.5787 bpp |
5.6562 bpp |
(4,14) |
5.6686 bpp |
| L2 |
5.5713 bpp |
5.6520 bpp |
(4,14) |
5.6603 bpp |
| L3 |
5.9960 bpp |
6.0971 bpp |
(5,10) |
6.0979 bpp |
| R3 |
5.9798 bpp |
6.0846 bpp |
(5,9) |
6.0855 bpp |
R4 |
6.1090 bpp |
6.1949 bpp |
(5,9) |
6.1957 bpp |
| L4 |
6.1205 bpp |
6.2056 bpp |
(5,9) |
6.2065 bpp |
Comments
The overhead due to the final packet being only partially filled
is taken into account in this simulation. It amounts to 1776/2 bytes
on average.
The overhead due to the high-rate science packet headers is not
included in the numbers above. It amounts to 38/1776 or approximately
2.1%.
If this sequence was representative for the HMI framelist, a
cadence of 2 seconds would give rise to an average datarate of 51.35
Mbits/s, including headers.
Overview plots
Plots of bpp as a function of K and FSMAX. Crosses show the bpp for
the original input data. Circles (TRACE only) show bpp after replacing
1% of the pixels with random values to simulate cosmic ray hits. The
horizontal lines indicate raw entropy of the first differences of the
original data.
Square root transformation
The tables below list the results obtained with simple square root
transformation of the simulated 12-bit HMI data used above. We used a
transformation of the form f(x) = floor(sqrt(C*x)+0.5), finv(y) =
floor((y^2)/C+0.5). The multiplier C was chosen as a power of 2 for
simplicity. The table below lists e(x) = x - finv(f(x)) and other
values of interest for various values of C:
Table 4: Errors in sqrt compression.
| Multiplier | e_max = max |e(x)| | min x, where |e(x)| = e_max | min x, where |e(x)|>0 |
| 128 |
-6 |
2661 |
41 |
1024 |
2 |
2373 |
280 |
The table below lists the optimal compression parameter and
the compression efficiency obtained for the simulated HMI images.
Table 5: Compression efficiency, MDI, circular crop,
limb darkening curve applied.
| Image | Multiplier | Optimal K | Compression |
RMS error |
| L1 |
none |
5 |
6.0192 bpp |
0 |
| L1 |
1024 |
3 |
4.7023 bpp |
1.06 |
| L1 |
128 |
2 |
3.6390 bpp |
2.95 |
Alternative image compression algorithms for JSOC archieve
Here we compare the onboard compression algorithm with two standard
compression algorithms that are candidates for being used as an
internal compressed storage format in the JSOC. The two candidates are
JPEG-LS (lossless JPEG) and the Rice compression extension to the FITS
standard, denoted by "FITZ". We use code based on the LOCO
implementation by Hewlett-Packard for JPEG-LS and a Rice coder based on
??? for FITZ.
All three algorithms are based on Golomb entropy coding (sometimes
incorrectly referred to as Rice coding), the simplicity of which
ensures high processing speed. They use quite different prediction
filters ("median edge detector" for JPEG-LS, simple first differencing
for FITZ), parameter estimation/adaptation and handling of very low
entropy symbols (run-length coding in JPEG-LS, special zero block
symbols in FITZ).
Compression Efficiency
Table 6: Compression efficiency, TRACE.
| Image |
hmicomp | JPEG-LS | FITZ(16) | FITZ(64) |
| TRACE: |
| sample1600a_23aug1998 |
4.7270 bpp |
3.332 bpp |
4.2882 bpp |
4.2001 bpp |
| sample171a_29aug1998 |
4.3999 bpp |
2.371 bpp |
3.3602 bpp |
3.4075 bpp |
| sample195a_16aug1998 |
6.4400 bpp |
5.516 bpp |
5.9800 bpp |
6.2913 bpp |
| sample195b_16aug1998 |
4.1411 bpp |
2.722 bpp |
3.5912 bpp |
3.5764 bpp |
| sampleconta_23aug1998 |
7.0774 bpp |
6.214 bpp |
6.7985 bpp |
6.6771 bpp |
Table 7: Compression efficiency, MDI, circular crop,
limb darkening curve applied.
| Image |
hmicomp | JPEG-LS | FITZ(16) | FITZ(64) |
| L1 |
6.0200 bpp |
5.5730 bpp |
6.1977 bpp |
6.1140 bpp |
| R1 |
6.0167 bpp |
5.5720 bpp |
6.1925 bpp |
6.1099 bpp |
| R2 |
5.6736 bpp |
5.2687 bpp |
5.8574 bpp |
5.7654 bpp |
| L2 |
5.6645 bpp |
5.2603 bpp |
5.8552 bpp |
5.7655 bpp |
| L3 |
6.0979 bpp |
5.5859 bpp |
6.2985 bpp |
6.2076 bpp |
| R3 |
6.0855 bpp |
5.5739 bpp |
6.2843 bpp |
6.1932 bpp |
R4 |
6.1957 bpp |
5.7619 bpp |
6.4171 bpp |
6.3239 bpp |
| L4 |
6.2065 bpp |
5.7738 bpp |
6.4291 bpp |
6.3358 bpp |
(De-) Compression speed / throughput
Table 3 lists compression time measured on Dell Dimension 4600 workstation
(3.06 GHz Intel P4, RedHat Linux 9) reading from and writing to local
disk. De-compression speeds are comparable to the compression speeds
listed in Table 3.
Table 8: Compression time (Usr+Sys) / throughput (MB/s).
| Image | JPEG-LS | FITZ(16) | FITZ(64) |
sample1600a_23aug1998 |
1.07 s / 31.4 MB/s |
0.55 s / 61.0 MB/s |
0.49 s / 68.5 MB/s |
| L1 |
1.24 s / 27.1 MB/s |
0.60 s / 55.9 MB/s |
0.55 s / 61.0 MB/s |
| sample1600a_23aug1998 (cropped) |
0.86 s / 39.0 MB/s |
0.49 s / 68.5 MB/s |
0.43 s / 78.0 MB/s |
| L1 (cropped) |
1.13 s / 29.7 MB/s |
0.53 s / 63.3 MB/s |
0.46 s / 72.9 MB/s |
Comments
- "throughput" is (uncompressed size)/(compression time). If
this number is equal to or larger than the smallest effective disk and
network bandwidth in the system, then (de)compressing files on write
(read) does not slow down processing speed if properly pipelined. Of
course this comes at the cost of increased CPU load.
Square root compression
The numbers below illustrate the effect of square root transformation
with the standard compression algorithms.
The last two rows illustrate the compression of square root
transformed data by standard algorithms. The last two rows illustrate
the compression obtained for square root transformed data that have
been transformed back and then compressed with a standard
algorithm. As expected, no benefit is obtained by having a sparser
histogram. In fact a slight degradation is observed in most cases
since the Golomb coding is optimal for a (smooth) geometric
distribution.
Table 9: Compression efficiency, MDI, circular crop,
limb darkening curve applied.
| Image | transformation |
hmicomp | JPEG-LS | FITZ(16) | FITZ(64) |
| L1 |
y = x |
6.0200 bpp |
5.5730 bpp |
6.1977 bpp |
6.1140 bpp |
| L1 |
y = round(sqrt(1024*x)) |
4.7023 bpp |
4.2853 bpp |
4.8993 bpp |
4.8132 bpp |
| L1 |
y = round(sqrt(128*x)) |
3.6390 bpp |
3.1110 bpp |
3.7374 bpp |
3.6409 bpp |
| L1 |
y = round((round(sqrt(1024*x))^2)/1024) |
-- |
5.5746 bpp |
6.1982 bpp |
6.1138 bpp |
| L1 |
y = round((round(sqrt(128*x))^2)/128) |
-- |
5.5973 bpp |
6.2091 bpp |
6.1233 bpp |