Compress a sparse matrix

Question 1

Compress a sparse matrix using Compressed sparse row (CSR, CRS or Yale format).

These are all the same form of compression (ignore new Yale).

Input may be any 2d data structure (list of lists, etc): e.g

[[0 0 0 0],
 [5 8 0 0],
 [0 0 3 0],
 [0 6 0 0]]

And the output should be three 1d data structures (list etc), that denote the outputs A, IA and JA, for example

[5, 8, 3, 6]
[0, 0, 2, 3, 4]
[0, 1, 2, 1,]

The process is described by wikipedia:

The array A is of length NNZ and holds all the nonzero entries of M in left-to-right top-to-bottom ("row-major") order.

The array IA is of length m + 1. It is defined by this recursive definition:

IA[0] = 0

IA[i] = IA[i − 1] + (number of nonzero elements on the (i − 1)-th row in the original matrix)

Thus, the first m elements of IA store the index into A of the first nonzero element in each row of M, and the last element IA[m] stores NNZ, the number of elements in A, which can be also thought of as the index in A of first element of a phantom row just beyond the end of the matrix M. The values of the i-th row of the original matrix is read from the elements A[IA[i]] to A[IA[i + 1] − 1] (inclusive on both ends), i.e. from the start of one row to the last index just before the start of the next.[5]

The third array, JA, contains the column index in M of each element of A and hence is of length NNZ as well.

If your language doesn't support actual data structures, input and output may be text.

Test cases

Input 1:

[[0 0 0 0],
 [5 8 0 0],
 [0 0 3 0],
 [0 6 0 0]]

Output 1:

[ 5, 8, 3, 6 ]
[ 0, 0, 2, 3, 4 ]
[ 0, 1, 2, 1, ]

Input 2

[[10 20 0 0 0 0],
 [0 30 0 40 0 0],
 [0 0 50 60 70 0],
 [0 0 0 0 0 80]]

Output 2:

[ 10 20 30 40 50 60 70 80 ]
[ 0 2 4 7 8 ]
[ 0 1 1 3 2 3 4 5 ]

Input 3:

[[0 0 0],
 [0 0 0],
 [0 0 0]]

Output 3:

[ ]
[ 0 0 0 0 ]
[ ]

Input 4:

[[1 1 1],
 [1 1 1],
 [1 1 1]]

Output 4:

[ 1 1 1 1 1 1 1 1 1 ]
[ 0 3 6 9 ]
[ 0 1 2 0 1 2 0 1 2 ]

Input 5:

[[0 0 0 0],
 [5 -9 0 0],
 [0 0 0.3 0],
 [0 -400 0 0]]

Output 5:

[ 5, -9, 0.3, -400 ]
[ 0, 0, 2, 3, 4 ]
[ 0, 1, 2, 1, ]

Assume inputs may contain any real number, you need not consider mathematical symbols or exponential representation (e.g. 5,000 will never be entered as 5e3). You will not need to handle inf, -inf, NaN or any other 'pseudo-numbers'. You may output a different representation of the number (5,000 may be output as 5e3 if you so choose).

Scoring

This is a code-golf, fewest bytes wins.

Question 2

Could 1-based indices be used for the last row?

Question 3

@Leo for JA? No.

Question 4

Isn't IA[0] = 0 completely unnecessary? It's only needed to define IA[i] = IA[i − 1]..., yet we could simply state that if i-1 < 0 to use 0. That is, IA[0] is always equal to 0, therefor it can be compressed out (yes, I realize that this is a critique of the algorithm, not this challenge).

Question 5

Will we have the inverse challenge too?

Question 6

Neat! Hadn't run into either format before, but I'm glad to see someone else did see that before (I shouldn't be the kind of person who spots trivial optimizations in algorithms this old).

Question 7

MATL, 19 bytes

!3#f!Dx0Gg!XsYshDq!

Input uses ; as row separator.

Try it online! Or verify all test cases: 1, 2, 3, 4, 5.

Explanation

! % Implicit input. Transpose
3#f % 3-output version of find: it takes all nonzero values and pushes
 % their column indices, row indices, and values, as column vectors
! % Transpose into a row vector
D % Display (and pop) vector of values
x % Delete vector of row values
0 % Push 0
G % Push input
g % Convert to logical: nonzeros become 1
! % Transpose
Xs % Sum of columns. Gives a row vector
Ys % Cumulative sum
h % Prepend the 0 that's below on the stack
D % Display (and pop) that vector
q % Subtract 1 from the vector of row indices
! % Transpose into a row vector. Implicitly display

Question 8

Mathematica, 78 bytes

{a=SparseArray@#;a@"NonzeroValues",a@"RowPointers",Join@@a@"ColumnIndices"-1}&

See this answer on mathematica.stackexchange.com.

Question 9

Jelly, 24 bytes

n0S0ドル;;\S,ドルT€Ẏ$’$
Ẏḟ0W;Ç

Try it online!

Question 10

APL (Dyalog), (削除) 31 (削除ここまで) 28 chars or (削除) 36 (削除ここまで) 33 bytes*

Requires ⎕IO←0 for zero based indexing. I/O is list of lists.

{(∊d)(0,+\≢ ̈d←⍵~ ̈0)(∊⍸ ̈⍵≠0)}

Try it online!

{...} anonymous function where the argument is represented by ⍵

(...)(...)(...) return a list of three things:

⍵≠0 Boolean where the argument differs from 0
⍸ ̈ ɩndices of those for each sub-list
∊ εnlist (flatten) to combine into single list

⍵~ ̈0 remove zeros from each sub-list of the argument
d← store that as d
≢ ̈ tally each
+\ cumulative sum
0, prepend a zero

∊d εnlist (flatten) d to combine into single list

^{* To run in Dyalog Classic, simply replace ⍸ with ⎕U2378.}

Question 11

Nice, I don't understand the input format though? f 4 4⍴ and then the values?

Question 12

@Pureferret the Code defines the function f. The Input is really a REPL, which calls f on the result of 4 4⍴... which reshapes the data into a 4×4 matrix.

Question 13

Rho for reshapes. I get it!

Question 14

@Pureferret I've updated the Try it online! link to better show test cases.

Question 15

Haskell, 87 bytes

f s|a<-filter(/=0)<$>s=(id=<<a,scanl(+)0$length<$>a,s>>= \t->[i|(i,e)<-zip[0..]t,e/=0])

Try it online!

How it works:

a<-filter(/=0)<$>s -- let a be the list of lists with all 0 removed]
 -- e.g. [[1,0,0],[0,3,4]] -> [[1],[3,4]]
 -- return a triple of
id=<<a -- a concatenated into a single list -> A 
scanl(+)0$length<$>a -- partial sums of the length of the sublists of a
 -- strating with an additional 0 -> IA
s>>= -- map the lambda over the sublists of s and concatenate
 -- into a single list
 \t->[i|(i,e)<-zip[0..]t,e/=0] -- the indices of the non-zero elements -> JA

Question 16

Python+SciPy, 79 bytes

_{i guess built-ins were not forbidden}

from scipy.sparse import*
A=csr_matrix(input())
print A.data,A.indptr,A.indices

Accepts input in the format [[0, 0, 0, 0],[5, 8, 0, 0],[0, 0, 3, 0],[0, 6, 0, 0]]

Question 17

R, 70 bytes

function(m,M=t(m))list(M[x<-!!M],diffinv(colSums(x)),which(x,T)[,1]-1)

Try it online!

This is an old, matrix challenge that didn't get an R answer for 3 years! Having to do this in row-major rather than column-major order costs 8 bytes (,M=t(m))).

Question 18

Python 3, 96 bytes

lambda m:list(map(list,zip(*[[e,i,j] for i,r in enumerate(m) for j,e in enumerate(r) if e!=0])))

Try it online!

Question 19

PHP, 107 bytes

<?for($y=[$c=0];$r=$_GET[+$l++];)foreach($r as$k=>$v)!$v?:[$x[]=$v,$z[]=$k,$y[$l]=++$c];var_dump($x,$y,$z);

Try it online!

PHP, 109 bytes

<?$y=[$c=0];foreach($_GET as$r){foreach($r as$k=>$v)if($v){$x[]=$v;$z[]=$k;$c++;}$y[]=$c;}var_dump($x,$y,$z);

Try it online!

Question 20

Does this need the numbers to be strings?

Question 21

@Pureferret Any Input in PHP is a string or an array of strings. I have not casted the input so if you wish that the output is purely int replace $x[]=$v with $x[]=+$v

Question 22

JavaScript (ES6), 117 bytes

a=>[a.map((b,i)=>(b=b.filter((x,c)=>x&&o.push(c)),m[i+1]=m[i]+b.length,b),m=[0],o=[]).reduce((x,y)=>x.concat(y)),m,o]

Input is a 2D array of numbers and output is an array of [A, IA, JA].

Explained

a=>[
 a.map((b,i) => ( // map each matrix row
 b = b.filter((x,c) => x // filter to only non-zero elements
 && o.push(c) // and add this index to JA
 )
 m[i+1] = m[i] + b.length, // set next value of IA
 b // and return filtered row
 ),
 m=[0],o=[] // initialize IA (m) and JA (o)
 ).reduce((x,y) => x.concat(y)), // flatten the non-zero matrix
m,o] // append IA and JA

Tests

let f=
a=>[a.map((b,i)=>(b=b.filter((x,c)=>x&&o.push(c)),m[i+1]=m[i]+b.length,b),m=[0],o=[]).reduce((x,y)=>x.concat(y)),m,o]
let run=x=>O.innerHTML+=f(x).map(s=>`[${s.join`, `}]`).join`\n`+"\n\n"
run([[0,0,0,0],[5,8,0,0],[0,0,3,0],[0,6,0,0]])
run([[10,20,0,0,0,0],[0,30,0,40,0,0],[0,0,50,60,70,0],[0,0,0,0,0,80]])
run([[0,0,0],[0,0,0],[0,0,0]])
run([[1,1,1],[1,1,1],[1,1,1]])
run([[0,0,0,0],[5,-9,0,0],[0,0,0.3,0],[0,-400,0,0]])

<pre id=O></pre>

Question 23

Japt, (削除) 31 (削除ここまで) (削除) 27 (削除ここまで) 17 bytes

Hopefully this output format is OK.

cf pUmè iT å+ Ucð

Try it

cf pUmè iT å+ Ucð :Implicit input of 2D array U
c :Flat map
 f : Filter
 p :Push the following two elements
 Um : First element: Map U
 è : Count the truthy elements in each
 i : Prepend
 T : Zero
 å+ : Cumulatively reduce by addition
 Uc : Second element: Flat map U
 ð : 0-based indices of truthy elements

Question 24

I just ran the other examples though and it works

Question 25

Python 2, 115 bytes

lambda m:zip(*[[v,i]for k in m for i,v in enumerate(k)if v])+[reduce(lambda a,b:a+[len(b)-b.count(0)+a[-1]],m,[0])]

Try it online!

Output is [A, JA, IA]

Question 26

Perl 6, 84 bytes

{.flatmap(*.grep(+*)),(0,|[\+] .map(+*.grep(+*))),.flat.kv.flatmap:{$^a%.[0]xx?$^b}}

Try it online!

The single matrix argument is in $_.

.flatmap(*.grep(+*)) selects the nonzero elements of the entire matrix.
[\+] .map(+*.grep(+*)) is the triangular reduction of the number of elements in each row (which some languages call scan). (0,|...) prepends a zero to that list.
.flat.kv produces an indexed list of all elements of the matrix. .flatmap: { $^a % .[0] xx ?$^b } flat-maps over the modulus of each index by the number of columns in the array (.[0], the number of elements in the first row), replicated by the element itself, interpreted as a boolean. That is, nonzero elements are replicated once, and zero elements are replicated zero times (ie, removed).

Question 27

Julia, ^{(削除) 66 (削除ここまで)} 63 bytes

using SparseArrays
n*a=getfield(sparse(a'),n)
!a=5a,3a.-1,4a.-1

SparseArrays is a standard library that stores with CSC instead of CSR (column instead of row or something), that's why we need to transpose (a').

-6 bytes if 1-indexing was allowed

overloads * with getfield to save a few bytes

Try it online!

Question 28

Scala, 82 bytes

x=>x.flatMap(_.zipWithIndex).filter(_._1!=0).unzip->x.scanLeft(0)(_+_.count(_!=0))

Try it in Scastie!

Returns ((A, JA), IA)

Explanation:

x => //The sparse matrix to be smooshed
x.flatMap( //Do the following operation to each row, then flatten the result
 _.zipWithIndex //Zip each element with its column (makes tuples)
).filter( //Keep the tuples where
 _._1!=0 //the first element isn't 0
).unzip //Unzip into a tuple of two lists, A and JA
-> //Make that the first element of another 2-tuple with IA:
x.scanLeft(0)( //Starting with 0
 _+_.count(_!=0) //Add the number of nonzero elements in each row
) //Keep the intermediate results, giving us IA

Luis Mendo Luis Mendo 107k10 gold badges139 silver badges381 bronze badges · Accepted Answer · 2017-07-05 14:13:17Z

MATL, 19 bytes

!3#f!Dx0Gg!XsYshDq!

Input uses ; as row separator.

Try it online! Or verify all test cases: 1, 2, 3, 4, 5.

Explanation

! % Implicit input. Transpose
3#f % 3-output version of find: it takes all nonzero values and pushes
 % their column indices, row indices, and values, as column vectors
! % Transpose into a row vector
D % Display (and pop) vector of values
x % Delete vector of row values
0 % Push 0
G % Push input
g % Convert to logical: nonzeros become 1
! % Transpose
Xs % Sum of columns. Gives a row vector
Ys % Cumulative sum
h % Prepend the 0 that's below on the stack
D % Display (and pop) that vector
q % Subtract 1 from the vector of row indices
! % Transpose into a row vector. Implicitly display

Stack Exchange Network

Compress a sparse matrix

Test cases

Scoring

15 Answers 15

MATL, 19 bytes

Explanation

Mathematica, 78 bytes

Jelly, 24 bytes

APL (Dyalog), (削除) 31 (削除ここまで) 28 chars or (削除) 36 (削除ここまで) 33 bytes*

Haskell, 87 bytes

Python+SciPy, 79 bytes

R, 70 bytes

Python 3, 96 bytes

PHP, 107 bytes

PHP, 109 bytes

JavaScript (ES6), 117 bytes

Explained

Tests

Japt, (削除) 31 (削除ここまで) (削除) 27 (削除ここまで) 17 bytes

Python 2, 115 bytes

Perl 6, 84 bytes

Julia, ^{(削除) 66 (削除ここまで)} 63 bytes

Scala, 82 bytes

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Hot Network Questions

Compress a sparse matrix

Test cases

Scoring

15 Answers 15

MATL, 19 bytes

Explanation

Mathematica, 78 bytes

Jelly, 24 bytes

APL (Dyalog), (削除) 31 (削除ここまで) 28 chars or (削除) 36 (削除ここまで) 33 bytes*

Haskell, 87 bytes

Python+SciPy, 79 bytes

R, 70 bytes

Python 3, 96 bytes

PHP, 107 bytes

PHP, 109 bytes

JavaScript (ES6), 117 bytes

Explained

Tests

Japt, (削除) 31 (削除ここまで) (削除) 27 (削除ここまで) 17 bytes

Python 2, 115 bytes

Perl 6, 84 bytes

Julia, (削除) 66 (削除ここまで) 63 bytes

Scala, 82 bytes

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Related

Hot Network Questions

Julia, ^{(削除) 66 (削除ここまで)} 63 bytes