Ragged list pattern matching

Question 1

Given a pattern and a ragged list of positive integers, your task is to decide whether the pattern matches the ragged list.

The pattern is also represented by a ragged list. But in addition to positive integers, it may contain a wildcard value.

Here is the rule for matching:

A positive integer matches the same positive integer.
The wildcard matches anything: either a positive integer, or a ragged list.
A ragged list matches a ragged list if they have the same length, and each pair of items at the same position matches.

For example, if we write the wildcard as 0, then the pattern [0, [4, [5], 0]] matches the ragged list [[1, 2, [3]], [4, [5], [6, [7]]]]: here the first 0 matches [1, 2, [3]], while the second 0 matches [6, [7]].

Note that the wildcard cannot match a subsequence of more than 1 items in a ragged list.

You may choose any fixed value as the wildcard, as long as it is consistent.

This is code-golf, so the shortest code in bytes wins.

This is decision-problem. You may use your language's convention for truthy/falsy, or use two distinct, fixed values to represent true or false.

Testcases

Here I use 0 to represent the wildcard. The input here are given in the order pattern, ragged list.

Thanks @pxeger, @pajonk, @DLosc, @Deadcode for providing some interesting testcases.

Truthy

[], []
[0], [1]
[1, 0], [1, [2, 3]]
[1, 0], [1, [[2, 3]]]
[1, [2, 0], 4], [1, [2, 3], 4]
[0, [4, [5], 0]], [[1, 2, [3]], [4, [5], [6, [7]]]]

Falsy

[1], []
[0], []
[[]], []
[[]], [3]
[[4]], [4]
[0], [1, 2]
[1, 0], [1]
[0], [[1], [2]]
[1, 0, [2, 3]], [1, [2, 3]]
[1, [0, 2], 4], [1, [2, 3], 4]
[[0], [4, [5], 0]], [[1, 2, [3]], [4, [5], [6, [7]]]]

Question 2

"Note that the wildcard cannot match a subsequence of more than 2 items in a ragged list." The 0 can match an element which is a list of more than two items, but it also can't be skipped or used to match two elements, so where does the "more than 2" come in?

Question 3

@Neil Oh, that is a typo. It should be "more than 1".

Question 4

Suggested truthy test-case: [1, 0], [1, [[2, 3]]] - my solution was failing that one (although I think it may be R-specific).

Question 5

Suggested falsey test case: [[]], []

Question 6

@Deadcode Of course you can.

Question 7

Elixir, 1 byte

Yes, Elixir's equality/pattern match operator functions exactly as requested when input is provided as plain code with wildcards indicated by underscore_. The output is by presence/absence of error (passed = match, error = no match, see the Test Suite

Since validity of this way of taking input has been questioned in comments, here is also the conventional function doing the same, with all the boilerplate counted in code...

Elixir, 28 bytes

&Code.eval_string&1<>"="<>&2

Try it online!

Question 8

I don't think this should count. The thing on the left hand side is not an actual value, so this only works when written directly in the code and can't be used as a function or full program, which means it doesn't match any allowed I/O method. You could maybe have something like &(Code.eval_string"#{&1=&2}"), which is anonymous function.

Question 9

I edited your suggestion in, but let's see how the others react. Honestly, I'd be very frustrated to learn that exotic I/O methods are only tolerated with exotic languages...

Question 10

Haskell, 44 bytes

W#_=1
p#a|p==a=1
L(h:p)#L(f:r)|h#f>0=L p#L r

Attempt This Online!

Haskell, 45 bytes

W==_=1>0;N p==N a=p==a;L p==L a=p==a;_==_=1<0

Attempt This Online!

Although now I'm not sure if I have to include the instance declaration in my score.

tinylisp, (削除) 71 (削除ここまで) (削除) 81 (削除ここまで) (削除) 70 (削除ここまで) 79 bytes

Being a language whose syntax is entirely made of ragged lists, you'd expect tinylisp to be good at parsing them, and it indeed doesn't do that bad of a job at it.

(d F(q((A B)(i(a(e A 0)(e A B))0(i(a(e()B)(e(v B)B))1(a(F(h A)(h B))(F(t A)(t B

Takes two lists as arguments, and outputs falsy (zero) if they match, or truthy (positive int) if they don't.

Edit +10 bytes: @DLosc pointed out that my code actually failed for some cases containing an empty list as an element.

Edit -11 bytes: @DLosc suggested (e(v A)A) for the troublesome conditional, and indeed it does work, and saves a byte off my original size.

Edit +9 bytes: Two more fixes for problems pointed out by alephalpha and DLosc, changing (e(v A)A) to (e(v B)B) and adding a test if B is nil to the second if, I think this actually works perfectly now

Try it online!

Question 14

Hmmm good thoughts @DLosc, I am going to have to tinker with this a bit and figure out what makes it work for that test case. The (e()(h A)) is indeed supposed to make sure that a 1 heads back up the recursion if it hits mismatched integers, as otherwise it would head into an infinite recursion, I did try using your suggested (a 1 A), but that seems to also not work right

Question 15

@DLosc some tinkering and I've realized that I have to check if A is either an integer or an empty list, and the only infallible method of doing that I've found is a(e()A)(e(q Int)(type A)), which is painfully long

Question 16

(a(e()A)(e()(c()A)) works!!

Question 17

I think you can use (e(v A)A): ints evaluate to themselves, and so does nil, but any nonempty ragged list of ints will fail to evaluate and thus return nil.

Question 18

Oh good one @DLosc! I had came up with (h(c A A)) on my own, because appending A to itself will either produce nil if A is int, (()) if A is nil, or ((foo) foo) if A is a non empty list

Question 19

R, (削除) 111 (削除ここまで) (削除) 110 (削除ここまで) (削除) 104 (削除ここまで) 82 bytes

Edit: -22 bytes thanks to @Giuseppe.

f=\(p,l,`+`=is.list,`-`=length)`if`(+p,+l&-p==-l&&all(mapply(f,p,l)),!p|!+l&&p==l)

Attempt This Online!

Straightforward recursion.

Question 20

Can you just do a Map(f,p,l)? The +l&-p==-l will short-circuit so you're guaranteed that p and l will have the same length if you're evaluating the Map, and it passes the test cases.

Question 21

@Giuseppe, thaks a lot! I was designing it gradually and the complicated Map was needed before adding those conditions at the front - and I didn't think of checking straightforward Map at the end - shame on me!

Question 22

No worries; I surmised that was the case but wasn't sure if there was another edge case I wasn't thinking of; I've been a little less active on CGCC lately.

Question 23

Python 2, 34 bytes

f=lambda p,a:a+a!=p+p>0<map(f,p,a)

Attempt This Online!

Outputs via presence or absence of an error. If there's no error, the pattern matches. If there is an error, the pattern doesn't match.

~~(削除) -16 (削除ここまで)~~ -13 bytes thanks to @loopy walt

Question 24

-16 I think. Using that in Python 2 map doesn't truncate but pads with None.

Question 25

@loopywalt Oh, I'd forgotten that! I originally just had f=lambda p,a:a!=p>0!=[*map([f][len(p)^len(a)],p,a)] in Py3, and made a lazy translation to Py2 for -4 bytes without thinking

Question 26

A bug introduced in your last edit (47 → 31 bytes): This returns True for [0], [] and [5, 0], [], which, as I understand, should return False, although clarification from @alephalpha would be helpful.

Question 27

Oops! This f=lambda p,a:a+a!=p+p>0<map(f,p,a) seems to fix it at 34 bytes: ato.pxeger.com/…

Question 28

Retina, (削除) 65 (削除ここまで) (削除) 63 (削除ここまで) (削除) 57 (削除ここまで) 56 bytes

~(L$`^.+
\n$&$
_
(\d+|<((?<_><)|(?<-_>>)|\d|,)+(?(_)^)>)

Try it online! Takes two newline-separated <>-wrapped lists but link is to test suite that deletes spaces, splits on semicolons and converts other types of brackets for convenience. Uses _ to represent the wild card. Edit: Saved ~~(削除) 6 (削除ここまで)~~ 7 bytes thanks to @Deadcode pointing out that I don't need to use []-wrapped lists or support spaces. Explanation:

~(`

Evaluate the result of the transformations below on the original input. Since they result in a single line, this makes Retina attempt to match the input against the result.

L$`^.+
\n$&$

Take only the first line of the input and wrap it in \n and $ so that it will match against the second line of the input.

_
(\d+|<((?<_><)|(?<-_>>)|\d|,)+(?(_)^)>)

Replace each _ with a match against either a non-negative integer or a list, using a .NET balancing group to ensure that the list is properly balanced. A named capturing group ?<_> is used here because .NET allows capturing groups to be reused in this way and it's golfier than calculating the capturing group number.

Question 29

You can drop 6 bytes by redefining your input specification to use different bracket characters that don't need to be escaped.

Question 30

What I don't understand is why you can't drop an additional 4 bytes by leaving the balanced group unnamed. Can you explain this?

Question 31

@Deadcode It won't work if there is more than one wildcard, because the group's number will be different for every instance.

Question 32

You can save 1 byte by dropping spaces from the input and using \d|, instead of [^<>].

Question 33

You can then save an additional 7 bytes by removing (?(_)^) – it's not needed as long as the input has balanced brackets, since the regex can only match unbalanced brackets in one direction, i.e., with multiple wildcards it wouldn't be able to match too few followed by too many.

Question 34

Jelly, 13 bytes

ßȯŒḊ?"ḣL{
ç9=

A dyadic Link that accepts the pattern (with 0 as the wildcard) on the left and the potential match on the right and yields 1 for a match or 0 for no match.

Try it online!

How?

ßȯŒḊ?"ḣL{ - Helper Link: pattern P; potential A
 " - P zip A applying:
 ? - if...
 ŒḊ - ...condition: depth (of element of P)
ß - ...then: call this Helper Link (with element of P and element of A)
 ȯ - ...else: (element of P) logical OR (element of A)
 L{ - get length of P
 ḣ - head (the zip result) to index (length of P)
ç9= - Link: pattern P; potential A
ç9 - P Helper Link A
 = - equals A?

Question 35

Charcoal, 56 bytes

1≔⟦A⟧θFθ«≔§ι0η≔§ι1ζ¿=+ηυη¿∧=+ζυζ=LηLζFLη⊞θEι§λκ⎚¿∧η¬=ηζ⎚

Try it online! Link is to verbose version of code. Takes input as a pair of two ragged lists with 0 indicating a wildcard and outputs a Charcoal boolean, i.e. - for matching, nothing for mismatching. Explanation:

Start by assuming that the lists match.

≔⟦A⟧θFθ«

Start a breadth-first search of the lists.

≔§ι0η≔§ι1ζ

Extract the corresponding sublists that are being searched.

¿=+ηυη

If the pattern list element is a sublist, ...

¿∧=+ζυζ=LηLζ

... then if the list element under test is a sublist of the same length, then...

FLη

... for each pair of elements of the sublists...

⊞θEι§λκ

... push the corresponding pair to the search list.

⎚

If the element under test wasn't a sublist of the same length then clear the canvas.

¿∧η¬=ηζ⎚

Otherwise, when the pattern element isn't a sublist, then if it is truthy but not equal to the element under test then clear the canvas.

Support for zeros by using empty strings as wildcard is possible at a cost of 2 bytes:

1≔⟦A⟧θFθ«≔§ι0η≔§ι1ζ¿=+ηυη¿∧=+ζυζ=LηLζFLη⊞θEι§λκ⎚¿∧+ηω¬=ηζ⎚

Try it online! Link is to verbose version of code.

Question 36

Neither the 56 nor 58 byte versions accept the test case [], [].

Question 37

@Deadcode Thanks, it was a typo in the code (I had ι when I needed η).

Question 38

A typo is now in the first TIO link.

Question 39

@Deadcode Whoops, deleted the old link, but failed to paste the new link in properly...

Question 40

Ruby, 57 bytes

f=->t,a{a!=t&&t&&t.zip(a).map{|r|[f][t.size^a.size][*r]}}

Try it online!

Uses nil as filler. (false will also work)

Ports @pxeger's Python answer.

Question 41

Is it possible to get a shorter Ruby answer by porting loopy's fix to pxeger's shortened Python answer?

Question 42

@Deadcode I can't figure out anything that would be shorter than this, especially since Ruby's arrays don't respond to > and <.

Question 43

Regex (Perl / PCRE), (削除) 63 (削除ここまで) (削除) 60 (削除ここまで) (削除) 56 (削除ここまで) 53 bytes

((_?)(2円.)?(?=.*
(4円?(?(?=2円)3円|(\d+,?|{(?5)*})))))+

Try it online! (Perl)
Try it on regex101 (PCRE)

The wildcard is _ and the delimiter between the pattern and the ragged list is a newline.

This regex works by going through the pattern one character at a time; any character other than _ must be matched with the identical character in the pattern, and _ is matched with any positive integer or ragged list. This is done while cumulatively building up a backreference of the full ragged list, as each piece of it is matched.

Due to the optimization of requiring the input to be well-formed, only the ragged lists matched with _ need to be checked for matched brackets. The regex uses { and } (curly braces) as its brackets to avoid having to \-escape them. Similarly, spaces must not be present in the input.

Explanation

( ... )+
Match what is inside any nonzero number of times.

(_?)(2円.)?
Consume the current pattern character. If it is _, set 2円 to contain it. Otherwise, set 2円 to be empty and 3円 to contain the character. Note that the latter can also happen if the character is _, but this cannot lead to a subsequent match, because _ cannot be in the input ragged list.

(?=.*
... )
Process the input ragged list in a lookahead, so that outside this lookahead, the input pattern can still be processed picking up where it left off. The .* skips over what remains of the input pattern, and the newline skips over the newline delimiter.

(4円?(?(?=2円)3円|(\d+,?|{(?5)*})))
Validate that the appropriate section of the input ragged list matches the current character of the input pattern, capturing the entire portion of it matched so far (starting from its beginning) in 4円.

4円?
If this is the first character being processed, 4円 will be unset, and this will consume zero characters of the input ragged list. Otherwise, this can either match with what 4円 has been set to, or match an empty string. Since the regex is designed only to work on well-formed input, it will in practice always match 4円 (the portion of the input ragged list already matched) when it is set, skipping to the appropriate part of the input ragged list to match against the corresponding current input pattern character. The unoptimized version of this would be (?(4)4円) or 4円?+.

(?(?=2円) ... | ... )
If 2円 matches (without consuming it), match against the first part, otherwise match against the second part. 2円 will contain _ if that (a wildcard) is the current pattern character, otherwise it will be empty. Since well-formed input can never contain _ in the input ragged list, 2円 will only match if the current pattern character is not a wildcard.

3円
Match against the current non-wildcard character.

(\d+,?|{(?5)*})
Match any single positive integer or ragged list. Balancing of opening and closing braces is done here using recursion; (?5) recursively calls the entire expression above. Since the input is assumed to be well-formed, we can get away with the optimization of not enforcing that there is an integer directly before and after every comma.

‍
Once the entire input pattern has been processed, the newline delimiter is skipped, enforcing that the entire input pattern has been matched against. There is no need to anchor at the beginning or end of the input string, because as long as the input is well-formed (with balanced braces), there is no way that the entire input pattern can be matched without also matching the entire input ragged list.

Regex (.NET), 71 bytes

(?<=(.)+);(?<-1>1円|(?<=(?=1円)_.*)(\d+|<((<)|(?<-4>>)|\d|,)+(?(4)^)>))+$

Try it online!

The wildcard is _ and the delimiter between the pattern and the ragged list is ;. The brackets are < and >.

This regex works by capturing the input pattern character by character in reverse, pushing it onto the Balanced Group stack of Group 1. Then the input ragged list is tested against this by popping the stack in a loop, and for each popped character, matching directly against it if possible, and otherwise, checking if the popped character is _, and if it is, matching against any ragged list with balanced brackets.

Explanation

(?<=(.)+);
Push the input pattern onto the Group 1 stack, one character at a time in reverse order, then skip to the input ragged list.

(?<-1> ... )+$
Pop the Group 1 stack in a loop, executing the following for each popped character, and then assert that the end of the string (and thus the end of the input ragged list) has been reached.

1円|(?<=(?=1円)_.*)(\d+|<((<)|(?<-4>>)|\d|,)+>)
What to execute for each character popped from the Group 1 stack.

1円
If the character matches, consume it. This can never happen if the character is _, because a well-formed input ragged list can't contain that.

|
Otherwise, execute the following.

(?<=(?=1円)_.*)
Assert that the pattern character is _.

(\d+|<((<)|(?<-4>>)|\d|,)+(?(4)^)>)
Match either a single positive integer, or a ragged list with balanced brackets.

(<) - push an opening bracket onto the Group 4 stack
(?<-4>>) - pop an entry from the Group 4 stack, and match a closing bracket
\d|, - match a numeral or comma
(?(4)^) - assert that the Group 4 stack is empty (literally, assert that if the Group 4 stack is not empty, we're at the beginning of the input string)

Question 44

Python3, 114 bytes:

T=type
f=lambda x,y:len(x)==len(y)and all(a==0 or(T(a)==T(b)and(f(a,b)if list==T(a)else a==b))for a,b in zip(x,y))

Try it online!

Question 45

Wolfram Language (Mathematica), 6 bytes

MatchQ

Try it online!

Built-in. Uses _ as a wildcard. Input [list, pattern] or [pattern][list].

Non-built-in:

Wolfram Language (Mathematica), 33 bytes

##=!=0&&Or@@#0@@@Check[{##},]&

Try it online!

Uses 0 as a wildcard. Input in any order. Returns False if they match, or Null otherwise.

Wolfram Language (Mathematica), 23 bytes

##=!=0&&#0@@@({##})&

Try it online!

Uses 0 as a wildcard. Input in any order. Returns by error presence/absence.

Question 46

Factor + `match`, 7 bytes

(match)

Try it online!

Question 47

Pip `-x`, 25 bytes

aN[0b]|$&0*g&a#=b&$&fMZab

Takes the pattern and the list as command-line arguments in Pip list syntax, using 0 as the wildcard value. Outputs 1 for truthy, empty string or 0 for falsey. Attempt This Online! Or, verify all test cases.

Explanation

This is a recursive full program. The -x flag evaluates the program's arguments, which allows us to take lists as inputs.

aN[0b]|$&0*g&a#=b&$&fMZab
 ; a is 1st arg (pattern); b is 2nd arg (list)
 ; g is list of both args; f is main function
 [0b] ; List containing 0 and b
aN ; Is a in that list? (I.e. is a 0 or does a equal b?)
 | ; If that result is truthy, return it; if not:
 0*g ; 0 times g (applies to ints, maps over lists)
 $& ; Fold on AND (truthy iff both elements are truthy):
 ; - If a or b is an int (and we know they're not equal
 ; because we already checked), we get a falsey result
 ; - If a or b is an empty list (and we know they're not
 ; both empty because we already checked), that's falsey
 ; - Otherwise, both are nonempty lists and we get truthy
 & ; If that result is falsey, return it; if not:
 a#=b ; Are a and b the same length?
 & ; If that result is falsey, return it; if not:
 f ; Apply the main function recursively
 MZab ; to corresponding items from a and b
 $& ; Fold on AND (truthy iff all elements are truthy)

Kirill L. 14.3k2 gold badges17 silver badges54 bronze badges · Accepted Answer · 2022-04-16 11:41:45Z

12

\$\begingroup\$

Elixir, 1 byte

Yes, Elixir's equality/pattern match operator functions exactly as requested when input is provided as plain code with wildcards indicated by underscore_. The output is by presence/absence of error (passed = match, error = no match, see the Test Suite

Since validity of this way of taking input has been questioned in comments, here is also the conventional function doing the same, with all the boilerplate counted in code...

Elixir, 28 bytes

&Code.eval_string&1<>"="<>&2

Try it online!

Share

Improve this answer

edited Apr 16, 2022 at 16:59

answered Apr 16, 2022 at 11:41

Kirill L.'s user avatar

Kirill L.

14.3k2 gold badges17 silver badges54 bronze badges

\$\endgroup\$

2

1

\$\begingroup\$ I don't think this should count. The thing on the left hand side is not an actual value, so this only works when written directly in the code and can't be used as a function or full program, which means it doesn't match any allowed I/O method. You could maybe have something like &(Code.eval_string"#{&1=&2}"), which is anonymous function. \$\endgroup\$

pxeger
– pxeger

2022年04月16日 13:47:50 +00:00
Commented Apr 16, 2022 at 13:47
\$\begingroup\$ I edited your suggestion in, but let's see how the others react. Honestly, I'd be very frustrated to learn that exotic I/O methods are only tolerated with exotic languages... \$\endgroup\$

Kirill L.
– Kirill L.

2022年04月16日 17:00:52 +00:00
Commented Apr 16, 2022 at 17:00

Add a comment |

Stack Exchange Network

Ragged list pattern matching

Testcases

Truthy

Falsy

14 Answers 14

Elixir, 1 byte

Elixir, 28 bytes

Haskell, 44 bytes

Haskell, 45 bytes

tinylisp, (削除) 71 (削除ここまで) (削除) 81 (削除ここまで) (削除) 70 (削除ここまで) 79 bytes

R, (削除) 111 (削除ここまで) (削除) 110 (削除ここまで) (削除) 104 (削除ここまで) 82 bytes

Python 2, 34 bytes

Retina, (削除) 65 (削除ここまで) (削除) 63 (削除ここまで) (削除) 57 (削除ここまで) 56 bytes

Jelly, 13 bytes

How?

Charcoal, 56 bytes

Ruby, 57 bytes

Regex (Perl / PCRE), (削除) 63 (削除ここまで) (削除) 60 (削除ここまで) (削除) 56 (削除ここまで) 53 bytes

Explanation

Regex (.NET), 71 bytes

Explanation

Python3, 114 bytes:

Wolfram Language (Mathematica), 6 bytes

Wolfram Language (Mathematica), 33 bytes

Wolfram Language (Mathematica), 23 bytes

Factor + `match`, 7 bytes

Pip `-x`, 25 bytes

Explanation

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Linked

Hot Network Questions

Ragged list pattern matching

Testcases

Truthy

Falsy

14 Answers 14

Elixir, 1 byte

Elixir, 28 bytes

Haskell, 44 bytes

Haskell, 45 bytes

tinylisp, (削除) 71 (削除ここまで) (削除) 81 (削除ここまで) (削除) 70 (削除ここまで) 79 bytes

R, (削除) 111 (削除ここまで) (削除) 110 (削除ここまで) (削除) 104 (削除ここまで) 82 bytes

Python 2, 34 bytes

Retina, (削除) 65 (削除ここまで) (削除) 63 (削除ここまで) (削除) 57 (削除ここまで) 56 bytes

Jelly, 13 bytes

How?

Charcoal, 56 bytes

Ruby, 57 bytes

Regex (Perl / PCRE), (削除) 63 (削除ここまで) (削除) 60 (削除ここまで) (削除) 56 (削除ここまで) 53 bytes

Explanation

Regex (.NET), 71 bytes

Explanation

Python3, 114 bytes:

Wolfram Language (Mathematica), 6 bytes

Wolfram Language (Mathematica), 33 bytes

Wolfram Language (Mathematica), 23 bytes

Factor + match, 7 bytes

Pip -x, 25 bytes

Explanation

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Linked

Related

Hot Network Questions

Factor + `match`, 7 bytes

Pip `-x`, 25 bytes