How to make this faster? I want to remove elements that satisfy the condition below but it's way too slow.
tup1 = Tuples[{{0, 1}, {-1, 0, 1}, {-1, 0, 1}, {0, 1}, {-1, 0,
1}, {-1, 0, 1}, {0, 1}, {-1, 0, 1}, {-1, 0, 1}, {0, 1}, {-1, 0,
1}, {-1, 0, 1}}];
tup2 = DeleteDuplicates[
tup1, (#1[[1 ;; 3]] == #2[[4 ;;
6]]) && (#1[[4 ;; 6]] == #2[[1 ;; 3]] && #1[[7 ;;
9]] == #2[[10 ;; 12]]) && (#1[[10 ;; 12]] == #2[[7 ;;
9]]) &];
3 Answers 3
Here is a semi-imperative way that gives the same result as tup2 from the original question, but much faster:
tup2b = Module[{keep}
, keep[t_] :=
( keep[t] = False
; keep[t[[{4,5,6,1,2,3,10,11,12,7,8,9}]]] = False
; True
)
; Select[tup1, keep]
];
tup2b === tup2
(* True *)
Length[tup2b] === Length[tup2] === 52650
(* True *)
On my machine, the calculation of tup2 takes a couple of hours whereas the approach shown here is subsecond. This approach also makes it easy to add other equivalence criteria if desired.
How It Works
The function keep is used as a predicate to Select to determine whether to keep each entry of the list. The first time each element is encountered, keep returns True. But as a side-effect it also adds a new definition to keep that records that the entry and its equivalent permutation are no longer to be kept. The new definition will return False if either entry is encountered later in the list. In this way keep effectively maintains a set of entries seen so far (along with their equivalents).
For the example list given in the question, it is not strictly necessary to record that we have seen each unpermuted entry since there are no duplicates. But in the general case that might not be so.
This method scans each entry of the list once, so it runs in time roughly proportional to the length $n$ of the list. Technically, the time is more on the order of $n \ln(n)$ due to the hashing involved in saving and testing entries, but for small $n$ the difference is not that noticeable.
By contrast, and as Leonid Shifrin points out, DeleteDuplicates must in principle compare every pair of elements, so the run time is proportional to $n^2$ -- a much larger number of iterations.
-
1$\begingroup$ wondering if
Pickmight be slightly more readable here. $\endgroup$AccidentalFourierTransform– AccidentalFourierTransform2021年01月06日 01:24:22 +00:00Commented Jan 6, 2021 at 1:24 -
$\begingroup$ @AccidentalFourierTransform It is probably just a matter of taste, but I think I prefer the
Selectform over something likePick[tup1, keep /@ tup1]. $\endgroup$WReach– WReach2021年01月06日 03:32:48 +00:00Commented Jan 6, 2021 at 3:32 -
$\begingroup$ Sure, it is definitely a matter of taste. My point was that
Picktogether with a minor modification ofkeepmight be ever so slightly cleaner/more readable. Or using the third argument. Or, being optimistic, perhaps there is a way to avoid having to definekeepaltogether. But I'm really just thinking out loud so don't take me too seriously :-) $\endgroup$AccidentalFourierTransform– AccidentalFourierTransform2021年01月06日 12:48:29 +00:00Commented Jan 6, 2021 at 12:48 -
1$\begingroup$ +1. It might be worth mentioning that
DeleteDuplicateswith explicit predicate uses quadratic complexity algorithm based on pairwise comparisons, which explains the timing difference, as well as the existence ofDeleteDuplicatesByas a separate function. $\endgroup$Leonid Shifrin– Leonid Shifrin2021年01月07日 02:26:58 +00:00Commented Jan 7, 2021 at 2:26 -
1$\begingroup$ I have added the new section How It Works. (also @LeonidShifrin) $\endgroup$WReach– WReach2021年01月08日 12:32:54 +00:00Commented Jan 8, 2021 at 12:32
You can also get tup2 from tup1 using:
1. Union
ClearAll[fA]
fA = Union[Sort[{#, #[[{4, 5, 6, 1, 2, 3, 10, 11, 12, 7, 8, 9}]]}] & /@ #][[All, 1]] &;
tup2A = fA @ tup1; // AbsoluteTiming // First
0.213222
Length @ tup2A
52650
2. DeleteDuplicates
ClearAll[fB]
fB = DeleteDuplicates[
Sort[{#, #[[{4, 5, 6, 1, 2, 3, 10, 11, 12, 7, 8, 9}]]}] & /@ #][[All, 1]] &;
tup2B = fB @ tup1; // AbsoluteTiming // First
0.257217
3. GroupOrbits + PermutationGroup
ClearAll[fC]
fC = GroupOrbits[PermutationGroup[{{4, 5, 6, 1, 2, 3, 10, 11, 12, 7, 8, 9}}], #,
Permute][[All, 1]] &
tup2C = fC @ tup1; // AbsoluteTiming // First
0.640413
4. Memoization
ClearAll[fD]
fD = Module[{f0},
f0[x_] := (f0[x] = f0[x[[{4, 5, 6, 1, 2, 3, 10, 11, 12, 7, 8, 9}]]] = Sequence[]; x);
f0 /@ #] &;
tup2D = fD @ tup1; // AbsoluteTiming // First
0.794055
5. DeleteDuplicatesBy
ClearAll[fE]
fE = DeleteDuplicatesBy[Sort[{#, #[[{4, 5, 6, 1, 2, 3, 10, 11, 12, 7, 8, 9}]]}]&]
tup2E = fE @ tup1; // AbsoluteTiming // First
1.13389
6. GroupBy
ClearAll[fF]
fF = Values @ GroupBy[#,
Sort[{#, #[[{4, 5, 6, 1, 2, 3, 10, 11, 12, 7, 8, 9}]]}] &,
First] &;
tup2F = fF @ tup1; // AbsoluteTiming // First
1.28655
All six results match tup2b from WReach's answer:
tup2b == tup2A == tup2B == tup2C == tup2D == tup2E == tup2F
True
In comparison, tup2b takes about a second:
tup2b = Module[{keep},
keep[t_] := (keep[t] = False;
keep[t[[{4, 5, 6, 1, 2, 3, 10, 11, 12, 7, 8, 9}]]] = False;
True); Select[tup1, keep]]; // AbsoluteTiming // First
1.06063
A tuple is deleted if both its first two triples and its second two triples are identical when exchanged (but not when they're the same when not swapped). That we can construct the undeleted tuples directly like so:
tup2 =
Module[{
pair = Tuples[Tuples[{{0, 1}, {0, -1, 1}, {0, -1, 1}}], 2],
unique
},
unique = DeleteDuplicates[pair, #1 === Reverse@#2 &];
Flatten /@
DeleteDuplicates@
Join[Tuples[{unique, pair}], Tuples[{pair, unique}]]];
Takes about a tenth of a second on my laptop.
UPDATE: If you want to use the existing tup1 and delete the duplicates, you can keep the ones that appear in tup2 like so:
tup3 = Intersection[tup1, tup2];
which is very fast. If for some reason you need to keep tup1 in the original order, you can do something a bit slower:
tup4 = Select[tup1, AssociationThread[tup2 -> True]];
This takes another 0.2 seconds on my laptop.
Given the way the problem is stated, it's much easier to construct tuples you know are unique than to delete duplicates after the fact.
-
1$\begingroup$ I don't get this. Your
pairstructure is different from mine. Can this method be used if I already havetup1as above? $\endgroup$emnha– emnha2021年01月05日 15:35:00 +00:00Commented Jan 5, 2021 at 15:35
Explore related questions
See similar questions with these tags.