I have implemented Python-style generators within Java. I might be reinventing the wheel here, but I feel that the ability to define a Generator with an anonymous class is the most flexible approach. Here's the relevant code:
generator/Generator.java
public abstract class Generator<T> implements Iterable<T>, Iterator<T>
{
private Lock lock = null;
private Lock lock2 = null;
private Semaphore semaphore = null;
private T curr;
private Thread execution;
private Runnable onTermination = null;
private Consumer<? super T> forEachAction = null;
public Generator(Object... params)
{
execution = new Thread(() ->
{
try
{
T t = get(params);
onTermination = ThrowingRunnable.of(execution::join);
yield(t);
getPermit();
}
catch(Exception unexpected)
{
onTermination = ThrowingRunnable.of(() ->
{
Exception e = new NoSuchElementException("Failed to retrieve element!");
e.initCause(unexpected);
throw e;
});
semaphore.release();
}
});
execution.setDaemon(true);
}
@Override
public final Iterator<T> iterator()
{
return this;
}
@Override
public final boolean hasNext()
{
return onTermination == null;
}
@Override
public final T next()
{
if(!hasNext())
throw new NoSuchElementException(
"There are no more elements to be generated by this Generator!");
if(semaphore == null && lock == null)
{
lock = new ReentrantLock();
lock2 = new ReentrantLock();
lock.lock();
semaphore = new Semaphore(0);
execution.start();
getPermit();
return curr;
}
lock2.lock();
lock.unlock();
getPermit();
lock.lock();
lock2.unlock();
getPermit();
if(onTermination != null)
{
lock.unlock();
onTermination.run();
}
return curr;
}
protected final void yield(T t)
{
if(forEachAction != null)
{
forEachAction.accept(t);
return;
}
curr = t;
semaphore.release();
lock.lock();
lock.unlock();
semaphore.release();
lock2.lock();
lock2.unlock();
}
private final void getPermit()
{
try
{
if(semaphore != null)
semaphore.acquire();
}
catch(InterruptedException e)
{
e.printStackTrace();
}
}
/**
* Consumes all remaining elements possible. Obviously, don't use on
* infinite Generators.
*/
@Override
public void forEach(Consumer<? super T> action)
{
Objects.requireNonNull(action);
if(!hasNext())
throw new IllegalStateException("Exhausted elements before calling forEach!");
forEachAction = action;
if(execution.isAlive())
{
lock.unlock();
}
else
{
try
{
execution.start();
}
catch(IllegalThreadStateException itse)
{
itse.initCause(// double checking
new IllegalStateException("Can't exhaust elements and then call forEach!"));
throw itse;
}
}
ThrowingRunnable.of(execution::join).run();
onTermination.run();
}
protected abstract T get(Object... objs);
}
This is the code that I use for ignoring compile-time exceptions from lambdas (which should be thrown at runtime, with the default handler).
throwing/ThrowingRunnable.java
@FunctionalInterface
public interface ThrowingRunnable extends Runnable, ExceptionFlowController
{
public abstract void run_() throws Exception;
@Override
default void run()
{
try
{
run_();
}
catch (Exception e)
{
handle(e);
}
}
static Runnable of(ThrowingRunnable tr, Consumer<Exception> h)
{
return new ThrowingRunnable()
{
public void run_() throws Exception
{
tr.run_();
}
public void handle(Exception e)
{
h.accept(e);
}
};
}
static Runnable of(ThrowingRunnable tr)
{
return tr;
}
}
throwing/ExceptionFlowController.java
/**
* Controls exception flow by piping it into a handler.
*/
public interface ExceptionFlowController
{
public default void handle(Exception e)
{
ThrowingUtil.raise(e);
}
}
throwing/ThrowingUtil.java
public class ThrowingUtil
{
@SuppressWarnings("unchecked")
static <E extends Exception> void raise(Exception e) throws E
{
throw (E) e;// sneakyThrow if you google it, restricted to exceptions only
}
}
Here's an example of using a Generator to print the first 92 Fibonacci numbers (until 64 bits is no longer enough):
Main.java
public class Main
{
public static void main(String[] args)
{
Generator<Number> g = new Generator<>()
{
public Number get(Object[] o)
{
return get(0, 1);
}
private Number get(long fib0, long fib1)
{
yield(fib0);
fib0 += fib1;
yield(fib1);
fib1 += fib0;
if(fib0 < 0 || fib1 < 0)
return null;
return get(fib0, fib1);
}
};
StreamSupport.stream(g.spliterator(),false)
.takeWhile(Objects::nonNull)
.forEach(System.out::println);
}
}
Output:
0
1
1
2
...
4660046610375530309
I was a bit disappointed with the amount of concurrent primitive vomit that I had to use in order to ensure Generators were synchronized properly. Keeping this in mind, here's what I'd like to know:
- Any generic code quality suggestions/opinions/revisions.
- How can I cut down on the number of Locks and Semaphores/usages of Locks and Semaphores (maybe using well-named condition variables)?
Edit:
Here's an example where using a Generator is massively convenient compared to creating a stateful Supplier/Iterator to do the same thing:
public static void main(String[] args)
{
// 0
// 1 2
// _ 6 3 4
//_ _ 8 9 5 _ 7 _
Node n0 = new Node(), n1 = new Node(), n2 = new Node(),
n3 = new Node(), n4 = new Node(), n5 = new Node(),
n6 = new Node(), n7 = new Node(), n8 = new Node(),
n9 = new Node();
n0.left = n1;
n0.right = n2;
n2.left = n3;
n2.right = n4;
n3.left = n5;
n1.right = n6;
n4.left = n7;
n6.left = n8;
n6.right = n9;
Generator<Node> g = new Generator<>()
{
public Node get(Object[] o)
{
return get(n0);
}
private Node get(Node n)
{
if(n.left != null)
get(n.left);
yield(n);
if(n.right != null)
get(n.right);
return null;
}
};
Generator<Node> rightMost7Nodes = new Generator<Node>()
{
int count = 0;
int target = 7;
public Node get(Object[] o)
{
return get(n0);
}
private Node get(Node n)
{
if(n.right != null)
get(n.right);
if(count++ >= target)
return null;
yield(n);
if(n.left != null)
get(n.left);
return null;
}
};
System.out.println("Nodes in-order:");
StreamSupport.stream(g.spliterator(),false)
.takeWhile(o -> g.hasNext()) //ignore last element
.forEach(System.out::println);
System.out.println("Right-most 7 nodes in reverse order:");
StreamSupport.stream(rightMost7Nodes.spliterator(),false)
.takeWhile(o -> g.hasNext()) //ignore last element
.forEach(System.out::println);
}
Output:
Nodes in-order:
Node(1)
Node(8)
Node(6)
...
Node(2)
Node(7)
Node(4)
Right-most 7 nodes in reverse order:
Node(4)
Node(7)
Node(2)
Node(3)
Node(5)
Node(0)
Node(9)
The flexibility provided by being able to yield
mid-logic makes it extremely simple for the programmer to modify the ordering and stored state during stream creation. If the same were to be done with a Supplier/Iterator, the need to correctly modify stored state during the traversal could be unnecessarily complex (using Morris Traversal or a stack-based approach is fine for full iteration, but can get complicated when you stop midway). Furthermore, the code would be inflexible to modify for other types of traversals (pre-order/post-order). For this reason, I plan to use my Generator implementation relatively frequently - which is why I'd like for it to be reviewed as per questions 1 and 2 :)
2 Answers 2
Implementing both Iterable and Iterator at the same time is a bit weird choice and while the API documentation for Iterator makes no claims about the implementation, I think most people would assume that subsequent calls to iterator() return a different object each time and those objects, if used for reading only, do not interfere with each other.
-
\$\begingroup\$ Thanks! You're right that it seems weird. I'll probably end up removing the Iterable functionality (
iterator()
method) and renameforEach
toforEachRemaining
as per the Iterator API (which makes more sense per my comment "Consumes all remaining elements possible"). \$\endgroup\$Avi– Avi2019年08月29日 16:16:06 +00:00Commented Aug 29, 2019 at 16:16
Ok, so, per @markspace, you've made this a lot more complicated than it has to be. In Python, yield
is being used to save the state of a function. In Java, you'd just use a stateful object that executes the desired function. I'm pretty sure you could either create a FibonacciIterator
or a FibonacciSupplier
and meet your requirements.
In either case, every time you call get()
/next()
, the code runs until it hits a return
(yield). Then state is preserved and control flow returns to the calling code. The next time get()
/next()
is called, execution continues from the preserved state. Both classes provide an infinite, ordered, stateful stream of elements. Supplier
can be plugged into Stream::generate
, while Iterable
can be iterated over.
It is my (limited) understanding that a Python Generator function
is just syntactic sugar that creates a Python Iterator
which tracks execution state. This is a convenience so you can work with a function instead of an object. In your Java code, you're already tracking state yourself in your client method - by creating an infinitely deep call stack recursing on get
with the new arguments.
If you think I'm mistaken, can you please provide a specific case that the classes below do not solve?
public final class FibonnaciSupplier implements Supplier<Integer> {
private int currentNumber = 0;
private int nextNumber = 1;
@Override
public Integer get() {
final int result = this.currentNumber;
final int sum = this.currentNumber + this.nextNumber;
this.currentNumber = this.nextNumber;
this.nextNumber = sum;
return Integer.valueOf(result);
}
}
public final class FibonnaciIterator implements Iterator<Integer> {
private int currentNumber = 0;
private int nextNumber = 1;
@Override
public boolean hasNext() {
return true;
}
@Override
public Integer next() {
final int result = this.currentNumber;
final int sum = this.currentNumber + this.nextNumber;
this.currentNumber = this.nextNumber;
this.nextNumber = sum;
return Integer.valueOf(result);
}
}
To see an example of the stack overflow issue, try the following generator.
Generator<Number> g = new Generator<Number>() {
public Number get(Object[] o) {
return get(0);
}
private Number get(long currentNumber) {
yield(currentNumber);
currentNumber += 1;
if (currentNumber < 0)
return null;
return get(currentNumber);
}
};
You can also put a breakpoint on the line yield(currentNumber)
, run your debugger through a few calls to get()
, and look at the call stack. It'll look something like:
Daemon Thread [Thread-0] (Suspended (breakpoint at line 18 in Main1ドル))
Main1ドル.get(long) line: 18
Main1ドル.get(long) line: 22
Main1ドル.get(long) line: 22
Main1ドル.get(long) line: 22
Main1ドル.get(Object[]) line: 14
Main1ドル.get(Object[]) line: 1
Main1ドル(Generator).lambda0ドル(Object[]) line: 25
232824863.run() line: not available Thread.run() line: 745
Those repeated get()
calls on line 22 are you stepping into a new stack frame every time get()
is invoked recursively.
-
\$\begingroup\$ I now realize that using
Supplier
andStream::generate
is the best way to go about simple element stream generation (possibly in conjunction withlimit(n)
to make the stream finite). However, I would argue that persisting state explicitly would be increasingly difficult for increasingly complex states (unlike Fibonacci). For example, providing an reverse in-order stream of elements from a binary tree (e.g. right-most n elements in a tree) would be simple using a recursive generator, whereas an iterative state-based approach would be far more complicated. \$\endgroup\$Avi– Avi2019年08月29日 16:26:26 +00:00Commented Aug 29, 2019 at 16:26 -
\$\begingroup\$ I'm not saying don't use recursion. I'm saying don't wrap a recursive method call in an unboundedly-increasing call stack with totally unnecessary concurrency checking. The
Supplier
/Iterator
implementation can use recursion as necessary, and still remember state as part of the object. \$\endgroup\$Eric Stein– Eric Stein2019年08月29日 16:47:04 +00:00Commented Aug 29, 2019 at 16:47 -
\$\begingroup\$ @Avi Would you mind adding a new generator implementation to your question which shows the issue you're talking about, and how you think your generator solves a problem that a simple iterator/supplier cannot? \$\endgroup\$Eric Stein– Eric Stein2019年08月29日 16:54:42 +00:00Commented Aug 29, 2019 at 16:54
-
\$\begingroup\$ I'm saying that the
Supplier/Iterator
implementation makes recursion unnecessarily complicated. The Fibonacci generator could have easily contained this:for(int i = 0; i < 46; i++) { yield(fib0); fib0 += fib1; yield(fib1); fib1 += fib0; } return null;
in itsget(int, int)
method and produced the same results. Iterative implementations with state are relatively easy in allSupplier/Iterator/Generator
approachs. w.r.t your latest request, I'll post a generator for a binary tree once I have it complete. \$\endgroup\$Avi– Avi2019年08月29日 16:56:06 +00:00Commented Aug 29, 2019 at 16:56
Explore related questions
See similar questions with these tags.
Stream.generate()
which seems a good deal simpler than your implementation: docs.oracle.com/en/java/javase/11/docs/api/java.base/java/util/… \$\endgroup\$yield
statements will pause a function, save all state (context), and continue from the lastyield
on successive calls tonext()
.Stream::generate
is insufficiently powerful for my desired use cases, as it provides only an infinite, unordered stream of elements, whereas my Generator intends to provide (in/)finite, ordered, stateful stream of elements. According to Stream's Javadoc: behavioral parameters "in most cases must be stateless (their result should not depend on any state that might change ...)". \$\endgroup\$next()
to switch to the Generator'sget
logic until such time as the Generatoryield
s orreturns
, at which point control flow should resume in the thread that callednext()
with the correct return value. This sounds extremely similar to a context switch to me, which is why I used threads. \$\endgroup\$