-
Notifications
You must be signed in to change notification settings - Fork 153
Porting to Dart and benchmarking a bit #638
-
Hi! As you may have seen, I've been porting agent-device to Dart and trying to maintain some parity with upstream agent-device.
One challenge I'm seeing both for Flutter and React Native apps is the flakiness. Do you see it as well in your tests? What I mean is that sometimes even the og agent-device misses some taps or is not able to scroll.
I've also noticed that iOS runner does not execute gestures like pan/rotate/fling on the RN test app - is that expected?
I decided to build some benchmarks to see if this issue is coming from vibe-coding the agent-device-dart, or is it inherent to the runner approach. The main difference between agent-device and agent-device-dart is that the latter does not use daemon approach.
Things that made the port look good (credit to your design):
- After a clean runner build, snapshot/interaction parity is close. Your daemon keeps per-command device ops fast; the Dart port (no daemon) is a bit faster with cold-open and process startup but each command is slightly slower.
- The fixture + replay format made a rigorous A/B benchmark possible at all
Results:
- https://github.com/orestesgaolin/agent-device-dart/blob/main/benchmark/report/REPORT-flutter-fixture-ios.md
- https://github.com/orestesgaolin/agent-device-dart/blob/main/benchmark/report/REPORT-flutter-fixture-android.md
- https://github.com/orestesgaolin/agent-device-dart/blob/main/benchmark/report/REPORT-expo-test-app-ios.md
Full disclosure - my port is fully vibe coded and I'm going with it mostly as an experiment ;) No intention to build competitor.
Beta Was this translation helpful? Give feedback.
All reactions
Replies: 2 comments
-
Thanks so much for flagging that! Reliability issues happen and that's often a bug on my end, will work on some perf and reliability fixes for that :)
Beta Was this translation helpful? Give feedback.
All reactions
-
Btw, you've also made a good point that doing such ports can be beneficial for the original tool as well, so keep on vibing please!
I've also recently added small perf reporting on the JS side, showing we're able to do initial commands in well under 100ms. And I'm adding a perf suite for all commands end-to-end here: #630.
Reality is that the slow part is the XCUITest instrumentation. Node.js is pretty quick where it needs to be at this point, and I'm sure we can make it quicker with some tweaks.
Beta Was this translation helpful? Give feedback.