This is the result of a three week yak shave that began with trying to use the visitor utility to mutate the AST.
A lot has changed to make things work with my use case, so this PR may not be suitable for this library, but the gist is:
- The AST now uses value semantics (
struct
s not class
es)
- Use keypaths instead of strings (
get
/set
) for traversing the AST
- Existential types throughout the AST (instance fields that are of a protocol type, e.g.
Value
, Definition
) have been removed and replaced with enum
s
Visitor
s are now implemented as protocols
- The
enter
and leave
methods are now type safe: You can only return new nodes that are of the same type
- In general, more functions that took in existential types have been replaced with generic functions
- As a bonus, have AST nodes implement the
TextOutputStreamable
protocol so they can be printed out again
Using structs
I first ran into the fact that the editing part of the visit
function wasn't yet implemented: returning a new node in the visit function didn't actually change anything.
So I began to translate over the graphql-js
reference implementation when I realised a fundamentally limiting factor of the current implementation:
The visit
function should return a new copy of the AST with the modifications applied, so in the javascript implementation they do a deep copy of each node whenever it's mutated. However in Swift, class
es have no practical way of doing such a deep copy.
In Swift, types that need to be copied about with value semantics are typically struct
s: Copying them then becomes a matter of using doing a var astCopy = ast
.
And although this is a significant semantic change, changing this didn't seem to affect much of the existing code nor the test suite.
The next step was to finish implementing the set
functions on the Node
protocol, which uncovered two more problems:
Using KeyPath
s to traverse the AST
The set
functions on the nodes were not implemented yet so I began to fill them out, only to quickly realise that it wasn't possible to assign arbitrary Node
values to the concrete instance variables in a type-safe manner. For reference, the types of the get
and set
functions for traversing the AST were:
protocol Node {
func get(key: String) -> NodeResult?
func set(value: Node?, key: String)
}
public enum NodeResult {
case node(Node)
case array([Node])
}
The value passed to set
is just any arbitrary Node
, and so would need to be downcasted before being set. On top of this, it would require a lot of boilerplate, matching up each string
to each variable.
extension Variable {
func set(value: Node?, key: String) {
switch key {
case "name":
name = value as! Name
default:
fatalError()
}
}
This is less than ideal, since it's using arbitrary strings and duplicates most of the boilerplate that get
is doing anyway. This seemed like a good use for key paths in Swift, which allows for strongly typed paths to the nested fields in each struct.
The immediately obvious solution would be to have each Node
return an array of WritableKeyPath
s to its various children. However we would end up with a heterogenous array, which if we were to type erase would prevent us from actually being able to write to the nodes. And even if we were to try this, there is no type-erased version of WritableKeyPath
: you simply need to know the type of the field you are writing to at compile time, and no you can't just make it a WritableKeyPath<Self, Node>
– Using Self
within the Node
protocol definition prevents it from being used as an existential type i.e. let x: Node
(more on this later), and you can't assign a concrete type Variable
to a Node
either.
So instead the resulting function on each Node
looks like this now:
extension Field {
public mutating func descend(descender: inout Descender) {
descender.descend(&self, \.name)
descender.descend(&self, \.variableDefinitions)
descender.descend(&self, \.directives)
descender.descend(&self, \.selectionSet)
}
}
The Descender
struct contains three generic functions:
struct Descender {
mutating func descend<T, U: Node>(_ node: inout T, _ kp: WritableKeyPath<T, U>)
mutating func descend<T, U: Node>(_ node: inout T, _ kp: WritableKeyPath<T, U?>)
mutating func descend<T, U: Node>(_ node: inout T, _ kp: WritableKeyPath<T, [U]>)
}
And this works great for allowing us to traverse the children in the AST, handling nullable children as well as any lists that may occur.
However, some of the Node
s' children were existentially typed: For example, SelectionSet
has an array of Selection
s:
struct SelectionSet {
var definitions: [Selection]
}
protocol Selection {}
extension Field : Definition {}
extension FragmentSpread : Definition {}
extension InlineFragment : Definition {}
And because of this, we get the "Protocol 'Selection' as a type cannot conform to 'Node'" error when trying to use the key path.
Replacing existential types with enums
Existential types in Swift – that is, naked uses of protocols in type signatures like var definitions: [Definition]
– are a bit of a foot-gun, and carry with them some restrictions as mentioned above. There's an evolution proposal to make this syntactically explicit, and a lot of the time generics are a better fit.
But until we get generalised existential types, I had to look around for an alternative.
I decided to replace the existential types with enums, essentially turning the AST into an algebraic data type:
public enum Selection: EnumNode, Equatable {
case field(Field)
case fragmentSpread(FragmentSpread)
case inlineFragment(InlineFragment)
}
Using enums to represent the sum types has sum extra benefits like not needing to downcast to extract the inner types and being able to exhaustively pattern match.
AnyKeyPath
Because we're using key paths to access the fields within the nodes, we could represent the path
that gets passed to the Visitor
by chaining them together with appending(path: KeyPath)
. This would have been a cool use of key paths, but unfortunately, Swift does not currently have support for key paths on enums.
So instead we have to use just an array of AnyKeyPath
, and the descend
method is implemented a bit differently on the enum nodes so that they don't actually leave a path.
Type safe enter and leave functions
The key paths passed to the Descender
still need to be strongly typed i.e. WritableKeyPath
with a known type, since Swift needs to be able to type check that we are assigning a value of the type to the path. But if we look at the type signature for the enter
and leave
functions, we find ourselves with two more existential types:
struct Visitor {
typealias Visit = (Node, IndexPathElement?, NodeResult?, [IndexPathElement], [NodeResult]) -> VisitResult
}
enum VisitResult {
case `continue`
case skip
case `break`
case node(Node?)
}
The existential type passed as an argument, and the existential type returned from the function. As mentioned earlier, existential types come with a lot of restrictions, one of them being that we can't use a value of Node
to write to a key path of type WritableKeyPath<Field, Name>
: We simply need the guarantee that the Node
value inside the VisitResult
actually is of the correct type.
And with the current API, there's nothing stopping the user from returning a Field
to replace a Name
node in the tree.
Since we need to work with concrete types again, lets start by making VisitResult
generic so we can make sure that users are replacing the node with the same type of node.
enum VisitResult<T: Node> {
case `continue`, skip, `break`, node(T?)
}
Then, we just need to make the closure generic so it takes in a node of type T
and returns a result of type VisitResult<T>
– but as it turns out, closures cannot be generic in Swift.
Functions can be generic though, and you can define generic function requirements on protocols. So why not try adopting a more protocol-orientated design and make Visitor
a protocol that users of the library can adopt?
protocol Visitor {
func enter<T: Node>(node: T, key: AnyKeyPath?, parent: VisitorParent?, path: [AnyKeyPath], ancestors: [VisitorParent]) -> VisitResult<T>
func leave<T: Node>(node: T, key: AnyKeyPath?, parent: VisitorParent?, path: [AnyKeyPath], ancestors: [VisitorParent]) -> VisitResult<T>
}
This looks like a nice design, now we can just do a quick cast at runtime to differentiate the concrete type:
struct MyVisitor: Visitor {
func enter<T: Node>(node: T, key: AnyKeyPath?, parent: VisitorParent?, path: [AnyKeyPath], ancestors: [VisitorParent]) -> VisitResult<T> {
switch node {
case let name as Name:
return .node(Name(loc: name.loc, value: "hello")) // Cannot convert value of type 'Name' to expected argument type 'T?'
default:
return .continue
}
}
}
But we run into a type error when trying to return a VisitResult
of a concrete type. We know for a fact that Name
is the same type asT
because we matched on it, so unfortunately we have to force downcast it. But we can hide this away from users by doing the down casting within the library, and delegating to functions with concrete types for each node.
protocol Visitor {
func enter(name: Name, key: AnyKeyPath?, parent: VisitorParent?, ancestors: [VisitorParent]) -> VisitResult<Name>
func leave(name: Name, key: AnyKeyPath?, parent: VisitorParent?, ancestors: [VisitorParent]) -> VisitResult<Name>
func enter(document: Document, key: AnyKeyPath?, parent: VisitorParent?, ancestors: [VisitorParent]) -> VisitResult<Document>
func leave(document: Document, key: AnyKeyPath?, parent: VisitorParent?, ancestors: [VisitorParent]) -> VisitResult<Document>
func enter(definition: Definition, key: AnyKeyPath?, parent: VisitorParent?, ancestors: [VisitorParent]) -> VisitResult<Definition>
func leave(definition: Definition, key: AnyKeyPath?, parent: VisitorParent?, ancestors: [VisitorParent]) -> VisitResult<Definition>
}
extension Visitor {
func enter(name: Name, key: AnyKeyPath?, parent: VisitorParent?, ancestors: [VisitorParent]) -> VisitResult<Name> { .continue }
func leave(name: Name, key: AnyKeyPath?, parent: VisitorParent?, ancestors: [VisitorParent]) -> VisitResult<Name> { .continue }
func enter(document: Document, key: AnyKeyPath?, parent: VisitorParent?, ancestors: [VisitorParent]) -> VisitResult<Document> { .continue }
func leave(document: Document, key: AnyKeyPath?, parent: VisitorParent?, ancestors: [VisitorParent]) -> VisitResult<Document> { .continue }
func enter(definition: Definition, key: AnyKeyPath?, parent: VisitorParent?, ancestors: [VisitorParent]) -> VisitResult<Definition> { .continue }
func leave(definition: Definition, key: AnyKeyPath?, parent: VisitorParent?, ancestors: [VisitorParent]) -> VisitResult<Definition> { .continue }
func enter<T: Node>(node: T, key: AnyKeyPath?, parent: VisitorParent?, path: [AnyKeyPath], ancestors: [VisitorParent]) -> VisitResult<T> {
switch node {
case let name as Name:
return enter(name: name, key: key, parent: parent, ancestors: ancestors) as! VisitResult<T>
case let document as Document:
return enter(document: document, key: key, parent: parent, ancestors: ancestors) as! VisitResult<T>
case let definition as Definition:
return enter(definition: definition, key: key, parent: parent, ancestors: ancestors) as! VisitResult<T>
...
}
}
This is the same approach that SwiftSyntax takes to their visitor:
https://github.com/apple/swift-syntax/blob/7117363d54ed5dfdbcb5eaa4c81697923c9768db/Sources/SwiftSyntax/SyntaxVisitor.swift.gyb#L40-L53
But they are using gyb, and it leaves us with a lot of boilerplate. If anyone has any ideas for a better design/way to improve this, I'd love to hear them.
In any case, the finished design gives us a nice API surface for editing the AST:
struct IntIncrementer: Visitor {
func enter(intValue: IntValue, key: AnyKeyPath?, parent: VisitorParent?, ancestors: [VisitorParent]) -> VisitResult<IntValue> {
let newVal = Int(intValue.value)! + 1
return .node(IntValue(loc: nil, value: String(newVal)))
}
}
let document = try! parse(source: """
query foo {
bar(a: 1, b: 2, c: 3)
}
""")
let visitor = IntIncrementer()
let newDocument = visit(root: document, visitor: visitor)
In a way thats type-safe, and plays well with value semantics.
Uh oh!
There was an error while loading. Please reload this page.
This is the result of a three week yak shave that began with trying to use the visitor utility to mutate the AST.
A lot has changed to make things work with my use case, so this PR may not be suitable for this library, but the gist is:
struct
s notclass
es)get
/set
) for traversing the ASTValue
,Definition
) have been removed and replaced withenum
sVisitor
s are now implemented as protocolsenter
andleave
methods are now type safe: You can only return new nodes that are of the same typeTextOutputStreamable
protocol so they can be printed out againUsing structs
I first ran into the fact that the editing part of the
visit
function wasn't yet implemented: returning a new node in the visit function didn't actually change anything.So I began to translate over the
graphql-js
reference implementation when I realised a fundamentally limiting factor of the current implementation:The
visit
function should return a new copy of the AST with the modifications applied, so in the javascript implementation they do a deep copy of each node whenever it's mutated. However in Swift,class
es have no practical way of doing such a deep copy.In Swift, types that need to be copied about with value semantics are typically
struct
s: Copying them then becomes a matter of using doing avar astCopy = ast
.And although this is a significant semantic change, changing this didn't seem to affect much of the existing code nor the test suite.
The next step was to finish implementing the
set
functions on theNode
protocol, which uncovered two more problems:Using
KeyPath
s to traverse the ASTThe
set
functions on the nodes were not implemented yet so I began to fill them out, only to quickly realise that it wasn't possible to assign arbitraryNode
values to the concrete instance variables in a type-safe manner. For reference, the types of theget
andset
functions for traversing the AST were:The value passed to
set
is just any arbitraryNode
, and so would need to be downcasted before being set. On top of this, it would require a lot of boilerplate, matching up eachstring
to each variable.This is less than ideal, since it's using arbitrary strings and duplicates most of the boilerplate that
get
is doing anyway. This seemed like a good use for key paths in Swift, which allows for strongly typed paths to the nested fields in each struct.The immediately obvious solution would be to have each
Node
return an array ofWritableKeyPath
s to its various children. However we would end up with a heterogenous array, which if we were to type erase would prevent us from actually being able to write to the nodes. And even if we were to try this, there is no type-erased version ofWritableKeyPath
: you simply need to know the type of the field you are writing to at compile time, and no you can't just make it aWritableKeyPath<Self, Node>
– UsingSelf
within theNode
protocol definition prevents it from being used as an existential type i.e.let x: Node
(more on this later), and you can't assign a concrete typeVariable
to aNode
either.So instead the resulting function on each
Node
looks like this now:The
Descender
struct contains three generic functions:And this works great for allowing us to traverse the children in the AST, handling nullable children as well as any lists that may occur.
However, some of the
Node
s' children were existentially typed: For example,SelectionSet
has an array ofSelection
s:And because of this, we get the "Protocol 'Selection' as a type cannot conform to 'Node'" error when trying to use the key path.
Replacing existential types with enums
Existential types in Swift – that is, naked uses of protocols in type signatures like
var definitions: [Definition]
– are a bit of a foot-gun, and carry with them some restrictions as mentioned above. There's an evolution proposal to make this syntactically explicit, and a lot of the time generics are a better fit.But until we get generalised existential types, I had to look around for an alternative.
I decided to replace the existential types with enums, essentially turning the AST into an algebraic data type:
Using enums to represent the sum types has sum extra benefits like not needing to downcast to extract the inner types and being able to exhaustively pattern match.
AnyKeyPath
Because we're using key paths to access the fields within the nodes, we could represent the
path
that gets passed to theVisitor
by chaining them together withappending(path: KeyPath)
. This would have been a cool use of key paths, but unfortunately, Swift does not currently have support for key paths on enums.So instead we have to use just an array of
AnyKeyPath
, and thedescend
method is implemented a bit differently on the enum nodes so that they don't actually leave a path.Type safe enter and leave functions
The key paths passed to the
Descender
still need to be strongly typed i.e.WritableKeyPath
with a known type, since Swift needs to be able to type check that we are assigning a value of the type to the path. But if we look at the type signature for theenter
andleave
functions, we find ourselves with two more existential types:The existential type passed as an argument, and the existential type returned from the function. As mentioned earlier, existential types come with a lot of restrictions, one of them being that we can't use a value of
Node
to write to a key path of typeWritableKeyPath<Field, Name>
: We simply need the guarantee that theNode
value inside theVisitResult
actually is of the correct type.And with the current API, there's nothing stopping the user from returning a
Field
to replace aName
node in the tree.Since we need to work with concrete types again, lets start by making
VisitResult
generic so we can make sure that users are replacing the node with the same type of node.Then, we just need to make the closure generic so it takes in a node of type
T
and returns a result of typeVisitResult<T>
– but as it turns out, closures cannot be generic in Swift.Functions can be generic though, and you can define generic function requirements on protocols. So why not try adopting a more protocol-orientated design and make
Visitor
a protocol that users of the library can adopt?This looks like a nice design, now we can just do a quick cast at runtime to differentiate the concrete type:
But we run into a type error when trying to return a
VisitResult
of a concrete type. We know for a fact thatName
is the same type asT
because we matched on it, so unfortunately we have to force downcast it. But we can hide this away from users by doing the down casting within the library, and delegating to functions with concrete types for each node.This is the same approach that SwiftSyntax takes to their visitor:
https://github.com/apple/swift-syntax/blob/7117363d54ed5dfdbcb5eaa4c81697923c9768db/Sources/SwiftSyntax/SyntaxVisitor.swift.gyb#L40-L53
But they are using gyb, and it leaves us with a lot of boilerplate. If anyone has any ideas for a better design/way to improve this, I'd love to hear them.
In any case, the finished design gives us a nice API surface for editing the AST:
In a way thats type-safe, and plays well with value semantics.