Algorithm to group connected graphs

Question 1

I am working on an algorithm, which should group connected Arcs. The situation is described via code examples below.

I have Arcs:

public class Arc
{
 public Point StartPoint { get; set; }
 public Point EndPoint { get; set; }
 public bool Visited { get; set; }
 public Arc Successor { get; set; }
 public Arc Predecessor { get; set; }
}

that consist of Points:

public class Point
{
 public int X { get; set; }
 public int Y { get; set; }
 public override bool Equals(Object obj)
 {
 if (ReferenceEquals(null, obj)) return false;
 if (ReferenceEquals(this, obj)) return true;
 return obj.GetType() == GetType() && Equals((Point)obj);
 }
 protected bool Equals(Point other)
 {
 return X.Equals(other.X) && Y.Equals(other.Y);
 }
 public override int GetHashCode()
 {
 unchecked
 {
 return (X.GetHashCode() * 397) ^ Y.GetHashCode();
 }
 }
}

Given these Points and resulting Arcs:

var p1 = new Point {X = 0, Y = 0};
var p2 = new Point { X = 0, Y = 1 };
var p3 = new Point { X = 0, Y = 2 };
var p4 = new Point { X = 1, Y = 0 };
var p5 = new Point { X = 1, Y = 1 };
var arc1 = new Arc { StartPoint = p1, EndPoint = p2};
var arc2 = new Arc { StartPoint = p4, EndPoint = p5};
var arc3 = new Arc { StartPoint = p2, EndPoint = p3};

I would like to end up with groups of connected Arcs as a list of list:

var arcListOfList = new List<List<Arc>>();

This is the crude algorithm I came up with so far:

foreach (var arc in arcList)
{
 var successor = arcList.FirstOrDefault(x => x.StartPoint.Equals(arc.EndPoint));
 var predecessor = arcList.FirstOrDefault(x => x.EndPoint.Equals(arc.StartPoint));
 arc.Successor = successor;
 arc.Predecessor = predecessor;
}
Arc next;
do
{
 next = arcList.FirstOrDefault(x => x.Visited == false);
 var temp = new List<Arc>();
 if (next == null) continue;
 do
 {
 next.Visited = true;
 temp.Add(next);
 if (next.Successor != null)
 {
 next = next.Successor;
 }
 else
 {
 break;
 }
 } while (true);
 arcListOfList.Add(temp);
} while (next != null);

I would appreciate any help/suggestions to improve it.

Question 2

Try writing a description of your problem and algorithm at the top of your post. That will likely catch the eye of potential reviewers more easily. It will also look nicer on tag pages (right now it just says " I have Arcs: ... ").

Question 3

return (X.GetHashCode() * 397) ^ Y.GetHashCode(); Could someone give a quick reason why this is unchecked, and if 397 is just a magic number, why pick exactly 397?

Question 4

And for clarification, is there only 1 arc in and 1 arc out of any given point? If this is a rule it needs to be clear in code, if there can be multiple in/out then the code is incorrect...

Question 5

Answering my first comment: stackoverflow.com/questions/102742/…

Question 6

I'm going to leave off suggesting anything about the algorithm for now as I don't fully understand what you want from it. Instead, I'm going to talk about your Point class in laborious depth...

public class Point
{
 public int X { get; set; }
 public int Y { get; set; }

I would argue that a point should be immutable - i.e. it doesn't change after you've created it. I would implement with full readonly properties:

private readonly int x;
public int X { get { return x; } }
private readonly int y;
public int Y { get { return y; } }
public Point(int x, int y)
{
 this.x = x;
 this.y = y
}

However, most people would let you off with private set auto properties:

public int X { get; private set; }

Let's look at your Equals implementations

public override bool Equals(Object obj)
{
 if (ReferenceEquals(null, obj)) return false;
 if (ReferenceEquals(this, obj)) return true;
 return obj.GetType() == GetType() && Equals((Point)obj);
}
protected bool Equals(Point other)
{
 return X.Equals(other.X) && Y.Equals(other.Y);
}

Firstly, bool Equals(Point other) -> that's IEquatable<Point> if made public so your class should implement that:

public class Point : IEquatable<Point>

Now, simplify the object.Equals override:

public override bool Equals(Object obj)
{
 return Equals(obj as Point);
}

And sharpen up the IEquatable<Point>.Equals:

public bool Equals(Point other)
{
 return other != null
 && other.X == X
 && other.Y == Y;
}

We should also override the == and != operators so that we're sure we are consistent with out equality:

public static bool operator ==(Point left, Point right)
{
 if (object.ReferenceEquals(left, null))
 {
 return object.ReferenceEquals(right, null);
 }
 return left.Equals(right);
}
public static bool operator !=(Point left, Point right)
{
 if (object.ReferenceEquals(left, null))
 {
 return !object.ReferenceEquals(right, null);
 }
 return !left.Equals(right);
}

Then you have a nice immutable class for your Point with consistent equality behaviour... Phew!

Edit

As EBrown pointed out in the comments, Point is a perfect fit for being a struct. It's less than 16 bytes, immutable and (0, 0) is a perfectly valid value (i.e. unitialized).

Question 7

It could be argued that Point should be a struct as well.

Question 8

@EBrown - Yes it almost certainly should be. I forgot to mention that as I was going through. I've updated now.

Question 9

Talking about this

Arc next;
do
{
 next = arcList.FirstOrDefault(x => x.Visited == false);
 var temp = new List<Arc>();
 if (next == null) continue;
 do
 {
 next.Visited = true;
 temp.Add(next);
 if (next.Successor != null)
 {
 next = next.Successor;
 }
 else
 {
 break;
 }
 } while (true);
 arcListOfList.Add(temp);
} while (next != null);

temp is almost always a bad name for a variable, at least if it stands on its own. Naming it tempArcList would be better but not ideal.

You should try to come up with a good and meaningful name.
if (next == null) continue; this continue should be a break because the do..while will just end if next != null.
Using braces {} for single statement if, else` will help you to make your code less error prone. At least you have put the statement at the same line, which is a good start. Using braces would also (IMHO) improve the readability because it makes the statement more prominent.
If you invert the if (next.Successor != null) condition, you would reduce the horizontal spacing, hence improving readability.
you should always declare your variables as near as possible to their usage.

Implementing the above points would lead to this

Arc next;
do
{
 next = arcList.FirstOrDefault(x => x.Visited == false);
 if (next == null) { break; }
 var currentArcs= new List<Arc>();
 do
 {
 next.Visited = true;
 currentArcs.Add(next);
 if (next.Successor == null) { break; }
 next = next.Successor;
 } while (true);
 arcListOfList.Add(currentArcs);
} while (next != null);

Question 10

I'm going to assume there's no rule against having multiple arcs going in/out of a single point, as there's nothing stopping it currently.

If it's possible to have multiple arcs entering/ exiting a single point, we need to change:

public class Arc
{
 ...
 public Arc Successor { get; set; }
 public Arc Predecessor { get; set; }
}

to this:

public class Arc
{
 ...
 public List<Arc> Successors { get; set; }
 public List<Arc> Predecessors { get; set; }
}

From there you'll need to do something like:

foreach (var arc in arcList){
 if (!arcAlreadyInAnyGroup(arc, arcListOfList)) {
 var arcGroup = new List<Arc>(); //empty the group each time
 getGroupOfArcs(ref arc, ref arcGroup);
 arcListOfList.Add(arcGroup);
 }
}

I'm not going to make arcAlreadyInAnyGroup, it can just be some linq to check if arc is anywhere in arcListOfList.

//Traverse outwards from single arc recursively
private void getGroupOfArcs(ref Arc startArc, ref List<Arc> arcGroup){
 startArc.visited = true;
 arcGroup.add(startArc);
 foreach (var forwardArc in startArc.Successors){
 if (!forwardArc.visited){
 getGroupOfArcs(ref forwardArc, ref arcGroup);
 }
 }
 foreach (var backwardsArc in startArc.Predecessors){
 if (!backwardsArc.visited){
 getGroupOfArcs(ref backwardsArc, ref arcGroup);
 }
 }
}

I'm sure there's a few code smells in there, but you get the idea. Just traverse out from one arc, and add any connected arcs to the group. The runtime here could also be improved a lot I think.

RobH RobH 17.1k6 gold badges38 silver badges73 bronze badges · Answer 1 · 2015-07-16 14:16:19Z

I'm going to leave off suggesting anything about the algorithm for now as I don't fully understand what you want from it. Instead, I'm going to talk about your Point class in laborious depth...

public class Point
{
 public int X { get; set; }
 public int Y { get; set; }

I would argue that a point should be immutable - i.e. it doesn't change after you've created it. I would implement with full readonly properties:

private readonly int x;
public int X { get { return x; } }
private readonly int y;
public int Y { get { return y; } }
public Point(int x, int y)
{
 this.x = x;
 this.y = y
}

However, most people would let you off with private set auto properties:

public int X { get; private set; }

Let's look at your Equals implementations

public override bool Equals(Object obj)
{
 if (ReferenceEquals(null, obj)) return false;
 if (ReferenceEquals(this, obj)) return true;
 return obj.GetType() == GetType() && Equals((Point)obj);
}
protected bool Equals(Point other)
{
 return X.Equals(other.X) && Y.Equals(other.Y);
}

Firstly, bool Equals(Point other) -> that's IEquatable<Point> if made public so your class should implement that:

public class Point : IEquatable<Point>

Now, simplify the object.Equals override:

public override bool Equals(Object obj)
{
 return Equals(obj as Point);
}

And sharpen up the IEquatable<Point>.Equals:

public bool Equals(Point other)
{
 return other != null
 && other.X == X
 && other.Y == Y;
}

We should also override the == and != operators so that we're sure we are consistent with out equality:

public static bool operator ==(Point left, Point right)
{
 if (object.ReferenceEquals(left, null))
 {
 return object.ReferenceEquals(right, null);
 }
 return left.Equals(right);
}
public static bool operator !=(Point left, Point right)
{
 if (object.ReferenceEquals(left, null))
 {
 return !object.ReferenceEquals(right, null);
 }
 return !left.Equals(right);
}

Then you have a nice immutable class for your Point with consistent equality behaviour... Phew!

Edit

As EBrown pointed out in the comments, Point is a perfect fit for being a struct. It's less than 16 bytes, immutable and (0, 0) is a perfectly valid value (i.e. unitialized).

It could be argued that Point should be a struct as well.
@EBrown - Yes it almost certainly should be. I forgot to mention that as I was going through. I've updated now.

Heslacher Heslacher 50.9k5 gold badges83 silver badges177 bronze badges · Answer 2 · 2015-07-16 14:18:50Z

Talking about this

Arc next;
do
{
 next = arcList.FirstOrDefault(x => x.Visited == false);
 var temp = new List<Arc>();
 if (next == null) continue;
 do
 {
 next.Visited = true;
 temp.Add(next);
 if (next.Successor != null)
 {
 next = next.Successor;
 }
 else
 {
 break;
 }
 } while (true);
 arcListOfList.Add(temp);
} while (next != null);

temp is almost always a bad name for a variable, at least if it stands on its own. Naming it tempArcList would be better but not ideal.

You should try to come up with a good and meaningful name.
if (next == null) continue; this continue should be a break because the do..while will just end if next != null.
Using braces {} for single statement if, else` will help you to make your code less error prone. At least you have put the statement at the same line, which is a good start. Using braces would also (IMHO) improve the readability because it makes the statement more prominent.
If you invert the if (next.Successor != null) condition, you would reduce the horizontal spacing, hence improving readability.
you should always declare your variables as near as possible to their usage.

Implementing the above points would lead to this

Arc next;
do
{
 next = arcList.FirstOrDefault(x => x.Visited == false);
 if (next == null) { break; }
 var currentArcs= new List<Arc>();
 do
 {
 next.Visited = true;
 currentArcs.Add(next);
 if (next.Successor == null) { break; }
 next = next.Successor;
 } while (true);
 arcListOfList.Add(currentArcs);
} while (next != null);

RHok RHok 1064 bronze badges · Answer 3 · 2015-07-16 21:01:05Z

I'm going to assume there's no rule against having multiple arcs going in/out of a single point, as there's nothing stopping it currently.

If it's possible to have multiple arcs entering/ exiting a single point, we need to change:

public class Arc
{
 ...
 public Arc Successor { get; set; }
 public Arc Predecessor { get; set; }
}

to this:

public class Arc
{
 ...
 public List<Arc> Successors { get; set; }
 public List<Arc> Predecessors { get; set; }
}

From there you'll need to do something like:

foreach (var arc in arcList){
 if (!arcAlreadyInAnyGroup(arc, arcListOfList)) {
 var arcGroup = new List<Arc>(); //empty the group each time
 getGroupOfArcs(ref arc, ref arcGroup);
 arcListOfList.Add(arcGroup);
 }
}

I'm not going to make arcAlreadyInAnyGroup, it can just be some linq to check if arc is anywhere in arcListOfList.

//Traverse outwards from single arc recursively
private void getGroupOfArcs(ref Arc startArc, ref List<Arc> arcGroup){
 startArc.visited = true;
 arcGroup.add(startArc);
 foreach (var forwardArc in startArc.Successors){
 if (!forwardArc.visited){
 getGroupOfArcs(ref forwardArc, ref arcGroup);
 }
 }
 foreach (var backwardsArc in startArc.Predecessors){
 if (!backwardsArc.visited){
 getGroupOfArcs(ref backwardsArc, ref arcGroup);
 }
 }
}

I'm sure there's a few code smells in there, but you get the idea. Just traverse out from one arc, and add any connected arcs to the group. The runtime here could also be improved a lot I think.

Stack Exchange Network

Algorithm to group connected graphs

3 Answers 3

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Linked

Hot Network Questions

Algorithm to group connected graphs

3 Answers 3

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Linked

Related

Hot Network Questions