Return to Answer

replaced http://codereview.stackexchange.com/ with https://codereview.stackexchange.com/

edited Apr 13, 2017 at 12:40

Veedrac's answer Veedrac's answer has a lot of good points about the quality of the code, so in this answer I'm just going to discuss the performance.

Veedrac's answer has a lot of good points about the quality of the code, so in this answer I'm just going to discuss the performance.

deleted 5901 characters in body

Source Link

edited Aug 19, 2016 at 10:42

Gareth Rees

edited Aug 19, 2016 at 10:42

Gareth Rees

50.1k
3
130
210

1. Introduction

Veedrac's answer has a lot of good points about the quality of the code, so in this answer I'm just going to discuss the performance problems.

There are two aspects to performance: speed, and algorithmic complexity.

2. Speed

Drop the calls to strip: there's no need for them, because split with no arguments has the same effect.
There's no need to construct the list a of edges: it's simpler just to construct the adjacency matrix directly from the input.
There's no need to deduplicate the edges (duplicate edges don't affect the correctness of Dikjstra's algorithm), but if you are going to do it, it's simpler to make adj into a list of sets, and call the add method instead of append so that the deduplication happens as you go along.

3. Algorithmic complexity

Algorithmic complexity tells you the amount of resources (typically time or space) used by a program, expressed as a function of the size of its input.

Let's do a detailed complexity analysis of your code. In Dijkstra's algorithm, once the data structures have been initialized, it enters a loop that carries out the following series of steps:

If the unvisited set is empty, exit.
From the unvisited set, find the node with the minimum distance from the start. Set this as the current node.
Remove the current node from the unvisited set.
For all the neighbours of the current node, update their distance from the start, then go to step 1.

How long does it take your code to execute each step, assuming that there are \$V\$ nodes and \$E\$ edges in the graph? Well,

Looking at the TimeComplexity page on the Python wiki, we can see that len(list) takes constant time, so the total contribution to the runtime due to this step is \$O(V)\$ (since the loop runs once for each node).

This step is implemented like this:

 minDist = float('inf')
 for i in unique:
 if (distance[i-1] <= minDist):
 minDist = distance[i-1]
 distIndex = i
 current = distIndex

which iterates over the entries of unique, so this step takes \$O(V)\$, and the total contribution to the runtime due to this step is \$O(V^2)\$.

This step is implemented like this:
```
 unique.remove(current)
```

where unique is a list. Looking at the TimeComplexity page on the Python wiki, we can see that list.remove takes time proportional to the length of the list, so this step takes \$O(V)\$, and the total contribution to the runtime due to this step is \$O(V^2)\$.

This step loops over the nodes that are adjacent to the current node, so takes time proportional to the number of adjacent nodes. Over the whole run of the algorithm, each edge in the graph will be followed at most once, so the total contribution to the runtime due to this step is \$O(E)\$

The total runtime is therefore \$O(V) + O(V^2) + O(V^2) + O(E)\$ which is \$O(V^2)\$ since there can be at most \$O(V^2)\$ edges.

This kind of analysis is something that you should get in the habit of doing each time you implement an algorithm that's new to you. It looks a bit long-winded when written out in full like this, but once you get into the habit you'll find that it's quick to sketch out the important details, and eventually you'll be able to do it in your head and immediately spot the step that has the wrong complexity.

4. Improving the complexity

Now, if you look at Wikipedia's description of Dijkstra's algorithm , you'll see that the runtime can be reduced to \$O(E + V \log V)\$ by using a min-priority queue to store the unvisited set, so that the removal of the current node in step 3, and the discovery of the univisited node with the smallest distance in step 2, can run in \$O(\log V)\$.

How can you implement an efficient min-priority queue in Python? There isn't one that's built-in, but the documentation for the heapq module has an explanation with example code showing how to implement one using a heap. Using this approach results in code like this:

from heapq import heapify, heappop, heappush
# Distance to unreachable nodes.
UNREACHABLE = float('inf')
def shortest_distances(graph, start):
 """Given a graph in adjacency list form and a start node, return a
 list of the distance of each node from start, or infinity for a
 node that is not reachable from the start.
 """
 # Mapping from node to its entry in the priority queue, which is a
 # list [distance, node].
 entries = {n: [UNREACHABLE, n] for n in range(len(graph))}
 # Min-priority queue of [distance, node] pairs.
 unvisited = list(entries.values())
 heapify(unvisited)
 # Placeholder for removed priority queue entries
 REMOVED = -1
 # Update the distance to a node in the priority queue.
 def update(node, distance):
 entries[node][1] = REMOVED
 entry = [distance, node]
 entries[node] = entry
 heappush(unvisited, entry)
 # Single-source, all paths using Dijkstra's algorithm.
 update(start, 0)
 while unvisited:
 dist, node = heappop(unvisited)
 if node == REMOVED:
 continue
 new_dist = dist + 1
 for neighbour in graph[node]:
 neighbour_dist, _ = entries[neighbour]
 if new_dist < neighbour_dist:
 update(neighbour, new_dist)
 return [entries[node][0] for node in range(len(graph))]

Note that this code works with nodes starting at 0, to avoid having to add and subtract 1, and uses an edge length of 1 (not 6). So the input and output need adjusting accordingly, for example:

import sys
def main():
 readline = lambda:map(int, next(sys.stdin).split())
 n, = readline()
 for _ in range(n):
 nodes, edges = readline()
 graph = [set() for _ in range(nodes + 1)]
 for _ in range(edges):
 a, b = readline()
 graph[a].add(b)
 graph[b].add(a)
 start, = readline()
 distances = shortest_distances(graph, start)
 print(*((-1 if d == UNREACHABLE else d * 6)
 for i, d in enumerate(distances)
 if i != 0 and i != start))
if __name__ == '__main__':
 main()

1. Introduction

Veedrac's answer has a lot of good points about the quality of the code, so in this answer I'm just going to discuss the performance problems.

There are two aspects to performance: speed, and algorithmic complexity.

2. Speed

Drop the calls to strip: there's no need for them, because split with no arguments has the same effect.
There's no need to construct the list a of edges: it's simpler just to construct the adjacency matrix directly from the input.
There's no need to deduplicate the edges (duplicate edges don't affect the correctness of Dikjstra's algorithm), but if you are going to do it, it's simpler to make adj into a list of sets, and call the add method instead of append so that the deduplication happens as you go along.

3. Algorithmic complexity

Algorithmic complexity tells you the amount of resources (typically time or space) used by a program, expressed as a function of the size of its input.

Let's do a detailed complexity analysis of your code. In Dijkstra's algorithm, once the data structures have been initialized, it enters a loop that carries out the following series of steps:

If the unvisited set is empty, exit.
From the unvisited set, find the node with the minimum distance from the start. Set this as the current node.
Remove the current node from the unvisited set.
For all the neighbours of the current node, update their distance from the start, then go to step 1.

How long does it take your code to execute each step, assuming that there are \$V\$ nodes and \$E\$ edges in the graph? Well,

Looking at the TimeComplexity page on the Python wiki, we can see that len(list) takes constant time, so the total contribution to the runtime due to this step is \$O(V)\$ (since the loop runs once for each node).

This step is implemented like this:

 minDist = float('inf')
 for i in unique:
 if (distance[i-1] <= minDist):
 minDist = distance[i-1]
 distIndex = i
 current = distIndex

which iterates over the entries of unique, so this step takes \$O(V)\$, and the total contribution to the runtime due to this step is \$O(V^2)\$.

This step is implemented like this:
```
 unique.remove(current)
```

This step loops over the nodes that are adjacent to the current node, so takes time proportional to the number of adjacent nodes. Over the whole run of the algorithm, each edge in the graph will be followed at most once, so the total contribution to the runtime due to this step is \$O(E)\$

The total runtime is therefore \$O(V) + O(V^2) + O(V^2) + O(E)\$ which is \$O(V^2)\$ since there can be at most \$O(V^2)\$ edges.

4. Improving the complexity

from heapq import heapify, heappop, heappush
# Distance to unreachable nodes.
UNREACHABLE = float('inf')
def shortest_distances(graph, start):
 """Given a graph in adjacency list form and a start node, return a
 list of the distance of each node from start, or infinity for a
 node that is not reachable from the start.
 """
 # Mapping from node to its entry in the priority queue, which is a
 # list [distance, node].
 entries = {n: [UNREACHABLE, n] for n in range(len(graph))}
 # Min-priority queue of [distance, node] pairs.
 unvisited = list(entries.values())
 heapify(unvisited)
 # Placeholder for removed priority queue entries
 REMOVED = -1
 # Update the distance to a node in the priority queue.
 def update(node, distance):
 entries[node][1] = REMOVED
 entry = [distance, node]
 entries[node] = entry
 heappush(unvisited, entry)
 # Single-source, all paths using Dijkstra's algorithm.
 update(start, 0)
 while unvisited:
 dist, node = heappop(unvisited)
 if node == REMOVED:
 continue
 new_dist = dist + 1
 for neighbour in graph[node]:
 neighbour_dist, _ = entries[neighbour]
 if new_dist < neighbour_dist:
 update(neighbour, new_dist)
 return [entries[node][0] for node in range(len(graph))]

Note that this code works with nodes starting at 0, to avoid having to add and subtract 1, and uses an edge length of 1 (not 6). So the input and output need adjusting accordingly, for example:

import sys
def main():
 readline = lambda:map(int, next(sys.stdin).split())
 n, = readline()
 for _ in range(n):
 nodes, edges = readline()
 graph = [set() for _ in range(nodes + 1)]
 for _ in range(edges):
 a, b = readline()
 graph[a].add(b)
 graph[b].add(a)
 start, = readline()
 distances = shortest_distances(graph, start)
 print(*((-1 if d == UNREACHABLE else d * 6)
 for i, d in enumerate(distances)
 if i != 0 and i != start))
if __name__ == '__main__':
 main()

Veedrac's answer has a lot of good points about the quality of the code, so in this answer I'm just going to discuss the performance.

Drop the calls to strip: there's no need for them, because split with no arguments has the same effect.
There's no need to construct the list a of edges: it's simpler just to construct the adjacency matrix directly from the input.
There's no need to deduplicate the edges (duplicate edges don't affect the correctness of Dikjstra's algorithm), but if you are going to do it, it's simpler to make adj into a list of sets, and call the add method instead of append so that the deduplication happens as you go along.

Post Undeleted by Gareth Rees

occurred Aug 19, 2016 at 10:13

profiling

Source Link

edited Aug 19, 2016 at 10:13

Gareth Rees

edited Aug 19, 2016 at 10:13

Gareth Rees

50.1k
3
130
210

Veedrac's answer has a lot of good points about the quality of the code, so in this answer I'm just going to discuss the performance problemproblems.

There are two aspects to performance: speed, and algorithmic complexity.

2. Complexity analysisSpeed

Let's figure out where the time goes by profiling the program using Python's built-in profiler :

$ python -m cProfile -s tottime cr138989.py < testcase.txt
...
 1492053 function calls in 3.915 seconds
 Ordered by: internal time
 ncalls tottime percall cumtime percall filename:lineno(function)
 1 1.767 1.767 3.915 3.915 cr138989.py:1(<module>)
 246223 1.346 0.000 1.348 0.000 {built-in method builtins.input}
 6 0.343 0.057 0.364 0.061 cr138989.py:1(shortest)
 246216 0.221 0.000 0.221 0.000 {method 'split' of 'str' objects}
 741011 0.106 0.000 0.106 0.000 {method 'append' of 'list' objects}
 246216 0.062 0.000 0.062 0.000 {method 'strip' of 'str' objects}
 6 0.043 0.007 0.043 0.007 {built-in method builtins.print}
 2387 0.020 0.000 0.020 0.000 {method 'remove' of 'list' objects}
 3403 0.002 0.000 0.002 0.000 cr138989.py:53(<genexpr>)
 6 0.001 0.000 0.001 0.000 cr138989.py:39(<listcomp>)
 258 0.001 0.000 0.001 0.000 {built-in method _codecs.ascii_decode}
 6 0.001 0.000 0.002 0.000 {method 'join' of 'str' objects}
 5796 0.001 0.000 0.001 0.000 {built-in method builtins.len}
 258 0.001 0.000 0.002 0.000 ascii.py:25(decode)
 258 0.000 0.000 0.000 0.000 codecs.py:280(getstate)
 1 0.000 0.000 3.915 3.915 {built-in method builtins.exec}
 1 0.000 0.000 0.000 0.000 {method 'disable' of '_lsprof.Profiler' objects}

It takes some practice to interpret the results, but basically this is telling you that the total runtime was 3.9 seconds, of which 1.7 seconds was spent in the code at the top level of the module, 1.3 seconds was spent in input , 0.3 in shortest, and a small amount in other functions.

So the obvious performance problem here is input. If you read the documentation , you'll see that this function has several features that you don't use (prompting, interactivity, newline stripping, EOF handling), and all this can easily be avoided by replacing input with sys.stdin.readline:

import sys
input = sys.stdin.readline

Now the profile results look like this:

 ncalls tottime percall cumtime percall filename:lineno(function)
 1 1.500 1.500 2.387 2.387 cr138989.py:1(<module>)
 6 0.357 0.059 0.378 0.063 cr138989.py:1(shortest)
 246216 0.151 0.000 0.151 0.000 {method 'split' of 'str' objects}
 246223 0.140 0.000 0.142 0.000 {method 'readline' of '_io.TextIOWrapper' objects}
 741011 0.103 0.000 0.103 0.000 {method 'append' of 'list' objects}
 246216 0.058 0.000 0.058 0.000 {method 'strip' of 'str' objects}
 6 0.052 0.009 0.052 0.009 {built-in method builtins.print}
 2387 0.020 0.000 0.020 0.000 {method 'remove' of 'list' objects}

This is already fast enough to pass the HackerRank time limit. But you can save another 0.6 seconds or so by making these changes:

Drop the calls to strip : there's no need for them, because split with no arguments has the same effect.
There's no need to construct the list a of edges: it's simpler just to construct the adjacency matrix directly from the input.
There's no need to deduplicate the edges (duplicate edges don't affect the correctness of Dikjstra's algorithm), but if you are going to do it, it's simpler to make adj into a list of sets, and call the add method instead of append so that the deduplication happens as you go along.

3. Algorithmic complexity

Algorithmic complexity tells you the amount of resources (typically time or space) used by a program, expressed as a function of the size of its input.

34. Improving the complexity

Using this approach, it is easy to pass the time limit at HackerRank results in code like this:

import sys
def main():
 readline = lambda:map(int, next(sys.stdin).split())
 n, = readline()
 for _ in range(n):
 nodes, edges = readline()
 graph = [list[set() for _ in range(nodes + 1)]
 for _ in range(edges):
 a, b = readline()
 graph[a].appendadd(b)
 graph[b].appendadd(a)
 start, = readline()
 distances = shortest_distances(graph, start)
 print(*((-1 if d == UNREACHABLE else d * 6)
 for i, d in enumerate(distances)
 if i != 0 and i != start))
if __name__ == '__main__':
 main()

Veedrac's answer has a lot of good points about the quality of the code, so in this answer I'm just going to discuss the performance problem.

2. Complexity analysis

3. Improving the complexity

Using this approach, it is easy to pass the time limit at HackerRank:

import sys
def main():
 readline = lambda:map(int, next(sys.stdin).split())
 n, = readline()
 for _ in range(n):
 nodes, edges = readline()
 graph = [list() for _ in range(nodes + 1)]
 for _ in range(edges):
 a, b = readline()
 graph[a].append(b)
 graph[b].append(a)
 start, = readline()
 distances = shortest_distances(graph, start)
 print(*((-1 if d == UNREACHABLE else d * 6)
 for i, d in enumerate(distances)
 if i != 0 and i != start))

Veedrac's answer has a lot of good points about the quality of the code, so in this answer I'm just going to discuss the performance problems.

There are two aspects to performance: speed, and algorithmic complexity.

2. Speed

Let's figure out where the time goes by profiling the program using Python's built-in profiler :

$ python -m cProfile -s tottime cr138989.py < testcase.txt
...
 1492053 function calls in 3.915 seconds
 Ordered by: internal time
 ncalls tottime percall cumtime percall filename:lineno(function)
 1 1.767 1.767 3.915 3.915 cr138989.py:1(<module>)
 246223 1.346 0.000 1.348 0.000 {built-in method builtins.input}
 6 0.343 0.057 0.364 0.061 cr138989.py:1(shortest)
 246216 0.221 0.000 0.221 0.000 {method 'split' of 'str' objects}
 741011 0.106 0.000 0.106 0.000 {method 'append' of 'list' objects}
 246216 0.062 0.000 0.062 0.000 {method 'strip' of 'str' objects}
 6 0.043 0.007 0.043 0.007 {built-in method builtins.print}
 2387 0.020 0.000 0.020 0.000 {method 'remove' of 'list' objects}
 3403 0.002 0.000 0.002 0.000 cr138989.py:53(<genexpr>)
 6 0.001 0.000 0.001 0.000 cr138989.py:39(<listcomp>)
 258 0.001 0.000 0.001 0.000 {built-in method _codecs.ascii_decode}
 6 0.001 0.000 0.002 0.000 {method 'join' of 'str' objects}
 5796 0.001 0.000 0.001 0.000 {built-in method builtins.len}
 258 0.001 0.000 0.002 0.000 ascii.py:25(decode)
 258 0.000 0.000 0.000 0.000 codecs.py:280(getstate)
 1 0.000 0.000 3.915 3.915 {built-in method builtins.exec}
 1 0.000 0.000 0.000 0.000 {method 'disable' of '_lsprof.Profiler' objects}

import sys
input = sys.stdin.readline

Now the profile results look like this:

 ncalls tottime percall cumtime percall filename:lineno(function)
 1 1.500 1.500 2.387 2.387 cr138989.py:1(<module>)
 6 0.357 0.059 0.378 0.063 cr138989.py:1(shortest)
 246216 0.151 0.000 0.151 0.000 {method 'split' of 'str' objects}
 246223 0.140 0.000 0.142 0.000 {method 'readline' of '_io.TextIOWrapper' objects}
 741011 0.103 0.000 0.103 0.000 {method 'append' of 'list' objects}
 246216 0.058 0.000 0.058 0.000 {method 'strip' of 'str' objects}
 6 0.052 0.009 0.052 0.009 {built-in method builtins.print}
 2387 0.020 0.000 0.020 0.000 {method 'remove' of 'list' objects}

This is already fast enough to pass the HackerRank time limit. But you can save another 0.6 seconds or so by making these changes:

Drop the calls to strip : there's no need for them, because split with no arguments has the same effect.
There's no need to construct the list a of edges: it's simpler just to construct the adjacency matrix directly from the input.
There's no need to deduplicate the edges (duplicate edges don't affect the correctness of Dikjstra's algorithm), but if you are going to do it, it's simpler to make adj into a list of sets, and call the add method instead of append so that the deduplication happens as you go along.

3. Algorithmic complexity

Algorithmic complexity tells you the amount of resources (typically time or space) used by a program, expressed as a function of the size of its input.

4. Improving the complexity

import sys
def main():
 readline = lambda:map(int, next(sys.stdin).split())
 n, = readline()
 for _ in range(n):
 nodes, edges = readline()
 graph = [set() for _ in range(nodes + 1)]
 for _ in range(edges):
 a, b = readline()
 graph[a].add(b)
 graph[b].add(a)
 start, = readline()
 distances = shortest_distances(graph, start)
 print(*((-1 if d == UNREACHABLE else d * 6)
 for i, d in enumerate(distances)
 if i != 0 and i != start))
if __name__ == '__main__':
 main()

Post Deleted by Gareth Rees

occurred Aug 18, 2016 at 12:27

added 2364 characters in body

Source Link

edited Aug 18, 2016 at 12:18

Gareth Rees

edited Aug 18, 2016 at 12:18

Gareth Rees

50.1k
3
130
210

Source Link

answered Aug 18, 2016 at 11:49

Gareth Rees

answered Aug 18, 2016 at 11:49

Gareth Rees

50.1k
3
130
210

lang-py