I've tried everything to make my program faster but with 250 nodes my code takes around 9 seconds to print the result, and with 5000 nodes it took around 260 seconds.
Is there a way to make my program faster?
I took the BFS function from here.
The site(connections) should consist of three towns each of which is connected to directly the other one by a road. The distance between two towns A and B is the minimum number of pairs of directly connected towns on the way from A to B. Two towns are connected directly by a road R if there is no other town on R between A and B. The distance of any possible site from the capital city is equal to the sum of the distances from each of the three towns representing the site from the capital city.
Output: the maximum possible distance of a site from the capital city, the number of all sites which are located at the maximum distance from the capital city.
there are two most distant sites from the capital in node 9, the sites are {0, 1, 3} and {2, 3, 5} and their distance from the capital is 10
Input first line (nodes_count, pairs_number, capital) Next all the pairs
Input
10 17 9
0 1
0 3
1 3
1 4
2 3
3 4
2 5
3 5
3 6
4 6
4 7
5 6
6 7
6 8
7 8
7 9
8 9
Output
10 2
My code
from collections import defaultdict
from queue import Queue
def read_nodes(pairs_number):
for _ in range(pairs_number):
yield map(int, input().split())
def parse_input(tree):
nodes_count, pairs_number, capital = map(int, input().split())
for node1, node2 in read_nodes(pairs_number):
tree[node1].append(node2)
tree[node2].append(node1)
return tree, capital, nodes_count
def traverse_path(fromNode, toNode, nodes):
def getNeighbours(current, nodes):
return nodes[current] if current in nodes else []
def make_path(toNode, graph):
result = []
while 'Root' != toNode:
result.append(toNode)
toNode = graph[toNode]
result.reverse()
return result
q = Queue()
q.put(fromNode)
graph = {fromNode: 'Root'}
while not q.empty():
current = q.get()
for neighbor in getNeighbours(current, nodes):
if neighbor not in graph:
graph[neighbor] = current
q.put(neighbor)
if current == toNode:
return make_path(toNode, graph)
return []
def distant_sites(graph_dic, capital, nodes_count):
distance_with_connections = defaultdict(list)
node_distance = {}
for towns in range(nodes_count):
towns_distance = len(traverse_path(capital, towns, graph_dic)) - 1
node_distance[towns] = (towns_distance)
for town1, neighbours in graph_dic.items():
for town2 in neighbours:
if town1 > town2:
for neighbour_of_town2 in graph_dic[town2]:
if town2 > neighbour_of_town2:
if neighbour_of_town2 in neighbours:
town_connection = (town1, town2, neighbour_of_town2)
distance = (node_distance[town1]) + (node_distance[town2]) + (node_distance[neighbour_of_town2])
distance_with_connections[distance].append(town_connection)
max_distance = max(distance_with_connections)
return max_distance, len(distance_with_connections[max_distance])
if __name__ == '__main__':
graph_dic, capital, nodes_count = parse_input(defaultdict(list))
max_distance, number_of_connections = distance_sites(graph_dic, capital, nodes_count)
print(max_distance, number_of_connections)
-
\$\begingroup\$ So you consider all non-capital cities, and for each of them you compute the maximum possible distance to the capital, is it correct? \$\endgroup\$coderodde– coderodde2016年05月23日 08:25:28 +00:00Commented May 23, 2016 at 8:25
-
\$\begingroup\$ Yes and they should form a connection of three towns , also the number of sites(connections) in the maximum possible distance to the capital \$\endgroup\$Joe– Joe2016年05月23日 09:09:19 +00:00Commented May 23, 2016 at 9:09
-
\$\begingroup\$ I am not quite sure about your algorithm, yet, in general, searching for longest path is NP-hard, which implies that there is no exact polytime algorithm for it unless \$P = NP\$. \$\endgroup\$coderodde– coderodde2016年05月23日 09:13:00 +00:00Commented May 23, 2016 at 9:13
-
\$\begingroup\$ yeah but I'm also searching for all the possible connections between the towns(three towns), The site(connections) should consist of three towns each of which is connected to directly the other one by a road. \$\endgroup\$Joe– Joe2016年05月23日 09:20:03 +00:00Commented May 23, 2016 at 9:20
-
\$\begingroup\$ and my implementation there takes more time \$\endgroup\$Joe– Joe2016年05月23日 09:20:36 +00:00Commented May 23, 2016 at 9:20
1 Answer 1
The code does not work because of a typo — there is a call to
distance_sites
but there is no such function.There are no docstrings. What do these functions do? How do I call them?
In order to improve the performance, we need to measure it, and in order to do that, it's helpful to be able to make test cases of arbitrary sizes. So let's write a test case generator:
from itertools import product def test_case(n): """Construct a graph with n**2 nodes and O(n**2) triangles, and return a tuple (graph, capital, number of nodes). """ graph = defaultdict(list) for i, j in product(range(n), repeat=2): k = i * n + j if i < n - 1: graph[k].append(k + n) graph[k + n].append(k) if j < n - 1: graph[k].append(k + 1) graph[k + 1].append(k) if i < n - 1 and j < n - 1: graph[k].append(k + n + 1) graph[k + n + 1].append(k) return graph, 0, n * n
Then we can easily measure the performance of the code using
timeit.timeit
:>>> from timeit import timeit >>> timeit(lambda:distant_sites(*test_case(50)), number=1) 51.7171316598542
with \$n=50\$ the graph has 2,500 nodes and 4,802 triangles.
queue.Queue
is a thread-safe data structure intended for use by multi-threaded programs. It has to take and release a lock for every operation, so it is overkill to use this in a single-threaded program. It is more than ten times faster to usecollections.deque
instead:>>> timeit(lambda:distant_sites(*test_case(50)), number=1) 4.451224883086979
The code computes the distance from the capital to each town by running a separate breadth-first search for each town. But this repeats a lot of work: in the course of finding the distance to town A, the breadth-first search will have to visit towns B, C, D, and so on. It would make sense to remember the distance to each town as we visit it, and so compute the distances from the capital to all the towns in one go:
from collections import deque def distances(graph, origin): """Return a dictionary mapping each node in graph to its distance from the origin. """ result = {origin: 0} visited = set([origin]) queue = deque([origin]) while queue: node = queue.popleft() distance = result[node] + 1 for neighbour in graph[node]: if neighbour not in visited: result[neighbour] = distance visited.add(neighbour) queue.append(neighbour) return result
This gives a couple of orders of magnitude speedup on the test case:
>>> timeit(lambda:distant_sites(*test_case(50)), number=1) 0.030185375828295946
The code for finding the most distant sites looks at all triangles. But this likely involves a lot of wasted effort. For example, suppose we find a triangle whose nodes have distance 48, 49 and 50 from the origin (with sum 147). Now there is no need to look at any triangle unless it contains a node with distance \${147\over 3} = 49\$ or more. So if we sort the nodes in reverse order by their distance from the origin and just remember the best-scoring site, then we may not have to consider very many sites before we know that we have found the most distant one.
def distant_sites(graph, origin): """Return the pair (max_site_dist, site_count), where max_site_dist is the maximum distance of any site in the graph from the origin, and site_count is the number of sites at that distance. A "site" is a triangle of nodes, and its distance from the origin is the sum of the distances of the three nodes. """ distance = distances(graph, origin) nodes = sorted(((d, n) for n, d in distance.items()), reverse=True) max_site_dist = 0 site_count = 0 for dist1, node1 in nodes: if dist1 * 3 < max_site_dist: break neighbours = graph[node1] for node2 in neighbours: dist2 = distance[node2] if (dist2, node2) >= (dist1, node1): continue for node3 in graph[node2]: if node3 not in neighbours: continue dist3 = distance[node3] if (dist3, node3) >= (dist2, node2): continue site_dist = dist1 + dist2 + dist3 if site_dist > max_site_dist: max_site_dist = site_dist site_count = 1 elif site_dist == max_site_dist: site_count += 1 return max_site_dist, site_count
The speedup you get from this optimization depends on the kinds of graph you feed it (if there are few sites then it won't make much difference). For my test case we get about 40% speedup:
>>> timeit(lambda:distant_sites(*test_case(50)), number=1) 0.017997111193835735
Explore related questions
See similar questions with these tags.