data = ['HTTP/1.1 200 OK', 'CACHE-CONTROL: max-age=1810', 'DATE: 2014年5月14日 12:15:19 GMT', 'EXT:', 'LOCATION: http://192.168.94.57:9000/DeviceDescription.xml', 'SERVER: Windows NT/5.0, UPnP/1.0, pvConnect UPnP SDK/1.0', 'ST: uuid:7076436f-6e65-1063-8074-78542e239ff5', 'USN: uuid:7076436f-6e65-1063-8074-78542e239ff5', 'Content-Length: 0', '', '']
From the above list, I have to extract the .xml
link.
My code:
for element in data:
if 'LOCATION' in element:
xmllink = element.split(': ').[1]
It's taking too much time. How can I make this faster?
Actually I am doing SSDP discovery for finding devices in a network. After sending the M-SEARCH command, devices send a datagram packet which I have taken in a data
variable. From this I have to extract the file link of that device for processing it.
When I use indexing to extract, it was done quickly.
-
4\$\begingroup\$ I cannot understand how a split and a small array like that can take too much time. Have you use some sort of profiling to make sure that this part is the problem? \$\endgroup\$Marc-Andre– Marc-Andre2014年05月14日 14:07:14 +00:00Commented May 14, 2014 at 14:07
-
\$\begingroup\$ @Marc-Andre actually i am doing ssdp dicovery for devices in network, after sending M-SEARCH command devices respond with a datagram packet which i have taken in "data" and its taking so much time to process this, earlier i have used direct indexing to find "LOCATION" and it was done quickly \$\endgroup\$Patrick– Patrick2014年05月14日 14:15:42 +00:00Commented May 14, 2014 at 14:15
-
2\$\begingroup\$ those information are important for a review! You should edit the question. \$\endgroup\$Marc-Andre– Marc-Andre2014年05月14日 14:27:32 +00:00Commented May 14, 2014 at 14:27
3 Answers 3
You want to test for element.startswith('LOCATION: ')
. You are doing 'LOCATION' in element
, which is not only slower since it has to check at every position of every element
, it might also lead to false matches.
Also, element
and data
are poor names. I suggest header
and headers
, respectively.
My suggestion:
LOC = 'LOCATION: '
xmllinks = [header[len(LOC):] for header in headers if header.startswith(LOC)]
if xmllinks:
xmllink = xmllinks[0]
-
1\$\begingroup\$ Is it safe to assume there will be a single space after 'LOCATION:' based on his original implementation? Searching for simply 'LOCATION:' then using
strip()
to remove any additional white space would probably be more secure. \$\endgroup\$BeetDemGuise– BeetDemGuise2014年05月14日 15:15:14 +00:00Commented May 14, 2014 at 15:15
I am not sure what is causing problem. This piece of code should not take up a lot of time. But here are some of the suggestions to speed up your code:
- Instead of list, create a set. The lookup time in set is constant.
- But the main problem with set is uses up more memory in comparison to list. So another option is to keep a sorted list (if possible) and use the bisect module to search the list.
Now some style suggestions:
for element in data:
if 'LOCATION' in element:
xmllink = element.split(': ').[1]
rewrite it as
for element in data:
if 'LOCATION' in element and ':' in element:
xmllink = element.split(':')[1].strip()
This ensures that if a string like 'LOCATION something' is in list then it will not raise any errors.
The .strip() is a better way to remove trailing whitespaces
-
\$\begingroup\$ Using a set only works if you want to check the whole element,
set.contains("LOCATION: http://www.example.com/test.xml")
would be fast, but what is needed here is something likeset.startsWith("LOCATION: ")
, which doesn't exist. \$\endgroup\$Simon Forsberg– Simon Forsberg2014年05月14日 16:10:49 +00:00Commented May 14, 2014 at 16:10
With such a small input it should be lightning fast. Anyway, shouldn't you break after finding the element? I'd write:
xmllink = next(s.split(":", 1)[1].strip() for s in data if s.startswith("LOCATION:")
-
\$\begingroup\$ how does this code ensures speedup ? \$\endgroup\$Pranav Raj– Pranav Raj2014年05月14日 14:26:26 +00:00Commented May 14, 2014 at 14:26
-
1\$\begingroup\$ because it breaks after finding the match. Not sure how the OP is having speed problems here, though \$\endgroup\$tokland– tokland2014年05月14日 14:26:52 +00:00Commented May 14, 2014 at 14:26
-
1\$\begingroup\$ @tokland Read the comments under the question, he's giving a bit more information about the speed problem. \$\endgroup\$Marc-Andre– Marc-Andre2014年05月14日 14:28:09 +00:00Commented May 14, 2014 at 14:28
-
\$\begingroup\$ While I agree with the
break
suggestion, I think using regex is totally overkill. \$\endgroup\$Simon Forsberg– Simon Forsberg2014年05月14日 16:12:13 +00:00Commented May 14, 2014 at 16:12 -
\$\begingroup\$ some refactors. \$\endgroup\$tokland– tokland2014年05月14日 20:14:46 +00:00Commented May 14, 2014 at 20:14
Explore related questions
See similar questions with these tags.