Asked 8 years, 10 months ago

Viewed 597 times

\$\begingroup\$

My problem starts with a list A (length around n = 100) of big strings (around 10000 characters each). I also have another q = 10000 strings of length 100. I want to check if each string is a substring of any element of the list A.

I've tried to do this using inor any but it is still taking too much time as there are 10000 iterations and in each iteration I'm checking if s of length 100 is in str of length 10000.

n,q=[int(item) for item in input().split()]
desc=[]
for i in range(n):
 desc.append(input())
desc="\t".join(desc)
for j in range(q):
 quest=input().strip()
 if quest in desc:
 print("It's in !")
 else:
 print("It's not in ..")

Is there any better way to do this much faster?

Note: The numbers I'm expliciting are upper bounds not exact values of the lengths.

edited Oct 23, 2016 at 16:48

Jamal's user avatar

Jamal

35.2k13 gold badges134 silver badges238 bronze badges

asked Oct 23, 2016 at 10:07

Adam's user avatar

Adam Adam

111 bronze badge

\$\endgroup\$

\$\begingroup\$ It looks like you're overcomplicating something. Can you add example input to your question so we can see and understand why you did what you did? \$\endgroup\$

Mast
– Mast ♦

2016年10月23日 10:59:28 +00:00
Commented Oct 23, 2016 at 10:59
\$\begingroup\$ Is quest always going to be exactly 100 characters? \$\endgroup\$

TheBlackCat
– TheBlackCat

2016年10月23日 12:30:54 +00:00
Commented Oct 23, 2016 at 12:30

Add a comment |

1 Answer 1

Sorted by: Reset to default

\$\begingroup\$

The problem of finding matches for multiple fixed search strings in a corpus is solved by the Aho–Corasick algorithm in time proportional to the length of the corpus plus the number of matches.

Python doesn't come with an implementation of the Aho–Corasick algorithm (as far as I know), but the Python Package Index has the pyahocorasick package. Or you could write your own.

Alternatively, if you are on a Unix system, you can use the -F (fixed strings) option to grep and avoid Python altogether.

answered Oct 23, 2016 at 11:13

Gareth Rees's user avatar

Gareth Rees Gareth Rees

50.1k3 gold badges130 silver badges210 bronze badges

\$\endgroup\$

Add a comment |

Your Answer

Draft saved

Draft discarded

Sign up or log in

Post as a guest

Name

Required, but never shown

Post as a guest

Name

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.

lang-py

Stack Exchange Network

Check if strings are substrings of another string in Python

1 Answer 1

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Hot Network Questions

Check if strings are substrings of another string in Python

1 Answer 1

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Related

Hot Network Questions