Checking for any character common to two strings — Go is 50× slower than Python

Question 1

I decided to learn Go, knowing Python. My approach is by solving easy problems like this one:

You are given two strings, A and B. Find if there is a substring that appears in both A and B. Several test cases will be given to you in a single file. The first line of the input will contain a single integer T, the number of test cases.

Then there will be T descriptions of the test cases. Each description contains two lines. The first line contains the string A and the second line contains the string B. For each test case, display YES (in a newline), if there is a common substring. Otherwise, display NO.

The problem is straightforward: check if two sets intersect.

The real problem is that for the biggest testcase (20 strings of length 10⁵ each) my Go code takes 2.6 seconds, which is unacceptable, knowing that my Python code solves the problem in 0.08 seconds. Based on the fact that Go is claimed to be near C speed, I assume that I am doing something super inefficiently.

package main
import "fmt"
func subst(s1 string, s2 string) bool {
 m := map[uint8]bool{}
 for i := 0; i < len(s1); i++ {
 m[s1[i]] = true
 }
 for i := 0; i < len(s2); i++ {
 if m[s2[i]] {
 return true
 }
 }
 return false
}
func main() {
 var n int
 var s1, s2 string
 fmt.Scanf("%d", &n)
 for i := 0; i < n; i++ {
 fmt.Scanf("%s", &s1)
 fmt.Scanf("%s", &s2)
 if subst(s1, s2) {
 fmt.Println("YES")
 } else {
 fmt.Println("NO")
 }
 }
}

Python implementation:

def twoStrings(a, b):
 if len(set(a) & set(b)):
 print 'YES'
 else:
 print 'NO'
for i in xrange(input()):
 twoStrings(raw_input(), raw_input())

Nonetheless, two snippets look really different, they are actually doing similar things. In go I create a hash-map and put boolean value for each character from the first string (this is how sets are implemented in a lot of languages) and then for each character from the second string I am checking whether it is in a map and if no - exiting. It's obvious in the Python code.

Question 2

Just wondering if you can profile your go code to see where it is slow. Comparing with the python code I cannot see the file handling code. maybe the actual slowness is there. BTW, the python code is not exactly the same as your go code. You do not iterate each string to create a map. In go there is a Split function that makes the array. Try using it

Question 3

Nice question. I have confirmed your observations about the speed of the two programs.

Question 4

@200_success I assume I have some obvious problem with reading/writing to stdin (heard that some functions in C are way slower when reading input). The problem is that I decided to learn Go just a few days ago, so I do not know how to properly use it.

Question 5

Not really sure whether it is nice to post someones else code as an answer, but in a chat I got a nicely working code with some explanation

Question 6

FWIW, for CPython any(char in a and char in b for char in ascii_lowercase) is 5x faster than using sets.

Question 7

This entire answer refers to my rewrite of the asker's code on the Go Playground and is mostly a summary of our discussion on the chat.so Go/Golang room. For reference, I'm reproducing the code below:

package main
import (
 "bufio"
 "io"
 "os"
 "strconv"
)
// charSet is a limited bitset to contain all lowercase latin characters (a-z).
type charSet struct {
 bits uint32
}
// Set switches a bit to 1 for any given character i that is in the range 'a'
// to 'z'. Characters outside this range have undefined behavior.
func (c *charSet) Set(i uint8) {
 c.bits |= 1 << (i - 'a')
}
// Intersects returns whether two charSets intersect (i.e., share one or more
// on bits).
func (c charSet) Intersects(o charSet) bool {
 return c.bits&o.bits != 0
}
// set returns a charSet for all bytes in s. Bytes in s must be in the range of
// 'a' to 'z'. Anything outside that range is regarded as undefined behavior.
func set(s []byte) charSet {
 var c charSet
 for i := 0; i < len(s); i++ {
 c.Set(s[i])
 }
 return c
}
// subst returns whether two strings share any characters. l and r are assumed
// to only contain characters in the range of 'a' to 'z'.
func subst(l, r []byte) bool {
 return set(l).Intersects(set(r))
}
func main() {
 var n int
 r := bufio.NewReader(os.Stdin)
 if l, err := r.ReadString('\n'); err == nil && len(l) > 1 {
 if n, err = strconv.Atoi(l[:len(l)-1]); err != nil {
 panic(err)
 }
 } else if err != nil {
 panic(err)
 }
 for i := 0; i < n; i++ {
 var s1, s2 []byte
 var err error
 s1, err = r.ReadBytes('\n')
 if err == nil {
 s2, err = r.ReadBytes('\n')
 }
 if err != nil && err != io.EOF {
 panic(err)
 }
 if len(s1) > 0 && len(s2) > 0 && subst(s1[:len(s1)-1], s2[:len(s2)-1]) {
 io.WriteString(os.Stdout, "YES\n")
 } else {
 io.WriteString(os.Stdout, "NO\n")
 }
 }
}

There are two major changes here to get this to the point that it at least beats Python in my own test runs on HackerRank:

Replace fmt.Scanf with just reading and parsing where necessary. The only input that requires actual parsing is the first integer, so there's no need for fmt.Scanf beyond that. Even initially we don't need it, however, because we're really only interested in parsing the first line. As such, I just replaced the entire thing with a buffered reader and read up to a delimiter, assuming that all lines end with '\n' (they should for this case). After that, I just run the first line through strconv.Atoi.

This chops off a reasonable amount of time in reading because it avoids parsing a format string and then modifying the input read to conform to the format string, as well as possible time spent in reflection depending on the data. I don't think it'll actually hit the reflect package here, but I haven't looked at the fmt.Fscanf implementation, so take that with a grain of salt.
Use a limited bitset (i.e., just one integer) instead of map[uint8]bool. This is more obvious as to why it's faster: there's no need to allocate more than four bytes on the stack to use an integer, and because the problem constrains the input strings to lowercase characters in the range of 'a' to 'z', we know it'll fit in the bounds of a 32-bit integer.

The charSet type and set function (which is just me being lazy and sort of mimicking the Python code) covers the entire implementation of this. All it does is, given a byte, update the underlying bits integer and test for an intersection using a bitwise and. To get a complete bitset for a string, or byte slice, just iterate over its length and set those bits.

This results in pretty simple code, an acceptable interface for working with a bitset that makes changes to the implementation easier, and it's much faster than allocating memory for a map, creating new cells in the map, and lookups in the map. (I think it's likely that some improvement would be seen by giving the map enough initial capacity for the cardinality of its key, but I doubt if it'd be significant — haven't tested it, so that's something to look into.) These are all fast enough for day to day things, but if we're trying to beat the time on some program, data structures are low-hanging fruit.

With those changes, I get, at worst, 0.01s on HackerRank for test #4 and #5 (possibly both — I'm guessing it's rounding at this point) and 0s for all other tests. This is also fairly easy to benchmark using the testing package, but I don't have a way to compare those results with Python's, so I consider HackerRank's authoritative for now.

Question 8

I have something to say about your Python implementation too:

You are mixing Input/Output and Logic

The function should return a boolean value, only after you should convert it to yes/no.

def twoStrings(a, b):
 if len(set(a) & set(b)):
 return True
 else:
 return False
def bool_to_yes_no(bool_):
 return "YES" if bool_ else "NO"

You are being too verbose

Your function can be written in one line thanks to bool

def twoStrings(a, b):
 return bool(len(set(a) & set(b)))

Indeed further inspection reveals that you are really checking if it is empty (i.e. has length 0), so we can just:

def twoStrings(a, b):
 return set(a) & set(b)

This may return an entire set but the bool_to__yes_no function will handle that just fine. If you really want booleans just add bool around your expression.

Also the name is not so good, I would call it intersection

def have_intersection(a: str, b: str) -> bool:
 return bool(set(a) & set(b))

Please note that this sintax only works in Python 3.x and that a and b in this particular case are strings but could anything else.

You are giving importance to the unecessary

Maybe this is nitpicking but programming is about precision:

for i in xrange(input()):

should be:

for _ in range(input()): #(`range` because I use Python 3).

because an ignored variable is conventionally called _

You have a security issue

People usually behave correctly but what if the file started with:

__import__(os).call('rm -rf') # Pseudocodish but you get the idea

You guessed it, all your files would be destroyed becaue input in old Python works like eval

Final version

Putting all my advice together:

def have_intersection(a, b):
 return bool(set(a) & set(b))
def bool_to_yes_no(bool_):
 return "YES" if bool_ else "NO"
for _ in range(int(input())):
 print(bool_to_yes_no(have_intersection((input(), input())))

Please note that I am using input because I am on Python 3 (that I reccomend over Python 2)

user50859user50859 · Accepted Answer · 2015-04-29 07:02:44Z

This entire answer refers to my rewrite of the asker's code on the Go Playground and is mostly a summary of our discussion on the chat.so Go/Golang room. For reference, I'm reproducing the code below:

package main
import (
 "bufio"
 "io"
 "os"
 "strconv"
)
// charSet is a limited bitset to contain all lowercase latin characters (a-z).
type charSet struct {
 bits uint32
}
// Set switches a bit to 1 for any given character i that is in the range 'a'
// to 'z'. Characters outside this range have undefined behavior.
func (c *charSet) Set(i uint8) {
 c.bits |= 1 << (i - 'a')
}
// Intersects returns whether two charSets intersect (i.e., share one or more
// on bits).
func (c charSet) Intersects(o charSet) bool {
 return c.bits&o.bits != 0
}
// set returns a charSet for all bytes in s. Bytes in s must be in the range of
// 'a' to 'z'. Anything outside that range is regarded as undefined behavior.
func set(s []byte) charSet {
 var c charSet
 for i := 0; i < len(s); i++ {
 c.Set(s[i])
 }
 return c
}
// subst returns whether two strings share any characters. l and r are assumed
// to only contain characters in the range of 'a' to 'z'.
func subst(l, r []byte) bool {
 return set(l).Intersects(set(r))
}
func main() {
 var n int
 r := bufio.NewReader(os.Stdin)
 if l, err := r.ReadString('\n'); err == nil && len(l) > 1 {
 if n, err = strconv.Atoi(l[:len(l)-1]); err != nil {
 panic(err)
 }
 } else if err != nil {
 panic(err)
 }
 for i := 0; i < n; i++ {
 var s1, s2 []byte
 var err error
 s1, err = r.ReadBytes('\n')
 if err == nil {
 s2, err = r.ReadBytes('\n')
 }
 if err != nil && err != io.EOF {
 panic(err)
 }
 if len(s1) > 0 && len(s2) > 0 && subst(s1[:len(s1)-1], s2[:len(s2)-1]) {
 io.WriteString(os.Stdout, "YES\n")
 } else {
 io.WriteString(os.Stdout, "NO\n")
 }
 }
}

There are two major changes here to get this to the point that it at least beats Python in my own test runs on HackerRank:

Replace fmt.Scanf with just reading and parsing where necessary. The only input that requires actual parsing is the first integer, so there's no need for fmt.Scanf beyond that. Even initially we don't need it, however, because we're really only interested in parsing the first line. As such, I just replaced the entire thing with a buffered reader and read up to a delimiter, assuming that all lines end with '\n' (they should for this case). After that, I just run the first line through strconv.Atoi.

This chops off a reasonable amount of time in reading because it avoids parsing a format string and then modifying the input read to conform to the format string, as well as possible time spent in reflection depending on the data. I don't think it'll actually hit the reflect package here, but I haven't looked at the fmt.Fscanf implementation, so take that with a grain of salt.
Use a limited bitset (i.e., just one integer) instead of map[uint8]bool. This is more obvious as to why it's faster: there's no need to allocate more than four bytes on the stack to use an integer, and because the problem constrains the input strings to lowercase characters in the range of 'a' to 'z', we know it'll fit in the bounds of a 32-bit integer.

The charSet type and set function (which is just me being lazy and sort of mimicking the Python code) covers the entire implementation of this. All it does is, given a byte, update the underlying bits integer and test for an intersection using a bitwise and. To get a complete bitset for a string, or byte slice, just iterate over its length and set those bits.

This results in pretty simple code, an acceptable interface for working with a bitset that makes changes to the implementation easier, and it's much faster than allocating memory for a map, creating new cells in the map, and lookups in the map. (I think it's likely that some improvement would be seen by giving the map enough initial capacity for the cardinality of its key, but I doubt if it'd be significant — haven't tested it, so that's something to look into.) These are all fast enough for day to day things, but if we're trying to beat the time on some program, data structures are low-hanging fruit.

With those changes, I get, at worst, 0.01s on HackerRank for test #4 and #5 (possibly both — I'm guessing it's rounding at this point) and 0s for all other tests. This is also fairly easy to benchmark using the testing package, but I don't have a way to compare those results with Python's, so I consider HackerRank's authoritative for now.

Stack Exchange Network

Checking for any character common to two strings — Go is 50× slower than Python

2 Answers 2

You are mixing Input/Output and Logic

You are being too verbose

You are giving importance to the unecessary

You have a security issue

Final version

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Hot Network Questions

Checking for any character common to two strings — Go is 50× slower than Python

2 Answers 2

You are mixing Input/Output and Logic

You are being too verbose

You are giving importance to the unecessary

You have a security issue

Final version

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Related

Hot Network Questions