Return to Answer

replaced http://stackoverflow.com/ with https://stackoverflow.com/

edited May 23, 2017 at 12:40

Overall, the code is quite clean and understandable. Though, I have never done anything like you are doing, but here are some notes about the code:

the Token and TokenTypes classes may use "__slots__" may use "__slots__" for faster attribute access and memory savings
the char in builtin_map.keys() can and should be replaced with just char in builtin_map to avoid creating the list of keys and looking up in the list instead of in the hashtable - O(n) vs O(1)
the char in ('?', '!', '.') should be replaced with something like char in ADDITIONAL_BUILTIN_CHARS where ADDITIONAL_BUILTIN_CHARS = {'?', '!', '.'} (note - Python 2.7+ syntax) - it is a set that you should define on the module level - or, may be even move to _builtins module to keep all the related constants together
all the other places where you use in to check if character is one of multiple characters can benefit from a similar improvement - defining as a set, giving a meaningful name and moving to the module level
the Lexer should probably support the "iterator" protocol "iterator" protocol (advancing to the next token continuously), like for example this or this one does
the PEP8 import guidelines suggest to put a new line between the different types of imports, replace:
```
 from collections import namedtuple
 from _builtins import builtin_map
```

with:

 from collections import namedtuple
 from _builtins import builtin_map

one-line doc strings can stay on a single line (PEP257)

As a side note, I think you should not be putting the "modified date" into the module itself - let the github and git handle it.

Overall, the code is quite clean and understandable. Though, I have never done anything like you are doing, but here are some notes about the code:

the Token and TokenTypes classes may use "__slots__" for faster attribute access and memory savings
the char in builtin_map.keys() can and should be replaced with just char in builtin_map to avoid creating the list of keys and looking up in the list instead of in the hashtable - O(n) vs O(1)
the char in ('?', '!', '.') should be replaced with something like char in ADDITIONAL_BUILTIN_CHARS where ADDITIONAL_BUILTIN_CHARS = {'?', '!', '.'} (note - Python 2.7+ syntax) - it is a set that you should define on the module level - or, may be even move to _builtins module to keep all the related constants together
all the other places where you use in to check if character is one of multiple characters can benefit from a similar improvement - defining as a set, giving a meaningful name and moving to the module level
the Lexer should probably support the "iterator" protocol (advancing to the next token continuously), like for example this or this one does
the PEP8 import guidelines suggest to put a new line between the different types of imports, replace:
```
 from collections import namedtuple
 from _builtins import builtin_map
```

with:

 from collections import namedtuple
 from _builtins import builtin_map

one-line doc strings can stay on a single line (PEP257)

As a side note, I think you should not be putting the "modified date" into the module itself - let the github and git handle it.

Overall, the code is quite clean and understandable. Though, I have never done anything like you are doing, but here are some notes about the code:

the Token and TokenTypes classes may use "__slots__" for faster attribute access and memory savings
the char in builtin_map.keys() can and should be replaced with just char in builtin_map to avoid creating the list of keys and looking up in the list instead of in the hashtable - O(n) vs O(1)
the char in ('?', '!', '.') should be replaced with something like char in ADDITIONAL_BUILTIN_CHARS where ADDITIONAL_BUILTIN_CHARS = {'?', '!', '.'} (note - Python 2.7+ syntax) - it is a set that you should define on the module level - or, may be even move to _builtins module to keep all the related constants together
all the other places where you use in to check if character is one of multiple characters can benefit from a similar improvement - defining as a set, giving a meaningful name and moving to the module level
the Lexer should probably support the "iterator" protocol (advancing to the next token continuously), like for example this or this one does
the PEP8 import guidelines suggest to put a new line between the different types of imports, replace:
```
 from collections import namedtuple
 from _builtins import builtin_map
```

with:

 from collections import namedtuple
 from _builtins import builtin_map

one-line doc strings can stay on a single line (PEP257)

As a side note, I think you should not be putting the "modified date" into the module itself - let the github and git handle it.

added 219 characters in body

Source Link

edited Feb 7, 2017 at 19:32

alecxe

edited Feb 7, 2017 at 19:32

alecxe

17.5k
8
52
93

Overall, the code is quite clean and understandable. Though, I have never done anything like you are doing, but here are some notes about the code:

the Token and TokenTypes classes may use "__slots__" for faster attribute access and memory savings
the char in builtin_map.keys() can and should be replaced with just char in builtin_map to avoid creating the list of keys and looking up in the list instead of in the hashtable - O(n) vs O(1)
the char in ('?', '!', '.') should be replaced with something like char in ADDITIONAL_BUILTIN_CHARS where ADDITIONAL_BUILTIN_CHARS = {'?', '!', '.'} (note - Python 2.7+ syntax) - it is a set that you should define on the module level - or, may be even move to _builtins module to keep all the related constants together
all the other places where you use in to check if character is one of multiple characters can benefit from a similar improvement - defining as a set, giving a meaningful name and moving to the module level
the Lexer should probably support the "iterator" protocol (advancing to the next token continuously), like for example this or this one does
the PEP8 import guidelines suggest to put a new line between the different types of imports, replace:
```
 from collections import namedtuple
 from _builtins import builtin_map
```

with:

 from collections import namedtuple
 from _builtins import builtin_map

one-line doc strings can stay on a single line (PEP257 )

As a side note, I think you should not be putting the "modified date" into the module itself - let the github and git handle it.

Overall, the code is quite clean and understandable. Though, I have never done anything like you are doing, but here are some notes about the code:

the Token and TokenTypes classes may use "__slots__" for faster attribute access and memory savings
the char in builtin_map.keys() can and should be replaced with just char in builtin_map to avoid creating the list of keys and looking up in the list instead of in the hashtable - O(n) vs O(1)
the char in ('?', '!', '.') should be replaced with something like char in ADDITIONAL_BUILTIN_CHARS where ADDITIONAL_BUILTIN_CHARS = {'?', '!', '.'} (note - Python 2.7+ syntax) - it is a set that you should define on the module level - or, may be even move to _builtins module to keep all the related constants together
all the other places where you use in to check if character is one of multiple characters can benefit from a similar improvement - defining as a set, giving a meaningful name and moving to the module level
the Lexer should probably support the "iterator" protocol (advancing to the next token continuously), like for example this or this one does
the PEP8 import guidelines suggest to put a new line between the types of imports, replace:
```
 from collections import namedtuple
 from _builtins import builtin_map
```

with:

 from collections import namedtuple
 from _builtins import builtin_map

As a side note, I think you should not be putting the "modified date" into the module itself - let the github and git handle it.

Overall, the code is quite clean and understandable. Though, I have never done anything like you are doing, but here are some notes about the code:

the Token and TokenTypes classes may use "__slots__" for faster attribute access and memory savings
the char in builtin_map.keys() can and should be replaced with just char in builtin_map to avoid creating the list of keys and looking up in the list instead of in the hashtable - O(n) vs O(1)
the char in ('?', '!', '.') should be replaced with something like char in ADDITIONAL_BUILTIN_CHARS where ADDITIONAL_BUILTIN_CHARS = {'?', '!', '.'} (note - Python 2.7+ syntax) - it is a set that you should define on the module level - or, may be even move to _builtins module to keep all the related constants together
all the other places where you use in to check if character is one of multiple characters can benefit from a similar improvement - defining as a set, giving a meaningful name and moving to the module level
the Lexer should probably support the "iterator" protocol (advancing to the next token continuously), like for example this or this one does
the PEP8 import guidelines suggest to put a new line between the different types of imports, replace:
```
 from collections import namedtuple
 from _builtins import builtin_map
```

with:

 from collections import namedtuple
 from _builtins import builtin_map

one-line doc strings can stay on a single line (PEP257 )

As a side note, I think you should not be putting the "modified date" into the module itself - let the github and git handle it.

added 219 characters in body

Source Link

edited Feb 7, 2017 at 19:27

alecxe

edited Feb 7, 2017 at 19:27

alecxe

17.5k
8
52
93

Overall, the code is quite clean and understandable. Though, I have never done anything like you are doing, but here are some notes about the code:

the Token and TokenTypes classes may use "__slots__" for faster attribute access and memory savings
the Token and TokenTypes classes may use "__slots__" for faster attribute access and memory savings
the char in builtin_map.keys() can and should be replaced with just char in builtin_map to avoid creating the list of keys and looking up in the list instead of in the hashtable - O(n) vs O(1)
the char in builtin_map.keys() can and should be replaced with just char in builtin_map to avoid creating the list of keys and looking up in the list instead of in the hashtable - O(n) vs O(1)
the char in ('?', '!', '.') should be replaced with something like char in ADDITIONAL_BUILTIN_CHARS where ADDITIONAL_BUILTIN_CHARS = {'?', '!', '.'} (note - Python 2.7+ syntax) - it is a set that you should define on the module level - or, may be even move to _builtins module to keep all the related constants together
the char in ('?', '!', '.') should be replaced with something like char in ADDITIONAL_BUILTIN_CHARS where ADDITIONAL_BUILTIN_CHARS = {'?', '!', '.'} (note - Python 2.7+ syntax) - it is a set that you should define on the module level - or, may be even move to _builtins module to keep all the related constants together
all the other places where you use in to check if character is one of multiple characters can benefit from a similar improvement - defining as a set, giving a meaningful name and moving to the module level
all the other places where you use in to check if character is one of multiple characters can benefit from a similar improvement - defining as a set, giving a meaningful name and moving to the module level
the Lexer should probably support the "iterator" protocol (advancing to the next token continuously), like for example this or this one does
the PEP8 import guidelines suggest to put a new line between the types of imports, replace:
```
 from collections import namedtuple
 from _builtins import builtin_map
```

with:

 from collections import namedtuple
 from _builtins import builtin_map

As a side note, I think you should not be putting the "modified date" into the module itself - let the github and git handle it.

Overall, the code is quite clean and understandable. Though, I have never done anything like you are doing, but here are some notes about the code:

the Token and TokenTypes classes may use "__slots__" for faster attribute access and memory savings
the char in builtin_map.keys() can and should be replaced with just char in builtin_map to avoid creating the list of keys and looking up in the list instead of in the hashtable - O(n) vs O(1)
the char in ('?', '!', '.') should be replaced with something like char in ADDITIONAL_BUILTIN_CHARS where ADDITIONAL_BUILTIN_CHARS = {'?', '!', '.'} (note - Python 2.7+ syntax) - it is a set that you should define on the module level - or, may be even move to _builtins module to keep all the related constants together
all the other places where you use in to check if character is one of multiple characters can benefit from a similar improvement - defining as a set, giving a meaningful name and moving to the module level

Overall, the code is quite clean and understandable. Though, I have never done anything like you are doing, but here are some notes about the code:

the Token and TokenTypes classes may use "__slots__" for faster attribute access and memory savings
the char in builtin_map.keys() can and should be replaced with just char in builtin_map to avoid creating the list of keys and looking up in the list instead of in the hashtable - O(n) vs O(1)
the char in ('?', '!', '.') should be replaced with something like char in ADDITIONAL_BUILTIN_CHARS where ADDITIONAL_BUILTIN_CHARS = {'?', '!', '.'} (note - Python 2.7+ syntax) - it is a set that you should define on the module level - or, may be even move to _builtins module to keep all the related constants together
all the other places where you use in to check if character is one of multiple characters can benefit from a similar improvement - defining as a set, giving a meaningful name and moving to the module level
the Lexer should probably support the "iterator" protocol (advancing to the next token continuously), like for example this or this one does
the PEP8 import guidelines suggest to put a new line between the types of imports, replace:
```
 from collections import namedtuple
 from _builtins import builtin_map
```

with:

 from collections import namedtuple
 from _builtins import builtin_map

As a side note, I think you should not be putting the "modified date" into the module itself - let the github and git handle it.

Source Link

answered Feb 7, 2017 at 19:17

alecxe

answered Feb 7, 2017 at 19:17

alecxe

17.5k
8
52
93

default