Skip to main content
Code Review

Return to Answer

replaced http://stackoverflow.com/ with https://stackoverflow.com/
Source Link

Overall, the code is quite clean and understandable. Though, I have never done anything like you are doing, but here are some notes about the code:

  • the Token and TokenTypes classes may use "__slots__" may use "__slots__" for faster attribute access and memory savings

  • the char in builtin_map.keys() can and should be replaced with just char in builtin_map to avoid creating the list of keys and looking up in the list instead of in the hashtable - O(n) vs O(1)

  • the char in ('?', '!', '.') should be replaced with something like char in ADDITIONAL_BUILTIN_CHARS where ADDITIONAL_BUILTIN_CHARS = {'?', '!', '.'} (note - Python 2.7+ syntax) - it is a set that you should define on the module level - or, may be even move to _builtins module to keep all the related constants together

  • all the other places where you use in to check if character is one of multiple characters can benefit from a similar improvement - defining as a set, giving a meaningful name and moving to the module level

  • the Lexer should probably support the "iterator" protocol "iterator" protocol (advancing to the next token continuously), like for example this or this one does

  • the PEP8 import guidelines suggest to put a new line between the different types of imports, replace:

     from collections import namedtuple
     from _builtins import builtin_map
    

with:

 from collections import namedtuple
 from _builtins import builtin_map
  • one-line doc strings can stay on a single line (PEP257)

As a side note, I think you should not be putting the "modified date" into the module itself - let the github and git handle it.

Overall, the code is quite clean and understandable. Though, I have never done anything like you are doing, but here are some notes about the code:

  • the Token and TokenTypes classes may use "__slots__" for faster attribute access and memory savings

  • the char in builtin_map.keys() can and should be replaced with just char in builtin_map to avoid creating the list of keys and looking up in the list instead of in the hashtable - O(n) vs O(1)

  • the char in ('?', '!', '.') should be replaced with something like char in ADDITIONAL_BUILTIN_CHARS where ADDITIONAL_BUILTIN_CHARS = {'?', '!', '.'} (note - Python 2.7+ syntax) - it is a set that you should define on the module level - or, may be even move to _builtins module to keep all the related constants together

  • all the other places where you use in to check if character is one of multiple characters can benefit from a similar improvement - defining as a set, giving a meaningful name and moving to the module level

  • the Lexer should probably support the "iterator" protocol (advancing to the next token continuously), like for example this or this one does

  • the PEP8 import guidelines suggest to put a new line between the different types of imports, replace:

     from collections import namedtuple
     from _builtins import builtin_map
    

with:

 from collections import namedtuple
 from _builtins import builtin_map
  • one-line doc strings can stay on a single line (PEP257)

As a side note, I think you should not be putting the "modified date" into the module itself - let the github and git handle it.

Overall, the code is quite clean and understandable. Though, I have never done anything like you are doing, but here are some notes about the code:

  • the Token and TokenTypes classes may use "__slots__" for faster attribute access and memory savings

  • the char in builtin_map.keys() can and should be replaced with just char in builtin_map to avoid creating the list of keys and looking up in the list instead of in the hashtable - O(n) vs O(1)

  • the char in ('?', '!', '.') should be replaced with something like char in ADDITIONAL_BUILTIN_CHARS where ADDITIONAL_BUILTIN_CHARS = {'?', '!', '.'} (note - Python 2.7+ syntax) - it is a set that you should define on the module level - or, may be even move to _builtins module to keep all the related constants together

  • all the other places where you use in to check if character is one of multiple characters can benefit from a similar improvement - defining as a set, giving a meaningful name and moving to the module level

  • the Lexer should probably support the "iterator" protocol (advancing to the next token continuously), like for example this or this one does

  • the PEP8 import guidelines suggest to put a new line between the different types of imports, replace:

     from collections import namedtuple
     from _builtins import builtin_map
    

with:

 from collections import namedtuple
 from _builtins import builtin_map
  • one-line doc strings can stay on a single line (PEP257)

As a side note, I think you should not be putting the "modified date" into the module itself - let the github and git handle it.

added 219 characters in body
Source Link
alecxe
  • 17.5k
  • 8
  • 52
  • 93

Overall, the code is quite clean and understandable. Though, I have never done anything like you are doing, but here are some notes about the code:

  • the Token and TokenTypes classes may use "__slots__" for faster attribute access and memory savings

  • the char in builtin_map.keys() can and should be replaced with just char in builtin_map to avoid creating the list of keys and looking up in the list instead of in the hashtable - O(n) vs O(1)

  • the char in ('?', '!', '.') should be replaced with something like char in ADDITIONAL_BUILTIN_CHARS where ADDITIONAL_BUILTIN_CHARS = {'?', '!', '.'} (note - Python 2.7+ syntax) - it is a set that you should define on the module level - or, may be even move to _builtins module to keep all the related constants together

  • all the other places where you use in to check if character is one of multiple characters can benefit from a similar improvement - defining as a set, giving a meaningful name and moving to the module level

  • the Lexer should probably support the "iterator" protocol (advancing to the next token continuously), like for example this or this one does

  • the PEP8 import guidelines suggest to put a new line between the different types of imports, replace:

     from collections import namedtuple
     from _builtins import builtin_map
    

with:

 from collections import namedtuple
 from _builtins import builtin_map
  • one-line doc strings can stay on a single line (PEP257 )

As a side note, I think you should not be putting the "modified date" into the module itself - let the github and git handle it.

Overall, the code is quite clean and understandable. Though, I have never done anything like you are doing, but here are some notes about the code:

  • the Token and TokenTypes classes may use "__slots__" for faster attribute access and memory savings

  • the char in builtin_map.keys() can and should be replaced with just char in builtin_map to avoid creating the list of keys and looking up in the list instead of in the hashtable - O(n) vs O(1)

  • the char in ('?', '!', '.') should be replaced with something like char in ADDITIONAL_BUILTIN_CHARS where ADDITIONAL_BUILTIN_CHARS = {'?', '!', '.'} (note - Python 2.7+ syntax) - it is a set that you should define on the module level - or, may be even move to _builtins module to keep all the related constants together

  • all the other places where you use in to check if character is one of multiple characters can benefit from a similar improvement - defining as a set, giving a meaningful name and moving to the module level

  • the Lexer should probably support the "iterator" protocol (advancing to the next token continuously), like for example this or this one does

  • the PEP8 import guidelines suggest to put a new line between the types of imports, replace:

     from collections import namedtuple
     from _builtins import builtin_map
    

with:

 from collections import namedtuple
 from _builtins import builtin_map

As a side note, I think you should not be putting the "modified date" into the module itself - let the github and git handle it.

Overall, the code is quite clean and understandable. Though, I have never done anything like you are doing, but here are some notes about the code:

  • the Token and TokenTypes classes may use "__slots__" for faster attribute access and memory savings

  • the char in builtin_map.keys() can and should be replaced with just char in builtin_map to avoid creating the list of keys and looking up in the list instead of in the hashtable - O(n) vs O(1)

  • the char in ('?', '!', '.') should be replaced with something like char in ADDITIONAL_BUILTIN_CHARS where ADDITIONAL_BUILTIN_CHARS = {'?', '!', '.'} (note - Python 2.7+ syntax) - it is a set that you should define on the module level - or, may be even move to _builtins module to keep all the related constants together

  • all the other places where you use in to check if character is one of multiple characters can benefit from a similar improvement - defining as a set, giving a meaningful name and moving to the module level

  • the Lexer should probably support the "iterator" protocol (advancing to the next token continuously), like for example this or this one does

  • the PEP8 import guidelines suggest to put a new line between the different types of imports, replace:

     from collections import namedtuple
     from _builtins import builtin_map
    

with:

 from collections import namedtuple
 from _builtins import builtin_map
  • one-line doc strings can stay on a single line (PEP257 )

As a side note, I think you should not be putting the "modified date" into the module itself - let the github and git handle it.

added 219 characters in body
Source Link
alecxe
  • 17.5k
  • 8
  • 52
  • 93

Overall, the code is quite clean and understandable. Though, I have never done anything like you are doing, but here are some notes about the code:

  • the Token and TokenTypes classes may use "__slots__" for faster attribute access and memory savings

    the Token and TokenTypes classes may use "__slots__" for faster attribute access and memory savings

  • the char in builtin_map.keys() can and should be replaced with just char in builtin_map to avoid creating the list of keys and looking up in the list instead of in the hashtable - O(n) vs O(1)

    the char in builtin_map.keys() can and should be replaced with just char in builtin_map to avoid creating the list of keys and looking up in the list instead of in the hashtable - O(n) vs O(1)

  • the char in ('?', '!', '.') should be replaced with something like char in ADDITIONAL_BUILTIN_CHARS where ADDITIONAL_BUILTIN_CHARS = {'?', '!', '.'} (note - Python 2.7+ syntax) - it is a set that you should define on the module level - or, may be even move to _builtins module to keep all the related constants together

    the char in ('?', '!', '.') should be replaced with something like char in ADDITIONAL_BUILTIN_CHARS where ADDITIONAL_BUILTIN_CHARS = {'?', '!', '.'} (note - Python 2.7+ syntax) - it is a set that you should define on the module level - or, may be even move to _builtins module to keep all the related constants together

  • all the other places where you use in to check if character is one of multiple characters can benefit from a similar improvement - defining as a set, giving a meaningful name and moving to the module level

    all the other places where you use in to check if character is one of multiple characters can benefit from a similar improvement - defining as a set, giving a meaningful name and moving to the module level

  • the Lexer should probably support the "iterator" protocol (advancing to the next token continuously), like for example this or this one does

  • the PEP8 import guidelines suggest to put a new line between the types of imports, replace:

     from collections import namedtuple
     from _builtins import builtin_map
    

with:

 from collections import namedtuple
 from _builtins import builtin_map

As a side note, I think you should not be putting the "modified date" into the module itself - let the github and git handle it.

Overall, the code is quite clean and understandable. Though, I have never done anything like you are doing, but here are some notes about the code:

  • the Token and TokenTypes classes may use "__slots__" for faster attribute access and memory savings
  • the char in builtin_map.keys() can and should be replaced with just char in builtin_map to avoid creating the list of keys and looking up in the list instead of in the hashtable - O(n) vs O(1)
  • the char in ('?', '!', '.') should be replaced with something like char in ADDITIONAL_BUILTIN_CHARS where ADDITIONAL_BUILTIN_CHARS = {'?', '!', '.'} (note - Python 2.7+ syntax) - it is a set that you should define on the module level - or, may be even move to _builtins module to keep all the related constants together
  • all the other places where you use in to check if character is one of multiple characters can benefit from a similar improvement - defining as a set, giving a meaningful name and moving to the module level

Overall, the code is quite clean and understandable. Though, I have never done anything like you are doing, but here are some notes about the code:

  • the Token and TokenTypes classes may use "__slots__" for faster attribute access and memory savings

  • the char in builtin_map.keys() can and should be replaced with just char in builtin_map to avoid creating the list of keys and looking up in the list instead of in the hashtable - O(n) vs O(1)

  • the char in ('?', '!', '.') should be replaced with something like char in ADDITIONAL_BUILTIN_CHARS where ADDITIONAL_BUILTIN_CHARS = {'?', '!', '.'} (note - Python 2.7+ syntax) - it is a set that you should define on the module level - or, may be even move to _builtins module to keep all the related constants together

  • all the other places where you use in to check if character is one of multiple characters can benefit from a similar improvement - defining as a set, giving a meaningful name and moving to the module level

  • the Lexer should probably support the "iterator" protocol (advancing to the next token continuously), like for example this or this one does

  • the PEP8 import guidelines suggest to put a new line between the types of imports, replace:

     from collections import namedtuple
     from _builtins import builtin_map
    

with:

 from collections import namedtuple
 from _builtins import builtin_map

As a side note, I think you should not be putting the "modified date" into the module itself - let the github and git handle it.

Source Link
alecxe
  • 17.5k
  • 8
  • 52
  • 93
Loading
default

AltStyle によって変換されたページ (->オリジナル) /