Skip to main content
Code Review

Return to Answer

replaced http://stackoverflow.com/ with https://stackoverflow.com/
Source Link
  • The docstrings at the beginning of the functions are used by the help system of Python, e.g. when using the help function in an interactive Python session, or when running the pydoc command from the command line. In addition, they can contain doctests, which not only provide examples for how to use a function, but can also be automatically executed in order to test the function.

  • Using the with syntax ensures that the input file is closed when the program is finished (h5py.File supports it).

  • Using izip in generate_coordinates prevents loading the entire file contents at once. If you are using Python 3, the regular zip function already behaves like this.

  • Inside the format_coordinate function there is a local function format_parts generating the two parts of the result for the latitude and the longitude, which are then joined together. It uses the str.format function to format each angle of the coordinate as a prefix and a three-digit zero-filled number.

  • I have taken the make_sure_path_exists function from this post this post on StackOverflow and modified it slightly.

  • The if __name__ == '__main__' idiom if __name__ == '__main__' idiom prevents the program to be executed when the file is not run as a script, but imported as a module.

  • os.path.join is the preferred way to combine paths, as opposed to manual string concatenation, because it is platform-independent and removes the possibility of errors such as combining A and B to AB instead of A/B.

  • The docstrings at the beginning of the functions are used by the help system of Python, e.g. when using the help function in an interactive Python session, or when running the pydoc command from the command line. In addition, they can contain doctests, which not only provide examples for how to use a function, but can also be automatically executed in order to test the function.

  • Using the with syntax ensures that the input file is closed when the program is finished (h5py.File supports it).

  • Using izip in generate_coordinates prevents loading the entire file contents at once. If you are using Python 3, the regular zip function already behaves like this.

  • Inside the format_coordinate function there is a local function format_parts generating the two parts of the result for the latitude and the longitude, which are then joined together. It uses the str.format function to format each angle of the coordinate as a prefix and a three-digit zero-filled number.

  • I have taken the make_sure_path_exists function from this post on StackOverflow and modified it slightly.

  • The if __name__ == '__main__' idiom prevents the program to be executed when the file is not run as a script, but imported as a module.

  • os.path.join is the preferred way to combine paths, as opposed to manual string concatenation, because it is platform-independent and removes the possibility of errors such as combining A and B to AB instead of A/B.

  • The docstrings at the beginning of the functions are used by the help system of Python, e.g. when using the help function in an interactive Python session, or when running the pydoc command from the command line. In addition, they can contain doctests, which not only provide examples for how to use a function, but can also be automatically executed in order to test the function.

  • Using the with syntax ensures that the input file is closed when the program is finished (h5py.File supports it).

  • Using izip in generate_coordinates prevents loading the entire file contents at once. If you are using Python 3, the regular zip function already behaves like this.

  • Inside the format_coordinate function there is a local function format_parts generating the two parts of the result for the latitude and the longitude, which are then joined together. It uses the str.format function to format each angle of the coordinate as a prefix and a three-digit zero-filled number.

  • I have taken the make_sure_path_exists function from this post on StackOverflow and modified it slightly.

  • The if __name__ == '__main__' idiom prevents the program to be executed when the file is not run as a script, but imported as a module.

  • os.path.join is the preferred way to combine paths, as opposed to manual string concatenation, because it is platform-independent and removes the possibility of errors such as combining A and B to AB instead of A/B.

some more remarks
Source Link
mkrieger1
  • 1.8k
  • 1
  • 14
  • 26

Side note

letter = set('-')
if letter & set(lat_check):
 ...

This is actually a clever idea, if you wanted to check if any of several letters is contained in a string. However, for a single letter, you can simply write the following:

if '-' in lat_check:
 ...

(But checking for negative values can be done more easily, see below.)

You can simplyjust use the original name of the list wherever the *_copy_1 or *_copy_2 names are used.

This does create a new list, as you intended:

lat_list = []
for lat in meta_df['latitude']:
 lat_list.append(lat)

However, it could more simply be written as

lat_list = list(meta_df['latitude'])

(But it turns out that creating this list wasn't necessary either, see below.)

If you are astonished by how assignments don't make copies in Python, you can read a good explanation in this blog post .

from itertools import izip
import errno
import h5py
import os
def generate_coordinates(input_filename):
 """Generate all (longitude, latitude) pairs contained in the input file."""
 with h5py.File(input_filename, 'r') as hf:
 for coordinate in izip(hf['longitude'], hf['latitude']):
 yield coordinate
def format_coordinate(coordinate):
 """Format a pair of numbersangles (longitude, latitude) as an 8-character string:
 >>> format_coordinate((0.5, 12.3))
 'E000N012'
 >>> format_coordinate((-40.23, -138.652))
 'W040S138'
 """
 def format_parts():
 fmt = '{prefix}{value:03d}'
 for angle, directions in zip(coordinate, ['EW', 'NS']):
 if angle >= 0:
 yield fmt.format(prefix=directions[0], value=int(angle))
 else:
 yield fmt.format(prefix=directions[1], value=int(-angle))
 return ''.join(format_parts())
def make_sure_path_exists(path, verbose=True):
 """Create the directory at the specified path if it doesn't exist.
 If verbose, print the name of the created directory.
 """
 try:
 os.makedirs(path)
 except OSError as exception:
 if exception.errno != errno.EEXIST:
 raise
 else:
 if verbose:
 print 'created directory {}'.format(path)
if __name__ == '__main__':
 for c in generate_coordinates('/path/to/input_file'):
 dir_name = format_coordinate(c)
 make_sure_path_exists(os.path.join('/output/dir', dir_name))

You can simply use the original name of the list wherever the *_copy_1 or *_copy_2 names are used.

from itertools import izip
import errno
import h5py
import os
def generate_coordinates(input_filename):
 """Generate all (longitude, latitude) pairs contained in the input file."""
 with h5py.File(input_filename, 'r') as hf:
 for coordinate in izip(hf['longitude'], hf['latitude']):
 yield coordinate
def format_coordinate(coordinate):
 """Format a pair of numbers (longitude, latitude) as an 8-character string:
 >>> format_coordinate((0.5, 12.3))
 'E000N012'
 >>> format_coordinate((-40.23, -138.652))
 'W040S138'
 """
 def format_parts():
 fmt = '{prefix}{value:03d}'
 for angle, directions in zip(coordinate, ['EW', 'NS']):
 if angle >= 0:
 yield fmt.format(prefix=directions[0], value=int(angle))
 else:
 yield fmt.format(prefix=directions[1], value=int(-angle))
 return ''.join(format_parts())
def make_sure_path_exists(path, verbose=True):
 """Create the directory at the specified path if it doesn't exist.
 If verbose, print the name of the created directory.
 """
 try:
 os.makedirs(path)
 except OSError as exception:
 if exception.errno != errno.EEXIST:
 raise
 else:
 if verbose:
 print 'created directory {}'.format(path)
if __name__ == '__main__':
 for c in generate_coordinates('/path/to/input_file'):
 dir_name = format_coordinate(c)
 make_sure_path_exists(os.path.join('/output/dir', dir_name))

Side note

letter = set('-')
if letter & set(lat_check):
 ...

This is actually a clever idea, if you wanted to check if any of several letters is contained in a string. However, for a single letter, you can simply write the following:

if '-' in lat_check:
 ...

(But checking for negative values can be done more easily, see below.)

You can just use the original name of the list wherever the *_copy_1 or *_copy_2 names are used.

This does create a new list, as you intended:

lat_list = []
for lat in meta_df['latitude']:
 lat_list.append(lat)

However, it could more simply be written as

lat_list = list(meta_df['latitude'])

(But it turns out that creating this list wasn't necessary either, see below.)

If you are astonished by how assignments don't make copies in Python, you can read a good explanation in this blog post .

from itertools import izip
import errno
import h5py
import os
def generate_coordinates(input_filename):
 """Generate all (longitude, latitude) pairs contained in the input file."""
 with h5py.File(input_filename, 'r') as hf:
 for coordinate in izip(hf['longitude'], hf['latitude']):
 yield coordinate
def format_coordinate(coordinate):
 """Format a pair of angles (longitude, latitude) as an 8-character string:
 >>> format_coordinate((0.5, 12.3))
 'E000N012'
 >>> format_coordinate((-40.23, -138.652))
 'W040S138'
 """
 def format_parts():
 fmt = '{prefix}{value:03d}'
 for angle, directions in zip(coordinate, ['EW', 'NS']):
 if angle >= 0:
 yield fmt.format(prefix=directions[0], value=int(angle))
 else:
 yield fmt.format(prefix=directions[1], value=int(-angle))
 return ''.join(format_parts())
def make_sure_path_exists(path, verbose=True):
 """Create the directory at the specified path if it doesn't exist.
 If verbose, print the name of the created directory.
 """
 try:
 os.makedirs(path)
 except OSError as exception:
 if exception.errno != errno.EEXIST:
 raise
 else:
 if verbose:
 print 'created directory {}'.format(path)
if __name__ == '__main__':
 for c in generate_coordinates('/path/to/input_file'):
 dir_name = format_coordinate(c)
 make_sure_path_exists(os.path.join('/output/dir', dir_name))
don't use LaTeX for simple numbers
Source Link
mkrieger1
  • 1.8k
  • 1
  • 14
  • 26

Inside the for loop you want to format only the current latitude value, but instead the whole list is rewritten each time. If you have a list with \$n = 1000\$n = 1000 items, the inner code (i.replace('-', 'S') or 'N' + lat_addchar) is executed \$n^2 = 1,000円,000円\$n2 = 1 000 000 times, of which the first \999ドル,000円\$999 000 times were wasted effort.

Inside the for loop you want to format only the current latitude value, but instead the whole list is rewritten each time. If you have a list with \$n = 1000\$ items, the inner code (i.replace('-', 'S') or 'N' + lat_addchar) is executed \$n^2 = 1,000円,000円\$ times, of which the first \999ドル,000円\$ times were wasted effort.

Inside the for loop you want to format only the current latitude value, but instead the whole list is rewritten each time. If you have a list with n = 1000 items, the inner code (i.replace('-', 'S') or 'N' + lat_addchar) is executed n2 = 1 000 000 times, of which the first 999 000 times were wasted effort.

Source Link
mkrieger1
  • 1.8k
  • 1
  • 14
  • 26
Loading
lang-py

AltStyle によって変換されたページ (->オリジナル) /