Often, when I am initializing something I have to use a temporary variable, for example:
file_str = "path/to/file"
file_file = open(file)
or
regexp_parts = ['foo', 'bar']
regexp = new RegExp( regexp_parts.join('|') )
However, I like to reduce the scope my variables to the smallest scope possible so there is less places where they can be (mis-)used. For example, I try to use for(var i ...)
in C++ so the loop variable is confined to the loop body.
In these initialization cases, if I am using a dynamic language, I am then often tempted to reuse the same variable in order to prevent the initial (and now useless) value from being used latter in the function.
file = "path/to/file"
file = open(file)
regexp = ['...', '...']
regexp = new RegExp( regexp.join('|') )
The idea is that by reducing the number of variables in scope I reduce the chances to misuse them. However this sometimes makes the variable names look a little weird, as in the first example, where "file" refers to a "filename".
I think perhaps this would be a non issue if I could use non-nested scopes
begin scope1
filename = ...
begin scope2
file = open(filename)
end scope1
//use file here
//can't use filename on accident
end scope2
but I can't think of any programming language that supports this.
What rules of thumb should I use in this situation?
- When is it best to reuse the variable?
- When is it best to create an extra variable?
- What other ways do we solve this scope problem?
5 Answers 5
Short answer: No, don't repurpose variables.
The idea is that by reducing the number of variables in scope I reduce the chances to misuse them.
It sounds more like you're misusing variables so that you can reduce the number of them.
Variables are cheap -- use as many as you need. Reusing variables to represent different things at different times will not be cheap in the long run; it will make your code more difficult to understand, and you or someone after you will be much more likely to create bugs trying to maintain such code.
What other ways do we solve this scope problem?
The best way is to choose descriptive names for your variables and never try to reuse one variable for two or more different concepts. The problem is not scope, the problem (as far as I can tell) is accidentally using a variable for the wrong thing. Descriptive names help in that respect.
You can always set a variable that you no longer need to a value that's harmless or invalid. This is common in languages with manual resource management, where you free/delete/release pointers when you're done with them. After doing so, it's often a good idea to set them to nil to prevent future use of that pointer. You can do similar things with other types, like setting a loop counter to -1, but in practice it's not usually necessary.
Also, don't write functions/methods that are so large that you have a hard time keeping track of all the variables you're using. If you've got dozens of variables floating around, there's a good chance that you're code is too complicated; break it down into smaller independent tasks.
-
I'm not worried about reusing variables for performance. I'm worried about the bug risk of leaving variables (and values) in scope after they cease to be useful to me.hugomg– hugomg11/11/2011 13:06:53Commented Nov 11, 2011 at 13:06
-
It's fine to use blocks limit the scope of variables -- people do that all the time. The only limitation is that you can't have blocks that partly overlap as you've described. In practice, this is a good thing; having to remember which blocks had started and which ones had ended would probably cause more bugs than it would prevent. And, of course, you'd need syntax to name blocks so you could say which one was starting or ending because there'd be no way to tell just by looking at the braces. And, it'd cause problems with the stack. I wouldn't want to work in that language.Caleb– Caleb11/11/2011 14:22:36Commented Nov 11, 2011 at 14:22
-
I can imagine it would be horribly comlicated but I don't think it would cause any runtime problems. Scoping can be determined statically by the compiler.hugomg– hugomg11/11/2011 17:15:14Commented Nov 11, 2011 at 17:15
I believe having one variable with multiple meanings is much more confusing than having many variables in scope.
At any rate, if you feel overwhelmed by the number of variables in scope, chances are your method/function is doing too much and should be split.
So the answer is: No, do not reuse variables, particularly not in this way.
reducing the number of variables in scope I reduce the chances to misuse them
My experience says mostly the other way.
- Multiple use (which eventually becomes abuse) forces people to generate multiple meaning and inconstancy of the variable. This is generally bad. One of the classic example (your examples are still quite a departure) is this:
int err
err = some_function1()
//... some other code
if(new_condition)
err = some_fuction2();
return err;
Understand that when the err returns it has potentially the return value err
depending on the context of new_condition
this might work. However, when someone modifies the code, around either function, there is a disaster.
- the other inconsistency i see in your idea is that while value of a variable is not modified or used for some time, it still need to be retained till end of scope. So in your above case -
file = "path/to/file"
file = open(file)
//
if(file != NULL)
The last line in if
has ambiguous meaning. So now i don't know what you are referring to.
ideally i would like to treat them file.path
and file.handle
which helps remove top level clutter while still preserves the sanity of the work.
- Last but most important is that most often variables are abused only if they are not properly named and if they are strongly named almost likely they wont get wrongly. This doesn't sound obvious to many - but when you see code to evolve for few years many would agree.
-
The
open(file)
call returns a file handle which is then assigned to thefile
variable. Sofile
starts out with "path/to/file," but then gets a new value that's a file handle. I don't think it's ambiguous, but it is potentially confusing and not a good plan.Caleb– Caleb11/11/2011 18:15:17Commented Nov 11, 2011 at 18:15
You are quite correct to limit the scope of variables if you can. It's best to simply use short functions, but if you have several variables only used for a few lines, it can make sense to make a scope around them. However, it's a bit ugly in most languages I know, and the benefits are comparatively small if you follow the advice below, so most people don't bother most of the time.
However, you definitely should:
If you have a one-off variable, give it a longer name you're unlikely to use again accidentally, eg:
path_to_config_file = "blah/blah/blah"; cfg_fileh = open path_to_config_file;
Always, always, avoid reusing variables (apart from using the same variable in multiple iterations of a loop):
// Do not do this: val = get_value_from_user(); val = val*100; // Convery from cm to m val = normalise_val_for_calc(val); // Instead try this: len_cm = get_value_from_user(); len_m = len_cm * 100; len_normalised = normalise_val_for_calc(len_m);
I've had, much, much more mistakes from doing #1 and forgetting which individual value was which than from accidentally re-using a variable later in a function
I find this more important than limiting scope, certainly not a good solution to it!
I think most languages than use scope (C and C++, and I think C#, perl, python, etc, although I don't know for sure) let you open a new block. It's especially useful in C++ where a variable going out of scope automatically clears up any memory or filehandles etc it stores (RAII).
For instance, in C++:
int main() {
int len_m;
{
// long calculation with many temporary variables
// consider moving this to a separate function
// but it's ok here under some circumstances
len_m = blah
}
// temporary variables have gone away
// do stuff with len_m
}
(That's not perfect because you have to declare the variable before the block, and initialise it during, but it's sometimes useful. Some scripting lanuages will let you use an ad-hoc function instead, len_m = { /* calculation */ }
...)
If you're having to many variables in a scope, it's probably an indication that too much is happening in that scope.
Take this:
regexp_parts = ['foo', 'bar']
regexp = new RegExp( regexp_parts.join('|') )
Make it this (assuming JS here):
function matchAnyOf() {
var parts = [];
for (var i = 0; i < arguments.length; i++)
parts = parts.concat(arguments[i]);
return new RegExp(parts.join('|'));
}
//And any of these self-explanatory oneliners will do what you want
regexp = matchAnyOf('foo', 'bar');
regexp = matchAnyOf(['foo', 'bar']);
regexp = matchAnyOf('foo|bar');//granted, this won't work if you choose to escape '|' in matchAnyOf (which probably makes sense)
Before, you had a helper variable, which cluttered your scope, now you have an isolated, reusable function (notice how parts
no longer requires a prefix in contrast to regexp_parts
, because context is now very clear) instead.
I assume such a thing is possible for the file example, although more context would be necessary, because as is, I would say that a variable who's sole purpose is to store a literal which is being used exactly once is a bit of an overkill ;)
Explore related questions
See similar questions with these tags.
{}
would do mostly the same thing, as far as scoping goes...file = open(file)
example is the kind that people are talking about when they say "Write code as if the next person maintaining it is a violent psychopath"