void removeForbiddenChar(string* s)
{
string::iterator it;
for (it = s->begin() ; it < s->end() ; ++it){
switch(*it){
case '/':case '\\':case ':':case '?':case '"':case '<':case '>':case '|':
*it = ' ';
}
}
}
I used this function to remove a string that has any of the following character: ,円 /, :, ?, ", <,>, |. This is for a file's name. This program runs fine. It simply change a character of the string to a blank when the respective character is the forbidden character. However, I have a feeling against this use of switch statement. I simply exploit the case syntax here, but this, somehow nags me. I just don't like it. Anybody else got a better suggestion of a better implementation in this case?
-
\$\begingroup\$ If a char isn't forbidden, then we leave it be. \$\endgroup\$Karl– Karl2011年01月27日 17:22:02 +00:00Commented Jan 27, 2011 at 17:22
-
1\$\begingroup\$ Name it replaceForbiddenChars, as it replaces instead of removes and handles multiple characters. \$\endgroup\$Fred Nurk– Fred Nurk2011年01月27日 17:30:09 +00:00Commented Jan 27, 2011 at 17:30
-
\$\begingroup\$ @Fred: If none of the cases in a switch-statement match, the control flow continues after the switch statement. That behavior is perfectly defined. \$\endgroup\$sepp2k– sepp2k2011年01月27日 17:33:42 +00:00Commented Jan 27, 2011 at 17:33
-
\$\begingroup\$ @sepp2k: Thanks, it is well-defined. I'm not sure why I thought that, but I'll blame it on articles I've been reading about (micro-)optimizing. \$\endgroup\$Fred Nurk– Fred Nurk2011年01月27日 17:37:52 +00:00Commented Jan 27, 2011 at 17:37
-
\$\begingroup\$ Seems you forgot * symbol (asterisk). It is forbidden also. \$\endgroup\$user4124– user41242011年05月06日 23:28:05 +00:00Commented May 6, 2011 at 23:28
7 Answers 7
Declare a string containing the illegal characters: "\\/:?"<>|"
. All you need to do is check if the char is in the array, so use a native function for that, or write a method CharInString(char* needle, string* haystack)
which loops through the contents of the provided haystack to check if the needle is inside it.
Your loop should end up looking like this:
string illegalChars = "\\/:?\"<>|"
for (it = s->begin() ; it < s->end() ; ++it){
bool found = illegalChars.find(*it) != string::npos;
if(found){
*it = ' ';
}
}
It's more maintainable and readable. You can tell if you've duplicated a character quite easily and since you can do it with any target string and any string of illegalChars you've just created for yourself a generic RemoveIllegalChars(string* targetString, string* illegalChars)
method usable anywhere in your program.
I may be using those pointers wrong. My C++fu is weak... for now.
-
\$\begingroup\$ +1, I was also going to recommend using a string to store the forbidden characters. I would add that this change makes it very easy to add the forbidden characters as a parameter to the
removeForbiddenChars
function, so that if the need should ever arise, it can be used in situations where different sets of characters are forbidden. Also you can use thefind
method to find out whether a character is in a string, so you don't necessarily need to write aCharInString
function (or you could write is a simple wrapper aroundfind
). \$\endgroup\$sepp2k– sepp2k2011年01月27日 16:24:15 +00:00Commented Jan 27, 2011 at 16:24 -
\$\begingroup\$ @sepp2k: We seem to be on the same wavelength here! :) I'll update my answer with the
find
method. \$\endgroup\$doppelgreener– doppelgreener2011年01月27日 16:25:33 +00:00Commented Jan 27, 2011 at 16:25 -
\$\begingroup\$ Maybe the file names are short and we don't call this function much, but please be aware that the proposed solution is O(n*m) on the number of characters in the string (n) and the number of illegal characters in the string (m). \$\endgroup\$WilliamKF– WilliamKF2011年02月09日 02:04:41 +00:00Commented Feb 9, 2011 at 2:04
you could always use transform
#include <algorithm>
#include <string>
#include <iostream>
const std::string forbiddenChars = "\\/:?\"<>|";
static char ClearForbidden(char toCheck)
{
if(forbiddenChars.find(toCheck) != string::npos)
{
return ' ';
}
return toCheck;
}
int main()
{
std::string str = "EXAMPLE:";
std::transform(str.begin(), str.end(), str.begin(), ClearForbidden);
std::cout << str << std::endl;
return 0;
}
-
\$\begingroup\$ Didn't even see this when I was just posting my answer. Yet another way to do it with a different STL algorithm :) \$\endgroup\$Mark Loeser– Mark Loeser2011年01月28日 00:31:23 +00:00Commented Jan 28, 2011 at 0:31
-
2\$\begingroup\$ Same thing with lambda:
std::transform(str.begin(), str.end(), str.begin(), [&forbidden](char c) { return forbidden.find(c) != std::string::npos ? ' ' : c; }
\$\endgroup\$Jon Purdy– Jon Purdy2011年01月28日 02:37:59 +00:00Commented Jan 28, 2011 at 2:37
Or, here's yet another way you could do it by using all stuff from the STL:
#include <algorithm>
#include <string>
#include <iostream>
bool isForbidden( char c )
{
static std::string forbiddenChars( "\\/:?\"<>|" );
return std::string::npos != forbiddenChars.find( c );
}
int main()
{
std::string myString( "hell?o" );
std::replace_if( myString.begin(), myString.end(), isForbidden, ' ' );
std::cout << "Now: " << myString << std::endl;
}
One thing that I would change about your function (in addition to Jonathan's recommendation of using a string to store the forbidden characters), is the argument type of removeForbiddenChar
to string&
instead of string*
. It is generally considered good practice in C++ to use references over pointers where possible (see for example this entry in the C++ faq-lite).
One further, minor cosmetic change I'd recommend is renaming the function to removeForbiddenChars
(plural) as that is more descriptive of what it does.
-
\$\begingroup\$ You are never checking the validity of the string * s, so if a nullptr is passed in the removeForbiddenChar function will attempt to dereference a nullptr. This implies that the caller of removeForbiddenChar should check for nullptr before calling removeForbiddenChar, but the caller won't necessarily be aware of this unless they view the internals of removeForbiddenChar. Requiring the reference to be passed in instead of a pointer conveys that your intention is: "You MUST have a valid string in order to call removeForbiddenChar." \$\endgroup\$YoungJohn– YoungJohn2013年12月12日 23:06:19 +00:00Commented Dec 12, 2013 at 23:06
C comes with a helpful function size_t strcspn(const char *string, const char *delimiters)
that you can implement this on top of. The ASCII version is pretty fast; it uses a bit vector to test for the delimiter characters.
-
\$\begingroup\$ If you are looking for performance, this one is hard to beat. \$\endgroup\$EvilTeach– EvilTeach2011年01月28日 14:45:20 +00:00Commented Jan 28, 2011 at 14:45
Solution with no conditional branching.
Swapping space for time optimization.
Simplified algorithm:
void removeForbiddenChar(string* s)
{
for (string::iterator it = s->begin() ; it < s->end() ; ++it)
{
// replace element with their counterpart in the map
// This replaces forbidden characters with space.
(*it) = charMap[*it];
}
}
Or the C++0x version:
void removeForbiddenChar(std::string* s)
{
std::transform(s->begin(), s->end(), [](char c) => {return charMap[c];});
}
Just need the data:
char charMap[] =
// The majority of characters in this array
// map the poistion to the same character code.
// charMap['A'] == 'A'
// For forbidden characters a space is in the position
// charMap['<'] == ' '
// Note: \xxx is an octal escape sequence
"000円001円002円003円004円005円006円007円"
"010円011円012円013円014円015円016円017円"
"020円021円022円023円024円025円026円027円"
"030円031円032円033円034円035円036円037円"
"040円041円 043円044円045円046円047円" // replaced 042円(") with space
"050円051円052円053円054円055円056円 " // replaced 057円(/) with space
"060円061円062円063円064円065円066円067円"
"070円071円 073円 075円 " // replaced 072円(:)074円(<)076円(>)077円(?) with space
"100円101円102円103円104円105円106円107円"
"110円111円112円113円114円115円116円117円"
"120円121円122円123円124円125円126円127円"
"130円131円132円133円 135円136円137円" // replaced 134円(\)
"140円141円142円143円144円145円146円147円"
"150円151円152円153円154円155円156円157円"
"160円161円162円163円164円165円166円167円"
"170円171円172円173円174円175円176円177円"
"200円201円202円203円204円205円206円207円"
"210円211円212円213円214円215円216円217円"
"220円221円222円223円224円225円226円227円"
"230円231円232円233円234円235円236円237円"
"240円241円242円243円244円245円246円247円"
"250円251円252円253円254円255円256円257円"
"260円261円262円263円264円265円266円267円"
"270円271円272円273円274円275円276円277円"
"300円301円302円303円304円305円306円307円"
"310円311円312円313円314円315円316円317円"
"320円321円322円323円324円325円326円327円"
"330円331円332円333円334円335円336円337円"
"340円341円342円343円344円345円346円347円"
"350円351円352円353円354円355円356円357円"
"360円361円362円363円364円365円366円367円"
"370円371円372円373円374円375円376円377円";
Similar to strcspn
is strpbrk
, but instead of returning offsets, it returns a pointer to the next match and NULL if no more matches. This makes the replacement as simple as:
while ((filename = strpbrk(filename , "\\/:?\"<>|")) != NULL)
*filename++ = '_';