I'm trying to verify if a subdomain entered by a user is valid, but whatever I pass in, it's never valid. I know the regex is ok, so the problem is my "if" logic, however I'm new to shell/bash
#!/bin/bash
#
echo Enter the subdomain\'s name to configure.
read SUBDOMAIN
if [[ ! $SUBDOMAIN =~ [A-Za-z0-9](?:[A-Za-z0-9-]{0,61}[A-Za-z0-9])? ]]; then
echo "$SUBDOMAIN is not a valid domain"
fi
Examples:
Would be accepted (regular subdomain names): test
Would not be accepted (invalid subdomain name): -
Would not be accepted (invalid subdomain name): (Empty)
Would not be accepted (invalid subdomain name): #$??&@#&?$##$
I would prefer using shell, but the parentheses in the regex make the script throw an error.
I'm not sure if it can be done with grep, but I never understood how to use grep and it always confused me.
-
Likely related: Bash =~ regex and https://regex101.com/steeldriver– steeldriver2018年04月30日 15:51:18 +00:00Commented Apr 30, 2018 at 15:51
-
@roaima DONE! :)NaturalBornCamper– NaturalBornCamper2018年04月30日 16:04:24 +00:00Commented Apr 30, 2018 at 16:04
-
@steeldriver I checked it out but "set -o rematchpcre" doesn't workNaturalBornCamper– NaturalBornCamper2018年04月30日 16:06:27 +00:00Commented Apr 30, 2018 at 16:06
-
@roaima Because subdomains can contain dashes for example, but cannot start with a dashNaturalBornCamper– NaturalBornCamper2018年04月30日 16:12:58 +00:00Commented Apr 30, 2018 at 16:12
1 Answer 1
If you're trying to match "alphanumeric" followed by "alphanumeric or dash", ensuring there's not a dash at the end, such that there is a total of 1..62 characters, this RE will work for you
^[[:alnum:]](([[:alnum:]]|-){0,61}[[:alnum:]])?$
This binds to the beginning and end of the string, so the RE must match the string in its entirety.
- Start of line
^
- A single alphanumeric, any case
[[:alnum:]]
- An optional block (bracketed
(
...)
and terminated with?
)[[:alnum:]]
or a dash-
, repeated 0..60 times[[:alnum:]]
- End of line
$
As has been recommended in the comments under this answer, I should point out that the [[:alnum:]]
range is affected by the current locale. If you want to ensure that it matches only "ASCII" A-Z, a-z and 0-9 you need to ensure it's running with LANG=C
. Otherwise you may find that additional characters are accepted, such as á é ø ß and others.
-
Thanks friend! Your regex looks much better! I just have to change the regex a bit so subdomains can't end with a dash as well and It's all good :)NaturalBornCamper– NaturalBornCamper2018年04月30日 16:21:39 +00:00Commented Apr 30, 2018 at 16:21
-
@NaturalBornCamper that's actually a little more complicated than it soundsChris Davies– Chris Davies2018年04月30日 16:23:53 +00:00Commented Apr 30, 2018 at 16:23
-
Nope, what you gave me got me started, I just changed your answer a bit and it's working: if [[ ! $SUBDOMAIN =~ ^[[:alnum:]]([[:alnum:]]|-){0,61}[[:alnum:]]$ ]];NaturalBornCamper– NaturalBornCamper2018年04月30日 16:26:12 +00:00Commented Apr 30, 2018 at 16:26
-
@NaturalBornCamper that will fail with a single character entry. It will also accept a 63 character string. Please see the amended answer for my suggestion.Chris Davies– Chris Davies2018年04月30日 16:27:26 +00:00Commented Apr 30, 2018 at 16:27
-
1@roaima Since you are writing an answer about what you do know it follows that it is reasonable that you should make a note about the a-z ranges matching many UNICODE characters and not leave that hidden.user232326– user2323262018年05月01日 07:40:31 +00:00Commented May 1, 2018 at 7:40