Getting the Twitter account from various input formats

Question 1

I have an input for users to enter a Twitter account in any different way and I want to extract the user account.

For example:

twitters = [
 "www.twitter.com/twitteruser1",
 "@twitteruser2",
 "twitteruser3",
 "https://twitter.com/twitteruser4",
 "https://www.twitter.com/twitteruser5",
 "www.twitter.com/twitteruser6",
 "http://www.twitter.com/twitteruser7",
 "http://www.twitter.com/twitteruser8",
 "twitter.com/twitteruser9"
]

The script that I've written to extract the data is the following:

twitters.each do |twitter|
 # for the url
 twitter_user = twitter.match(/twitter.com\/([^\/.]*)$/)
 if twitter_user != nil
 puts twitter_user[1]
 next
 end
 # for @ beginning
 twitter_user = twitter.match(/^@([^\/.]*)$/)
 if twitter_user != nil
 puts twitter_user[1]
 next
 end
 # if we arrive, we haven't found any coincidence
 puts twitter
end

It actually works, outputting the following:

twitteruser1
twitteruser2
twitteruser3
twitteruser4
twitteruser5
twitteruser6
twitteruser7
twitteruser8
twitteruser9

But as I'm really newbie in Ruby I wanted to check for possible improvements.

Question 2

I would put the regular expressions into a list:

TWITTER_PATTERNS = [
 /twitter.com\/([^\/.]*)$/, # "www.twitter.com/twitteruser1"
 /^@([^\/.]*)$/, # "@twitteruser2"
]

This function can iterate over that list:

def twitter_user(twitter)
 TWITTER_PATTERNS.each do |pattern|
 return 1ドル if twitter =~ pattern
 end
 twitter
end

This clearly separates the "policy" (what patterns are used to extract twitter users) from the "mechanism" (the loop we use to apply the policy). A change to the function's implementation won't require a change to the patterns, and vice versa.

Question 3

You can use a ruby case statement to check for different regex matches. If a match is found, it is by definition not nil, so it removes the need for a nil check as well.

twitters.each do |twitter|
 case twitter
 # for the url
 when /twitter.com\/([^\/.]*)$/
 puts 1ドル
 # for @ beginning
 when /^@([^\/.]*)$/
 puts 1ドル
 # if we arrive, we haven't found any coincidence
 else
 puts twitter
 end
end

When ruby does a regex match, any capture groups are assigned to the global variables 1ドル, 2ドル, 3ドル, and so on. See this question on StackOverflow for more details about the mechanic.

Note on your regex: If you want the capture groups to actually contain anything, you should change the asterisk to a plus: ([^\/.]+). Without the plus, "www.twitter.com/" will be captured by one of the regex. With the plus, that string will fall into the default case.

Question 4

twitters.each do |twitter|
 puts twitter.match(/twitter.com\/([^\/.]*)$/) ||
 twitter.match(/^@([^\/.]*)$/) ||
 twitter
end

It's shorter, and I'd say at least equally readable. Chain of || operators will return first truthy value.

Question 5

Hi @Borsunho, thanks for the quick response. The problem is that in the two first cases, I need the value of the twitter.match(...)[1], and if I access to [1] without checking if twitter.match(...) is nil, for the accounts that haven't matched the first case, the script fails with a: undefined method [] for nil:NilClass

Question 6

I suspect there is a bug here. If I'm right, it would be fixed by appending &.()[1] to each of the two match expressions, but the fix makes the code pretty ugly. There's probably a better fix.

score 2 · Accepted Answer · 2016-12-17 22:41:22Z

I would put the regular expressions into a list:

TWITTER_PATTERNS = [
 /twitter.com\/([^\/.]*)$/, # "www.twitter.com/twitteruser1"
 /^@([^\/.]*)$/, # "@twitteruser2"
]

This function can iterate over that list:

def twitter_user(twitter)
 TWITTER_PATTERNS.each do |pattern|
 return 1ドル if twitter =~ pattern
 end
 twitter
end

This clearly separates the "policy" (what patterns are used to extract twitter users) from the "mechanism" (the loop we use to apply the policy). A change to the function's implementation won't require a change to the patterns, and vice versa.

Stack Exchange Network

Getting the Twitter account from various input formats

3 Answers 3

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Hot Network Questions

Getting the Twitter account from various input formats

3 Answers 3

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Related

Hot Network Questions