What is the most efficient and productive way to cut each N symbol from a string?
For instance,
n = 5;
str = '1234A1234B1234C';
result: "123412341234"
This is my approach:
def delete_each_n(str, n)
i = n
str.length/n.times do
str.slice!(i-1)
i += (n - 1)
end
str
end
4 Answers 4
If you're interested in performance, and your particular circumstances restrict the characters in the string to those in the ASCII range, then there is something to be said for avoiding the overhead of multibyte character operations and assuming single byte operations will work for you.
This is 2.4.2 on a MacBook Pro, and you'll see that method 4, using bytesize
and byteslice
, and avoiding regexp
and map
, is twice as fast as the next fastest on longer strings, and five times as fast on the original example.
2.4.2 :001 > def method1(string, n)
2.4.2 :002?> string.gsub(/.{#{n}}/){ |sub| sub.chop }
2.4.2 :003?> end
=> :method1
2.4.2 :004 >
2.4.2 :005 > def method2(string, index)
2.4.2 :006?> # Here I use a regular expression to split the string every n characters
2.4.2 :007 > substrings = string.split(%r{(.{#{index}})})
2.4.2 :008?> .reject(&:empty?) # And cut out any empty strings that appear
2.4.2 :009?>
2.4.2 :010 > # Then we can merge the substrings together, without the list character in each substring
2.4.2 :011 > substrings.map do |substring|
2.4.2 :012 > substring.length < index ? substring : substring[0..-2]
2.4.2 :013?> end.join
2.4.2 :014?> end
=> :method2
2.4.2 :015 >
2.4.2 :016 > def method3(string, index)
2.4.2 :017?> string.gsub(/(.{#{index-1}})./, '\1円')
2.4.2 :018?> end
=> :method3
2.4.2 :019 >
2.4.2 :020 > def method4(string, n)
2.4.2 :021?> length = n - 1
2.4.2 :022?> (0..(string.bytesize / n)).each_with_object("") do |x, new_string|
2.4.2 :023 > new_string << string.byteslice(x*n, length)
2.4.2 :024?> end
2.4.2 :025?> end
=> :method4
2.4.2 :026 >
2.4.2 :027 > require 'benchmark'
=> true
2.4.2 :028 >
2.4.2 :029 > runs = 100000
=> 100000
2.4.2 :030 > Benchmark.bm(7) do |x|
2.4.2 :031 > string = '1234A1234B1234C'
2.4.2 :032?> n = 5
2.4.2 :033?> x.report("method 0") { runs.times {}}
2.4.2 :034?> x.report("method 1") { runs.times {method1(string, n)}}
2.4.2 :035?> x.report("method 2") { runs.times {method2(string, n)}}
2.4.2 :036?> x.report("method 3") { runs.times {method3(string, n)}}
2.4.2 :037?> x.report("method 4") { runs.times {method4(string, n)}}
2.4.2 :038?> end ; ""
user system total real
method 0 0.000000 0.000000 0.000000 ( 0.003366)
method 1 0.570000 0.000000 0.570000 ( 0.572950)
method 2 0.670000 0.000000 0.670000 ( 0.666871)
method 3 0.750000 0.000000 0.750000 ( 0.763856)
method 4 0.120000 0.000000 0.120000 ( 0.118647)
=> ""
2.4.2 :039 >
2.4.2 :040 > runs = 50000
=> 50000
2.4.2 :041 > Benchmark.bm(7) do |x|
2.4.2 :042 > string = '1234A1234B1234C'*50
2.4.2 :043?> n = 2
2.4.2 :044?> x.report("method 0") { runs.times {}}
2.4.2 :045?> x.report("method 1") { runs.times {method1(string, n)}}
2.4.2 :046?> x.report("method 2") { runs.times {method2(string, n)}}
2.4.2 :047?> x.report("method 3") { runs.times {method3(string, n)}}
2.4.2 :048?> x.report("method 4") { runs.times {method4(string, n)}}
2.4.2 :049?> end ; ""
user system total real
method 0 0.000000 0.000000 0.000000 ( 0.001685)
method 1 7.110000 0.010000 7.120000 ( 7.131064)
method 2 11.450000 0.010000 11.460000 ( 11.475658)
method 3 9.640000 0.070000 9.710000 ( 9.721599)
method 4 3.750000 0.010000 3.760000 ( 3.758784)
=> ""
2.4.2 :050 >
Your code
You have a sneaky bug in your code!
1000/2.times {|i| puts i }
You seem to think that this code would display 500 numbers between 0
and 499
.
It doesn't. Instead, it displays 0
, 1
, and returns 500
.
You need to replace str.length/n.times do
with (str.length/n).times do
.
Alternative
You can use gsub
to look for the substrings, chop
them and replace them:
def delete_every_nth_char(string, n)
string.gsub(/.{#{n}}/){ |sub| sub.chop }
end
delete_every_nth_char('1234A1234B1234C', 5)
# "123412341234"
delete_every_nth_char('ABAB', 2)
# "AA"
delete_every_nth_char('ABA', 2)
# "AA"
delete_every_nth_char('ABA', 1)
# ""
delete_every_nth_char('ABA', 5)
# "ABA"
It is concise and probably faster than splitting and joining the strings manually.
One way to do it would be to split the string into substrings that are the length n
, then remove the last element from each substring. So, for example:
def delete_each_n(string, index)
# Here I use a regular expression to split the string every n characters
substrings = string.split(%r{(.{#{index}})})
.reject(&:empty?) # And cut out any empty strings that appear
# Then we can merge the substrings together, without the list character in each substring
substrings.map do |substring|
substring.length < index ? substring : substring[0..-2]
end.join
end
Another way would be:
str.gsub(/(.{#{n-1}})./, '\1円')
-
1\$\begingroup\$ I think you should explain why your alternative is better. \$\endgroup\$Billal BEGUERADJ– Billal BEGUERADJ2018年02月12日 07:15:36 +00:00Commented Feb 12, 2018 at 7:15
delete_each_n("abcabc", 2) #=> "acbc"
, when I expected that you wanted "acb". \$\endgroup\$