Memory/Time usage on substring search code

Question 1

I have implemented a substring search, which works as expected and provides the right output. On some edge cases tho, I get either Memory or Time limit exceeded because:

the string length is too big
there are too many queries

The code asks for a few inputs:

a string
the number of queries
query

Info about the string and the queries:

string length is not bigger than 300.000 and consists of English lowercase characters.
max number of queries is 300.000
sum of the queries length is not bigger than 300.000

I get MLE if I use HashMap caching, and I get TLE without it.

Where can my code be improved, so it also handles those edge cases?

use std::collections::HashMap;
use std::io::{stdin, stdout, BufWriter, Write};
fn main() {
 let out = &mut BufWriter::new(stdout());
 let mut s = String::new();
 stdin().read_line(&mut s).expect("Failed to read input");
 let s = s.trim();
 let mut q = String::new();
 stdin()
 .read_line(&mut q)
 .expect("Failed to read number of queries");
 let q: u32 = q.trim().parse().unwrap();
 if q == 0 {
 writeln!(out, "No").ok();
 }
 let sorted_suffixes = suffix_array(&s);
 let mut saved_idx = HashMap::with_capacity(q as usize);
 let mut saved_results = HashMap::with_capacity(q as usize);
 for idx in 0..q {
 let mut i = String::new();
 stdin()
 .read_line(&mut i)
 .unwrap_or_else(|_| panic!("Failed to read input {}", idx));
 let i = i.trim();
 let result = saved_results.entry(i.to_string()).or_insert_with(|| {
 let sorted_suffixes = saved_idx
 .entry(i.len())
 .or_insert_with(|| transform(&sorted_suffixes, s, i.len()));
 match sorted_suffixes.binary_search(&i) {
 Ok(_) => "Yes",
 _ => "No",
 }
 });
 writeln!(out, "{}", result).ok();
 }
}
fn transform<'a>(prep: &[usize], s: &'a str, max_len: usize) -> Vec<&'a str> {
 let str_len = s.len();
 let mut result = Vec::with_capacity(prep.len());
 for &p in prep {
 if p + max_len > str_len {
 result.push(&s[p..])
 } else {
 result.push(&s[p..p + max_len])
 }
 }
 result
}
fn suffix_array(s: &str) -> Vec<usize> {
 let s = format!("{}{}", s, "$");
 let str_len = s.len();
 let (mut prep, mut class) = (vec![0; str_len], vec![0; str_len]);
 let mut arr: Vec<(char, usize)> = Vec::with_capacity(str_len);
 for (idx, ch) in s.chars().enumerate() {
 arr.push((ch, idx));
 }
 arr.sort_by(|a, b| a.0.cmp(&b.0));
 for (idx, a) in arr.iter().enumerate() {
 prep[idx] = a.1;
 }
 for (idx, _) in s.chars().enumerate().skip(1) {
 if arr[idx].0 == arr[idx - 1].0 {
 class[prep[idx]] = class[prep[idx - 1]];
 } else {
 class[prep[idx]] = class[prep[idx - 1]] + 1;
 }
 }
 let mut k = 0;
 while (1 << k) < str_len {
 prep = prep
 .iter()
 .map(|p| (p + str_len - (1 << k)) % str_len)
 .collect();
 count_sort(&mut prep, &class);
 let mut c_new = vec![0; str_len];
 for idx in 1..str_len {
 let prev = (
 class[prep[idx - 1]],
 class[(prep[idx - 1] + (1 << k)) % str_len],
 );
 let cur = (class[prep[idx]], class[(prep[idx] + (1 << k)) % str_len]);
 if cur == prev {
 c_new[prep[idx]] = c_new[prep[idx - 1]];
 } else {
 c_new[prep[idx]] = c_new[prep[idx - 1]] + 1;
 }
 }
 class = c_new;
 k += 1;
 }
 prep
}
fn count_sort(p: &mut Vec<usize>, c: &[i32]) {
 let n = &p.len();
 let mut cnt = vec![0; *n];
 for idx in 0..c.len() {
 cnt[c[idx] as usize] += 1;
 }
 let mut p_new = vec![0; *n];
 let mut pos = vec![0; *n];
 for idx in 1..*n {
 pos[idx] = pos[idx - 1] + cnt[idx - 1];
 }
 p.iter().for_each(|&item| {
 let i = c[item] as usize;
 p_new[pos[i]] = item;
 pos[i] += 1;
 });
 *p = p_new;
}

Question 2

Memory / time usage aside, is your code producing correct results, to the best of your knowledge? If that is the case, your question might be on-topic; please edit your question to include the clarification. Also, please include a more detailed description of what your code does and/or some test cases, if possible.

Question 3

Thanks. The code is correct and produces the right result. I'll make it clearer.

Question 4

Can you re-open it?

Question 5

It has been reopened.

Question 6

Some questions: 1. are the strings guaranteed to consist of a fixed set of bytes? 2. what is the rough scale of the lengths of the string and queries, as well as the number of queries?

Question 7

suffix_array

First major issue I found while checking code was this cycle

let mut k = 0;
while (1 << k) < str_len {
 prep = prep
 .iter()
 .map(|p| (p + str_len - (1 << k)) % str_len)
 .collect(); // what exactly should've happened here
 count_sort(&mut prep, &class); // nvm just sort it back
 let mut c_new = vec![0; str_len];
 for idx in 1..str_len {
 let prev = (
 class[prep[idx - 1]],
 class[(prep[idx - 1] + (1 << k)) % str_len],
 );
 let cur = (class[prep[idx]], class[(prep[idx] + (1 << k)) % str_len]);
 if cur == prev {
 c_new[prep[idx]] = c_new[prep[idx - 1]];
 } else {
 c_new[prep[idx]] = c_new[prep[idx - 1]] + 1;
 }
 }
 class = c_new;
 k += 1;
}

it does something to prep, than calls count_sort on it, and then messes up with class. After second iteration looks like prep doesn't change at all as well as class, so it basically waste time on calling count_sortand allocating/deallocating memory buffers up to 300k times. It is easily replaceable with single call of count_sort.

second minor issue - for (idx, _) in s.chars().enumerate().skip(1) it's really complicated way of saying for idx in 1..str_len. Compiler probably will optimize unused things, but it is still potential slowdown.

Also

let (mut prep, mut class) = (vec![0; str_len], vec![0; str_len]);

is harder to read and make really no sense to unbind tuple you've just created. So better is separate them and make them Vec<usize> because there's no negative values in the algorithm.

let mut prep = vec![0usize; str_len];
let mut class = vec![0usize; str_len];

Also requires to fix count_sort second parameter type and remove unnecessary cast as usize.

count_sort

not sure why you've decided to make n as a pointer, it isn't really helps anywhere and doesn't affect performance as well. might be a good idea to change function from

fn count_sort(p: &mut Vec<usize>, class: &[usize])

to

fn count_sort(p: Vec<usize>, class: &[usize]) -> Vec<usize>

because function will discard previous values anyway. Probably it will help compiler to optimize some calls away and make it easier to read.

Question 8

About the mapping inside the while loop: it sorts prep, first it will sort by the first char, then the first 2 chars, first 4... doubling each time and comparing the values of both sides. The code doesn't work without it. So let's say before the iteration prep is (using the equivalent chars): [,ドル a, a, a, b, b, b] This doesn't mean that prep is sorted because the 2nd chars of the first iteration could be something like: [$a, ab, ab, a,ドル ba, bb, ba]. We need the 2nd iteration to sort it to: [$a, a,ドル ab, ab, ba, ba, bb]. And without this sorting, the binary search won't work.

Question 9

if you check content of prep on every cycle of the loop you will notice nothing really changes. It doesn't change content of prep after first cycle. Same stands for class - it changes only once and then it produces the same output. So it cycle just waste time or lacks some extra logic. You can check output on playground. Comment cycle and uncomment sort.

Question 10

Looks like you are correct. but for some reason, this is returning "No" when I uncomment count and comment cycle (even tho fla is there):

fla in ["", "ajf", "ajs", "a", "fhl", "fla", "fka", "hkf", "hla", "jhk", "jfl", "jsf", "klj", "kfh", "ka", "ljh", "laj", "laj", "sfk"]

Question 11

Ah, without the cycle prep is not properly sorted, in the case above fla comes before fka and binary search fails to find it

Question 12

Hmmm, looks like i was a bit wrong - cycle repeats itself after second loop, so if make it while (1 << k) < 2 it pops proper array and succeeds on finding right answer.

Sugar SugarSugar 3462 silver badges8 bronze badges · Accepted Answer · 2020-09-08 19:26:29Z

suffix_array

First major issue I found while checking code was this cycle

let mut k = 0;
while (1 << k) < str_len {
 prep = prep
 .iter()
 .map(|p| (p + str_len - (1 << k)) % str_len)
 .collect(); // what exactly should've happened here
 count_sort(&mut prep, &class); // nvm just sort it back
 let mut c_new = vec![0; str_len];
 for idx in 1..str_len {
 let prev = (
 class[prep[idx - 1]],
 class[(prep[idx - 1] + (1 << k)) % str_len],
 );
 let cur = (class[prep[idx]], class[(prep[idx] + (1 << k)) % str_len]);
 if cur == prev {
 c_new[prep[idx]] = c_new[prep[idx - 1]];
 } else {
 c_new[prep[idx]] = c_new[prep[idx - 1]] + 1;
 }
 }
 class = c_new;
 k += 1;
}

it does something to prep, than calls count_sort on it, and then messes up with class. After second iteration looks like prep doesn't change at all as well as class, so it basically waste time on calling count_sortand allocating/deallocating memory buffers up to 300k times. It is easily replaceable with single call of count_sort.

second minor issue - for (idx, _) in s.chars().enumerate().skip(1) it's really complicated way of saying for idx in 1..str_len. Compiler probably will optimize unused things, but it is still potential slowdown.

Also

let (mut prep, mut class) = (vec![0; str_len], vec![0; str_len]);

is harder to read and make really no sense to unbind tuple you've just created. So better is separate them and make them Vec<usize> because there's no negative values in the algorithm.

let mut prep = vec![0usize; str_len];
let mut class = vec![0usize; str_len];

Also requires to fix count_sort second parameter type and remove unnecessary cast as usize.

count_sort

not sure why you've decided to make n as a pointer, it isn't really helps anywhere and doesn't affect performance as well. might be a good idea to change function from

fn count_sort(p: &mut Vec<usize>, class: &[usize])

to

fn count_sort(p: Vec<usize>, class: &[usize]) -> Vec<usize>

because function will discard previous values anyway. Probably it will help compiler to optimize some calls away and make it easier to read.

About the mapping inside the while loop: it sorts prep, first it will sort by the first char, then the first 2 chars, first 4... doubling each time and comparing the values of both sides. The code doesn't work without it. So let's say before the iteration prep is (using the equivalent chars): [,ドル a, a, a, b, b, b] This doesn't mean that prep is sorted because the 2nd chars of the first iteration could be something like: [$a, ab, ab, a,ドル ba, bb, ba]. We need the 2nd iteration to sort it to: [$a, a,ドル ab, ab, ba, ba, bb]. And without this sorting, the binary search won't work.
if you check content of prep on every cycle of the loop you will notice nothing really changes. It doesn't change content of prep after first cycle. Same stands for class - it changes only once and then it produces the same output. So it cycle just waste time or lacks some extra logic. You can check output on playground. Comment cycle and uncomment sort.
Looks like you are correct. but for some reason, this is returning "No" when I uncomment count and comment cycle (even tho fla is there): fla in ["", "ajf", "ajs", "a", "fhl", "fla", "fka", "hkf", "hla", "jhk", "jfl", "jsf", "klj", "kfh", "ka", "ljh", "laj", "laj", "sfk"]
Ah, without the cycle prep is not properly sorted, in the case above fla comes before fka and binary search fails to find it
Hmmm, looks like i was a bit wrong - cycle repeats itself after second loop, so if make it while (1 << k) < 2 it pops proper array and succeeds on finding right answer.

Stack Exchange Network

Memory/Time usage on substring search code

1 Answer 1

suffix_array

count_sort

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Hot Network Questions

Memory/Time usage on substring search code

1 Answer 1

suffix_array

count_sort

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Related

Hot Network Questions