12
\$\begingroup\$

You will be given a block of text and a cursor position at the end of a possibly incomplete word.

You must output a suggestion for what this word (let's call it w) may be, based on existing words in the text (let's call the collection of words e). The suggestion is decided as follows (choosing the earliest matched rule):

  1. if there are words in e which start with w, return the shortest one.
  2. if there are words in e which contain the letters of w in the order they appear in w (but not necessarily adjacent), return the shortest one.
  3. return the word in e which contains the most* letters of w in the order they appear in w (but not necessarily adjacent)

*Note that when counting letters for rule 3, you should proceed left to right in w when matching the word in e. So if you have w = "abcdef" and a word in e "cdeab", you would match ab. If you have w = "abcdef" and a word in e "cdefbcd" you would match bcd. In other words, iterate through the letters of w, and for each letter, jump to the next position that it appears in the word being checked (or stay at the current position if it's not present).

If there are any ties within a rule, follow the below until a tie is broken:

  1. return the shortest word
  2. return the word which appears the least number of words before the cursor. (skip this rule if none of the matches are before the cursor)
  3. return the most frequently appearing word
  4. return the earliest appearing word.

If no rules are matched, return the current word which is being typed.

Input

The text and the cursor. This could be:

  • A string and an integer
  • A string with a # in it representing the cursor
  • An array of characters and an integer
  • Etc

Output

The resulting suggestion, as a string / array of characters etc.

Rules

  • The text will only contain lowercase letters and spaces (and a character for the cursor if you choose to take input in this way).
  • The cursor will always be followed by a space or the end of the text.
  • Code golf rules and scoring
  • There will always be at least one letter directly before the cursor
  • You must show some evidence that your code passes all the test cases. This could be through a link to a tio.run page, or through some other mechanism.

Test Cases

(# represents the cursor position)

"the early bird gets t# worm" -> "the" (1)
"peter piper pick# a peck of pickled peppers a peck of pickled peppers peter piper picked" -> "picked" (1)
"the# quick brown fox jumps over the lazy dog" -> "the" (1)
"dichlorodiphenyltrichloroethane is a chemical originally intended for insecticides and is commonly known as ddt#" -> "dichlorodiphenyltrichloroethane" (2)
"many runners run for years before they rn# a marathon" -> "run" (2)
"she sells es# shells by the sea shore" -> "sells" (2)
"how much wood would a woodch# chuck" -> "wood" (3)
"bad abcdefg dbf#" -> "abcdefg" (3)
"aaaabbbb ba# ab" -> "ab" (3)
"cdefbcd abcdef# ab" -> "cdefbcd" (3)
"water wafer wa#" -> "wafer" (1, tiebreak 1)
"wafer wa# water" -> "wafer" (1, tiebreak 1)
"wa# water wafer wafer" -> "wafer" (1, tiebreak 2)
"wa# water wafer" -> "water" (1, tiebreak 3)
"abcd arcs ac#" -> "arcs" (2, tiebreak 1)
"ac# abcd arcs arcs" -> "arcs" (2, tiebreak 2)
"ac# abcd arcs" -> "abcd" (2, tiebreak 3)
"abcd arcs ca#" -> "arcs" (3, tiebreak 1)
"ca# abcd arcs arcs" -> "arcs" (3, tiebreak 2)
"ca# abcd arcs" -> "abcd" (3, tiebreak 3)
"cup# of tea" -> "cup" (no match)
asked Sep 6 at 7:07
\$\endgroup\$
2
  • 1
    \$\begingroup\$ Lovely attention to detail to provide the rules/ties relevant for each test case! \$\endgroup\$ Commented Sep 6 at 16:14
  • \$\begingroup\$ @JonathanAllan thanks :) \$\endgroup\$ Commented Sep 6 at 17:26

2 Answers 2

3
\$\begingroup\$

Jelly, 45 bytes

RFḟ@ḣ18;
ẹⱮŻç/=9Jݤ,LƲʋÐṀ9fƇLÐṂfṚȯÆṃ'{ʋ8ḣ5¤Ḣȯ

A full program that accepts the existing words, e, the word at the cursor, w, and the zero index of w, i, and prints the result to stdout.

Try it online! Or see the test-suite (employs the register, ®, in place of 5 to allow multiple tests.).

How?

ẹⱮŻç/=9Jݤ,LƲʋÐṀ9fƇ... - Main Link: e, w
 ʋÐṀ9 - keep those x of e maximal under f(x, w):
ẹⱮ - all 1-indices of each c of w in x
 Ż - prefix this list of lists with a zero
 ç/ - reduce by last Link (RFḟ@ḣ18;) as a dyad, see below
 -> MatchIndices
 Ʋ - last four links as a monad - f(MatchIndices)
 = - {MatchIndices} equals...
 9Jݤ - ...indices of w prefixed with a zero? -> StartsWithW
 , - paired with...
 L - length -> [StartsWithW, MatchCount]
 fƇ - only keep those with any match
 -> PotentialMatches (ready for tie-breaks)
RFḟ@ḣ18; - Helper Link, get MatchIndices: CurrentMatchIndices, NextIndices = [e.indices_of(c)]
R - {CurrentMatchIndices} Range -> [[1..v] for v in CurrentMatchIndices]
 F - flatten
 ḟ@ - filter from {NextIndices}
 ḣ1 - head to 1-index 1 (no-op given an empty list)
 8; - concatenate to {CurrentMatchIndices}
LÐṂfṚȯÆṃ'{ʋ8ḣ5¤Ḣȯ - ...Main Link continued:PotentialMatches
LÐṂ - keep those of minimal length -> ShortestMatches
 8ḣ5¤ - get WordsLeftOfCursor
 ʋ - last four links as a monad -> f(ShortestMatches, WordsLeftOfCursor):
 f - {ShortestMatches} filter keep {WordsLeftOfCursor}
 Ṛ - reverse
 ȯ - logical OR... (i.e. if empty)
 Æṃ'{ - ...mode of {ShortestMatches} -> Most frequent(s), in left-right order
 Ḣ - head -> WinningMatch or 0 if none found
 ȯ - logical OR {w}
 - implicit print
answered Sep 7 at 12:46
\$\endgroup\$
2
\$\begingroup\$

Charcoal, 95 bytes

≔⪪S θ≔⊟ΦθNoι#η⊞υ⟦0χ−η#⟧F−θ⟦η⟧«≔¬⌕ι−η#ζ≔ιεFη«≧+¬¬Noεκζ≔✂ε⊕⌕εκLε1ε»⊞υ⟦ζ±Lι±⌕⊞O⮌...θ⌕θηιιNoθι±⌕θιι⟧»⊟⌈υ

Try it online! Link is to verbose version of code. Explanation:

≔⪪S θ≔⊟ΦθNoι#η

Split the input into words and find the cursor word w.

⊞υ⟦0χ−η#⟧

w counts as the shortest word having no matching letters. This prioritises it over words with no matching letters but words with at least one matching letter take priority over w.

F−θ⟦η⟧«

Loop over the other words.

≔¬⌕ι−η#ζ

Prioritise a word that w is a prefix of. (This word will already have all of the letters of w in order, but it just needs a little extra boost.)

≔ιεFη«≧+¬¬Noεκζ≔✂ε⊕⌕εκLε1ε»

Count how many letters of w appear in the word in order.

⊞υ⟦ζ±Lι±⌕⊞O⮌...θ⌕θηιιNoθι±⌕θιι⟧

Push this and all of the tiebreak values to the list of results.

»⊟⌈υ

Output the word that is the best match.

answered Sep 8 at 11:04
\$\endgroup\$

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.