Caesar cipher in F#

Question 1

I'm trying to broaden my horizon by learning functional programming (coming from OO, specficially C#). For this, I'm implementing some small problems to get a feeling for the language. One of my first is the Caesar cipher.

Formal problem statement:

Implement a Caesar cipher, both encoding and decoding. The key is an integer from 1 to 25. This cipher rotates the letters of the alphabet (A to Z). The encoding replaces each letter with the 1st to 25th next letter in the alphabet (wrapping Z to A). So key 2 encrypts "HI" to "JK", but key 20 encrypts "HI" to "BC". This simple "monoalphabetic substitution cipher" provides almost no security, because an attacker who has the encoded message can either use frequency analysis to guess the key, or just try all 25 keys.

One change I did implement was that the shift is not constrained to 1 - 25, but can be any (positive) number. Note also that I'm okay with converting the input/output to upper case.

What I'm most interested in is whether what I wrote is idiomatic F# or if there are better (more functional) ways of doing things.

(*
Caesar cipher - 
Implement a Caesar cipher, both encoding and decoding. 
The key is an integer from 1 to 25. This cipher rotates the letters of the alphabet (A to Z). 
The encoding replaces each letter with the 1st to 25th next letter in the alphabet (wrapping Z to A). 
So key 2 encrypts "HI" to "JK", but key 20 encrypts "HI" to "BC". 
This simple "monoalphabetic substitution cipher" provides almost no security, because an attacker who has the encoded message 
can either use frequency analysis to guess the key, or just try all 25 keys.
*)
open System
let alphabet = [| 'A' .. 'Z' |]
let tryGetIndex c =
 let idx = alphabet
 |> Array.tryFindIndex (fun t -> t = (Char.ToUpper c))
 (c, idx)
// own modulo operation because f# % doesn't work right for us if 
// we're handling negative numbers
// f#: -3 % 26 -> -3
// 'ours': modulo -3 26 -> 23
// taken from here: 
// http://gettingsharper.de/2012/02/28/how-to-implement-a-mathematically-correct-modulus-operator-in-f/
let modulo n m = ((n % m) + m) % m
let encryptIdx idx shift =
 (idx + shift) % alphabet.Length
let decryptIdx idx shift = 
 let newIdx = idx - shift
 modulo newIdx alphabet.Length
let getShifted shift shiftOperation charIdx =
 match charIdx with
 | _, Some idx -> alphabet.[shiftOperation idx shift]
 | c, None -> c
let operateCaesar operation shift text =
 if shift < 0 then
 raise (ArgumentException "only positive shifts are supported")
 (Seq.map tryGetIndex 
 >> Seq.map (getShifted shift operation)
 >> Seq.map string) text
 |> String.concat "" 
let encryptcaesar shift text = operateCaesar encryptIdx shift text
let decryptcaesar shift text = operateCaesar decryptIdx shift text
let enc = "hello world" |> encryptcaesar 500
let dec = enc |> decryptcaesar 500

Question 2

^{Disclaimer: I know almost no F#. But I know Haskell. So I think I should be able to give some insight. An F# expert can give you more, of course.}

Re-use of functions

One thing that strikes me as odd is that you have two shift functions, encryptIdx and decryptIdx. Well, the number of shift functions isn't what I'm concerned with, to be honest, but the different requirements on the shift value.

Where decryptIdx can take any shift (even a negative one), encryptIdx will return a value out of [| 0 .. 25 |] if you supply a negative one due to the property of % you've displayed nicely in the comment. That's not immediately obvious for someone, so you should mention in the documentation/comments that encryptIdx may only use a non-negative shift.

However, the bigger problem is that both implementations can get out of sync. Let's say that at some point you decide to use another alphabet, e.g.

let alphabet2 = [| '0' .. 'z' |]

You change your encryptIdx, but you forget to change it in decryptIdx. Ouch. So instead, let us write one in terms of the other, or rather in terms of another function:

let shiftIdx shift idx = 
 let newIdx = idx + shift
 modulo newIdx alphabet.Length
let encryptIdx shift idx =
 shiftIdx shift idx
let decryptIdx shift idx = 
 shiftIdx (-shift) idx

Note that I've changed the order of arguments. Why? Because it enables us to use currying:

let encryptIdx shift =
 shiftIdx shift
let decryptIdx shift = 
 shiftIdx (-shift)

Combining functions

Either way, let us now look at getShifted. I'm kidding. getShifted is perfectly fine with your current combination of tryGetIndex returning a pair.^{[remark 1]}

But operateCaesar could use a little polish. First of all, we can get rid of the shift < 0 exception, since now both shifts can handle negative numbers. Next, instead of repeatedly mapping over our text, we can map once:

let operateCaesar operation shift text =
 text 
 |> Seq.map (tryGetIndex >> getShifted shift operation >> string)
 |> String.concat ""

That's because Seq.map f >> Seq.map g is the same as Seq.map (f >> g), so we have only one Seq.map call.^{[remark 2]}

Before we leave that function, let us think about types. operation will be an int -> int. shift will be an int. And text will be a seq<char>. The result will be a String. However, if we restrict text to be a String, we can use String.map instead of Seq.map and make the function easier again:

let operateCaesar operation shift text =
 text 
 |> String.map (tryGetIndex >> getShifted shift operation)

Now for a short test. Remember my remark about currying? That

let f bar baz = g magic bar baz

is the same as

let f = g magic

With that in mind, how can you rewrite operateCaesar above or the next two lines?

let encryptcaesar shift text = operateCaesar encryptIdx shift text
let decryptcaesar shift text = operateCaesar decryptIdx shift text

The solution will be in the code below, so make sure to think about it before you scroll further.

All in all, we end up with:

open System
let alphabet = [| 'A' .. 'Z' |]
let tryGetIndex c =
 let idx = alphabet
 |> Array.tryFindIndex ((=) (Char.ToUpper c ))
 (c, idx)
let modulo n m = ((n % m) + m) % m
let shiftIdx shift idx = 
 let newIdx = idx + shift
 modulo newIdx alphabet.Length
let encryptIdx shift = shiftIdx shift
let decryptIdx shift = shiftIdx (-shift)
let getShifted shiftOperation shift charIdx =
 match charIdx with
 | _, Some idx -> alphabet.[shiftOperation shift idx]
 | c, None -> c
let operateCaesar operation shift = String.map (tryGetIndex >> getShifted operation shift)
let encryptcaesar = operateCaesar encryptIdx
let decryptcaesar = operateCaesar decryptIdx

I took the courtesy to change the order of arguments in getShifted so that it reflects the order of arguments in operateCaesar better.

Remarks on your original code in a nutshell

So what were my major critique points? Function re-use. encryptIdx and decryptIdx are well suited to be implemented in terms of each other.

Another approach

Let us change tryGetIndex slightly and make it true to its name:

let tryGetIndex c =
 alphabet
 |> Array.tryFindIndex ((=) (Char.ToUpper c))

It now returns an Option<int>, so just the index. Now we need another function, fromIndex:

let fromIndex idx = alphabet.[idx]

Now we can write a combined variant of tryGetIndex >> getShifted:

let characterShift shiftFunc c =
 tryGetIndex c |> Option.fold (fun _ v -> fromIndex (shiftFunc v)) c

The Option.fold uses the given function if the Option is Some and returns the second argument otherwise. So we either had an index, shift it and return the corresponding character, or we didn't, and therefore return the original character. Note that shiftFunc here is a function that shifts the character accordingly, e.g. encryptIdx 10. That's all in the article linked above, though.

We end up with the following operateCaesar:

let operateCaesar op shift = 
 String.map (characterShift (op shift))

_{Remark 1: We will revisit that soon enough.}
_{Remark 2: I sincerely hope that F#'s compiler does this optimization by default.}

Question 3

Thank you very much for the thorough analysis! Especially currying is something that I have to make more use of (coming from C# I'm used to supplying all arguments all the time ). Regarding the different encryptIdx and decryptIdx: You're totally correct, I should have changed encryptIdx as well; don't really know why I didn't. I'll have to take a closer look at your characterShift to fully comprehend it.

Question 4

@germi Keep in mind that I'm not a F#-person, so the review should be taken with a grain of salt. Now to the Option.fold. Option.fold takes three arguments: a combining function, a "state", and an Option. If the Option is None, the state (the original character c) is returned. If the Option is Some the combining function gets both the state and the Option's value. The state isn't interesting for us, so we ignore it (the _) and just shift the value and return the correct character from the now shifted index.

Question 5

@germi also, just in case you come across that term: the process of going from let foo bar = goo bar to let foo = goo is called eta reduction or eta conversion.

Question 6

Zeta gives a thorough answer; I'll just add the following.

The operateCaesar function could be written as:

let operateCaesar operation shift text =
 if shift < 0 then
 raise (ArgumentException "only positive shifts are supported")
 snd (text |> Seq.mapFold (fun str ch -> ch, (sprintf "%s%c" str (getShifted shift operation (tryGetIndex ch)))) "")

Which may not be clearer but just another way to do the same thing, and in one operation.

Shifting 500 positions is essentially the same as shifting 500 % alphabet.Length = 6 (signed), so you could consider any shift to be done in two consecutive alphabets (A - Z A - Z):

let alphabet = [ 'A'..'Z'] @ [ 'A'..'Z']

The position of a char can be found as:

let pos = ((int)ch - (int)alphabet.[0])

While encrypting the new position is pos + shift.

While decrypting the new position is alphatbet.Length + pos - shift

All in all my solution would be something like this:

let alphabet = [ 'A'..'Z']
let shiftText shift text =
 let alphabetLength = alphabet |> Seq.length
 let actualShift = shift % alphabetLength
 let prePos = if shift >= 0 then 0 else alphabetLength
 let offset ch = 
 let pos = ((int)ch - (int)alphabet.[0])
 match pos with
 | x when ch >= alphabet.[0] && ch <= alphabet.[alphabetLength-1] -> alphabet.[(prePos + pos + actualShift) % alphabetLength]
 | _ -> ch
 snd (text |> Seq.mapFold (fun str ch -> ch, (sprintf "%s%c" str (offset ch))) "")
let encrypt shift text = shiftText shift text
let decrypt shift text = shiftText -shift text

Question 7

Why Seq.mapFold instead of String.map? Why not Seq.fold? After all, you don't use the result of the map part at all. Also, I'm not sure whether the inline isValid (aka ch >= ... && ch <= ... is good practice. But then again, I'm not a F# developer.

Question 8

Also, given that Seq.mapFold's documentation is currently missing from the official (online) documents, I'd add some remarks there. Other than that, I think all your points are valid.

Question 9

@Zeta, Of cause you should use fold instead of mapFold. That'll make the line much more readable.

Zeta Zeta 19.6k2 gold badges57 silver badges90 bronze badges · Accepted Answer · 2017-04-30 11:31:36Z

^{Disclaimer: I know almost no F#. But I know Haskell. So I think I should be able to give some insight. An F# expert can give you more, of course.}

Re-use of functions

One thing that strikes me as odd is that you have two shift functions, encryptIdx and decryptIdx. Well, the number of shift functions isn't what I'm concerned with, to be honest, but the different requirements on the shift value.

Where decryptIdx can take any shift (even a negative one), encryptIdx will return a value out of [| 0 .. 25 |] if you supply a negative one due to the property of % you've displayed nicely in the comment. That's not immediately obvious for someone, so you should mention in the documentation/comments that encryptIdx may only use a non-negative shift.

However, the bigger problem is that both implementations can get out of sync. Let's say that at some point you decide to use another alphabet, e.g.

let alphabet2 = [| '0' .. 'z' |]

You change your encryptIdx, but you forget to change it in decryptIdx. Ouch. So instead, let us write one in terms of the other, or rather in terms of another function:

let shiftIdx shift idx = 
 let newIdx = idx + shift
 modulo newIdx alphabet.Length
let encryptIdx shift idx =
 shiftIdx shift idx
let decryptIdx shift idx = 
 shiftIdx (-shift) idx

Note that I've changed the order of arguments. Why? Because it enables us to use currying:

let encryptIdx shift =
 shiftIdx shift
let decryptIdx shift = 
 shiftIdx (-shift)

Combining functions

Either way, let us now look at getShifted. I'm kidding. getShifted is perfectly fine with your current combination of tryGetIndex returning a pair.^{[remark 1]}

But operateCaesar could use a little polish. First of all, we can get rid of the shift < 0 exception, since now both shifts can handle negative numbers. Next, instead of repeatedly mapping over our text, we can map once:

let operateCaesar operation shift text =
 text 
 |> Seq.map (tryGetIndex >> getShifted shift operation >> string)
 |> String.concat ""

That's because Seq.map f >> Seq.map g is the same as Seq.map (f >> g), so we have only one Seq.map call.^{[remark 2]}

Before we leave that function, let us think about types. operation will be an int -> int. shift will be an int. And text will be a seq<char>. The result will be a String. However, if we restrict text to be a String, we can use String.map instead of Seq.map and make the function easier again:

let operateCaesar operation shift text =
 text 
 |> String.map (tryGetIndex >> getShifted shift operation)

Now for a short test. Remember my remark about currying? That

let f bar baz = g magic bar baz

is the same as

let f = g magic

With that in mind, how can you rewrite operateCaesar above or the next two lines?

let encryptcaesar shift text = operateCaesar encryptIdx shift text
let decryptcaesar shift text = operateCaesar decryptIdx shift text

The solution will be in the code below, so make sure to think about it before you scroll further.

All in all, we end up with:

open System
let alphabet = [| 'A' .. 'Z' |]
let tryGetIndex c =
 let idx = alphabet
 |> Array.tryFindIndex ((=) (Char.ToUpper c ))
 (c, idx)
let modulo n m = ((n % m) + m) % m
let shiftIdx shift idx = 
 let newIdx = idx + shift
 modulo newIdx alphabet.Length
let encryptIdx shift = shiftIdx shift
let decryptIdx shift = shiftIdx (-shift)
let getShifted shiftOperation shift charIdx =
 match charIdx with
 | _, Some idx -> alphabet.[shiftOperation shift idx]
 | c, None -> c
let operateCaesar operation shift = String.map (tryGetIndex >> getShifted operation shift)
let encryptcaesar = operateCaesar encryptIdx
let decryptcaesar = operateCaesar decryptIdx

I took the courtesy to change the order of arguments in getShifted so that it reflects the order of arguments in operateCaesar better.

Remarks on your original code in a nutshell

So what were my major critique points? Function re-use. encryptIdx and decryptIdx are well suited to be implemented in terms of each other.

Another approach

Let us change tryGetIndex slightly and make it true to its name:

let tryGetIndex c =
 alphabet
 |> Array.tryFindIndex ((=) (Char.ToUpper c))

It now returns an Option<int>, so just the index. Now we need another function, fromIndex:

let fromIndex idx = alphabet.[idx]

Now we can write a combined variant of tryGetIndex >> getShifted:

let characterShift shiftFunc c =
 tryGetIndex c |> Option.fold (fun _ v -> fromIndex (shiftFunc v)) c

The Option.fold uses the given function if the Option is Some and returns the second argument otherwise. So we either had an index, shift it and return the corresponding character, or we didn't, and therefore return the original character. Note that shiftFunc here is a function that shifts the character accordingly, e.g. encryptIdx 10. That's all in the article linked above, though.

We end up with the following operateCaesar:

let operateCaesar op shift = 
 String.map (characterShift (op shift))

_{Remark 1: We will revisit that soon enough.}
_{Remark 2: I sincerely hope that F#'s compiler does this optimization by default.}

Thank you very much for the thorough analysis! Especially currying is something that I have to make more use of (coming from C# I'm used to supplying all arguments all the time ). Regarding the different encryptIdx and decryptIdx: You're totally correct, I should have changed encryptIdx as well; don't really know why I didn't. I'll have to take a closer look at your characterShift to fully comprehend it.
@germi Keep in mind that I'm not a F#-person, so the review should be taken with a grain of salt. Now to the Option.fold. Option.fold takes three arguments: a combining function, a "state", and an Option. If the Option is None, the state (the original character c) is returned. If the Option is Some the combining function gets both the state and the Option's value. The state isn't interesting for us, so we ignore it (the _) and just shift the value and return the correct character from the now shifted index.
@germi also, just in case you come across that term: the process of going from let foo bar = goo bar to let foo = goo is called eta reduction or eta conversion.

Stack Exchange Network

Caesar cipher in F#

2 Answers 2

Re-use of functions

Combining functions

Remarks on your original code in a nutshell

Another approach

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Linked

Hot Network Questions

Caesar cipher in F#

2 Answers 2

Re-use of functions

Combining functions

Remarks on your original code in a nutshell

Another approach

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Linked

Related

Hot Network Questions