This question is part of a series solving the Rosalind challenges. For the previous question in this series, see Wascally wabbits Wascally wabbits. The repository with all my up-to-date solutions so far can be found here.
This question is part of a series solving the Rosalind challenges. For the previous question in this series, see Wascally wabbits. The repository with all my up-to-date solutions so far can be found here.
This question is part of a series solving the Rosalind challenges. For the previous question in this series, see Wascally wabbits. The repository with all my up-to-date solutions so far can be found here.
The Genetic Code
This question is part of a series solving the Rosalind challenges. For the previous question in this series, see Wascally wabbits. The repository with all my up-to-date solutions so far can be found here.
Problem: PROT
The 20 commonly occurring amino acids are abbreviated by using 20 letters from the English alphabet (all letters except for B, J, O, U, X, and Z). Protein strings are constructed from these 20 symbols. Henceforth, the term genetic string will incorporate protein strings along with DNA strings and RNA strings.
The RNA codon table dictates the details regarding the encoding of specific codons into the amino acid alphabet.
Given:
An RNA string \$s\$ corresponding to a strand of mRNA (of length at most 10 kbp).
Return:
The protein string encoded by \$s\$.
Sample Dataset:
AUGGCCAUGGCGCCCAGAACUGAGAUCAAUAGUACCCGUAUUAACGGGUGA
Sample Output:
MAMAPRTEINSTRING
My solution solves the sample dataset and the actual dataset given.
Dataset:
AUGCGCCCUUGGUCGCUCCUUGGAUCAGAGCAUAUUCUAUCACGGCGCGUCGAAGGAAUAACCCACGACAUCUCUCUAUAUUGGAUUCCCUUUUUUUCGGUUCAGGUAGAUCAUUUGGCUACUGGACUUUCUAAGAUUUACUCCGCCAUGUUCCUAUAUGUUACAUUCUCAGCCGAAGUCGCUGUAUAUCACGUUAAGGUAGACGGUUCCUUGACUACCAGCGACGCCUGUAGGGAGAAUUCCAUCCAUCAUGCAUGUAUGGGCAGUGCGCACUUACAGCGCCAUAGGCAACGGACGGACAGACCUCCUUUCCUGUCGGACGGUAAGCCGCGAUCCAAUACAGAGCAAAGUCCCACGCCCUCCUAUAGACUCACGCCAAGAUUGUAUUCCCCGUUAACCGCUCUCUCAGGGAAGUUGUAUCUACUCGGAUCGGGAUGUCCUUGGAAAUGUAGGAAAAUGGCUCAAACUACGAUUGUAUACCGUGCGAGACGUUGGAUCCCGCUUAUCACUGAUACCAUAAUCUGUGUGGCCCCCUUACCACAACCUAACCAUGGAGUAGUAGCCCUGGCCGUCCCUUCAAGGGCAAGACCUCAUUGUCUUGUACGCUUAUCACAAGGGCCAUCUAACAAUGUGUACCGGUAUAAUUUUACGUGGUAUUGUCCAGACGGCGGUACGGCCGAUCCGUUGCCAUUUCGUCAUGGCAUAACCUCGGUCUAUCUUCCUUCCUACUCGGGAAUAGUUCGCAGUACACCAUACCUCAUCGGCACUUACGCUGUUCCAACACAAAAUUCUGAUCCCUUCGCUACCACCCGCUGGGUAUCUGUCAGGUUACUGGCCUCCACUACCGGAGAGGGCGAUACGGGGGACCGCGGAACACUUUCUACAUUUUUGGACUGCUGUAUGUCUACUUCGGCUCUUCCUCCCGCGAGAUUUAUAUCGGCAUACAGAGUAAACGCUACCCAUGGCGACGACACUCGUCUCACCGUUAAGAAGCUGUGCACACCUUAUAAAAGCCUUUGGUCAGGUAAUGUAUGUAGACCUCUAUCGGUCUAUAAGUUGAAGGAAAGAUAUAUAGUCCUCACCGUUGGUUAUCAUUCCCUCCUAUCCCGCAUGUCCACCCUAAGGAAAACUUACUCCGUGGACCAAGCGGGACCGGCAAGGCCCUUGGUUGCCGAAAAAUCAGGAGUUUGCGAGUGCGACUAUGAUCCCAGGUUUGUAAUCGUUAUCGCGACUGUUCAUUUCGGGAAGGCUACUGGGCUGCAUUCAAAGGCGGACCGCUGCUCUCGCUGGGUCGGGUCCAGUGGACGGGUCGAGGCUGCUCAGCUCCAUCAUCAAAGAAAUAAGAAGCCCACCGGAAGAGACAGUGAUGCCGAGAAGAUGUCCGGCAAAGGCAGACUCUGGUCCCAAGUUAGACUGGGUGGUUUAGAGAUCCAGAUAGGUCGUCGUUACCAGUUCUACGAAUACUUUCUUGGUAAACUGAUUAUCCGUCAAAGGCCGACUCACGAGAGGACACAGGGUGCAAAGAAACCCUCACAGAAGGGAACGCGAUUGGCGACACGUAUGGCGCUCGCGUGGUGUGACAGGUUUGGUAGGAUCUAUAUCCCCCAGUGCAACAUAUUAGUUCACACUAUAAUGAAGGUCCGAUUGCACCAAACAGCCUGCGAUGAUAACACGUGGACUGCUGGAGAGUAUGACUUGUACGACUGCACGCGCGAUGUACCCAACAUCGUCCUGUCCCUACCGCACCAUCUUUAUGAACUGCUGGUCUCUGAUGCACUCCCCGCCCCCCACCUCCUCUUUCUGGGGGAAUCUGGUUCCGCCAGGCAUAGGACACGUUCGGUUGGACUAACUAUUGCUAACUACACGAUCAUUCGUGAUAGGUGUCGGUCCACAUGUAUAAGGUUGGAUGAACCUAACCCCUACCGACGAUUUGAUGUAUUGUGGCCACUACUAAUGACCCCCGCUCGUAACACCCAAACACGCGCAUUUCUCUGCUCCGUGGCUGGCUGGAAUUGCCAGUAUUACAGACCCCCUGACAGAUCCAGUAGUAGGACGUACUUGAUCGCACUACAUUUAGCAAUCAAAUUGCGUUCACGCUACCCCAAUCGUUCAGAUCUGGCUUAUCCCACAUCUGUAAACAGGAACACGGUGUAUAUCACCGUAGUCUCUCUACGCACAGCGAAACUAAGAUACAACAGUUACACACCCAGCAAACUGCGGCCCGAUCAAGGCAAUGACCCCAGCUACCGAACUUCGGAAAGAGGGAAUUUCGUCCGGAACUUGCCAUCAGUGAAACCGUACCGAGACUUCAUGAAAACGAUCAGCAUUUCCUUUACAGGAUUCCGGACCAAAUUUGAUCGUCAAAUUGGGAAUUCCAUAGGCCGAGGAUUCACGGGAGCAAGGCCGACCUCAUUGAAGCGUCAUAGUCGCUUUUCGCUCACCGUACAUUCUAGCAAGCGCUAUCUCUCCCGCCUCAACGCCUACGUCUCUUUUACAAUAAAACAUCGGAUAACGAGAUUCGACGUGGCUACGCGCGAAGUUAAGGCUUCCGCCGUGGUACCUAAUGCGGAAAUAGUCCGGAAAAGUACGAAGGUGUGCUGGUUUAUGUGCAUCAUCAGACUCCAAACUGUCCGAGCUACCAAACCGACCAUUCAGAAGCAGUUGUUGAGAUUAAGUGGCCCGUUUCAAUGCGGGGAUUCCACCAAUUACGUACAACCUGGUUACUUCCACAGUUCAAACGCGCCCCGCCCGGCGUGUGUCAGUGUGUGUAUCAGCCCGGGGUUAUGGGACCUGUUGGUAAAAACCCGGAAUGCUUUCUCCCGUGUGCACGGGGGUACUAUCCUUACUUUAGUUCAGCAUGACAUUCAUAAAGUAGAAUUAUCGUCAGCAUGCACUCGCGAGCGGGCUACCAACCUGGCAAUGACUGAAAGCGUAACGUCAUACUCUCAGUGCGAGGCCUCGACUCGCCAUACCGAAAUACAAGCUGUUAGCUCAAUUGUGUAUCUCCACUUGACUGCGGCCCGCAGGGAGAAACACACAGAGAAGAGGGCCGACGCGAAGCAUCAUGUGUCUACUUGGCGCGAGGGUAAAACCGAAAACGUCAUUGGAAGGCUCAGAUCGUCACAUACAUUAACCCUUAGGUUACAUCCAUCCUCGUUGGACAACUGGCCGUUCAUUCUUGGGGAAUGCCAGAGAGGAACGGAUAUCGAGGAACGCAUCCCGGCACAUGCGGAAUGUACAGACAAGGCUGUAGGCGUAGCCUUUCAGUCGACGCUAUGGGCAGAUUCGGCGAAUCCGCGAGGUGGAUCUCGCUUGAGAAGAGGGAUUAGGGGCCCCAACGCGAUGAAUAUUGAAUGCGGAUAUUAUCUGGCGAGACAGCUUCUUGACCGCUCUUGUAGUCGCAAGAUAGGCGAGACUCUAAGACAAACUAGUUCCCGCACGCCAUUGCCAUGCAAGCGAGGCCGCGUCCCAAAACCCCUGGAACCUAAAGAGUCGAACAGGUCAGGAGCAAGUAGCGUAUGGAUACCAGUCGGAGUUAGGCUGGGUCCCUCCGCUGCGAAGACUCCGCCCUGGCGACAUGGUCGCCCGCGACAACUUCUAAUCUCUCCUCAAGUUAUUCCGUUAAGACACCCGAGCAAACGAGCUAGUCAAAGGGAUCAGUGCGAGCUCCCAUCAUGUCCUGAGUACAAGACCCCAGUGUGCCGACUUGCUUUGGGUAGCUCCAGAAUGGUUCACGAAUUAGCCCUUAAGAUGCUCUCCCCUGUUCCGGAGUUCGUGUGGAGGGUCGGAGGCGGGAAAGUCUAUUUAACGGCGGACCCACCUAGGGUAAAUCUGACGCAUAUGUCUGAACACGCCCUGGUGGUACCAGGAGUUUCCCUAUGGGCCCUUUUUCUUUUAAGACCUUUAUUUUUCCAUCUCCAACCUCGAUUAUCGACAACAUACCGUUGGCGCAGACACUUAUGCUUACCCGUUCAACUGCAUUCGUACAGGCUGGGUGAUAUGCAAUUAGGAGCAUCUCGACGUUGGGUAGGCCCCCGAAAUAUAUGGAGACAGGGUGUGUAUGCGUGGGAGAUAUAUGAGAUUCGAACUGUACCGGCUCCAAGGGUGUCACUGUUCCGCCGUUGGAAGGAAAACACUUACACCCUCUUUGGAUCAGGGGAAAUUACAGCGAAUGUUAAGACGGCUAUGUAUCGGAUAACGUCACAUCCGUUUAGACUGUAUGCGGGCGCCCGAGAAUUCAGUCCAUUCCGACUAAACGAGAAAAAGUUUGCCCCCGCGGGGAUUACGUACAAAACCGGAUGCGAUUAUAGCCGUUCUGGAGAACUCUGUGAGGGGCGGGGAAGGAAAAAUAGUUUUAUGUACCAUUGGGCGGCCCUUCCUCUCCAUGCACAUAAAACAAACAGCCUCAUUGAUUUCUACACUCCGUGCAACCCAAAUGCGGCUGCUGACAUGCUAGUGCGUAGUACAGAAGAUGCCCGAGCUCGAAUACAUUGCAGAUAUUGGGUUAACAAUUCGUUUUGCAAAUAUAUAUGUUGGCACUCCAGGUUCUUGAUAGAACCGAUUCAGAAGAAAUGGUGUACACCCAUUGAGAGGCGUCGCCCCGUAAUUAACGGGGAUGUCUUAAACGGGUCAGAGGUAACUACUAAGACGCGGUGCUGUUUCAGAUGGGCAAGUCAUACGGGCCGUUCUUACGGAAGAGAUCGUGCUAAUACAAACCUGCUUGUUAUGGGCGACGCAUCGCCCGAGGGGGGCGCGAAUCGGAGACUUGCAAGUACGACCGGUGGAUUCGUGCAAUUUAAGGUAUACAUUUCACGCGGAGACCCCGGGAAGGAGCUCCCCUACAUAACACGAAUAUCACCCGGCCGAAUUAGGGCUCGACGGUCCUUCCGCCUAAUGUGUGCAGUGAACGAGUUGGUGCCUGAAGAUGGUUUCACUCACAGGCGCCAAAGUACGACUCCUCCUUCCCGAUCAGUUUGCGACGGGCCUGUCCGCUUUAGAAUAAAGACCCACUUCCAAACUUCGACCGGCUGGGGGAAACAUUGGAGCAGCUUCCAAUGUGAUCAAAACUGUAGCAGAUCUCUAGCAUACUUCAAACAGGAAGUUACUGUAUGUGGUGCACGACCUGGCCAAGGUAGUUUCCUCCCCCUAGGACUGGUGAACGAUGGCGGGUGGAUUGUCAUUCAUAGUGAGAGAUUAGCCGUGCCUGCUUAUGACGGGAUCGGCGAUCUAGUAAGCGAUUCCAAAUCGAACAGCAUGCGCCGAGGGGACACUUACUUGGAGGUACUUAUCCGAGCGAAAAGGAGGGAGCCCAUAUCCAAAUGCGCUAGUAGAGGAGCACUGUCGAGUCAUGACCGCGGCCAUUCACUCGUAAGUACAGGGACACACUUCCAUAUCCUCCGGGGACUGAUGGGUAUUCGUACGAUUCGGCUGAGCGGGUCGCGGGACCCUACGGUCCGAACGUCUCGCGAGGGGUGUCAGGCUCCUCAUUUCAUGGUGCAGUCUGUCUGGAAGCCGACUACACUACGGGGUAGUCCUGCCCUUGAUAAUGCUAAUGAGAGUAACUCACGCCCCGCCCAUAACAAAGGCCGGGGGCCCUCUCAAUCAAACAGGCGGACGGGGAAUGUCAGGCAGGUUGUCGUUGGCAGAGUUACGCUCUCGCAGGAGAUUAAUCCUUUUGUAAAGCAUUUGGAACUAGUCCCCGGCUAUUAUUUAGCUGAGUAUCCAAUGCCUAGAAGCCUUGCGUCCCGUUCUAACCUGCGCGUAAUUCAUACAUCGCAUGAGAGAGCAAGGCAAACAAUCCAUUCGCCUGGCAAGAGAAACCGAGGAGCAAGUCACCGAACGCCCGCCGGGAAGCACCGCGAGUACCCACGACAAAACAGCUGCUUGGACUAUUAUGAACCCUCCAUACGUAGGAAGGAGGCCUAUGGGUGCGUCAAUAACGCACUCCCUGAUUGUCCUGACAAGGACGAUCGCGAAUGGACGCGCUCGCAAUCCAUGAUUGAAAUGUCCAGACCAACCGAGUCCCUGCUCAGUGCCUCCUGGCAUCGGCCAUUGGUUCUUGGAAGCCUCAACUACGGAUUCAUCACUGACCCCGUGGCGCUCACUGGUCAAAGAAAACUAGGAUGCCGUGGAAUGAUGAACACGUUAAUGUUAAUAUGGAACCAUCAUUUCGGCCCCUAUGGUUCAACCCCAAGAUUAGUUUUCGUUUGUGAAGCCAAGCGGCACCGGGGAUCUUGGGCAAACUACACUGAAGCAAAACUCCCUUCCUAUUAUGUAAUAACACUGGCACAGGGUCUUGGCCCGCGCGCUGGGCUCCACCACAACGUGUACUGUCUUCACCCUCAACGAGUUUUCCACUUCUGUCCCUUCGUAUCAGUUCACUUGCAAUUCCUAUCCCAUGUUUCGACUAGCCCAAGCGCUAAGUGUGCCCGCCUAGAUCCAGUCCAUCUUCCGGCUGAGGUGGGGAUCGUCAAACCUGCGGGGCGGAUUAAGAAGUCAUUUGUUGGUGGCGCGGGGCCUCUCAGAAUGUUAAAUAGGCAACGUGUAUGUUCGGUGGGGCCUGGAAGUGGACCGCCGUCCGUGGCGGAUUGUGCCAAAUUAACGGCUGAAGUGGAGUGGACUUCCAUCCACCCAGCUGCAGCAGAUCGGGGGUUAUCCCAAAGCACCAUCCAUGCCAGCAUGAUGCUGACUCACCAAAUAAGCUUCACUGAAUGCGACAAGUUCGCGCAAAUGGCUCAGAGCAGCGUGUCCCACACCGUGGGACAAAGGGUAUACUCGACUUCUCCACCUUGCGCGAAACCUGGCCCCGCUGGAUACAGACUGAUCAGUUCUAUCGAAUGUACCGUGCACAAAUGUAAACGUCGACAUAUGGCCGGCGCGCUACUGCGGCCCCGGGAACCUGGCCUCCUACCCGAUGACAAUGUAUCACCCGUCCCUCUCCGGUACGGCGAUAAUAUAUUGGCUCAUCGGGUGGCAUCUUACUCCCGGCGUACUUCCGACCCGUCGCAUCAAGUCCGAUCUGACCAAUUCUGGGACAUUAAUGUACAACCACCUAUACCCUUCUUCGCACCUCUGCUGAAUUUGGCUUCACGAACCUAUGGGCGAGGAGCGCUGCUGUCCCCGCCGGAACCACAGAUUCACGCUGCCACUAUGGCUUCAGCUAGAUGUGAGUCAAAUAAUAGAUCAGUACUCGUUAUGCGGCAUGAUCAUGAAGGUAAACUGCCCUUGCACCGAUCCAAGCUAAGCGGGCUAGCUGUAAUCCUUAGCCGGGGAUCUUCCGAUGUAUGUGCCCCCUCGGACAUGAAACACAUCCACAGUGGAGAUAGACAAAUGACUGAGGAGCUUAGAUUUCUGGAGAACAAAAACUUGAUGGGCUUAAGAUAUGGUCUAUACUCAUUAACAUCGAGAUGUGCUCGAAACGUCGAUAGACUCAUUCCUUUUAUUCGCCUACAGCAAGUGUUCGGGGAAUCAAAGUUGGAGUCACUUGCCCCAGGGGUCAAGCCGCUCCCGAUUUUCGUCGAGCGUCGUAGGAUGUGGCCGCCGGUUAUAUGGAUAAGUAUACGUUGCGGACACCAGACCAGACCCUAUAUACGAGACCGUUCUGCAGCUAAAUGUCGGGGGGGCCAGUCGCGGCCCGCCCUCUCUAAACAACUUAUUUACGUUCGGCGGGUAAGGCAGUGGCGGUUACAUCCAGGCAGACAGAUGGUCCUUGGUCAUACGUUCGCGAGCUCUUUCCGUCAGGAACUCUCCGCACAGCAUACUGCAACUCGCCGGAUUACAAGACCCCUUAGUGCUCCCUUUUAUGUACGUCCCCGGCCCUGGACUGGCGGUACUGUCGAGCUUUGCAUUUUUAGAGGCGCCUCAUGCAGGACUUCAGAAUUCGGCAAGGGAGCUACCCCCAAAGAGCUCCUCGUAAUGAACGGGUUCCUAGUGGUGUAUUACCCAGCCGGACAAAGGCCCGGUCUAACGUCUUUUGUCCGUUCGCAUUCCAUACGUCCCGUGUACGCCGAGCUACUCGGUAGUGAAACUAGGCGAGACUUGCGGAGGUCUUUUUGGUCAGUAAACGUAGUACUUGGUGUAUAUCGUCACUUACGCCACGGCACAAGGCAAAGGAGUGCGUCAUCCGGAUUGAAGGGCACUCUCAAGGUUGAUUCGCCAAUGGGUGUUGUUCGUCGCAAACCGAACCCGAUCCACUUUUUACCCUGGAAAGGGGUGUCAAGGGCGGACUUCGUGGCUCUAUCCGUCCAUGGAGUAUACUCGUCCUCAGUAAGUAGUGUAGGAUGGUUCACCGGAUGGAAAGGUAACGUUAAAAGACCGCUUCGUUGUUUAAUUGCGCAAGACUUCAAGUGCUCGAGCUUAGGUCUUCCCAUUAUGUUUAGGGAUGUAUUCUCACAAAUGCCUUAUUGUAGAUUGAGACAAGCUCCAUACGUAGUAGCACCCUUUGACUCGGGCGUUCUAUGGAUAGCUCGCAAGACGUGGAUCGCAUUCAGUCACUUACGAAAAUCCAGAUUCUGCCCUGCCUGGCUGUCAACAGACAACACCUUCGAUCAAUACGGAUCUAUCUUGGUGAGCGAAUUUUCUCCCACCCCGCGGGGAAUCGCACUGGUGGUCUGUGUGCCCCGAUCCAUUGUCUGCCGGAGCCACGGGAAAAAUUUUAAAUUCUGUAUCCUACUCCCCCGUGUGGCUGUAGCCCAGCUGAGGUCAGUAUGUCACCUUGUCGCUAUUAGGUGUUUCACCAUCCUAAUUGGCAAACUGUUUCAACCCUGCCAGAUAAGGUCAGAGCAGCCCCUUCGCUGGUAUUUAUCCCACAGCCCCCCUUCGAAGCGUUCCGCUAAGGCAAUACCAGCUCCGUACAGAGCGCCGGGUACCUUCCUCAUCUACUCCUGGAUCUACUUCUUACUUUGUAGGUCCACGGAUCAAGGCUGUUACUUUUGCAUAGUUCAUCGUGCCAUUACGCAGAGGACUGGAUGUCCCAGAAUACUUCUUGGAUUCACACUUGUCUCAAAUGAGCUUACGGUGGCGCACGGGAUUCAAGCUCCCGUGUUAGAGCCUCGGGCGCUGCCGUACAAUAGGGCAACUCCCAGAACCGAUCACGGAGUUUCUCCGGUGCGUAGACGUAGGUGCAGUAAUAUUCCUAUAAAUGUUGGAGAGUACCGCUGGUUGUUUACUUUUUCGGUAUGCAUACCUACCGACAGUCGCAAGGCAGUACAUGCCACGCAAGUUAGUUGUUUAAUGGUUUUGCCGCGCACAGCUCGUGCCUAUCAUAGGGUAAGGUACACCAGCUUCGGGCUUGCUUCUGAGCAGACCCAAACUAUUUUUCUGAUCCACAUAUCAUCAGACAACAAUUUUGCUCGAAAAGUAUGCAUACCCCCAUUAGUCCUUCUCUGA
Output:
MRPWSLLGSEHILSRRVEGITHDISLYWIPFFSVQVDHLATGLSKIYSAMFLYVTFSAEVAVYHVKVDGSLTTSDACRENSIHHACMGSAHLQRHRQRTDRPPFLSDGKPRSNTEQSPTPSYRLTPRLYSPLTALSGKLYLLGSGCPWKCRKMAQTTIVYRARRWIPLITDTIICVAPLPQPNHGVVALAVPSRARPHCLVRLSQGPSNNVYRYNFTWYCPDGGTADPLPFRHGITSVYLPSYSGIVRSTPYLIGTYAVPTQNSDPFATTRWVSVRLLASTTGEGDTGDRGTLSTFLDCCMSTSALPPARFISAYRVNATHGDDTRLTVKKLCTPYKSLWSGNVCRPLSVYKLKERYIVLTVGYHSLLSRMSTLRKTYSVDQAGPARPLVAEKSGVCECDYDPRFVIVIATVHFGKATGLHSKADRCSRWVGSSGRVEAAQLHHQRNKKPTGRDSDAEKMSGKGRLWSQVRLGGLEIQIGRRYQFYEYFLGKLIIRQRPTHERTQGAKKPSQKGTRLATRMALAWCDRFGRIYIPQCNILVHTIMKVRLHQTACDDNTWTAGEYDLYDCTRDVPNIVLSLPHHLYELLVSDALPAPHLLFLGESGSARHRTRSVGLTIANYTIIRDRCRSTCIRLDEPNPYRRFDVLWPLLMTPARNTQTRAFLCSVAGWNCQYYRPPDRSSSRTYLIALHLAIKLRSRYPNRSDLAYPTSVNRNTVYITVVSLRTAKLRYNSYTPSKLRPDQGNDPSYRTSERGNFVRNLPSVKPYRDFMKTISISFTGFRTKFDRQIGNSIGRGFTGARPTSLKRHSRFSLTVHSSKRYLSRLNAYVSFTIKHRITRFDVATREVKASAVVPNAEIVRKSTKVCWFMCIIRLQTVRATKPTIQKQLLRLSGPFQCGDSTNYVQPGYFHSSNAPRPACVSVCISPGLWDLLVKTRNAFSRVHGGTILTLVQHDIHKVELSSACTRERATNLAMTESVTSYSQCEASTRHTEIQAVSSIVYLHLTAARREKHTEKRADAKHHVSTWREGKTENVIGRLRSSHTLTLRLHPSSLDNWPFILGECQRGTDIEERIPAHAECTDKAVGVAFQSTLWADSANPRGGSRLRRGIRGPNAMNIECGYYLARQLLDRSCSRKIGETLRQTSSRTPLPCKRGRVPKPLEPKESNRSGASSVWIPVGVRLGPSAAKTPPWRHGRPRQLLISPQVIPLRHPSKRASQRDQCELPSCPEYKTPVCRLALGSSRMVHELALKMLSPVPEFVWRVGGGKVYLTADPPRVNLTHMSEHALVVPGVSLWALFLLRPLFFHLQPRLSTTYRWRRHLCLPVQLHSYRLGDMQLGASRRWVGPRNIWRQGVYAWEIYEIRTVPAPRVSLFRRWKENTYTLFGSGEITANVKTAMYRITSHPFRLYAGAREFSPFRLNEKKFAPAGITYKTGCDYSRSGELCEGRGRKNSFMYHWAALPLHAHKTNSLIDFYTPCNPNAAADMLVRSTEDARARIHCRYWVNNSFCKYICWHSRFLIEPIQKKWCTPIERRRPVINGDVLNGSEVTTKTRCCFRWASHTGRSYGRDRANTNLLVMGDASPEGGANRRLASTTGGFVQFKVYISRGDPGKELPYITRISPGRIRARRSFRLMCAVNELVPEDGFTHRRQSTTPPSRSVCDGPVRFRIKTHFQTSTGWGKHWSSFQCDQNCSRSLAYFKQEVTVCGARPGQGSFLPLGLVNDGGWIVIHSERLAVPAYDGIGDLVSDSKSNSMRRGDTYLEVLIRAKRREPISKCASRGALSSHDRGHSLVSTGTHFHILRGLMGIRTIRLSGSRDPTVRTSREGCQAPHFMVQSVWKPTTLRGSPALDNANESNSRPAHNKGRGPSQSNRRTGNVRQVVVGRVTLSQEINPFVKHLELVPGYYLAEYPMPRSLASRSNLRVIHTSHERARQTIHSPGKRNRGASHRTPAGKHREYPRQNSCLDYYEPSIRRKEAYGCVNNALPDCPDKDDREWTRSQSMIEMSRPTESLLSASWHRPLVLGSLNYGFITDPVALTGQRKLGCRGMMNTLMLIWNHHFGPYGSTPRLVFVCEAKRHRGSWANYTEAKLPSYYVITLAQGLGPRAGLHHNVYCLHPQRVFHFCPFVSVHLQFLSHVSTSPSAKCARLDPVHLPAEVGIVKPAGRIKKSFVGGAGPLRMLNRQRVCSVGPGSGPPSVADCAKLTAEVEWTSIHPAAADRGLSQSTIHASMMLTHQISFTECDKFAQMAQSSVSHTVGQRVYSTSPPCAKPGPAGYRLISSIECTVHKCKRRHMAGALLRPREPGLLPDDNVSPVPLRYGDNILAHRVASYSRRTSDPSHQVRSDQFWDINVQPPIPFFAPLLNLASRTYGRGALLSPPEPQIHAATMASARCESNNRSVLVMRHDHEGKLPLHRSKLSGLAVILSRGSSDVCAPSDMKHIHSGDRQMTEELRFLENKNLMGLRYGLYSLTSRCARNVDRLIPFIRLQQVFGESKLESLAPGVKPLPIFVERRRMWPPVIWISIRCGHQTRPYIRDRSAAKCRGGQSRPALSKQLIYVRRVRQWRLHPGRQMVLGHTFASSFRQELSAQHTATRRITRPLSAPFYVRPRPWTGGTVELCIFRGASCRTSEFGKGATPKELLVMNGFLVVYYPAGQRPGLTSFVRSHSIRPVYAELLGSETRRDLRRSFWSVNVVLGVYRHLRHGTRQRSASSGLKGTLKVDSPMGVVRRKPNPIHFLPWKGVSRADFVALSVHGVYSSSVSSVGWFTGWKGNVKRPLRCLIAQDFKCSSLGLPIMFRDVFSQMPYCRLRQAPYVVAPFDSGVLWIARKTWIAFSHLRKSRFCPAWLSTDNTFDQYGSILVSEFSPTPRGIALVVCVPRSIVCRSHGKNFKFCILLPRVAVAQLRSVCHLVAIRCFTILIGKLFQPCQIRSEQPLRWYLSHSPPSKRSAKAIPAPYRAPGTFLIYSWIYFLLCRSTDQGCYFCIVHRAITQRTGCPRILLGFTLVSNELTVAHGIQAPVLEPRALPYNRATPRTDHGVSPVRRRRCSNIPINVGEYRWLFTFSVCIPTDSRKAVHATQVSCLMVLPRTARAYHRVRYTSFGLASEQTQTIFLIHISSDNNFARKVCIPPLVLL
PROT.rb:
def abbreviate(str)
list = ""
str.scan(/.../) do |sub|
case sub
when "UUU", "UUC"
list += "F"
when "UUA", "UUG"
list += "L"
when "UCU", "UCC", "UCA", "UCG", "AGU", "AGC"
list += "S"
when "UAU", "UAC"
list += "Y"
when "UGU", "UGC"
list += "C"
when "UGG"
list += "W"
when "CUU", "CUC", "CUA", "CUG"
list += "L"
when "CCU", "CCC", "CCA", "CCG"
list += "P"
when "CAU", "CAC"
list+= "H"
when "CAA", "CAG"
list += "Q"
when "CGU", "CGC", "CGA", "CGG", "AGA", "AGG"
list += "R"
when "AUU", "AUC", "AUA"
list += "I"
when "AUG"
list += "M"
when "ACU", "ACC", "ACA", "ACG"
list += "T"
when "AAU", "AAC"
list += "N"
when "AAA", "AAG"
list += "K"
when "GUU", "GUC", "GUA", "GUG"
list += "V"
when "GCU", "GCC", "GCA", "GCG"
list += "A"
when "GAU", "GAC"
list += "D"
when "GAA", "GAG"
list += "E"
when "GGU", "GGC", "GGA", "GGG"
list += "G"
else
return list
end
end
end
user_input = gets.chomp
abbreviate(user_input)
Yea, that's a very long switch.
Basically it's a translator. Take 3 characters, output 1 other. I've thought about implementing this using a key-value map, but that wouldn't solve the repetitiveness.
The readability is quite good and so is the speed. However, I'm quite sure it isn't idiomatic. I have no clue how this would score on maintainability.
There's one thing striking me as odd: return
should be implicit. But simply stating list
instead of return list
modifies the behaviour and I'm not sure why.