Fishy: An ASCII Programming Language

Question 1

I've decided to write a very simply output programming language. All the user does is write ASCII values inside ASCII fish, and the interpreter pieces the values together and outputs them.

I'm mainly looking for feedback on the interpreter, as the language itself is very easy to understand.

Here's what a Hello, World! program looks like in Fishy:

><72> ><101> ><108> ><108> ><111> ><44> ><32> ><87> ><111> ><114> ><108> ><100> ><33>

All the rules of the language are listed in the module docstring of the program.

"""
Fishy (.fishy extension)
><> Frontfish
Implementation is simple:
You enter ASCII values between the facing signs <>
Commands on separate lines will have output separated by a new line
Example:
><98> ><112> ><113> ><107>
bpqk
><97>
><108>
><101>
a
l
e
NO TRAILING WHITESPACE!
Trailing whitespace after the last fish on the line will result in a syntax error
"""
import argparse
import os
import sys
from typing import List
def run_code(code: List[str]):
 """
 Runs the passed Fishy Code
 """
 for line in code:
 # Clean up code and separate commands#
 line = line.strip("\n")
 commands = line.split(" ")
 # Check if line has multiple statements in it 
 if len(commands) > 1:
 if correct_syntax(commands):
 output = "".join(chr(get_number(fish)) for fish in commands)
 print(output)
 else:
 if correct_syntax(commands):
 print(chr(get_number(commands[0])))
def correct_syntax(pond: List[str]) -> bool:
 """
 Checks the syntax of the passed list of commands on the following criteria:
 Is a fish ><..>
 Correct Example:
 ><98> ><108> ><56>
 Incorrect Example:
 ><98> >><<76>> ><[108>
 """
 for fish in pond:
 if not is_fish(fish):
 sys.exit(f"Incorrect Syntax: {fish}")
 return True
def is_fish(fish: str) -> bool:
 """
 Returns if the passed fish is the fish or not
 Fish: Starts with >< ends with >
 A fish like so ><98g> will be caught by "get_number()" function
 """
 return fish.startswith("><") and fish.endswith(">")
def get_number(fish: str) -> int:
 """
 Returns the number in the fish
 """
 # Check font fish first #
 try:
 number = int(fish[2:-1])
 except ValueError:
 sys.exit(f"Incorrect Syntax: {fish}")
 return number
def get_content(file: str) -> List[str]:
 """
 Returns all the content in the passed file path
 :param file -> str: File to read content
 :return List[str]: Content in file
 """
 with open(file, "r") as file:
 return [line for line in file]
def main() -> None:
 """
 Sets up argparser and runs main program
 """
 parser = argparse.ArgumentParser(description="Enter path to .fishy program file")
 parser.add_argument("Path", metavar="path", type=str, help="path to .fishy program file")
 args = parser.parse_args()
 file_path = args.Path
 if not os.path.isfile(file_path):
 sys.exit("The file does not exist")
 content = get_content(file_path)
 run_code(content)
if __name__ == "__main__":
 main()

Question 2

Restructuring and optimization

The initial approach introduces inefficient file processing as get_content function reads all lines from the input file into a list at once and holds that list in memory throughout the entire processing. The traversal of the lines that were read is then redundantly repeated in run_code function.
The more efficient way is to convert get_content into a generator function and consume one line from file at a time on demand.

The optimized get_content function:

def get_content(file: str) -> List[str]:
 """
 Yields lines from the passed file path
 :param file -> str: File to read content
 :return List[str]: Content in file
 """
 with open(file, "r") as file:
 for line in file:
 yield line.rstrip()

run_code function is renamed to parse_code

Inefficiency of validating and traversing commands

In parse_code (formerly run_code) function the commands sequence is potentially being traversed twice: once on correct_syntax(commands) call and then - on getting numbers chr(get_number(fish)) for fish in commands.
Moreover, consequent validations in this case may lead to redundant calculations.
Consider the following situation: commands contains 10 items, all of them passed correct_syntax check but then, the 9th item fails on get_number check. That causes 10 redundant operations/checks.

To optimize validations we notice that is_fish and get_number are conceptually dependent on the same context - "fish" and are intended to validate the same "fish" object.
Thus, those 2 validations are reasonably combined/consolidated into one validation function is_fish:

def is_fish(fish: str) -> bool:
 """
 Validates "fish" item
 Fish: Starts with >< ends with > and has number inside
 A fish like so ><98g> will fail the check
 """
 return fish.startswith("><") and fish.endswith(">") and fish[2:-1].isdigit()

get_number function is now removed.
The correct_syntax function is renamed to get_fish_numbers and its responsibility now is "Collect fish numbers from valid fishes":

def get_fish_numbers(pond: List[str]) -> bool:
 """
 Collects fish numbers with checking the syntax of the passed list of commands on the following criteria:
 Is a fish ><..>
 Correct Example:
 ><98> ><108> ><56>
 Incorrect Example:
 ><98> >><<76>> ><[108>
 """
 fish_numbers = []
 for fish in pond:
 if not is_fish(fish):
 sys.exit(f"Incorrect Syntax: {fish}")
 fish_numbers.append(int(fish[2:-1]))
 return fish_numbers

And finally the optimized parse_code function:

def parse_code(code: List[str]):
 """
 Parse and output the passed Fishy Code
 """
 for line in code:
 # Clean up code and separate commands#
 commands = line.split(" ")
 # Check if line has multiple statements in it
 fish_numbers = get_fish_numbers(commands)
 if len(fish_numbers) > 1:
 output = "".join(chr(num) for num in fish_numbers)
 print(output)
 else:
 print(chr(fish_numbers[0]))

Question 3

Here is a potential solution which was minimized from a finite automata. To make this solution more maintainable, a parse tree could have been created (or an explicit finite automata) so that the syntax can be modified in the future.

Note: this answer is a bit academic in that its practical use is limited, however, provides a starting point to convert this program into a parse tree.

It doesn't have the file reading capabilities or the argparse abilities, but it has the core of the solution (checks if the program is valid and if so, run it.)

import re
input_program = "><72> ><101> ><108> ><108> ><111> ><44> ><32> ><87> ><111> ><114> ><108> ><100> ><33>"
regex = r"(?:^\>\<((1|2|3|4|5|6|7|8|9|10|1{2}|12|13|14|15|16|17|18|19|20|21|2{2}|23|24|25|26|27|28|29|30|31|32|3{2}|34|35|36|37|38|39|40|41|42|43|4{2}|45|46|47|48|49|50|51|52|53|54|5{2}|56|57|58|59|60|61|62|63|64|65|6{2}|67|68|69|70|71|72|73|74|75|76|7{2}|78|79|80|81|82|83|84|85|86|87|8{2}|89|90|91|92|93|94|95|96|97|98|9{2}|10{2}|101|102|103|104|105|106|107|108|109|1{2}0|1{3}|1{2}2|1{2}3|1{2}4|1{2}5|1{2}6|1{2}7|1{2}8|1{2}9|120|121|12{2}|123|124|125|126|127))\> )+(?:\>\<(1|2|3|4|5|6|7|8|9|10|1{2}|12|13|14|15|16|17|18|19|20|21|2{2}|23|24|25|26|27|28|29|30|31|32|3{2}|34|35|36|37|38|39|40|41|42|43|4{2}|45|46|47|48|49|50|51|52|53|54|5{2}|56|57|58|59|60|61|62|63|64|65|6{2}|67|68|69|70|71|72|73|74|75|76|7{2}|78|79|80|81|82|83|84|85|86|87|8{2}|89|90|91|92|93|94|95|96|97|98|9{2}|10{2}|101|102|103|104|105|106|107|108|109|1{2}0|1{3}|1{2}2|1{2}3|1{2}4|1{2}5|1{2}6|1{2}7|1{2}8|1{2}9|120|121|12{2}|123|124|125|126|127)\>)$"
pattern = re.compile(regex)
def extract_ascii_codes(input_text):
 """
 Converts the ASCII codes into text
 """
 matches = re.finditer(r"\d+", input_text)
 for matchNum, match in enumerate(matches, start=1):
 yield int(match.group())
def parse_line(input_program):
 """
 Checks if the line in the program is syntatically valid; returns if it is
 """
 if pattern.match(input_program) is not None:
 return (''.join(map(chr, extract_ascii_codes(input_program))))
parsed_program = list(map(parse_line, input_program.split("\n")))
if all(parsed_program):
 for a_line in parsed_program:
 print(a_line)
else:
 print("Syntax error")

Finite automata (condensed):

enter image description here

Question 4

Why 1{2} instead of 11 in the regex?

Question 5

I guess if its automatically generated the regex is simplified anywhere the generator finds repitition

score 11 · Accepted Answer · 2019-12-02 09:49:12Z

Restructuring and optimization

The initial approach introduces inefficient file processing as get_content function reads all lines from the input file into a list at once and holds that list in memory throughout the entire processing. The traversal of the lines that were read is then redundantly repeated in run_code function.
The more efficient way is to convert get_content into a generator function and consume one line from file at a time on demand.

The optimized get_content function:

def get_content(file: str) -> List[str]:
 """
 Yields lines from the passed file path
 :param file -> str: File to read content
 :return List[str]: Content in file
 """
 with open(file, "r") as file:
 for line in file:
 yield line.rstrip()

run_code function is renamed to parse_code

Inefficiency of validating and traversing commands

In parse_code (formerly run_code) function the commands sequence is potentially being traversed twice: once on correct_syntax(commands) call and then - on getting numbers chr(get_number(fish)) for fish in commands.
Moreover, consequent validations in this case may lead to redundant calculations.
Consider the following situation: commands contains 10 items, all of them passed correct_syntax check but then, the 9th item fails on get_number check. That causes 10 redundant operations/checks.

To optimize validations we notice that is_fish and get_number are conceptually dependent on the same context - "fish" and are intended to validate the same "fish" object.
Thus, those 2 validations are reasonably combined/consolidated into one validation function is_fish:

def is_fish(fish: str) -> bool:
 """
 Validates "fish" item
 Fish: Starts with >< ends with > and has number inside
 A fish like so ><98g> will fail the check
 """
 return fish.startswith("><") and fish.endswith(">") and fish[2:-1].isdigit()

get_number function is now removed.
The correct_syntax function is renamed to get_fish_numbers and its responsibility now is "Collect fish numbers from valid fishes":

def get_fish_numbers(pond: List[str]) -> bool:
 """
 Collects fish numbers with checking the syntax of the passed list of commands on the following criteria:
 Is a fish ><..>
 Correct Example:
 ><98> ><108> ><56>
 Incorrect Example:
 ><98> >><<76>> ><[108>
 """
 fish_numbers = []
 for fish in pond:
 if not is_fish(fish):
 sys.exit(f"Incorrect Syntax: {fish}")
 fish_numbers.append(int(fish[2:-1]))
 return fish_numbers

And finally the optimized parse_code function:

def parse_code(code: List[str]):
 """
 Parse and output the passed Fishy Code
 """
 for line in code:
 # Clean up code and separate commands#
 commands = line.split(" ")
 # Check if line has multiple statements in it
 fish_numbers = get_fish_numbers(commands)
 if len(fish_numbers) > 1:
 output = "".join(chr(num) for num in fish_numbers)
 print(output)
 else:
 print(chr(fish_numbers[0]))

Stack Exchange Network

Fishy: An ASCII Programming Language

2 Answers 2

Restructuring and optimization

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Linked

Hot Network Questions

Fishy: An ASCII Programming Language

2 Answers 2

Restructuring and optimization

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Linked

Related

Hot Network Questions