i3 config file parser

Question 1

So I am using i3 an Linux window manager to manage my windows. In addition this is run on a laptop that is frequently mounted to several output-displays. This is handled by the following lines in my i3 config file

set $firstMonitor DP-2-1
set $secondMonitor DP-1-2
set $laptop eDP-1
workspace 1 output $firstMonitor $laptop
workspace 2 output $firstMonitor $laptop
workspace 3 output $firstMonitor $laptop
workspace 4 output $firstMonitor $laptop
workspace 5 output $secondMonitor $laptop
workspace 6 output $secondMonitor $laptop
workspace 7 output $secondMonitor $laptop
workspace 8 output $secondMonitor $laptop
workspace 9 output $secondMonitor $laptop
workspace 10 output $laptop $laptop

For convience we can define variables in the config file using set and prefix them with $. The lines above tells my window manager I would like workspace 1 to 4 on $firstMonitor if it exists. Otherwise fall back to $laptop.

I am using this for various scripts and therefore need to extract this information from my config file, and store it somewhere. For instance it could look like this

[
 {
 'DP-2-1': ['1', '2', '3', '4'],
 'DP-1-2': ['5', '6', '7', '8', '9'],
 'eDP-1': ['10']
 },
 {
 'eDP-1': ['1', '2', '3', '4', '5', '6', '7', '8', '9', '10'
 }
]

parsed. To do so I wrote the following Python code to extract which output each workspace is assigned to. Note that the i3 file is picky about whitespaces, meaning the line set$firstMonitor DP-2-1 will not compile.

from pathlib import Path
import collections
CONFIG = Path.home() / ".config" / "i3" / "config"
def read_config(config=CONFIG):
 with open(config, "r") as f:
 return f.readlines()
def read_workspace_outputs(lines=read_config()):
 """Reads an i3 config, returns which output each workspaces is assigned to
 Example:
 set $firstMonitor DP-2-1
 set $secondMonitor DP-1-2
 set $laptop eDP-1
 set $browser 1
 workspace $browser output $firstMonitor $laptop
 workspace 2 output $firstMonitor $laptop
 workspace 3 output $firstMonitor $laptop
 workspace 4 output $firstMonitor $laptop
 workspace 5 output $secondMonitor $laptop
 workspace 6 output $secondMonitor $laptop
 workspace 7 output $secondMonitor $laptop
 workspace 8 output $secondMonitor $laptop
 workspace 9 output $secondMonitor $laptop
 workspace 10 output $laptop $laptop
 Will return
 [
 {
 'DP-2-1': ['1', '2', '3', '4'],
 'DP-1-2': ['5', '6', '7', '8', '9'],
 'eDP-1': ['10']
 },
 {
 'eDP-1': ['1', '2', '3', '4', '5', '6', '7', '8', '9', '10'
 }
 ]
 >>> read_workspace_outputs(lines=['workspace 1 output eDP-1'])
 [defaultdict(<class 'list'>, {'eDP-1': ['1']})]
 >>> read_workspace_outputs(lines=['set $laptop eDP-1','workspace 1 output eDP-1'])
 [defaultdict(<class 'list'>, {'eDP-1': ['1']})]
 >>> read_workspace_outputs(lines=['set $browser 1','workspace $browser output eDP-1'])
 [defaultdict(<class 'list'>, {'eDP-1': ['1']})]
 >>> read_workspace_outputs(lines=[
 ... "set $firstMonitor DP-2-1",
 ... "set $secondMonitor DP-1-2",
 ... "set $laptop eDP-1",
 ... "",
 ... "workspace 1 output $firstMonitor $laptop",
 ... "workspace 3 output $secondMonitor $laptop",
 ... "workspace 5 output $laptop $laptop",
 ... ])
 [defaultdict(<class 'list'>, {'DP-2-1': ['1'], 'DP-1-2': ['3'], 'eDP-1': ['5']}), defaultdict(<class 'list'>, {'eDP-1': ['1', '3', '5']})]
 """
 # Extract workspaces and variables [set $name command] from file
 def get_workspaces_and_variables(lines):
 workspaces = []
 variables_2_commands = dict()
 for line in lines:
 if line.startswith("workspace"):
 workspaces.append(line.strip())
 elif line.startswith("set"):
 _, variable, *command = line.split()
 variables_2_commands[variable] = " ".join(command)
 return workspaces, variables_2_commands
 # Convert back $name to command for outputs and workspaces
 def workspaces_without_variables(workspaces, variables):
 workspace_outputs = []
 for workspace in workspaces:
 workspace_str, output_str = workspace.split("output")
 workspace, variable = workspace_str.split()
 workspace_number = (
 variables[variable] if variable.startswith("$") else variable
 )
 outputs = [
 variables[output] if output.startswith("$") else output
 for output in output_str.split()
 ]
 workspace_outputs.append([workspace_number, outputs])
 return workspace_outputs
 # Currently things are stored as workspaces = [[output1, output 2], ...]
 # This flips the order and stores it as a dict with outputs as keys and values workspaces
 def workspaces_2_outputs(workspaces):
 output_workspaces = [
 collections.defaultdict(list)
 for _ in range(len(max((x[1] for x in workspaces), key=len)))
 ]
 for (workspace_number, outputs) in workspaces:
 for j, output in enumerate(outputs):
 output_workspaces[j][output].append(workspace_number)
 return output_workspaces
 workspaces_w_variables, variables = get_workspaces_and_variables(lines)
 variable_free_workspaces = workspaces_without_variables(
 workspaces_w_variables, variables
 )
 return workspaces_2_outputs(variable_free_workspaces)
if __name__ == "__main__":
 import doctest
 import yaml
 from yaml.representer import Representer
 doctest.testmod()
 OUTPUT_WORKSPACE_CONFIG = (
 Path.home() / ".config" / "i3" / "oisov-scripts" / "i3-output-workspace-2.yaml"
 )
 yaml.add_representer(collections.defaultdict, Representer.represent_dict)
 with open(OUTPUT_WORKSPACE_CONFIG, "w") as file:
 yaml.dump(read_workspace_outputs(), file)

The output.yaml file looks like this

- DP-1-2:
 - '5'
 - '6'
 - '7'
 - '8'
 - '9'
 DP-2-1:
 - '1'
 - '2'
 - '3'
 - '4'
 eDP-1:
 - '10'
- eDP-1:
 - '1'
 - '2'
 - '3'
 - '4'
 - '5'
 - '6'
 - '7'
 - '8'
 - '9'
 - '10'

I will add some typing hints in the future, but for now I was wondering if this is the best approach to extract this information. The code feels a bit clunky, even if it does that I want it to.

For testing one can use the first code block in this example. A longer i3 example file can for instance be https://github.com/sainathadapa/i3-wm-config/blob/master/i3-default-config-backup, due note that this does not set the workspaces to specific outputs.

Question 2

As far as coding in Python, I think this is nicely done, with good style, comments, and testing included. I agree with you it's a bit clunky, and I have some other minor comments too.

What's clunky

I have some ideas why it might feel a bit clunky.

get_workspaces_and_variables could return more useful objects. The workspaces are raw lines, not yet parsed. The variables is already good, it's a dictionary ready to use to substitute symbols with actual values. As I try to build a mental model of the program in my head, I feel a heavy burden here, having to remember about two kinds of objects, where one of them needs more work. I would prefer a parse_workspaces function that returns workspaces in an intuitive format.

workspaces_without_variables takes raw workspace lines and variable mappings, and returns workspaces in an intuitive format, with all variables resolved. And that's fine, it just looks kind of a lot of code. It would be good to extract to a function the duplicated variables[name] if name.startswith("$") else name. Also, I think the parsing could be done a bit simpler. In a code comment, an example raw line would be helpful too.

Finally, workspaces_2_outputs creates the final desired mapping, I just find this line a little bit cryptic:

for _ in range(len(max((x[1] for x in workspaces), key=len)))

That is, it finds the max entry by length, and then takes the length as the limit of the range. I would find it more natural to take the lengths, and then find the max of them. And instead of numeric indexing, I would use descriptive names. Like this:

for _ in range(max(len(outputs) for _, outputs in workspaces))]

Putting the above comments together, consider this alternative implementation:

def parse_variable(line):
 # example line: "set $firstMonitor DP-2-1"
 _, name, value = line.split()
 return name, value
def parse_workspace(line, variables):
 def resolve(name):
 return variables[name] if name.startswith('$') else name
 # example line: "workspace 1 output $firstMonitor $laptop"
 _, workspace, _, *devices = line.split()
 return resolve(workspace), [resolve(device) for device in devices]
def outputs_to_workspaces(workspaces):
 mappings = [collections.defaultdict(list)
 for _ in range(max(len(outputs) for _, outputs in workspaces))]
 for workspace, outputs in workspaces:
 for index, output in enumerate(outputs):
 mappings[index][output].append(workspace)
 return mappings
def parse_workspaces():
 variables = {}
 workspaces = []
 for line in lines:
 if line.startswith('set'):
 name, value = parse_variable(line)
 variables[name] = value
 elif line.startswith('workspace'):
 workspaces.append(parse_workspace(line, variables))
 return workspaces
return outputs_to_workspaces(parse_workspaces())

This passes the original test cases, so I think the behavior is preserved, even though the parsing uses a bit simplified logic.

Note that for parse_workspaces to work like this it's important that all the variables needed to parse a workspace line have already been defined. I don't know the i3 format, and if this is a safe assumption or not. If not, then indeed a 2nd pass would be needed, similar to the way you did.

Notice that the scope of variables is much more limited than in the posted code. I think this helps reducing mental burden.

Shadowing names from outer scope

In the posted code, some names shadow names from outer scope, for example:

def get_workspaces_and_variables(lines):
 ^^^^^
def workspaces_without_variables(workspaces, variables):
 ^^^^^^^^^

I've experienced nasty bugs in the past due to this, I suggest to avoid such shadowing.

Usability

I would have liked to play around with the program, but with the input and output file paths hardcoded, this was not easy enough. It would be good to accept these as command line arguments, and perhaps use the current values as default.

Question 3

You have doctests - good; keep that up.

Where this falls over, mostly, is that you have untyped data soup. Internal data representations should not look like JSON. Python (and in particular dataclass) makes it easy to spin up lightweight models so that you gain some assurance of correctness, particularly if you use mypy.

Assigning a parameter default of lines=read_config() is a bad idea. This default is going to be applied on module load. If the default file is huge module load will be slow, and if the default file changes after some time from module load has passed, the default will be stale. Having default filenames is fine, but I wouldn't have an entire default config like this.

Your startswith parsing can be simplified by the use of regular expressions.

Having a function read_workspace_outputs with a collection of nested functions on the inside, apart from doctests, makes unit testing impossible.

I don't understand much about the yaml module, and the yaml format is trivial enough that it's not difficult to just produce the markup directly. I don't have a strong opinion on this point, but the suggested code below shows direct markup generation.

Your output format is a little strange; how much control do you have over it? It looks like you're implying the outermost tree level to be the index within the workspace monitor list, but it would be less confusing if you also showed the numerical index itself. That said, the example code below abides by your existing output structure.

Suggested

import re
from dataclasses import dataclass, field
from pathlib import Path
from string import Template
from typing import Iterator, Iterable, Tuple, Dict, List, Collection
CONFIG = Path.home() / ".config" / "i3" / "config"
@dataclass(frozen=True)
class Monitor:
 name: str
 workspaces: Dict[int, 'Workspace'] = field(default_factory=dict)
@dataclass(frozen=True)
class Workspace:
 id: int
 monitors: List[Monitor] = field(default_factory=list)
WorkspaceDefinition = Tuple[
 int, # Workspace ID
 Tuple[str], # monitors
]
def read_config(config: Path = CONFIG) -> Iterator[WorkspaceDefinition]:
 variables: Dict[str, str] = {}
 var_pat = re.compile(
 r'^set'
 r' \$(?P<key>\S+)'
 r' (?P<value>.*)'
 r'\n$'
 )
 work_pat = re.compile(
 r'^workspace'
 r' (?P<id>\S+)'
 r' output'
 r' (?P<monitors>.+)'
 r'\n$'
 )
 with config.open() as f:
 for line in f:
 var = var_pat.match(line)
 if var:
 variables[var['key']] = var['value']
 continue
 filled = Template(line).substitute(variables)
 workspace = work_pat.match(filled)
 if workspace:
 yield int(workspace['id']), workspace['monitors'].split(' ')
def load_workspaces(work_defs: Iterable[WorkspaceDefinition]) -> Tuple[
 Tuple[Workspace],
 Tuple[Monitor],
]:
 monitors = {}
 workspaces = {}
 for work_id, monitor_names in work_defs:
 workspace = Workspace(id=work_id)
 workspaces[work_id] = workspace
 for monitor_name in monitor_names:
 monitor = monitors.get(monitor_name)
 if monitor is None:
 monitor = Monitor(name=monitor_name)
 monitors[monitor_name] = monitor
 monitor.workspaces[work_id] = workspace
 workspace.monitors.append(monitor)
 return tuple(workspaces.values()), tuple(monitors.values())
@dataclass(frozen=True)
class MonitorPosition:
 index: int
 workspaces_by_monitor: Dict[str, List[Workspace]]
 @classmethod
 def all(
 cls,
 workspaces: Collection[Workspace],
 monitors: Collection[Monitor],
 ) -> Iterator['MonitorPosition']:
 monitor_positions = max(
 len(workspace.monitors)
 for workspace in workspaces
 )
 for index in range(monitor_positions):
 yield cls(
 index,
 dict(cls.for_index(workspaces, monitors, index)),
 )
 @classmethod
 def for_index(
 cls,
 workspaces: Collection[Workspace],
 monitors: Collection[Monitor],
 index: int,
 ) -> Iterator[Tuple[str, List[Workspace]]]:
 for monitor in monitors:
 workspaces_used = [
 workspace for workspace in workspaces
 if len(workspace.monitors) > index
 and workspace.monitors[index] is monitor
 ]
 if workspaces_used:
 yield monitor.name, workspaces_used
def to_yaml(
 positions: Iterable[MonitorPosition],
 filename: Path,
) -> None:
 with filename.open('w') as yaml:
 for position in positions:
 group_prefix = '-'
 for monitor_name, workspaces in position.workspaces_by_monitor.items():
 yaml.write(f'{group_prefix:<2}{monitor_name}:\n')
 yaml.write('\n'.join(
 f" - '{workspace.id}'"
 for workspace in workspaces
 ))
 yaml.write('\n')
 group_prefix = ' '
def test() -> None:
 workspaces, monitors = load_workspaces(read_config(Path('config')))
 positions = MonitorPosition.all(workspaces, monitors)
 to_yaml(positions, Path('i3-output-workspace-2.yaml'))
if __name__ == '__main__':
 test()

janos janos 113k15 gold badges154 silver badges396 bronze badges · Accepted Answer · 2021-10-29 18:14:48Z

As far as coding in Python, I think this is nicely done, with good style, comments, and testing included. I agree with you it's a bit clunky, and I have some other minor comments too.

What's clunky

I have some ideas why it might feel a bit clunky.

get_workspaces_and_variables could return more useful objects. The workspaces are raw lines, not yet parsed. The variables is already good, it's a dictionary ready to use to substitute symbols with actual values. As I try to build a mental model of the program in my head, I feel a heavy burden here, having to remember about two kinds of objects, where one of them needs more work. I would prefer a parse_workspaces function that returns workspaces in an intuitive format.

workspaces_without_variables takes raw workspace lines and variable mappings, and returns workspaces in an intuitive format, with all variables resolved. And that's fine, it just looks kind of a lot of code. It would be good to extract to a function the duplicated variables[name] if name.startswith("$") else name. Also, I think the parsing could be done a bit simpler. In a code comment, an example raw line would be helpful too.

Finally, workspaces_2_outputs creates the final desired mapping, I just find this line a little bit cryptic:

for _ in range(len(max((x[1] for x in workspaces), key=len)))

That is, it finds the max entry by length, and then takes the length as the limit of the range. I would find it more natural to take the lengths, and then find the max of them. And instead of numeric indexing, I would use descriptive names. Like this:

for _ in range(max(len(outputs) for _, outputs in workspaces))]

Putting the above comments together, consider this alternative implementation:

def parse_variable(line):
 # example line: "set $firstMonitor DP-2-1"
 _, name, value = line.split()
 return name, value
def parse_workspace(line, variables):
 def resolve(name):
 return variables[name] if name.startswith('$') else name
 # example line: "workspace 1 output $firstMonitor $laptop"
 _, workspace, _, *devices = line.split()
 return resolve(workspace), [resolve(device) for device in devices]
def outputs_to_workspaces(workspaces):
 mappings = [collections.defaultdict(list)
 for _ in range(max(len(outputs) for _, outputs in workspaces))]
 for workspace, outputs in workspaces:
 for index, output in enumerate(outputs):
 mappings[index][output].append(workspace)
 return mappings
def parse_workspaces():
 variables = {}
 workspaces = []
 for line in lines:
 if line.startswith('set'):
 name, value = parse_variable(line)
 variables[name] = value
 elif line.startswith('workspace'):
 workspaces.append(parse_workspace(line, variables))
 return workspaces
return outputs_to_workspaces(parse_workspaces())

This passes the original test cases, so I think the behavior is preserved, even though the parsing uses a bit simplified logic.

Note that for parse_workspaces to work like this it's important that all the variables needed to parse a workspace line have already been defined. I don't know the i3 format, and if this is a safe assumption or not. If not, then indeed a 2nd pass would be needed, similar to the way you did.

Notice that the scope of variables is much more limited than in the posted code. I think this helps reducing mental burden.

Shadowing names from outer scope

In the posted code, some names shadow names from outer scope, for example:

def get_workspaces_and_variables(lines):
 ^^^^^
def workspaces_without_variables(workspaces, variables):
 ^^^^^^^^^

I've experienced nasty bugs in the past due to this, I suggest to avoid such shadowing.

Usability

I would have liked to play around with the program, but with the input and output file paths hardcoded, this was not easy enough. It would be good to accept these as command line arguments, and perhaps use the current values as default.

Stack Exchange Network

i3 config file parser

2 Answers 2

What's clunky

Shadowing names from outer scope

Usability

Suggested

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Hot Network Questions

i3 config file parser

2 Answers 2

What's clunky

Shadowing names from outer scope

Usability

Suggested

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Related

Hot Network Questions