0

I have the following tree algorithm which prints the conditions for each leaf:

def _grow_tree(self, X, y, depth=0):
 # Identify best split
 idx, thr = self._best_split(X, y)
 # Indentation for tree description
 indent = " " * depth
 indices_left = X.iloc[:, idx] < thr
 X_left = X[indices_left]
 y_left = y_train[X_left.reset_index().loc[:,'id'].values]
 X_right = X[~indices_left]
 y_right = y_train[X_right.reset_index().loc[:,'id'].values]
 self.tree_describe.append(indent +"if x['"+ X.columns[idx] + "'] <= " +\
 str(thr) + ':')
 # Grow on left side of the tree 
 node.left = self._grow_tree(X_left, y_left, depth + 1)
 self.tree_describe.append(indent +"else: #if x['"+ X.columns[idx] + "'] > " +\
 str(thr) + ':')
 # Grow on right side of the tree
 node.right = self._grow_tree(X_right, y_right, depth + 1)
 return node

This produces the following print for a particular case:

["if x['VAR1'] <= 0.5:",
 " if x['VAR2'] <= 0.5:",
 " else: #if x['VAR2'] > 0.5:",
 "else: #if x['VAR1'] > 0.5:",
 " if x['VAR3'] <= 0.5:",
 " else: #if x['VAR3'] > 0.5:"]

How could I obtain the following output?:

["if x['VAR1'] <= 0.5:",
 " if x['VAR1'] <= 0.5&x['VAR2'] <= 0.5",
 " else: #if x['VAR1'] <= 0.5&x['VAR2'] > 0.5:",
 "else: #if x['VAR1'] > 0.5:",
 " if x['VAR1'] > 0.5&x['VAR3'] <= 0.5:",
 " else: #if x['VAR1'] > 0.5&x['VAR3'] > 0.5:"]
asked Feb 25, 2020 at 14:45
3
  • Didn't you make the indentation because you didn't want this output? Because now indent shows which item is a child of another, and now you want to repeat the x['VAR1'] <= 0.5/x['VAR1'] > 0.5 parts to show that. Commented Feb 25, 2020 at 14:59
  • Initially yes, but now I intend to use the pandas query function to create leaf columns based on conditions and the first way is not practical. I need for each leaf to have all the conditions. Commented Feb 25, 2020 at 15:07
  • So just forward it like you did with depth. Depth grows +1, description elements will grow with conditions. Commented Feb 25, 2020 at 15:25

1 Answer 1

1

You could introduce a new argument to your function, which will have the string with higher-level condition(s) that need to be added to each deeper conditions:

I would also suggest using .format() for your string building:

def _grow_tree(self, X, y, depth=0, descr=""):
 idx, thr = self._best_split(X, y)
 indent = " " * depth
 cond = "x['{}'] <= {}{}".format(X.columns[idx], thr, descr)
 self.tree_describe.append("{}if {}:".format(indent, cond))
 node.left = self._grow_tree(X_left, y_left, depth + 1, " & " + cond)
 cond = "x['{}'] > {}{}".format(X.columns[idx], thr, descr)
 self.tree_describe.append("{}else: #if {}:".format(indent, cond))
 node.right = self._grow_tree(X_right, y_right, depth + 1, " & " + cond)
 return node
answered Feb 25, 2020 at 15:09

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.