5

So my problem is this, I have a file that looks like this:

[SHIFT]this isrd[BACKSPACE][BACKSPACE] an example file[SHIFT]1

This would of course translate to

' This is an example file!'

I am looking for a way to parse the original content into the end content, so that a [BACKSPACE] will delete the last character(spaces included) and multiple backspaces will delete multiple characters. The [SHIFT] doesnt really matter as much to me. Thanks for all the help!

asked Feb 3, 2011 at 2:58
1
  • Are [BACKSPACE] and [SHIFT] the only markups that you need to worry about? Commented Feb 3, 2011 at 3:12

5 Answers 5

1

Here's one way, but it feels hackish. There's probably a better way.

def process_backspaces(input, token='[BACKSPACE]'):
 """Delete character before an occurence of "token" in a string."""
 output = ''
 for item in (input+' ').split(token):
 output += item
 output = output[:-1]
 return output
def process_shifts(input, token='[SHIFT]'):
 """Replace characters after an occurence of "token" with their uppecase 
 equivalent. (Doesn't turn "1" into "!" or "2" into "@", however!)."""
 output = ''
 for item in (' '+input).split(token):
 output += item[0].upper() + item[1:]
 return output
test_string = '[SHIFT]this isrd[BACKSPACE][BACKSPACE] an example file[SHIFT]1'
print process_backspaces(process_shifts(test_string))
answered Feb 3, 2011 at 3:26
Sign up to request clarification or add additional context in comments.

Comments

1

If you don't care about the shifts, just strip them, load

(defun apply-bspace ()
 (interactive)
 (let ((result (search-forward "[BACKSPACE]")))
 (backward-delete-char 12)
 (when result (apply-bspace))))

and hit M-x apply-bspace while viewing your file. It's Elisp, not python, but it fits your initial requirement of "something I can download for free to a PC".

Edit: Shift is trickier if you want to apply it to numbers too (so that [SHIFT]2 => @, [SHIFT]3 => #, etc). The naive way that works on letters is

(defun apply-shift ()
 (interactive)
 (let ((result (search-forward "[SHIFT]")))
 (backward-delete-char 7)
 (upcase-region (point) (+ 1 (point)))
 (when result (apply-shift))))
answered Feb 3, 2011 at 3:31

2 Comments

+1 for an Elisp answer! It's (not too suprisingly, I guess) quite good at this sort of thing... I'm a vim person, personally, but things like this sometimes pull me towards emacs.
@Joe Kington - Hehe. To be truthful, this is the sort of thing I'd handle with a keyboard macro and maybe an alist unless there were multiple, large files that needed parsing. It's just that a function is easier to share and explain.
1

This does exactly what you want:

def shift(s):
 LOWER = '`1234567890-=[];\',円./'
 UPPER = '~!@#$%^&*()_+{}:"|<>?'
 if s.isalpha():
 return s.upper()
 else:
 return UPPER[LOWER.index(s)]
def parse(input):
 input = input.split("[BACKSPACE]")
 answer = ''
 i = 0
 while i<len(input):
 s = input[i]
 if not s:
 pass
 elif i+1<len(input) and not input[i+1]:
 s = s[:-1]
 else:
 answer += s
 i += 1
 continue
 answer += s[:-1]
 i += 1
 return ''.join(shift(i[0])+i[1:] for i in answer.split("[SHIFT]") if i)
>>> print parse("[SHIFT]this isrd[BACKSPACE][BACKSPACE] an example file[SHIFT]1")
>>> This is an example file!
answered Feb 3, 2011 at 3:50

2 Comments

Oops, I just spotted a bug... sorry. Fixing it now
The debug is complete and the result is exactly what you want
0

It seems that you could use a regular expression to search for (something)[BACKSPACE] and replace it with nothing...

re.sub('.?\[BACKSPACE\]', '', YourString.replace('[SHIFT]', ''))

Not sure what you meant by "multiple spaces delete multiple characters".

answered Feb 3, 2011 at 3:11

4 Comments

-1 How will this work for "blah[BACKSPACE][BACKSPACE][BACKSPACE]arf"?
But it needs to delete one space BEFORE the backspace as well as the '[BACKSPACE]' itslef
That's my point -- gahooa's solution won't work for my blah-barf example.
Yeah, i just saw what you were saying, so far the only way I can think of to do it would combine python with autoit or another manual macro/automation service, but the results would be tedious at best, and possibly not 100% functioning
0

You need to read the input, extract the tokens, recognize them, and give a meaning to them.

This is how I would do it:

# -*- coding: utf-8 -*-
import re
upper_value = {
 1: '!', 2:'"',
}
tokenizer = re.compile(r'(\[.*?\]|.)')
origin = "[SHIFT]this isrd[BACKSPACE][BACKSPACE] an example file[SHIFT]1"
result = ""
shift = False
for token in tokenizer.findall(origin):
 if not token.startswith("["):
 if(shift):
 shift = False
 try:
 token = upper_value[int(token)]
 except ValueError:
 token = token.upper()
 result = result + token
 else:
 if(token == "[SHIFT]"):
 shift = True
 elif(token == "[BACKSPACE]"):
 result = result[0:-1]

It's not the fastest, neither the elegant solution, but I think it's a good start.

Hope it helps :-)

answered Feb 3, 2011 at 3:31

Comments

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.