0

I am trying to print in Python 3 and am having trouble. I have a for loop in my code that looks like this:

seq = input("enter DNA sequence to search: ")
pat = re.compile('(.{10})(ATC.{3,6}CAG)')
li = []
output_lines = [] 
for mat in pat.finditer(seq):
 x = mat.end()
 li.append(mat.groups()+(seq[x:x+10],))
for u in li:
 z = u[1] 
 A = z.count('A')
 C = z.count('C') 
 G = z.count('G') 
 T = z.count('T')
 sumbases = [A,C,G,T]
 print(sumbases)

When I print sumbases, I get this for example:

[1, 2, 3, 4]
[2, 0, 1, 4]

I am trying to format the output like this:

[1, 2, 3, 4],[2, 0, 1, 4]

Can anyone show me the problem? Thanks in advance.

eumiro
214k36 gold badges307 silver badges264 bronze badges
asked Apr 18, 2011 at 11:01
9
  • That's not the output from this code. It'll print precisely A,C,G,T and a newline per iteration. I'd prefer if you posted your real code, or at least were consistent with the examples. Commented Apr 18, 2011 at 11:05
  • Thanks, sorry I didn't include the entire code. I can see how the question is unclear. Commented Apr 18, 2011 at 11:22
  • @delnan i edited my post to include the entire code. Commented Apr 18, 2011 at 12:03
  • You should consider using the built-in optimized collections.Counter facility. Commented Apr 18, 2011 at 13:55
  • Seems the intent of li.append(mat.groups()+(seq[x:x+10],)) is to also capture the following ten characters, why not just add a third capture group to explicitly match them: '(.{10})(ATC.{3,6}CAG)(.{10})' . Then I think you can simply write li = re.findall(pat, seq) ... Commented Nov 28, 2018 at 10:29

1 Answer 1

2

You can try this:

output_lines = []
for u in li: 
 z = u[1] 
 A = z.count('A') 
 C = z.count('C') 
 G = z.count('G') 
 T = z.count('T')
 sumbases = "A,C,G,T" # I suppose you format it here differently
 y = sumbases.replace("\n"," ") # not sure why you need this
 # print(y) # don't print now, print later...
 output_lines.append(y)
print(','.join(output_lines))

EDIT for your edited question:

seq = input("enter DNA sequence to search: ")
pat = re.compile('(.{10})(ATC.{3,6}CAG)')
output_lines = [] 
for mat in pat.finditer(seq):
 x = mat.end()
 z = (mat.groups()+(seq[x:x+10],)[1]
 output_lines.append(str([z.count(a) for a in 'ACGT')]))
print(','.join(output_lines))
answered Apr 18, 2011 at 11:05
Sign up to request clarification or add additional context in comments.

5 Comments

Nevermind my print(y, end=',') suggestion - it adds an ugly trailing comma. This is better, although it would be even better if the whole loop could be folded into a generator expression.
thanks for the help. i don't think the print statement is supported in python 3 with this syntax? when i try this i get an error?
@sebrowns - sorry, just put brackets around it (see my edited answer). Also, if you write 'I get an error' on stackoverflow.com, then try to include it into your comment.
@sebrowns that's right. This example is for Python 2. But it's still not clear what your actual code is.
great. i think i am starting to understand now. i get a syntax error from output_lines on the penultimate line though.

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.