I am trying to programmatically use this tool http://www.idtdna.com/calc/analyzer so that I can batch analyze a few dozen DNA sequences.
Peeking inside Chrome's developer tools, I see that I am sending a POST request when I click the Analyze button (after entering a sequence (eg "AACCGGTT") in the Sequence field). In the Response tab, I can also see the response (eg "MeltTemp") that I wish to collect. So, following the tutorial (http://docs.python-requests.org/en/latest/user/quickstart/#make-a-request), I put together the snippet below, but it's clearly not working.
>>> import requests
>>> url = 'http://www.idtdna.com/calc/analyzer/home/analyze' # from: ChromeDevTools -> Network -> Name: analyze -> tab: Headers -> General -> Request URL
>>> data = {"settings":{"Sequence":"AACCGGTT", # from: ChromeDevTools -> Network -> Name: analyze -> tab: Headers -> Request Payload
"NaConc":50,
"MgConc":0,
"DNTPsConc":0,
"OligoConc":0.25,
"NucleotideType":"DNA",
}}
>>> r = requests.post(url, data=data)
>>> r.url # 404 page
u'http://www.idtdna.com/404.aspx?aspxerrorpath=/calc/analyzer/home/analyze'
# I was hoping for something like this that gives me the JSON response (which can be found at
# ChromeDevTools -> Network -> Name: analyze -> tab: Response)
>>> r.some_magical_function()
{"Sequence":"AAC CGG TTG GTT AAT T","NaConc":50, ... "MeltTemp":45.1, ...}
What am I missing?
Does the post request need to be way more complicated (with cookies? session??)? If so, please provide pointers to what I need to learn (or the solution :p)
I understand if the website has safeguards against this kind of usage; if it's basically impossible, then what strategy do you suggest?selenium?
1 Answer 1
- As niemmi commented, the actual url is http://sg.idtdna.com/calc/analyzer/home/analyze
- The site require cookies set (You need to access the page at least once before actual request using
request.Sessionto set cookie) - You need to pass json-encoded data (you can use
json=...argument)
import requests
url = 'http://sg.idtdna.com/calc/analyzer/home/analyze'
data = {
'settings': {
'Sequence': 'AACCGGTT',
'NaConc': 50,
'MgConc': 0,
'DNTPsConc': 0,
'OligoConc': 0.25,
'NucleotideType': 'DNA',
}
}
s = requests.Session()
s.get('http://sg.idtdna.com/calc/analyzer') # to set cookies
r = s.post(url, json=data)
print(r.json())
UPDATE
Depending on location, redirection does not happen. In such case use www.idtdna.com instead of sg.idtdna.com.
7 Comments
www (instead of the sg) based on the request: ChromeDevTools -> Network -> Name: analyze -> tab: Headers -> General -> Request URL. I can't seem to find anywhere the sg address...print(r.json()) ValueError: No JSON object could be decoded and r.url is the 404 page I mentioned in the question....sg with www in the anser code, and run it?
404 Not Found. Why do you think this is the URL toPOSTto?www(instead of thesg, as pointed out by niemmi) based on the request: ChromeDevTools -> Network -> Name: analyze -> tab: Headers -> General -> Request URL. Perhaps this is not where I should get the POST address...???