Problem
I have a simple python script that’s repeating a HTTP response/request. I would like to know how I could have constructed this better, particularly if there’s a way I could have incorporated the slice operator to cut down on my variable usage.
For context – the response contained a JSON string in between the <ExecuteResult>
flags that I’m removing currently with split()
import requests
import json
from pprint import pprint
url = "<redacted>"
cookies = {<redacted>}
headers = {<redacted>}
data="<redacted>"
submissions = input("How many submissions to create:")
while submissions > 0:
response = requests.post(url, headers=headers, cookies=cookies, data=data)
xml_string = response.content
json_file = xml_string.split('<ExecuteResult>')[1]
json_file = (json_file.split('</ExecuteResult>')[0]).replace("\", "")
json_response = json.loads(json_file)
print "Created Id: " + json_response['_entity']['Id']
submissions -= 1
Solution
If you are making many requests, it is worth it to have a look at requests.Session
. This re-uses the connection to the server for subsequent requests (and also the headers and cookies).
I would also start separating your different responsibilities. One might be global constants (like your header and cookies), one a function that actually does the submission, one that creates the session and does each submission and then some code under a if __name__ == '__main__':
guard that executes the code.
import requests
import json
COOKIES = {<redacted>}
HEADERS = {<redacted>}
def post(session, url, data):
"""Use session to post data to url.
Returns json object with execute result"""
response = session.post(url, data=data)
json_file = response.content.split('<ExecuteResult>')[1]
json_file = (json_file.split('</ExecuteResult>')[0]).replace("\", "")
return json.loads(json_file)
def post_submisssions(url, data, n):
"""Create a session and post `n` submissions to `url` using `data`."""
session = requests.Session()
session.headers.update(HEADERS)
session.cookies = COOKIES
for _ in xrange(n):
json_response = post(session, url, data)
print "Created Id:", json_response['_entity']['Id']
if __name__ == "__main__":
url = "<redacted>"
data = "<redacted>"
post_submisssions(url, data, input("How many submissions to create:"))
I also added some rudimentary docstring
s.
You could also change your parsing by using BeautifulSoup
:
from bs4 import BeautifulSoup
def post(session, url, data):
"""Use session to post data to url.
Returns json object with execute result"""
response = session.post(url, data=data)
soup = BeautifulSoup(response.content, 'lxml')
return json.loads(soup.find('ExecuteResult'.lower()).replace("\", ""))
I’ll start with the obvious:
while submissions > 0:
...
submissions -= 1
can be transformed into:
for _ in xrange(submissions):
...
There are other changes that I think would make the code slightly better, albeit not necessarily more pythonic:
submissions
should probably be namedsubmission_count
(thanks, Ludisposed!).- You appear to be parsing HTML using
string.split()
. It might work OK now, but if you need to add more functionality in the future this approach will be very cumbersome. Try using a library instead (for example, Beautiful Soup). - You’re not using the
pprint
import, so you might as well remove that line.