Python: search for a string inside a given url, save to a txt the link it points to -

i'm new python , trying write script looks coupons regarding online courses. flow following:

# read .txt list of courses , create vector containing of them  # loop through vector and, each course, check if there's discount online (e.i udemycoupon.discountsglobal.com)  # if there is, print link pointing

creating vector file easy, else seems give me problems. taking example above website, have create, each course, string needed url. tried search web urllib (and urllib2) couldn't (error 403: forbidden). looked other answers none of them seem work.

could please tell me how write particular part of script (considering example "complete python programming course 2016: code using python 3")?

# have string called "course" containing "complete python programming course 2016: code using python 3"  # substitute spaces inside "course" "+" , # utf-8 symbol code (: should %3a)  # string link = "http://udemycoupon.discountsglobal.com/?s=complete+python+programming+course+2016%3a+code+using+python+3" can created.  # if text "100% off free complete python programming..." # or "98% off complete python programming..." # or "97% off complete python programming..." # or every combination of upper/lower case (i think converting lowercase might convenient) contained in "link" url # save link pointing to variable "coupon_link" # print "coupon_link" new .txt file

this tried:

# -*- coding: utf-8 -*-  import urllib.request  open('courses_list.txt') f:     courseslist = f.readlines() # going modified     courseslistunquoted = courseslist # keep original strings  length = len(courseslist)   # create links http://udemycoupon.discountsglobal.com/ in range(0, length):     courseslist[i] = courseslist[i].replace("\n","")     courseslist[i] = courseslist[i].replace(" ","+").lower()     courseslist[i] = courseslist[i].replace(":","%a3")     courseslist[i] = courseslist[i].replace("#","%23")     courseslist[i] = courseslist[i].replace("!","%21")     courseslist[i] = courseslist[i].replace("/","%2f")  # scrape http://udemycoupon.discountsglobal.com/ addressbeginning = "http://udemycoupon.discountsglobal.com/?s="  in range(0, length):     link = addressbeginning + courseslist[i]     urllib.request.urlopen(link) response:         htmlcode = response.read()     ...

but doesn't seem work. error "no module named request". how can access text on webpage?

thank help.

it might possible website trying access has limited access these resources browsers or specific clients. change user-agent setting them specific browser , send request again.

Search This Blog

Alcombright

Python: search for a string inside a given url, save to a txt the link it points to -

Comments

Post a Comment

Popular posts from this blog

php - How to add and update images or image url in Volusion using Volusion API -

c# SetCompatibleTextRenderingDefault must be called before the first -

Laravel mail error `Swift_TransportException in StreamBuffer.php line 269: Connection could not be established with host smtp.gmail.com [ #0]` -