Python: search for a string inside a given url, save to a txt the link it points to -


i'm new python , trying write script looks coupons regarding online courses. flow following:

# read .txt list of courses , create vector containing of them  # loop through vector and, each course, check if there's discount online (e.i udemycoupon.discountsglobal.com)  # if there is, print link pointing 

creating vector file easy, else seems give me problems. taking example above website, have create, each course, string needed url. tried search web urllib (and urllib2) couldn't (error 403: forbidden). looked other answers none of them seem work.

could please tell me how write particular part of script (considering example "complete python programming course 2016: code using python 3")?

# have string called "course" containing "complete python programming course 2016: code using python 3"  # substitute spaces inside "course" "+" , # utf-8 symbol code (: should %3a)  # string link = "http://udemycoupon.discountsglobal.com/?s=complete+python+programming+course+2016%3a+code+using+python+3" can created.  # if text "100% off free complete python programming..." # or "98% off complete python programming..." # or "97% off complete python programming..." # or every combination of upper/lower case (i think converting lowercase might convenient) contained in "link" url # save link pointing to variable "coupon_link" # print "coupon_link" new .txt file 

this tried:

# -*- coding: utf-8 -*-  import urllib.request  open('courses_list.txt') f:     courseslist = f.readlines() # going modified     courseslistunquoted = courseslist # keep original strings  length = len(courseslist)   # create links http://udemycoupon.discountsglobal.com/ in range(0, length):     courseslist[i] = courseslist[i].replace("\n","")     courseslist[i] = courseslist[i].replace(" ","+").lower()     courseslist[i] = courseslist[i].replace(":","%a3")     courseslist[i] = courseslist[i].replace("#","%23")     courseslist[i] = courseslist[i].replace("!","%21")     courseslist[i] = courseslist[i].replace("/","%2f")  # scrape http://udemycoupon.discountsglobal.com/ addressbeginning = "http://udemycoupon.discountsglobal.com/?s="  in range(0, length):     link = addressbeginning + courseslist[i]     urllib.request.urlopen(link) response:         htmlcode = response.read()     ... 

but doesn't seem work. error "no module named request". how can access text on webpage?

thank help.

it might possible website trying access has limited access these resources browsers or specific clients. change user-agent setting them specific browser , send request again.


Comments

Popular posts from this blog

php - How to add and update images or image url in Volusion using Volusion API -

javascript - jQuery UI Splitter/Resizable for unlimited amount of columns -

javascript - IE9 error '$'is not defined -