Web Scraping-Part 3
In WebScraping: Part 2 I showed you how Twitter user’s followers data can be extracted. To do that we used requests and BeutifulSoup. So in this post, I will show you how to extract data by using python default libraries instead of reauests and BeautifulSoup.
To do this I will be using urllib and requests module and I will be extracting images from famous webcomic xkcd.
First import modules.
I will be getting a number of images from the websites. So to do that I will be using this URL https://c.xkcd.com/random/comic/. This URL everytime open up a random image.
Above we declared variables. Headers will be used to simulate the browser type requests.
Now to send requests urllib will be used.
Now we have the source of the page. We need to extract the image URL.
At this point, we have image URL. To fetch the image we will do the following.
This will save an image in the dir “img”.
To make this code modular we can create a function and get multiple images we can use loops.
That’s it. We now can download a number of images from xkcd with this python code.
I wrote this code when I was just learning to code in Python. You can find the code that I wrote on this repository https://github.com/Parassharmaa/i-scrap. This code is not very modular, if you are interested in updating this code, you can contribute to this repo.