Install Python Wget and Automate your File Downloads

Published:25 August 2021 - 10 min. read

Azure Cloud Labs: these FREE, on‑demand Azure Cloud Labs will get you into a real‑world environment and account, walking you through step‑by‑step how to best protect, secure, and recover Azure data.

Downloading multiple files from the Internet manually as part of your daily routine can truly be a nightmare. And if you’re looking for a way to automate your file downloads, then Python’s Wget is the right tool for you.

In this tutorial, you’ll learn many ways to download files, from running the basic Python wget command to creating a script to download multiple files simultaneously.

Let’s get down to it!

Prerequisites

This tutorial will be a hands-on demonstration. If you’d like to follow along, be sure you have the following:

Downloading and Installing Wget on Windows

Wget is a non-interactive utility to download remote files from the internet. Aside from being built-in with Unix-based OS, the wget command also has a version built for Windows OS. At the time of writing, the latest Wget Windows version is 1.21.6.

Before you download files with the wget command, let’s go over how to download and install Wget on your Windows PC first.

1. Download Wget either for 64bit or 32bit for Windows.

2. Open File Explorer and find the wget.exe file you downloaded, then copy and paste it to the C:\Windows\System32 directory to add wget.exe to the PATH environment variable. The PATH environment variable specifies sets of directories to be searched to find a command or run executable programs.

Adding wget.exe in the PATH environment variable lets you run the wget command from any working directory in the command prompt.

3. Now, launch the command prompt and confirm the version (--version) of Wget (wget) you downloaded with the command below.

wget --version

Once you see the output on the screenshot below, then Wget is successfully installed in your machine.

Confirming if Wget was successfully installed.
Confirming if Wget was successfully installed.

Downloading a File Directly from a URL

Now that you’ve installed Wget, let’s dig into running basic wget commands. Perhaps you want to download a file from a specific URL. In that case, you only need the basic wget command syntax and specify the URL to download the file from.

Related: Download a File with an Alternative PowerShell wget Command

Below, you can see the basic syntax for running the wget command. Notice that after the wget command, you’ll specify various options followed by the website URL.

wget [options] url

Downloading a File to the Working Directory

With the wget command syntax you learned still fresh in your memory, let’s look at downloading a file to the working directory by running the wget without added options.

Run the command below to download the wget.exe file from the specified URL (https://eternallybored.org/misc/wget/1.21.1/64/wget.exe) to the working directory.

wget https://eternallybored.org/misc/wget/1.21.1/64/wget.exe

Once you see this output on your command prompt, the file has been downloaded successfully.

Downloading a single file to the working directory
Downloading a single file to the working directory

Downloading a File to a Specific File Path

You’ve just downloaded a file to your working directory, but what if you prefer to download the file to a specific file path? If so, then run the below command instead to specify the download location.

Run the wget command below and add the --directory-prefix option to specify the file path (C:\Temp\Downloads) to save the file you’re downloading.

wget ‐‐directory-prefix=C:\Temp\Downloads https://eternallybored.org/misc/wget/1.21.1/64/wget.exe

Open File Explorer and navigate to the download location you specified (C:\Temp\Downloads) to confirm that you’ve successfully downloaded the file.

Confirming File is Successfully Downloaded
Confirming File is Successfully Downloaded

Downloading and Renaming a File

Downloading a file to your preferred directory with a single command is cool enough. But perhaps you’d like to download a file with a different name. If so, the -o flag is the answer! Adding the -o flag lets you output the file you’re downloading with a different name.

Below, run the basic wget command syntax to download the wget.exe file from a specific URL. But this time, add the -o flag to rename the file you’re downloading. So instead of wget.exe, you’re naming the file new_get.exe.

wget -o new_wget.exe https://eternallybored.org/misc/wget/1.21.1/64/wget.exe

You can see below in File Explorer that the downloaded file is named new_wget.exe.

Viewing Downloaded File with Custom Name
Viewing Downloaded File with Custom Name

Downloading a File’s Newer Version

Perhaps you want to download a newer version of a file you previously downloaded. If so, adding the --timestamp option in your wget command will do the trick. Applications on a website tend to be updated over time, and the --timestamp option checks for the updated version of the file in the specified URL.

The wget command below checks (--timestamp) and downloads the newer version of the wget.exe file to the C:\Temp\Downloads directory.

wget ‐‐timestamp ‐‐directory-prefix=C:\Temp\Downloads https://eternallybored.org/misc/wget/1.21.1/64/wget.exe

If the file (wget.exe) were modified from the version you specified, you’d get a similar output as in the previous examples. But if not, you’ll see the screenshot below. Notice the part where it says Not Modified, indicating there’s no new newer version of the file you’re downloading.

Downloading a file newer version
Downloading a file newer version

Downloading Files from a Website Requiring Username and Password

Most websites require a user to be logged in to access or download some files and content. To make this possible, Wget offers the --user and --password options. With these options, Wget provides a username and password to authenticate your connection request when downloading from a website.

Below is the basic syntax of the wget command to download files from websites requiring your account’s username (myusername) and password (mypassword).

wget --user=myusername --ask-password=mypassword https://downloads.mongodb.com/compass/mongodb-compass-1.28.1-win32-x64.zip

You will see an output similar in the image below if the command is successful.

Downloading files from a password-protected website
Downloading files from a password-protected website

Downloading a Web Page

Instead of a file, perhaps you’re trying to download a web page to keep a local copy. In that case, you’ll run a similar command that downloads a file, but with additional options.

Run the wget command below to download the home page of the http://domain.com/ website and create a folder named domain.com in the working directory. The domain.com folder is where the downloaded home page is saved (-o).

The command also creates a log file in the working directory instead of printing output on the console.

wget -r http://domain.com/ -o log

Below, you’ll see the local copy of the downloaded web page and log file where the download logs are saved.

Viewing Downloaded File and Log File
Viewing Downloaded File and Log File

You may also put several options together, which do not require arguments. Below, you can see that instead of writing options separately (-d -r -c), you can combine them in this format (-drc).

wget -d -r -c http://domain.com/ -o log   # Standard option declaration
wget -drc http://domain.com/ -o log       # Combined options

Downloading an Entire Website

Rather than just a single web page, you may also want to download an entire website to see how the website is built. To do so, you’ll need to configure the wget command as follows:

  • Replicate (--mirror) the website (www.domain.com), and ensure all files (-p), including scripts, images, etc., are included in the download.
  • Now add the -P option to set a download location (./local-dir).
  • Ensure you download the specific website only by adding the --convert-links option to your command. Most websites have pages with links pointing to a resource for other websites. You’re also downloading all other linked websites when you download a website, which you may not need.
wget --mirror -p --convert-links -P ./local-dir http://www.domain.com/

Once you see the below output, the file has been downloaded successfully.

Downloading an entire website
Downloading an entire website

Wget downloads all the files that make up the entire website to the local-dir folder, as shown below.

Viewing Downloaded Website Files
Viewing Downloaded Website Files

The command below outputs the same result as the previous one you executed. The difference is that the --wait option sets a 15-second interval in downloading each web page. While the --limit option sets the download speed limit to 50Kmbps.

wget --mirror -p --convert-links -P ./local-dir --wait=15 --limit-rate=50K http://www.domain.com/

Downloading Files from Different URLs Simultaneously

As you did in the previous examples, downloading files manually each day is obviously a tedious task. Wget offers the flexibility to download files from multiple URLs with a single command, requiring a single text file.

Sounds like a good deal? Let’s get down to it!

Open your favorite text editor and put in the URLs of the files you wish to download, each on a new line, like the image below.

Adding different download URLs to a text file
Adding different download URLs to a text file

Now, run the command below to download the files from each URL you listed in the text file.

wget -i list.txt

Below, you can see the output of each file’s download progress.

Downloading different files from the URLs in a text file
Downloading different files from the URLs in a text file

Resuming an Interrupted Download

By now, you already know your way of downloading files with the wget command. But perhaps, your download was interrupted during the download. What would you do? Another great feature of wget is the flexibility to resume an interrupted or failed download.

Below is an example of an interrupted download as you lost your internet connection. Notice that the download progress (7%) gets stuck, and the eta keeps counting up.

Showing a Failed / Interrupted File Download
Showing a Failed / Interrupted File Download

The download progress will automatically resume when you get your internet connection back. But in other cases, like if the command prompt unexpectedly crashed or your PC rebooted, how would you continue the download? The --continue option will surely save the day.

Run the wget command below to continue (--continue) an interrupted download of the wget.exe file.

wget --continue https://download.techsmith.com/snagit/releases/snagit.exe

You can see below that the interrupted download resumed at 7% when interrupted (not always). You’ll also see the total and remaining file size to download.

Resuming a Failed / Interrupted File Download
Resuming a Failed / Interrupted File Download

Alternatively, you may want to set a certain number of times the wget command will retry a failed or interrupted download.

Add the --tries option in the wget command below that sets 10 tries to complete downloading the wget.exe file if the download fails. To demonstrate how the --tries option works, interrupt the download by disconnecting your computer from the internet as soon as you run the command.

wget --tries=10 https://www.google.com/images/branding/googlelogo/1x/googlelogo_color_272x92dp.png

Below, you can see that the download stops, and the HTTP request is awaiting a response.

Interrupting the Download Progress
Interrupting the Download Progress

Now, reconnect your computer to the internet, and you’ll see the download will automatically continue, as shown below. You can see that it’s the second try to download the file.

Retrying File Download Automatically
Retrying File Download Automatically

Creating a Python Script for Downloading Files

You’ve learned how to download files by running commands so far, but did you know you can also create a script to download files automatically? Let’s dive into writing some Python code.

1. Create a new folder named ~downloader.

2. Launch VS Code, then click on the File menu —> Open Folder to open the ~downloader folder you created.

Opening Folder in VS Code
Opening Folder in VS Code

3. Click on the new file icon to create a new Python script file named app.py in your project directory, as shown below.

Creating a Python Script File
Creating a Python Script File

4. Now, click on the Terminal menu, and choose New Terminal to open a new command-line terminal, as shown below.

Running a New Terminal
Running a New Terminal

Installing and Activating Virtual Environment

Now that you have your project folder and script file, let’s dig into creating a virtual environment. A virtual environment is an isolated environment for Python projects where the packages required for your project are installed. You’ll activate this virtual environment to enable the execution of your program in the future.

Run the below commands on your VS Code terminal to install the virtual environment package and create a virtual environment.

pip install virtualenv # Install Virtual Environment Package
virtualenv download    # Create a Virtual Environment named 'download'

Run either of the commands below depending on your operating system to activate your virtual environment.

source download/bin/activate # Activate Virtual Environment for Unix/Mac
download\Scripts\activate    # Activate Virtual Environment for Windows

Installing wget Module

You now have your virtual environment set up, so it’s time to install the wget module. The wget module is developed to provide an API for the Python developers’ community. This module eases the applications and implementations of the wget command with Python

When building a Python project, you need to store the packages in a requirements.txt file. This file will help you install the same version of the packages used in the future.

Run the commands below to install the Wget module and add it to the requirements.txt file.

pip install wget # Install the wget module
pip freeze > requirements.txt # Add wget to requirements.txt

Now copy and paste the code below to the app.py you previously created in VS Code.

The code below changes the output of the file download so that you can see each file download’s progress with a custom progress bar.

# import the wget module
from wget import download
#
# create a downloader class.
class downloader:
    #  Create a custom prgress bar method
    def progressBar(self,current,total):
        print("Downloading: %d%% [%d / %d] bytes" % (current / total * 100, current, total))
        
    # Create a downloadfile method
    # Accepting the url and the file storage location
    # Set the location to an empty string by default.

    def downloadFile(self, url, location=""):
         # Download file and with a custom progress bar
        download(url, out = location, bar = self.progressBar)

downloadObj = downloader()
downloadObj.downloadFile("https://blog.debugeverything.com/wp-content/uploads/2021/04/python-virtualenv-project-structure.jpg","files")

Finally, run the command below to execute the script app.py script.

python app.py

Below, you can see each file’s download progress in percentage with the file’s total and current downloaded size in bytes.

Downloading files by running the app.py script
Downloading files by running the app.py script

Conclusion

Throughout this tutorial, you’ve learned how to download files with Python wget command. You’ve also experienced downloading files from running basic wget commands to running the wget module in Python script to download multiple files.

Now, how would you use Python Wget in your next project to download files automatically? Perhaps creating a scheduled download task?

Hate ads? Want to support the writer? Get many of our tutorials packaged as an ATA Guidebook.

Explore ATA Guidebooks

Looks like you're offline!