Imagine an early morning where you're sipping on a hot cup of fresh coffee, and your computer reads out the latest headlines to you—all on its own. Doesn’t that sound amazing?

Well, with Python, you can build your very own, personalized newsreader, which will read out all the top headlines for you, along with the excerpts of each headline. Yes, that's right. By using the right libraries, you can make Python do all your morning routines, without having to read each and every word on your own.

Here’s how you can write this code and tweak it to your favorite news website.

Pre-Requisites for Running the Code

Before you jump in and start writing the code, you need to fulfill a few pre-requisites. These are some very basic requirements, which can be make the use of Python easier and more effective.

  1. Python: Having the latest version of Python installed would be a good decision. You can install any Python IDE for best results.
  2. News website/internet access: Since the Python code reads the top headlines from your favorite website, you need to ensure you can access the website while running this code.

The entire code is written in Jupyter Notebook, a popular Python IDE for this guide. Additionally, India Today’s news website is coded within the sample code.

To download Jupyter Notebook, you can either use it as a part of the anaconda package, or download a standalone version on your system.

Download: Anaconda | Jupyter Notebook

Without further ado, let’s delve deeper into the code.

Writing the Code in Python

To start, you need to import a few Python libraries, each serving different purposes.

        import win32com.client as wincl

from urllib.request import urlopen as ureq

from bs4 import BeautifulSoup as soup
        sp = wincl.Dispatch("SAPI.spVoice")
    

Where:

  • win32com.client: This library interacts with Windows devices and runs Python programs seamlessly.
  • urllib.request: This library handles URL values from the request module.
  • bs4: The BS4 library contains the Beautiful Soup function, which scrapes data from websites using Python.
  • sp = wincl.Dispatch("SAPI.spVoice"): Activate the Voice commands in Windows.

This code will work on Windows only, as you will be calling the win32.com.client library.

Python code snippet

Next, you need to define the URL (link) of the website within the url variable, which is stored in Python’s memory.

        url = "https://www.indiatoday.in/top-stories"
    

Create a new variable client to store the URL opening command.

        client = ureq(url)
print(client)

where:

  • client: New variable.
  • ureq: Python function imported from the urllib.request, which opens the stored url.
Python code snippet

Since you have opened the URL in the memory, it is time to check if the website in question allows unsecured connections via Python. You can print the client variable and check the output.

There are two possibilities with the print command:

  • HTTPError: When a website is secure, you can’t scrape the content using Python.
Python error snippet
  • Code Snippet: If a code snippet is returned after running the website, assume that you can easily pull the headlines.
Python code snippet

Once you have defined the URL of the news website within the URL command, it’s time to import the HTML code into a variable.

        page_html = client.read()

print(page_html)
Python code snippet

You need to print the website’s HTML code imported into Python as a precautionary step. You can even match this code with the website code available under the Inspect option.

Before converting the code, you need to close the website from Python’s memory by using the close command.

        client.close()
    

Since you have the HTML code imported into a Python variable, you need to convert it into a Python readable format to apply the find and findall commands to look for keywords.

You can pass the following command to convert the HTML code:

        page_soup = soup(page_html , "html.parser")
    

Where:

  • page_soup: New variable.
  • soup: Alias for the Beautiful Soup module.
  • page_html: Variable which contains the HTML code from the website.
  • html_parser: Default syntax to convert the HTML code.

Once the code is ready for use, it is time to examine the website HTML code to start looking for headline keywords.

To do so, right-click anywhere on the website, and click on Inspect. This will open the HTML code for the website in question.

Website interface

On the website’s code window, scroll around, until you locate the container tags which store the headlines.

These are contained within the view-content tags on the India Today website. Each news website’s containers vary, but you should be able to navigate through the code with relative ease.

        articles = page_soup.find("div" , { "class" : "view-content" })
    
HTML code window on the India Today website

Finally, you need to capture the sub-tags, which contain the main headlines Python will be reading out to you.

        articles = articles.findAll("div" , {"class" : "catagory-listing"})
    
India Today website HTML code window

The view-content container will contain multiple headlines, the outer shell for your headlines.

To capture the H2 tags and the snippets listed with each headline, you need to run a loop.

        i = 1

for x in articles:

   title = x.find("h2").text

   para = x.find("p").text

   print(i , title , "\n" , "\n" , para , "\n" , "\n")

   sp.Speak(title)

   sp.Speak(para)

   i=i+1

Where:

  • i: New counter variable, which will be auto-incremented.
  • title: New variable to save the headline (h2).
  • para: New variable to hold the paragraphs associated with each H2.
  • print: The title of the headline and the para will be printed on the Python interface.
  • sp.Speak(Title): Python will read out each stored title.
  • sp.Speak(para): Python will read out each stored paragraph snippet.
  • i = i+1: This command auto-increments the serial number associated with each headline displayed on Python’s interface.

Using Python’s Beautiful Soup Module to Read Your Daily News

Every time you run the code, fresh headlines from the news website will be downloaded before being read out aloud. Python executes the code each time you run the set of codes, thereby keeping you updated with the changes on the website.

The older headlines will continue to be displayed and read out by Python until you refresh and rerun the code.

Using Python to Read Out Your Daily Headlines Is Easy

Python, as an open-source language, offers a series of tools such a Beautiful Soup, Selenium, and other frameworks—to beginners and advanced users alike.

If you want to get your daily news delivered by voice, Python makes it easy. Learning this particular language can also help you become a better programmer in all areas.