Instagram is one of the most popular social media sites with billions of users. Everyone from students to celebrities has Instagram accounts. The public data from Instagram can be of immense value to businesses, marketers, and individuals. Anyone can use this data to perform data analysis, target marketing, and generate insights.

You can use Python to build an automated tool that extracts Instagram data.

Installing Required Libraries

Instaloader is a Python library you can use to extract publicly available data from Instagram. You can access data like images, videos, username, no. of posts, followers count, following count, bio, etc. using Instaloader. Note that Instaloader is not affiliated with, authorized, maintained, or endorsed by Instagram in any way.

To install instaloader via pip, run the following command:

        pip install instaloader
    

You must have pip installed on your system to install external Python libraries.

Next, you need to install the Pandas Python library. Pandas is a Python library that's mainly used to perform data manipulation and data analysis. Run the following command to install it:

        pip install pandas
    

Now, you're ready to begin setting up the code and fetching the data out of Instagram.

Setting Up Your Code

To set up the Instagram data fetching tool, you need to import the Instaloader Python library and create an instance of the Instaloader class. After that, you need to provide the Instagram handle of the profile from which you want to extract the data.

The Instagram Extractor Python code is available in a GitHub repository and is free for you to use under the MIT License.

        import instaloader
 
# Creating an instance of the Instaloader class
bot = instaloader.Instaloader()
 
# Loading the profile from an Instagram handle
profile = instaloader.Profile.from_username(bot.context, 'cristiano')
print(profile)

This is a good first step to check the basics work. You should see some meaningful data with no errors:

screenshot of instaloader set up output

Extracting Data From Profile

You can extract valuable publically available data like username, no. of posts, followers count, following count, bio, user ID, and external URL using Instaloader with just a few lines of code. You only need to provide the Instagram handle of the profile.

        import instaloader
import pandas as pd
 
# Creating an instance of the Instaloader class
bot = instaloader.Instaloader()
 
# Loading a profile from an Instagram handle
profile = instaloader.Profile.from_username(bot.context, 'leomessi')
print("Username: ", profile.username)
print("User ID: ", profile.userid)
print("Number of Posts: ", profile.mediacount)
print("Followers Count: ", profile.followers)
print("Following Count: ", profile.followees)
print("Bio: ", profile.biography)
print("External URL: ", profile.external_url)

You should see lots of profile information from the handle you specify:

output screenshot of extracting info from Messi's insta profile

Extracting Emails From Bio

You can extract email addresses from the Insta bio of any profile using regular expressions. You need to import the Python's re library and pass the regular expression for validating the email as a parameter to the re.findall() method:

        import instaloader
import re

# Creating an instance of Instaloader class
bot = instaloader.Instaloader()
profile = instaloader.Profile.from_username(bot.context, "wealth")
print("Username: ", profile.username)
print("Bio: ", profile.biography)
emails = re.findall(r"\b[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\.[A-Z|a-z]{2,}\b", profile.biography)
print("Emails extracted from the bio:")
print(emails)

The script will print anything it recognizes as an email address in the bio:

output screenshot of extracting emails from Instagram bio

Extracting Top Search Results Data

When you search for anything on Instagram, you get several results including usernames and hashtags. You can extract the top search results using the get_profiles() and get_hashtags() methods. You only need to provide the search query in the instaloader.TopSearchResults() method. Further, you can iterate and print/store the individual results.

        import instaloader
 
# Creating an instance of the Instaloader class
bot = instaloader.Instaloader()
 
# Provide the search query here
search_results = instaloader.TopSearchResults(bot.context, 'music')
 
# Iterating over the extracted usernames
for username in search_results.get_profiles():
    print(username)
 
# Iterating over the extracted hashtags
for hashtag in search_results.get_hashtags():
    print(hashtag)

The output will include any matching usernames and hashtags:

output screenshot of extracting top results from instagram search

Extracting Followers & Followings of an Account

You can extract the followers of an account, and those that it follows itself, using Instaloader. You'll need to provide an Instagram username and password to retrieve this data.

Never use your personal accounts to extract data from Instagram as it may get your account temporarily or permanently banned.

After creating an instance of the Instaloader class, you need to provide your username and password. This is so that the bot can log in to Instagram using your account and fetch the followers and followings data.

Next, you need to provide the Instagram handle of the target profile. The get_followers() and get_followees() methods extract the followers and followees. You can get the followers' and followees' usernames using the follower.username and followee.username properties respectively.

If you want to store the results in a CSV file, you first need to convert the data into a Pandas DataFrame object. Use the pd.DataFrame() method to convert a list object into a DataFrame.

Finally, you can export the DataFrame object to a CSV file using the to_csv() method. You need to pass the filename.csv as a parameter to this method to get the exported data in the CSV file format.

Only the account owners can see all the followers and followings. You will not be able to extract all the followers and followings data using this or any other method.

        # Importing Libraries
import instaloader
import pandas as pd
 
# Creating an instance of the Instaloader class
bot = instaloader.Instaloader()
bot.login(user="Your_username", passwd="Your_password")
 
# Loading a profile from an Instagram handle
profile = instaloader.Profile.from_username(bot.context, 'Your_target_account_insta_handle')
 
# Retrieving the usernames of all followers
followers = [follower.username for follower in profile.get_followers()]
 
# Converting the data to a DataFrame
followers_df = pd.DataFrame(followers)
 
# Storing the results in a CSV file
followers_df.to_csv('followers.csv', index=False)
 
# Retrieving the usernames of all followings
followings = [followee.username for followee in profile.get_followees()]
 
# Converting the data to a DataFrame
followings_df = pd.DataFrame(followings)
 
# Storing the results in a CSV file
followings_df.to_csv('followings.csv', index=False)

Download Posts From an Instagram Account

Again, to download posts from any account, you'll need to provide a username and password. This is so the bot can log in to Instagram using your account. You can retrieve all the posts' data using the get_posts() method. And you can iterate and download all the individual posts using the download_post() method.

        # Importing Libraries
import instaloader
import pandas as pd
 
# Create an instance of Instaloader class
bot = instaloader.Instaloader()
bot.login(user="Your_username",passwd="Your_password")
 
# Loading a profile from an Instagram handle
profile = instaloader.Profile.from_username(bot.context, 'Your_target_account_insta_handle')
 
# Retrieving all posts in an object
posts = profile.get_posts()
 
# Iterating and downloading all the individual posts
for index, post in enumerate(posts, 1):
    bot.download_post(post, target=f"{profile.username}_{index}")

Scrape the Web Using Python

Data scraping or web scraping is one of the most common ways to extract useful information from the web. You can use the data you extract for marketing, content creation, or decision-making.

Python is the preferred language for data scraping. Libraries like BeautifulSoup, Scrapy, and Pandas simplify data extraction, analysis, and visualization.