Every so often, online articles don’t display the date they were first published. Sometimes though, you need to cite that content or verify how recent it is. In these cases, this is how you can locate that elusive date.
Many bloggers remove the publication date from their content because when readers see an article they know was published a while ago, they subconsciously assume it’s outdated. Even if it’s not. By removing the date, the content is always able to pass as new.
Yet sometimes we need to know the publication date — even just a rough date. We may need to reference the content in our own work. Or we may simply want to validate that the content still is relevant.
Below are a few simple methods that will help you to figure out when that content was born. Unfortunately, there’s no way to guarantee perfection here. Often, you may only be able to find the date the page was last modified, but at least it’s getting you somewhere.
Look at the URL
Even if the publication date has been removed from the article itself, many websites will still reference the date in the URL.
This will by no means work for all websites, but it should be your first port of call.
Check the Sitemap
A sitemap is an .xml file which contains the URLs and metadata for each URL within a website. Although there’s no standard way to find this file, there are three methods worth trying. First is to enter “sitemap.xml” at the end of the site’s URL.
If this doesn’t work, scroll to the bottom of the site, and see if there is a link to the “Sitemap”.
If you still can’t find it, type “site:example.com filetype:xml” into Google. This will only show .xml files for that domain. See if you can locate the sitemap.
If you find the sitemap, search the page for the specific URL you’re questioning, and you should find the date written amongst the text. This is nearly always the date when the page was last modified.
Turn to Google
In Google type “inurl:” followed by the URL of the article in question and hit search. Just below the title, and before the excerpt, the original date will sometimes be displayed.
This will only happen if Google can easily figure out the date of publication based on the HTML of the website in question. If the date isn’t displayed on the Google results page, next paste “&as_qdr=y15″ to the end of that Google search URL.
You should now see a date displayed for that page. This date isn’t guaranteed to be the publication date. It’s usually the date that Google last noticed an update to that page. But for static articles and blog posts, this date is usually pretty reliable. For pages that are regularly updated, you may need to do a little more digging.
Check the Comments
If you’re dealing with content from a popular source, the comments will usually have started on the day of publication, or thereabouts.
Scroll to the bottom of the page, and find the oldest comment. This will give you a good gauge for when that article first went live.
If the date a comment was left is displayed like “438 days ago“, you can quickly find the exact date by typing “438 days ago” into Wolfram Alpha. This is just one of the everyday uses of Wolfram Alpha .
Check the Images
The URL of images uploaded to a website will often have a timestamp included. The date displayed is reliable if that specific image was uploaded for that article. Although the date of upload is not the same as the date published, it gives a clear sign of the rough time period when the article was written.
For example, the URL of the image below shows the image was uploaded in October 2015, though it doesn’t offer any more specificity.
If the image is hosted off the website, or is simply being linked to from a central image “library” that the website houses, the date displayed will be inaccurate. Using this technique alongside others mentioned is a good way to double-check your dates, though.
Use The Wayback Machine
Internet Archive’s Wayback Machine lets you know how many times the archive has saved a specific page, and between which dates. Often, you can even look at what that page looked like at specific points in time. This means you can prove that the quote or data you’re referencing was actually there on that date. The example below shows the first time Wayback Machine archived this particular article was on December 24 2014. In fact, the article went live on December 23. Not everything can be perfect.
The earliest date displayed on Wayback will be an indicator as to when, roughly, that content was published. This means that if someone insists that an alternative news story is only two weeks old, you can show them that the same story was archived last year.
Don’t Be Left in the Dark Any Longer
You should now be able to find, beyond most reasonable doubt, the rough publication date of an article, even if the publication wants to keep quiet about it.
Are there any other methods that you use for finding hidden dates of content? Have you ever struggled to find the publication date of a piece of content?