Welcome to Part 2 of Finding Web Page Publish Dates when they’re not displayed on a page. Why would you care? Well, you have your reasons. Please see Part 1 of this topic to get a sense of why we’re bothering to look at this stuff.
In any case, continuing then…
- See if images have a date stamp.
- Click on an image or right click and open in a new window. See if the URL has a date stamp on it.
- A more extreme option might be to look at image info to see if there’s EXIF data in the image with a date. This doesn’t necessarily tell you much as the image could have been taken any time. Maybe it’s years old stock photography.
- Try Google’s Structured Data Tool.
- The tool is to help web site creators validate data within their pages. But it can also be used for discovery.
First you enter the URL. We’ll use one with a known date for this example.
You’ll get a results page.
You’ll find several detail sections on the right. One of them, (in this case “CreativeWork”), has the publish date.
Now, this won’t always work, but here’s an example of a guy who wrote on article on why you shouldn’t publish dates on blog posts, except maybe in some cases.
It turns out, even though the author is talking about not posting dates, he did put the date into the meta data of the post. He might not have even realized he was doing it. It’s possible he’s using a WordPress plugin that adds Open Graph style meta data and it got put in automatically. So the date may not be visible to users and this may fake out some users in terms of the date not showing up visibly on a Search Results page, but it is discoverable. The other way you would be able to see this is looking at the page source. That is, the code that makes up the web page. Using Google Chrome, for example, you can go the Developer Tools section. You can probably just right click and select “Inspect Element” on the page. A lot of tech stuff will come up. Choose the “Elements” tab if not already selected. You can then use Ctrl-F or Command-F to get to the search box. Then just type “date” to search for date in the code. Of course, in this particular article, (which is about dates), this could take awhile! Though this time I knew I was actually looking for “published_time” so this was found pretty quickly.
The point is, if the date is this important to you for whatever reasons, it may be discoverable in the code.
It’s possible none of these methods will work for certain types of pages. That is, not all “pages” online are actually pages at all. There are a variety of techniques for updating page content that injects information into what a user might think of as a page. Typically, a user will think about a page as that thing in their URL address bar. Using AJAX or Angular or any number of techniques can of course make that not entirely true. The point here isn’t about technology. It’s about what you provide to customers at the canonical address of indexed content you intend to be found over time; however the page is defined.
One other method: The Sitemap
For SEO purposes, a site may have a sitemap.xml file at their root. So if you go to http://DOMAIN_NAME.com/sitemap.xml you can maybe see a list of pages in the site and when they were added.
Sites that Can Help
- Time Travel
“Time Travel helps you find and view versions of web pages that existed at some time in the past.”
- Wayback Machine at Internet Archive
Any More Techniques?
There is one utterly low tech way to find does of important information. And that’s just to seek an alternative source. One typical use case where the date is important to you at all is because there’s some statistical data or other claim in an article that’s pertinent to a decision your trying to make.
So the simple questions become:
- Can you find the same information elsewhere, given that you perhaps now have some good keywords you can use to see out this information?
- Is the information being provided original research or is it sourced from elsewhere? If from elsewhere, can you find that report or the company that made it? Perhaps the name of the analyst that generated the info in the first place?
These are all ways of saying essentially the same thing; did the original document generating your interest in the info offer enough clues for you to seek out alternative sources. You will still need to ask yourself if the sources are reliable, but that’s another question. Often, you’ll find 5, 10, 100 articles on the web with all manner of opinions that are really referencing a single piece of primary research offered by one company. And perhaps not backed up by other sources. This doesn’t make the info wrong of course. But it should be cause for pause and reflection.