Determining age of information on web
I do a lot of internet research for my job. Is there a way to determine the date a web page was creatated or last updated?
You used to be able to right click on the page and look at properties and it would give you that info. Now it always says todays date.
Re:Determining age of information on web
Have you guys heard about this Loraina Bobbit thing?
Re: Determining age of information on web
Re: Determining age of information on web
Wait. The Loraina Bobbit comment didn't answer your question?
Re: Determining age of information on web
Quote:
Originally Posted by
intrepid27
I do a lot of internet research for my job. Is there a way to determine the date a web page was creatated or last updated?
You used to be able to right click on the page and look at properties and it would give you that info. Now it always says todays date.
I think it depends on the web page. For example, if the page is generated by a s c r i p t/server side code, then that is why you'll be seeing today's date most likely.
Not that everyone does it, but an SEO page might have a metadata tag in its HTML. For example
Other than that, it's hit or miss. If these don't exist and there's no date in the actual content of the page, then you might be out of luck.
Re: Determining age of information on web
Thanks, how do you find the metadatatag thingamabobber?
Re: Determining age of information on web
Quote:
Originally Posted by
intrepid27
Thanks, how do you find the metadatatag thingamabobber?
Well, caveat is not all pages will have them especially if the website isn't professionally done (most likely, although it doesn't have to be professional done to have this).
You need to view the HTML "source" of the page (i.e. "View Source"). Then i'd do a search for "
Re: Determining age of information on web
Try http://www.archive.org/. It's dark today, though. You can see web sites from when they first were created.
Re: Determining age of information on web
There is no way to determine the date of publication of a web page, in a reliable manner.
The best methods are done by the big search engines: Google and Bing. Since they are constantly spidering the web, they keep track of when a page changes by looking at the difference between their previous cache and what is currently there. From that value, you can get an approximate date of publication. There are two caveats: 1) the vast majority of data on the web is very infrequently spidered, with good reason. It never changes - for example, government pages or something with absolutely no traffic. 2)It is easy to game this system, and plenty of low-brow news web sites do this.
To see this in use, do a google search, then choose the "in last hour" or "in last 24 hours" under the "any time" header on the left side of the results.
If i remember correctly, the Bing 2.2 Web Service API has a field that tells you when they last detected a change on a web page. This gets returned as part of the search results - you can probably pass the URL of the document you want to date as the search phrase, and get the spidered date.
Re: Determining age of information on web
Quote:
Originally Posted by
ce1
There is no way to determine the date of publication of a web page, in a reliable manner.
The best methods are done by the big search engines: Google and Bing. Since they are constantly spidering the web, they keep track of when a page changes by looking at the difference between their previous cache and what is currently there. From that value, you can get an approximate date of publication. There are two caveats: 1) the vast majority of data on the web is very infrequently spidered, with good reason. It never changes - for example, government pages or something with absolutely no traffic. 2)It is easy to game this system, and plenty of low-brow news web sites do this.
To see this in use, do a google search, then choose the "in last hour" or "in last 24 hours" under the "any time" header on the left side of the results.
If i remember correctly, the Bing 2.2 Web Service API has a field that tells you when they last detected a change on a web page. This gets returned as part of the search results - you can probably pass the URL of the document you want to date as the search phrase, and get the spidered date.
Extremely unreliable, and not to mention that a search engine might not hit information brought on with AJAX requests too...
You can get lucky with the last modified dates in the source, but that doesn't mean all the information on the page is from that date. It could be 1 sentence that got updated, and the rest of the content has been on there for 5 years.
Very hit and miss