html_encoded1.html - 2 KB
html_encoded2.html - 45 KB
html_encoded3.html - 99 KB
If you view the source of the pages, you’ll see something like this at the end:
<!-- BEGIN_FILERECOVERYI placed these files in my public_html folder on April 19, and linked to them from my index.html page. Today I checked Google, MSN, Yahoo, and Ask to see if any of them were cached. Here’s the results:
chunks = 4
filename = xor.o
recover = 2
orig_size = 1105
block_size = 554
block_num = 3
fY/xaGQn0V5MOOpLnM1WIsIUMirrVBQ2XNhidvc5yjL9tEyKTmNjNPjcrJzcPWvs INxxHl1Gt5lKQAYoNi1DXOhFI5ExBm15Nxx1T/hFCwVvsyaHsQQdd3lcqWJl+WTw BTlkiI8yWcPPoy38dqgTVnc4aSNd+0YQWW0bDl67/6XTnych3rSXn5YEYhVMU2eS LCR/0N4pAhKgeMb7SXtdJNQ6WykqDXYJAjtTOIrT2CLaPNRdKbU/ydsvUSDenSt+
Etc…
END_FILERECOVERY -->
Google – cached all three
MSN – cached 1 and 2
Yahoo – indexed 2 only (not available in their cache)
Ask – nada
To see if Google can handle any more, I have created 4 new files of 150, 200, 250, and 300 KB. Looks like 99 KB is too large for MSN. Yahoo’s cache is really inconsistent- maybe 2 is in there, maybe it’s not. Why didn’t they grab 1?
I’ll check back in a couple of weeks and see if anything else has been cached.
Update: 6/20/06
Google and MSN have cached all files that range up to 300 KB. Yahoo has only indexed the first 3 (none are cached), and Ask has nothing.
Now I'm going to create a 400 KB, 500 KB, and 1 MB file and see what happens.
Update: 2/21/07
The cache limits for the search engines appear to be the following: Google - 977 KB, Yahoo - 214 KB, and MSN - 1 MB. I still cannot tell for sure what Ask's limit is, but I ran an experiment where I found 984 KB cached for a document that was 1.6 MB. Google's limit has been confirmed by others.
No comments:
Post a Comment