Extension, Module

Archived
Forum
(read-only)

CE Cache

ExpressionEngine 2, ExpressionEngine 3

Back to this add-on's main page
View Other Add-ons From Causing Effect

     

Cache breaking only finding First Instance? (In Progress)

Support Request

MacObserver
MacObserver

We’ve long had a recurring problem that we finally traced down to CE Cache.

The issue is that EE is (thankfully?) smart enough to translate capital letters in URL stubs to lowercase and homogenize them. That means EE returns the same comment if someone loads either of these two URLs, even though the first one is the only one that’s “right” in that it matches the URL stub:

http://www.macobserver.com/tmo/article/os-x-how-to-create-custom-radio-stations-in-iTunes-12-2
http://www.macobserver.com/tmo/article/os-x-how-to-create-custom-radio-stations-in-itunes-12-2 

Note the capital “T” in the top one. Each of these URLs will create a separate entry in CE Cache, of course, and that’s understandable.

For cache breaking, we have {entry_id} and {url_title} in there, and here’s the interesting (but very good) part: the tags for each of those two entries are exactly the same at:

Tags85535|os-x-how-to-create-custom-radio-stations-in-itunes-12-

Yes, they’re both lowercase regardless of which URL it is. This is a good thing, and one would expect that both (or neither) of them would get wiped out by an article update that triggers cache breaking. Problem is ONLY the lowercase one gets wiped out.

Seems to me to be a bug in Cache Breaking. If it’s successful at breaking one entry with those tags, should it not also break another entry with the exact same tags?

Causing Effect - Aaron Waldon
# 1
Developer
Causing Effect - Aaron Waldon

Hi MacObserver!

This is an interesting post.

The first thought I have when reading this, is that having different URIs return the same page is actually a bad thing. It is my understanding that unique content should have a unique URL. If that is not the case, then that should probably be corrected as soon as possible. It would be perfectly fine to have a redirect for URLs that contain uppercase letters to their lowercase equivalent, but if pages are linking and responding in different cases, you may want to consider a script to correct that.

As for the cache breaking, if you set the cache breaking for “Any” channel to a tag of {entry_id}, then the cached items for both should be cleared. Is that not the case?

I had a client a couple of years ago that had a problem with other sites linking to posts that were incorrectly cased. I hereby release this solution to you under the MIT License in my name (Aaron Waldon):

.htaccess file:

#.... you may have other .htaccess rules here

<IfModule mod_rewrite.c>
    
# Enable Rewrite Engine
    
RewriteEngine On

    
#.... you may have other rules here

    #---------- Force url to lowercase if upper case is found
    # Check if the URI contains an uppercase character
    
RewriteCond %{REQUEST_URI} [A-Z]
    
# Ignore pagination pages
    
RewriteCond %{REQUEST_URI} !/P[0-9]
    
# Ignore search pages
    
RewriteCond %{REQUEST_URI} !search[NC]
    
# Ensure it is not a file on the drive first
    
RewriteCond %{REQUEST_FILENAME} !-s
    
# Redirect to the lowercase rewrite script
    
RewriteRule (.*) rewrite-strtolower.php?rewrite-strtolower-url=$1 [QSA,L]

    
#.... other redirect rules here
</IfModule>

#.... you may have other .htaccess rules here 

.rewrite-strtolower.php file:

<?php
if (isset($_GET['rewrite-strtolower-url']))
{
 $url 
$_GET['rewrite-strtolower-url'];
 unset(
$_GET['rewrite-strtolower-url']);
 
$params http_build_query($_GET);
 if (
strlen($params))
 
{
  $params 
'?' $params;
 
}
 header
('Location: http://' $_SERVER['HTTP_HOST''/' strtolower($url) . $paramstrue301);
 exit;
}
header
('HTTP/1.0 404 Not Found');
die(
'Unable to convert the URL to lowercase, because a URL was not provided.'); 
MacObserver
# 2
MacObserver
Causing Effect - Aaron Waldon - 11 July 2015 07:46 PM

The first thought I have when reading this, is that having different URIs return the same page is actually a bad thing. It is my understanding that unique content should have a unique URL. If that is not the case, then that should probably be corrected as soon as possible. It would be perfectly fine to have a redirect for URLs that contain uppercase letters to their lowercase equivalent, but if pages are linking and responding in different cases, you may want to consider a script to correct that.

This seems to be an EE thing (at least for us, and probably for others). Regardless of case, EE returns the same page. And that’s fine (a good thing, I believe).

Causing Effect - Aaron Waldon - 11 July 2015 07:46 PM

As for the cache breaking, if you set the cache breaking for “Any” channel to a tag of {entry_id}, then the cached items for both should be cleared. Is that not the case?

That is not the case. It seems to only be breaking the first instance it finds (for us, anyway). Can you confirm that it doesn’t do this on a default EE install? I know it shouldn’t, but I’m just trying to narrow down where, exactly, this bug might live.

(and thanks for the scripts! If all else fails we’ll give them a try).

MacObserver
# 3
MacObserver

Hey, Aaron — Not sure how, but this one got marked as Resolved and… it’s not.

Just to be clear, only ONE instance of an article matching the correct tags is being “broken” from the Cache… the other(s) are not. Any ideas?

Causing Effect - Aaron Waldon
# 4
Developer
Causing Effect - Aaron Waldon

This seems to be an EE thing (at least for us, and probably for others). Regardless of case, EE returns the same page. And that’s fine (a good thing, I believe).

I do not think it is a good thing unless a URI variation is doing a 301 redirect to the correct URI. Web URI’s should be unique.

That is not the case. It seems to only be breaking the first instance it finds (for us, anyway). Can you confirm that it doesn’t do this on a default EE install? I know it shouldn’t, but I’m just trying to narrow down where, exactly, this bug might live.

CE Cache does break more than one tagged cache. That’s the point of tagging. Nonetheless, I can test it again this weekend. Are you running the latest versions of CE Cache and EE?

Not sure how, but this one got marked as Resolved and… it’s not.

I think I marked it resolved after I went above and beyond and gave you a great solution that would not only correct the problem, but improve SEO. ;)

MacObserver
# 5
MacObserver

Well, I think it’s more accurate to say your great solution (which was great, especially from SEO, don’t get me wrong) simply worked around the problem. :) It didn’t solve it, as we’ve got several instances (unrelated to capitalization) where multiple pages tagged with the same tag don’t get broken. And yeah, we’re running the latest CE Cache.

MacObserver
# 6
MacObserver

Hey, Aaron — Just checking to see if you did any further testing on the multiple tags breaking. For us it only breaks the first instance it finds and it’s causing a few issues, the one highlighted in this post included among them.