0

I have a legacy (Tomcat) site that used to be called host.domain and recently deployed a Drupal site. The new Drupal server is now host.domain and the old Tomcat server is legacy.domain. Most of the hierarchy has been replicated in Drupal, so bookmarks and search engine results should still work: request host.domain/dir/page.jsp and Drupal will trim the .jsp and look for a node named dir/page. Some of the content has not yet been migrated, so I'm using the 'Redirects 404' Drupal module to check the old server for content before returning a 404: request host.domain/legacy/oldpage.jsp, Drupal looks for a legacy/oldpage node, doesn't find one, tries legacy.domain/legacy/oldpage.jsp, finds it, and transparently passes the content to the browser with drupal_http_request() - the client's URL doesn't even change. This part works great.

The problem is .js and .css files - for some reason, a request for host.domain/legacy/file.js or file.css will not trigger Drupal's 404 processing, and so Drupal never asks legacy.domain if it's got the file. Instead the 404 falls through to Apache, and displays an Apache 404 (not a Drupal or Tomcat one). This means that content on legacy.domain served through Drupal on host.domain, if it includes local css or js, won't get those styles or behaviors

The way I see it, I have three options:

  1. Copy the css and js files off the Tomcat server, and place them in the Drupal root while replicating the old directory hierarchy. This might work but would be messy, complicate Drupal core updates, and might interfere with the behavior of the content-proxying 404 behavior which is working.
  2. Get Drupal to trigger a 404 for .js and .css files like it does for .jsp files. Any ideas why it doesn't already?
  3. If Drupal won't throw a 404 for .js and .css files, then tell Apache to act as a second layer for Drupal's proxying behavior. If a 404 falls through to Apache, have it try serving it from legacy.domain instead.

I guess I could also go through all of the content on the old Tomcat server and replace all of the relative includes with absolutes using the legacy.domain name, but I'm already trying to move that content off that host and I really don't want to put effort into files that are going to be replaced soon - I just want them to work properly until I can migrate them. Does anyone have any advice or tutorials on implementing option 2 or 3?

The Apache config is stock Ubuntu 12.04.3 LTS. The .htaccess in the Drupal directory is:

# Protect files and directories from prying eyes.                                           
<FilesMatch "\.(engine|inc|info|install|make|module|profile|test|po|sh|.*sql|theme|tpl(\.php
)?|xtmpl)(~|\.sw[op]|\.bak|\.orig|\.save)?$|^(\..*|Entries.*|Repository|Root|Tag|Template)$|
^#.*#$|\.php(~|\.sw[op]|\.bak|\.orig\.save)$">                                              
  Order allow,deny                                                                          
</FilesMatch>                                                                               

# Don't show directory listings for URLs which map to a directory.                          
Options -Indexes                                                                            

# Follow symbolic links in this directory.                                                  
Options +FollowSymLinks                                                                     

# Make Drupal handle any 404 errors.                                                        
ErrorDocument 404 /index.php                                                                

# Set the default handler.                                                                  
DirectoryIndex index.php index.html index.htm                                               

# Override PHP settings that cannot be changed at runtime. See                              
# sites/default/default.settings.php and drupal_environment_initialize() in
# includes/bootstrap.inc for settings that can be changed at runtime.

# PHP 5, Apache 1 and 2.
<IfModule mod_php5.c>
  php_flag magic_quotes_gpc                 off
  php_flag magic_quotes_sybase              off
  php_flag register_globals                 off
  php_flag session.auto_start               off
  php_value mbstring.http_input             pass
  php_value mbstring.http_output            pass
  php_flag mbstring.encoding_translation    off
</IfModule>

# Requires mod_expires to be enabled.
<IfModule mod_expires.c>
  # Enable expirations.
  ExpiresActive On

  # Cache all files for 2 weeks after access (A).
  ExpiresDefault A1209600

  <FilesMatch \.php$>
    # Do not allow PHP scripts to be cached unless they explicitly send cache
    # headers themselves. Otherwise all scripts would have to overwrite the
    # headers set by mod_expires if they want another caching behavior. This may
    # fail if an error occurs early in the bootstrap process, and it may cause
    # problems if a non-Drupal PHP file is installed in a subdirectory.
    ExpiresActive Off
  </FilesMatch>
</IfModule>

# Various rewrite rules.
<IfModule mod_rewrite.c>
  RewriteEngine on

# This forces all drupal links to end in a trailing slash.
# Companion rules to trailing slash module.
# https://drupal.org/project/trailing_slash
RewriteBase /
RewriteCond %{REQUEST_METHOD} !=post [NC]
RewriteRule ^(.*(?:^|/)[^/\.]+)$ $1/ [L,R=301]

  # Set "protossl" to "s" if we were accessed via https://.  This is used later
  # if you enable "www." stripping or enforcement, in order to ensure that
  # you don't bounce between http and https.
  RewriteRule ^ - [E=protossl]
  RewriteCond %{HTTPS} on
  RewriteRule ^ - [E=protossl:s]

  # Make sure Authorization HTTP header is available to PHP
  # even when running as CGI or FastCGI.
  RewriteRule ^ - [E=HTTP_AUTHORIZATION:%{HTTP:Authorization}]

  # Block access to "hidden" directories whose names begin with a period.
  RewriteRule "(^|/)\." - [F]

  # Pass all requests not referring directly to files in the filesystem to
  # index.php. Clean URLs are handled in drupal_environment_initialize().
  RewriteCond %{REQUEST_FILENAME} !-f
  RewriteCond %{REQUEST_FILENAME} !-d
  RewriteCond %{REQUEST_URI} !=/favicon.ico
  RewriteRule ^ index.php [L]

  # Rules to correctly serve gzip compressed CSS and JS files.
  # Requires both mod_rewrite and mod_headers to be enabled.
  <IfModule mod_headers.c>
    # Serve gzip compressed CSS files if they exist and the client accepts gzip.
    RewriteCond %{HTTP:Accept-encoding} gzip
    RewriteCond %{REQUEST_FILENAME}\.gz -s
    RewriteRule ^(.*)\.css $1\.css\.gz [QSA]

    # Serve gzip compressed JS files if they exist and the client accepts gzip.
    RewriteCond %{HTTP:Accept-encoding} gzip
    RewriteCond %{REQUEST_FILENAME}\.gz -s
    RewriteRule ^(.*)\.js $1\.js\.gz [QSA]

    # Serve correct content types, and prevent mod_deflate double gzip.
    RewriteRule \.css\.gz$ - [T=text/css,E=no-gzip:1]
    RewriteRule \.js\.gz$ - [T=text/javascript,E=no-gzip:1]

    <FilesMatch "(\.js\.gz|\.css\.gz)$">
      # Serve correct encoding type.
      Header set Content-Encoding gzip
      # Force proxies to cache gzipped & non-gzipped css/js files separately.
      Header append Vary Accept-Encoding
    </FilesMatch>
  </IfModule>
</IfModule>

UPDATE

Per Shane Madden's recommendation below, I've added this to the top of the mod_rewrite section of the root .htaccess:

RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteCond %{REQUEST_FILENAME} \.(css|js)$
RewriteRule ^(.*)$ http://legacy.domain/$1 [L]

This works if I ask for host.domain/oldfile.css: even if oldfile.css doesn't exist on the legacy host, I get a Tomcat 404, so I know the rewrite works. The problem is with directories that don't exist within directories that don't exist.

If I have a file on the legacy system:

http://legacy.domain/root.css

and ask for it at

http://host.domain/root.css

the file will show up, because it meets the three RewriteCond rules. However, if I ask for

http://host.domain/long/path/to/file.css

then I get an Apache (not Tomcat) 404, with an entry in error.log:

File does not exist: /var/www/long

It looks like the rewrite rule is only taking effect if the document requested is (or would be) in the same directory as the .htaccess that contains that rule. If the requested file is inside a directory, the directory triggers a 404, which doesn't match the conditions because it doesn't end with .css or .js, and Apache stops processing right there. Is there a way to have the rule apply for any 404, no matter how far it may be down a directory hierarchy that doesn't exist locally?

Riblet
  • 15
  • 4
  • Show your Apache configuration. – Michael Hampton Sep 27 '13 at 15:27
  • Anything in particular you want to see? Except for the name-based virtual host, there Apache config is stock Ubuntu 12.04.3 LTS. I'll add the .htaccess that's in the Drupal directory to the OP. – Riblet Sep 27 '13 at 15:44

1 Answers1

0

How about directly proxying css and js files which don't exist on the filesystem, since the ones for the drupal site should all be hitting real files in the /sites directory?

Within the <Directory> block for your Drupal install, something like this:

RewriteCond %{REQUEST_FILENAME} \.(css|js)$
RewriteCond %{REQUEST_FILENAME} !-f
RewriteRule ^(.*)$ http://proxy-target/$1 [P,L]

Edit:

Since the filesystem mapping is choking before it can check for existence, let's instead do the checking without needing the filesystem mapping.

Put this in your <VirtualHost> block directly (or in the main server config if you aren't using virtual hosts):

RewriteCond %{REQUEST_URI} \.(css|js)$
RewriteCond /path/to/your/docroot%{REQUEST_URI} !-f
RewriteRule ^/(.*)$ http://proxy-target/$1 [P,L]
Shane Madden
  • 112,982
  • 12
  • 174
  • 248
  • This appears to be getting closer to the goal, but not fully there yet. The ${REQUEST_FILENAME} didn't work until I replaced the $ with % (I don't know enough about mod_rewrite to know the difference or even if the $ is valid). – Riblet Oct 01 '13 at 18:36
  • @Riblet Nope, it's not valid as `$` - typo on my part :) – Shane Madden Oct 01 '13 at 19:04
  • @Riblet On your edit - we've got problems with filesystem mapping, then.. and the way it's set up, we need to have it filesystem mapped for the rules to apply. I guess let's put rules in the virtual host context, instead. Are you able to edit the virtual host config (instead of just the htaccess)? And are there any strange filesystem mappings like `Alias` that might be being used for the css/js files? – Shane Madden Oct 01 '13 at 19:12
  • The system I'm working on at the moment is a VirtualBox VM - I have full access to it. I was going to try to get it working on the VM, then in a Drupal install on the VM before taking it to production. There are no special mappings that I can see, all of the Drupal-specific config is in its .htaccess, and I'm just working in an empty directory right now, not the Drupal install. The only directives in the .htaccess are the ones I listed in the update. Based on the error log, Apache chokes as soon as it sees the first-level directory that isn't there and doesn't process further. – Riblet Oct 01 '13 at 20:23
  • @Riblet Ok, then let's try this a different way - see my new edit. – Shane Madden Oct 01 '13 at 20:41
  • Getting VERY close here - I think global vs. per-directory rule is going to be the way to go. Tomcat is appending a jsessionid value after the css file though, which is irritating; \.(css|js) isn't gonna match blahblah.css;jsessionid=abc123, but that's Tomcat's problem, not mod_rewrite's. I'll try modifying the regex to match .(css|js) up to the semicolon (without $) or tell Tomcat to stop preserving sessions. Secondly, it won't work unless I omit the Proxy flag after the RewriteRule, but it works fine without it. Am I missing out on anything if I leave it off? Thanks for all your help btw. – Riblet Oct 02 '13 at 14:25
  • @Riblet Hmm, the proxy flag shouldn't make a difference on whether the rule is matched - it's redirecting instead of proxying when you remove that, right? For the match, change to `\.(css|js)(;|$)` should do it. – Shane Madden Oct 02 '13 at 15:40