Post without Account — your post will be reviewed, and if appropriate, posted under Anonymous. You can also use this link to report any problems registering.

URL rewriting made easy

  • 0 Replies

Online Phil

  • Global Moderator
  • Sr. Member
  • *****
  • 364
    • View Profile
URL rewriting made easy
« March 01, 2017, 10:37:07 AM »
When implementing a site on an Apache server, URL rewriting/redirecting is a major headache. The rewriting rules are really convoluted, especially when it comes to making multiple passes through an .htaccess file. That can trip up even the most experienced expert. Even Apache's own documentation for URL rewriting calls it "voodoo".

What can be done to make this a clean-cut predictable process, rather than black magic? On this site, I tried doing a general rewriting module in PHP (rewriter.php). It used normal string processing code in PHP to disassemble the incoming URL into its component parts, modified those parts with normal PHP code (testing and string processing), glued it all back together, and passed it to the PHP header("Location: XXXX") call. It worked fantastically well, except for handling POST data from a form. That required a kludgey workaround to preserve the POST data whenever it was detected (a major objective was to not require any changes to downstream PHP files). I was never able to get form operations such as CAPTCHA to work properly. Eventually I had to concede defeat and go back to fighting with .htaccess. It turned out that my host (Lunarpages) had not set up the server in the normal manner — if the URI started with a real path, it jumped directly to that directory, bypassing the normal chain of .htaccess file processing (starting at /)! Even after I figured that out, it was still a lot of trial and error to get .htaccess URL rewriting to do exactly the things I wanted (and I'm still not 100% sure that it works right!).

Applications such as Wordpress or Drupal simply sweep up any URI that's not a real address, and feed it to a PHP routine (from /index.php) to process it in a manner similar to what I did. This is how they handle SEO "fake" paths, among other things. I haven't looked at their internals to see if and how they process POST data. Other applications, such as osCommerce, embed the various Query String data into the human-friendly URI, and use .htaccess rewriting to extract the useful data into a Query String and discard the human-friendly part. For example, /product_display/p-15234-mr-fusion-reactor might become /product_display.php?product_id=15234. SMF (Simple Machines Forum) "Pretty URLs" stores the human-friendly name (title) in a database table, along with the various parameters needed to pass to the real routines. This posting might show up as /url-rewriting-made-easy, and become internally /show_thread.php?id=6534. This avoids having to embed ugly numbers in the title (like osCommerce), but requires a database entry for each article, etc.

Another solution might be to have a standalone compiler that would take URL specifications, in some sort of language, and output a chunk of code ready to drop into your .htaccess file. If someone wanted to put enough effort into it, I'm sure it could be done. This would be something you run on your PC whenever you want to make a change to URL rewriting/redirection, rather than something running live on the server to process each incoming URL.

Any other ideas?