#### News:

Give our new Discussions area a try!

PDF::Builder v3.024 Released, 12 September 2022
Please see the CPAN listing, GitHub entry.

PDF::Table v1.003 Released, 05 July 2022
Please see the CPAN listing, as well as the GitHub entry.

### A Thought…

A foreign businessman goes to Russia on behalf of a large company, looking to build a factory. He contacts several construction companies to get quotes.

The Germans tell him that they can build the plant with top of the line materials and engineering for 2 million euros.

The Turks tell him they can build the plant cheap, for only 1 million euros — but quality will obviously suffer.

Then along comes a Russian contractor. He tells the businessman that he has the best solution! Pay him 3 million. He will hire the Turks to build the plant for 1 mil, and he and the rep each get 1 mil for themselves!

— AndreiROM, worldbuilding.stackexchange.com

There are a lot of Frequently Asked Questions about third party applications that have to be answered over and over and over, mostly because people can't be bothered to search the community forums first!

If you are looking for the old SMF version 1.1 fixes (/freeSW/SMF/fixes1.1), some descriptions of them are in the SMF section. I am no longer including full code and instructions for these fixes.

If you are looking for the old SMF project (/freeSW/SMF/projects) ideas list, sorry, but it has been discontinued.

## General items NOT specific to any one application

### Internet Explorer 8 Problems

For a long time (up through IE7), Microsoft thumbed its nose at the world and followed its own course with regards to how Internet Explorer would interpret HTML and CSS. The result was that IE browsers tended to behave quite a bit differently than standards-compliant browsers such as Firefox, Chrome, Opera, Safari, etc. The ignorant masses who used IE because it came with their PC were happy, but web developers cried "foul!" because they frequently had to make elaborate workarounds to get their pages to work half way decently on a Microsoft browser.

As of IE8, Microsoft started listening to the world, and actually made an IE browser that was reasonably standards-compliant. This isn't to say that IE8 has attained Nirvana — it still has bugs and incompatibilities — but it's much better than before. Now, here comes the trouble: many sites and applications merely check for "Internet Explorer" as the browser, and don't yet differentiate among versions. So, IE8 is fed the same bogus hacks that IE6 and IE7 (and earlier versions) need to somewhat work. Naturally, IE8 works much differently than IE6, so pages tend to have a lot of problems.

To get around this, Microsoft added "IE7 compatibility mode" to IE8. You can configure your browser to act like IE7 (i.e., "broken"), so that it behaves the same way as older versions when given code designed for them. Unfortunately, this breaks pages that feed it standards-compliant HTML and CSS! So, Microsoft made it possible to add a new switch to pages that treat "IE" browsers differently, and to go into older IE browser mode ("IE7 mode"). Ironic, isn't it? Microsoft finally delivers a "good" browser and then web users and developers have to "break" it to make it behave like in the Bad Old Days.

To wrap up this tale, if your site doesn't render well on IE8, try configuring IE8 (Tools menu) to emulate IE7. If that fixes the problem (it usually does), you can add a new tag to your pages (such as SMF) to permanently use the IE7 compatibility mode for your pages. Somewhere between the <head> and </head> tags (not in the middle of an existing tag!), add the following line:

<meta http-equiv="X-UA-Compatible" content="IE=EmulateIE7" />

In SMF, it's usually in the file index.template.php (in your Themes directory), where you'll find the other <meta> tags.

Note: there are reports that Microsoft keeps fooling with IE8 and has broken the IE7 emulation mode in some releases. Keep that in mind before complaining that this fix doesn't work!

Go to top

### Missing Support Files

Many site owners don't know how to set up their sites properly, and as a result, their site error logs are clogged with thousands of spurious "errors". Not only is this an inconvenience, but it's also a danger, because it makes it easier to miss a real and important error message amongst all the dross. It also leads to annoying reports of "404 error message", when there is no actual 404 error.

Go to top

#### Missing Favicon

Almost every browser (except for nongraphical ones), requests a "favicon" from your site. This is that little graphic that appears on the left end of your browser's address bar, in browser tabs, and in browser bookmarks/favorites lists. If you don't have one, you will have a 404 error logged that the file is missing.

The simplest solution is to use almost any "paint" program supplied with your PC to create a 16 x 16 pixel picture in ICON format. I said, "ICON". Do not be an idiot and create a GIF, JPEG, or PNG image and then simply rename it with an ".ico" extension. You need to save it in ICON format for it to work in the simple mode. Name it "favicon.ico" and upload it to the root directory of your site (what HTML sees as /). That's all you need to do, and no more error messages logged about "missing favicon.ico". ICON format. Location /. Name "favicon.ico". Got it? Note that many modern browsers will support 32 x 32 pixel icons, but not all older ones will. If any of your visitors might be marooned on an IE 6 desert island, stick with 16 x 16.

Note: some browsers may take a while to pick up and display the favicon. Especially if they've already cached a favicon for your site, they may not bother asking for a refresh for a few days. You probably will need to clear your browser's cache. You may need to delete an existing bookmark for the site and re-bookmark the site. Or not — different browsers behave in different ways.

What do I use for a picture? Use whatever you want. Just keep in mind that you can't get much visible detail with only 16 pixels on a side, so keep the design simple. The color palette is fairly limited, too, so don't get carried away with gradients and such. One good thing to use is the first letter of your site name, or its initials. You may be able to cut and paste the letter from your logo and scale it down to 16 x 16 without too much loss of detail. Many browsers can now support 32 x 32, if you need more detail.

Now, it is possible to use other formats, names, and locations, but you need to tell the browser where to find the favicon. In each page's "<head>" section, place:

Of course, you need to set the site, path, and file name to match your favicon. More information on allowable formats and browser support may be found at Wikipedia's Favicon entry.

Final notes: unless you use a <link> shortcut icon tag to specify the favicon location, the favicon.ico file goes in the site root (/). It does not go in your forum's root or any other place, because browsers always request exactly /favicon.ico. Also, by using one or more <link> shortcut icon tags, you can specify different favicons for different parts of your site, or even for different pages.

Go to top

#### Missing Robots File

Most search engines, at least those that are "well behaved" and polite, look for a file named "robots.txt" to tell them where they're not allowed to look (or at least, not allowed to index). Unfortunately, this results in an unnecessary error log entry if it can't be found (doesn't exist, or is in the wrong place). You can search the SMF community forum, or Google, for information on how to create a proper "robots.txt" file and have it do what you want. If nothing else, you can create an empty (blank) file and upload it to /robots.txt (in the site root). Comments start with "#", so you can add some notes to yourself on issues to address in the future, when you get around to creating a real robots.txt file.

Note that a "robots.txt" file provides no security for your site! It is merely a suggestion for search engine spiders about what you'd rather not have indexed, but they are under no requirement to actually follow your suggestion! Usually, it is primarily for the purpose of enhancing your Search Engine results, by telling search engines not to index near-copies of your primary pages, such as print format pages, mobile format pages, etc. Furthermore, some rogue bots have been known to go through robots.txt files explicitly looking for files that you don't want indexed, on the chance that they have something juicy like credit card numbers. So, don't use robots.txt for security, but only to optimize your search engine listings (e.g., "don't index my print format pages").

Also remember that, like favicon.ico, search engines will request a specific file name from a specific location (robots.txt in root /). Unlike favicon, there is no tag to specify a different place, name, or format.

Go to top

#### Missing Error Pages

For some reason, most Apache servers are configured such that you get a "404" error (page not found) if you make use of a default Error Page (also known as an Error Document or Error Handler). This is absurd — after all, the server was able to find and use the default error page — but still it logs a 404 error for the error page it couldn't find. Usually it's looking for /nnn.shtml, where "nnn" is the error code number, such as "404". I like to refer to these spurious error messages as Apache brain farts.

The way to avoid cluttering up your error log with all these spurious non-errors is to define your own set of Error Pages, at least for the more common errors. If your host provides you with the cPanel control panel, there is a button to create "Error Pages". Clicking it will generate a core of SHTML-format fields, which you can then wrap your own HTML code around to customize the page to look like it's an integral part of your site. You can always go back later and manually edit the files it creates, so don't panic about getting everything "just right" the first time. Typically, it will create and place in the root (/) directory 400.shtml, 401.shtml, 403.shtml, 404.shtml, and 500.shtml files. These are "server side" processed HTML files (SHTML), where certain special commented fields are replaced by information reported in the error. If your control panel does not have a way to create Error Pages/Error Handlers/Error Documents, ask your host what the proper naming and protocols are.

Even just the bare SHTML core of these files will work for you, but won't be very pretty. Feel free to add the various HTML tags to make it look like a regular web page. Be careful about going overboard — if you get too elaborate, depending on style sheets or images or Javascript code brought in from your site, you may end up with problems that your Error Page can't be displayed due to the very problem with your site that it's trying to report! So, use some restraint and try to keep it fairly simple, with minimal files brought in from your site. With your customized Error Pages in place, the server will use them instead of the default pages, you won't get your error log cluttered up with thousands of bogus reports of missing nnn.shtml files, and your site will look much more professional.

You're not constrained to SHTML or locating the Error Pages in the root (/). Some webmasters like to use PHP or HTML files instead, either omitting the error-specific messages or getting that information in some other way. If you don't want to use SHTML format, or name the files nnn, or place them in the root, feel free to brew up your own scheme. The only change is that you need to put an entry in your .htaccess file to tell the server where to find each Error Page you have defined:

ErrorDocument nnn /path/name.extension

Be careful to give an absolute path (starting with /) and not a relative path.

Finally, there are scores of errors in the 4xx and 5xx range that you can write Error Pages for. You don't have to confine yourself to 400, 401, 403, 404, and 500, although those are the most common. Feel free to add additional Error Pages, particularly if you seem to keep encountering a particular error.

Go to top

### Proper Naming of Files

People who come into the world of Web hosting after a lifetime immersed in the World of Windows tend to have some bad habits. Windows allows some unspeakable things to flourish in the way of naming files, which cause numerous problems when you try to do the same thing in Linux or some other real operating system.

Windows is case insensitive, while Linux is case sensitive. That is, Windows doesn't care that you name your file MyImage.JPEG and then refer to it as myimage.jpeg — it considers them the same file, and will allow you to create only one such file at a time in the filesystem. Linux will treat MyImage.JPEG and myimage.jpeg as two different files, as they have different letters in the names. You will be able to create two completely different and separate files under those two names. The problem then comes when you try to refer to a file saved as MyImage.JPEG as "myimage.jpeg" (or vice-versa). Linux won't find the file you're looking for (MyImage.JPEG), because it's looking for "myimage.jpeg"! OK, some other "mainframe" operating systems are also case-insensitive, so we can't really condemn Windows for this. You just need to be aware that if you're working on a Linux server, after a lifetime of using Windows, that case matters.

A less forgiveable error on the part of Windows was to allow all sorts of random characters into file names, especially spaces (blanks). Microsoft thought it good that you could name your letter to Granny "Thank you grandmama for Thanksgiving dinner.doc", but it causes untold problems. While it's not a big deal in a GUI (drag and drop) environment, it's hell on the command line and in code. You have to wrap quotes (") around the name, even on Windows, to keep it all in one piece. And the less said about other punctuation permitted in names (including <, >, [, and ]), the better. Needless to say, Linux doesn't like spaces and most punctuation in file names, and will give you an Atomic Wedgie if you upload a file with such a name.

The bottom line is that you have to be careful when creating and uploading files from Windows to a Linux server. Watch out for embedded blanks/spaces and most punctuation characters. Rename your files on your PC before uploading them.

Go to top

### Those Strange "Core" Files

Hopefully no more than once in a while, you may notice a strange new file somewhere in your site directories. Its name may take different forms, but always including (with varying capitalization) "core". There is often a moderately long number as part of the name, sometimes decimal, sometimes hexadecimal. This is the "process ID". What process? We'll discuss that. These files are often quite large, so you don't want a lot of them to accumulate.

Well, what is it? In UNIX and Linux type systems (but not Windows), when a binary executable program ups and dies, it leaves behind a corpse, called the core file. In a Web environment, these programs include PHP, Perl, MySQL (or other database), Apache (the server), cron (scheduled tasks), and perhaps other random programs that you might have been running, such as through a system() call. All running processes have a more or less unique "process ID" (a number to keep track of them), which is usually part of the name of the core file (so as to tell them apart). The core file contains the path and name of the executing program, information about the immediate cause of death (e.g., divide by zero, bad pointer causing a memory access into an invalid address ["segment fault"], etc.). It also holds the execution stack (where in the program it was, the processor registers' values, stack and heap data, subroutine parameter lists, and whatever else.

So what's the point of this core file? A technician or developer with access to the original executable program can feed it and the core file to a debugger such as "dbx", and look inside the program to see what happened — what triggered the failure, and are there any hints in data and parameters as to bad data that might have caused it. In the right hands, this can lead to bug fixes in a program, or at least, a bug report to the developers. So, your hosting company's support technicians may be interested in any core files you happen across on your site. Or, they may be too busy to bother. You should let them know that a problem produced a core file (check its date and time to see if it was associated with the event). Don't erase any files offered to tech support until they confirm that they have a copy, or don't need it any more, or don't need it at all. They may move it out of your directories to theirs.

Now, programs do crash for no apparent reason, from time to time, and many will be automatically restarted. If you get a core file once in a Blue Moon, and nothing seems to be amiss, you don't have to report it to tech support. If you're not going to file a problem or bug report anyway, just go ahead and erase the core file. It's just taking up space. On the other hand, if they're popping up with sufficient frequency to be a nuisance, you should inform your tech support. They'll probably be interested if one of their major systems is constantly crashing.

Go to top

### Byte Order Mark

From time to time, particularly if you work with UTF-8-encoded Unicode text, you may encounter a strange symbol at the top of your file: ï » ¿. No, your server has not been taken over by space aliens, or even by hackers. What has happened is that some brain-dead text editor that you or someone used "helpfully" added the UTF-8 Byte Order Mark (BOM) to your file. Some editors, especially those from Microsoft, think it's necessary to mark a file as UTF-8 if you've been editing under that mode.

Needless to say, despite what Microsoft thinks, this is not at all helpful. These three characters (i+umlaut/diaeresis (ï), xEF; right double guillemet/angle quotation mark (»), xBB; and inverted question mark (¿), xBF; to use their correct names and Latin-1 encodings) appear at the top of the page, and depending on which file they were inserted into, may cause "header" problems. You need to find and remove these three characters, using a text editor that won't go and insert them again just after you've removed them! Despite being from Microsoft, Notepad doesn't appear to do this, although it really isn't suitable for handling larger files. The free editors ViM and Notepad++ come highly recommended for general editing on your PC, and won't insert BOMs. Some editors give you the option of saving UTF-8 files "with" or "without" a BOM — be aware of this and always save "without BOM". Finally, an SMF utility such as file_check may be used to remove the BOM from files.

Note that the BOM can "hide" from you if your PC is running in UTF-8 character encoding mode, or the output HTML is UTF-8. That is, you won't see those three odd characters on any page, including the "View Page Source" page, because browsers attach significance to those three byte codes together. They're recognized as the BOM, and usually ignored. Since the BOM is regarded as text being sent to the browser, it will trigger the sending of the headers to the browser, possibly generating later errors as PHP tries to set new HTTP headers via the header() call. This can lead to very mysterious "Cannot modify headers" errors, with nary a BOM in sight. In such cases, check what the current page encoding is. If it's UTF-8, changing it (browser menu item, usually under View) to Latin-1/ISO-8859-1/Western should make the BOM magically appear on the screen, confirming the diagnosis.

Finally, other multibyte character encodings, including other UTF formats, may have their own forms of Byte Order Marks (different codes than given above) that you might have to deal with.

Go to top

### Proper Permissions

In order to function properly, yet minimize the chances of someone maliciously changing things on your site, directory ("folder", for Windows Weenies), and file permissions need to be properly set. There's a lot of misinformation out there, so let's set the record straight!

#### What do these funny numbers mean?

On a Linux server, you will see references to clusters of three digits, such as "644". This is shorthand for specifying who has what permissions to do what to a specific directory or file. First of all, there are three parties of interest: the "owner" (a.k.a. "user"), "group" members, and "others" (a.k.a. "world"). In most cases on a website, you can give "group" and "others" the same permissions, and treat them the same way. The one exception will be discussed later. The owner has the first number, the group the second, and others the third. So what are these numbers?

#### Who gets what permissions?

Now that you understand what the triplets of numbers (or equivalently, the triplets of letter triplets) mean, who should get what permissions? The general rule of thumb is to grant only sufficient permissions that a party needs to get the job done. That is, you don't grant "write" permission to someone who has no business writing to your directories or files! The owner gives themself the ability to write to directories and files, as presumably they need to do this once in a while, and they can be trusted not to double-cross themselves. Your "group" and "others" don't normally get the ability to write to a directory or file, as this can enable them to do much damage, either accidentally or maliciously. "Others" includes the owners of other sites on a shared server (anyone who has an account on the server). Depending on the server configuration and "upstream" directory permissions, visitors from the Web may or may not be able to act as "others". It's best to assume, unless you have knowledge to the contrary about your specific server setup, that some "others" may be on your server at some point.

The base permissions for a directory are 755 (-rwxr-xr-x). These permit anyone to use the directory (e.g., run scripts in it), but restricts write permission to the owner. In certain cases, you might want the owner to normally run without write permission. This is possible with 555 permissions. You might want to deny "others" the ability to run a directory (e.g., it contains some tools or sensitive information not needed for normal site operation). The permissions could be 500 or 700 for that. Mix and match to suit your needs, but be leery of giving write permission (7) to either your group or others. You can always change permissions later (such as granting yourself write permission), but be careful not to give yourself (owner) "0" permissions (no access!) — you may then have to have your host change them for you!

The base permissions for an ordinary file are 644 (-rw-r--r--). These permit anyone to read the file (such as to run a PHP script), but restricts write permission to the owner (only the owner can update the file). In certain cases, you may want even the owner to run without write permission. This is possible with 444 permissions. You can deny others the ability to read the contents of a file with "0" permissions, such as 400 or 600. Mix and match to suit your needs, but be leery of giving write permission (6 or 7) to either your group or others. You can always change permissions later (such as granting yourself write permission), but be careful not to give yourself (owner) "0" permissions (no access!) — you may then have to have your host change them for you! Also note that denying access (0) to a file means that the website can't run that script or otherwise work with the file. For example, if you give 400 permissions to .htaccess, it will prevent a curious visitor from listing and viewing the file, but the server may not be able to make use of this file! Use other methods, such as "deny" clauses within .htaccess, to keep out nosy visitors.

#### My application is complaining that it can't write to...

First of all, some background. Depending on your server configuration, the server (such as Apache) and/or PHP may be running as "owner", "group", or even "others". That is, for the purpose of permissions, the server/PHP takes on the identity of the owner, or of the owner's group, or just as plain "other" random user. When does this matter to you? When the application program (such as SMF) needs to write to a directory or file, it needs "write" permission. If, for example, SMF wants to write to its "attachments" directory, and that directory has 755 permissions, SMF is going to not have write permission if it's running as "group" or "others"! It will issue an error message that it "can't write to the attachments directory". What to do? If you don't know if the server and/or PHP are running as group or others, try "group" first. Change the directory permissions to 775 and try the operation again. If it still can't write, the server and/or PHP are running as "others" and you need to go to 777. Then, SMF will have permission to write to the directory and everyone is happy. Likewise, for writing to a file (e.g., Settings.php), the file's permissions might have to be changed to 664 or even 666.

Note that osCommerce requires that its "configure.php" configuration settings files be Read-Only. Permissions of 444 will almost always work, but 644 may work too, if PHP is running in your group or as "other". The key to understanding this is if PHP can not write to the file, that's good enough, even if the owner (you) can. The whole point is that if the application can somehow be manipulated into trying to overwrite the configuration settings file, but is denied by the permissions, the file is safe. Further note that if you need to edit such a file, or upload a new copy of it, that you will have to restore write ability to it (usually 644 permissions) first. Uploads and edits may appear to work even if you forget to make the file writable, but you'll notice that the file didn't seem to change! After editing and saving, remember to change the permissions back to what they were before (such as 444).

End of story? No! It's not safe to leave directories at 777 or files at 666. "Others" will be able to get in and do all sorts of damage (and believe me, they are constantly going around to sites twisting all the doorknobs to see what doors are unlocked = writable). If you had to make a directory or file "world writable" (777 or 666), change it back to 755/644 as soon as your application is done with whatever operation required it to write to the directory or file. That will minimize the chances of a Bad Guy getting in and planting a hack, or worse. Some ignorant people will blithely assure you that 777 is perfectly safe, or that the first thing you should do is change your directories (and even files) to 777. Don't! They're at best, sadly misinformed; at worst, deliberately setting you up for a hack. You may need to temporarily expose your directory or file to writing by "others", but close and lock that door as soon as you can! Remember the Golden Rule: grant only sufficient permissions to a user to let them do what they need to do, and no more.

If you have no choice but to permanently leave a directory "world writable" (777) for some purpose, such as letting users initiate SMF uploads of avatars or attachments, you will have to take measures to lock down this directory as much as possible. You won't be able to prevent Bad Guys from (possibly only from other accounts on your server, if not from the Web) writing arbitrary files to this directory, or altering other users' files, but you can limit the damage. Your software may be able to move user-uploaded files into a more secure dirctory (to prevent their being damaged or erased by attackers). You should have .htaccess controls that prevent scripts from being executed in any directory that may host user-created files (avatars, attachments). At least this will minimize the chance that someone uploads, say, an avatar that contains malicious PHP code that they can then invoke to do damage.

Some servers run security software, such as "suPHP", which will shut you down ("500" Internal Server error) if it finds a "world writable" (write permission for "others") directory or file (xx6 or xx7 permissions). Some may even do this for "group writable". Naturally, the server should be set up so that the server software (e.g., Apache) and/or PHP run as "owner" or, at worst, "group". Otherwise, a website could never write to its own files!

#### Settings.php

Some other applications, such as osCommerce, require that configuration files (configure.php) be Read Only. This is Read Only from the perspective of PHP and the server, not necessarily the file owner (you), so that the program cannot (through accident or malicious input by a hacker) overwrite the configuration file. It doesn't care whether or not you, as the owner, can write to the files, just that it can't. Anyway, if PHP and/or the server are running as "owner", you will want to change the permissions to 444. In SMF, the same applies if you wish to keep Settings.php from being overwritten by SMF — use 444 or (sometimes) 644.

#### Random thoughts

On Linux servers, the chmod ("change mode") command is used to change permissions. You can either give the numeric permissions:

chmod 444 Settings.php

or one or more of "u", "g", and "o"; "+" or "-"; and "r", "w", or "x". For example:

chmod u-w Settings.php

to change from 644 (read-write by owner) to 444 (read-only) by all, by removing (-) write permission (w) from the user (owner, u). Generally it's easier, especially when dealing with a single file or directory, to use the numeric form. It's also less likely that you'll get crossed-up by confusing "o" for "others" with "owner" ("u", "user").

Many servers are set up to not allow an FTP client to change file permissions. In such cases, you will have to go into your hosting service's site control panel, to the File Manager, and change permissions there. If you try using an FTP client to change permissions, even if it claims success, be careful to confirm that the permissions were actually changed.

If you are using the cPanel hosting control panel, be aware that it has a quirk that has tripped up many people. When you select a directory or file, and "Change Permissions" for the action, you will be presented with a grid of checkboxes. The columns are for "owner/user", "group", and "other/world", and the rows are for "read", "write", and "execute". Note that below the columns are entry fields with the equivalent permission numbers. You MUST tick/untick the checkboxes in order to change permissions. Do NOT overtype the numbers in the boxes — your changes will be ignored!

Never give yourself ("owner") 0 permissions. You may find yourself locked out and unable to make further changes to the file or directory, without the assistance of your hosting service administrators.

Why chmod instead of, say, chperm? This dates back to the early days of Unix (Linux's ancestor), where read-write permissions were known as the file's mode. Also, Unix's designers had something of a fetish about saving keystrokes on those old 134 baud TeleTypes, so the "e" was dropped (chmod instead of chmode).

#### Windows is different

All of the above discussion pertains to Linux (or other Unix family) servers. If you're on a Windows server, the permission system is quite different. You have "read-write" (default) and "read-only" (attrib +r) permissions, but, not having used a Windows server, I can't tell you if there is a distinction between owner and users.

Also note that PHP's "chmod" function, which is modeled after the Unix/Linux chmod command, can do some odd things on a Windows filesystem. If Windows uses an "owner/others" model, there may be odd results if you try granting different permissions to "group" and "others". The mapping back and forth between the two models may produce unexpected results. For Web use, the only reason to do this would be because PHP is running as "group", which is inapplicable anyway on a Windows server. So the bottom line is that you need to understand the permissions model for your server's OS before you (or SMF code) starts applying permission changes.

Go to top

### 500 (Internal Server) Errors

A frequent question is "Why am I getting a 500/Internal Server Error"? Here are some of the most common reasons:

1. A PHP file was edited (or hacked), leaving blank or empty line(s) at the beginning or end of the file, or blanks (or other characters) before the first opening <?php or after the last closing ?>. This can happen during the insertion or removal of a mod, if not done correctly, during manually editing by the owner or webmaster, or when a hack attempts to write some code to your page. The solution is to search your PHP files carefully, noting blank or empty lines or blanks that are out of place. The SMF file_check utility may be helpful in this regard.

Note that recent versions of the FileZilla FTP client seem to have developed a reputation for picking the wrong mode, when you allow it to "automatic"ally select the transfer mode. However, it does have a genuine and severe problem — if you let it automatically select the mode, if the file does not have any extension (or has an unrecognized extension), FileZilla will assume that it is text and always choose ASCII mode. This causes major problems if you use FileZilla to transfer the SMF attachments directory, because attachment names are hashed for security and have no extension. All image and document files will be corrupted by FileZilla, so use great care with that FTP client.

3. A directory or file uses forbidden permissions. Some security software, such as suPHP, will stop access to a directory or file and throw a 500 error if the directory or file is "world writable" (typically 777 or 666 permissions). See Proper Permissions for a discussion on when (if ever) you should be using such permissions.
4. There is an error in your .htaccess or php.ini file. .htaccess gets read before every access to a directory or file, so a coding error in it can really stop you cold.
5. PHP settings in the wrong place. Some older systems permit PHP configuration settings to be placed in the .htaccess file, using phpflag or phpvalue commands. Many systems, however, now require PHP settings to be placed in php.ini, and will throw a 500 error if you use the old fashioned method. Watch out for mods (or mod instructions) that tell you to put such flags and settings into your .htaccess — either the author is ignorant or the instructions haven't been updated in years.
6. Use of certain Mod Security settings in your .htaccess file can cause a 500 error. Review them, and comment them all out to see if that's causing the error. If it is, reintroduce them one at a time until you find the offending entry. Most modern applications, including SMF and osCommerce, have sufficient security built in that it is safe to turn off Mod Security (usually with an .htaccess entry) if your host has turned it on.
7. Requesting SSL (https:) service, when there is no SSL certificate for that exact domain, can cause a 500 error on some servers.
8. Your hosting plan is very "low end", and severely limits the number of simultaneous PHP processes you can run, or has some other resource limitation, and kills off "excess" processes. Your only solution in this case is to move to a better hosting plan.
9. Your host server is misconfigured in some way, and ends up spewing out large numbers of zombie processes. Most hosts will then kill your SMF installation to avoid swamping the system with the Walking Dead. All you can do as a temporary workaround is to run a cron job that every minute looks for zombie processes and kills them. That's not a long-term solution, but can keep your host from causing you grief while someone tries to figure out what happened.
10. Someone or something screwed up editing PHP code, leaving echo //something-removed;. To remove such a line, it must be // echo something-removed;. PHP 4 seems to tolerate echo //, but PHP 5 doesn't!

Go to top

You may sometimes experience a strange error similar to: Warning: Cannot modify header information - headers already sent. The cause of this is usually very simple. A PHP program, such as SMF, normally works this way:

1. Page is started by a request for a new page.
2. The PHP processor initializes its state, including an HTTP "header" section for things like the character encoding used, the HTTP version, the location, etc. This header section receives a bunch of default values (settings).
3. The PHP script (.php file) is loaded and execution is begun.
4. SMF makes PHP calls to "header()" function to assign new values to some of the header settings. For example, a new setting may be made for the "location":
header("Location: /some path/some new script name.php");
or, session or cookie information may be changed.
5. SMF sends the first characters to the browser, of HTML code and text to be displayed. This triggers the immediate sending of whatever's in the "headers" data of the page. This will be some mixture of default and explicitly set ("custom") headers. Then the first HTML code and text gets sent.
6. SMF sends more HTML code and text to build up the page.
7. Page creation is finished (the .php script ends) and the page is complete.

So what can go wrong? Plenty! Somewhere between steps 3 and 4, extraneous text is produced and sent to the browser. This might be an error message of some sort, a Byte Order Mark, an out-of-place code, or just a stray blank ahead of an opening <?php marker (or sometimes after the last ?> marker). It's not uncommon for sloppy editing to leave a blank or two outside of the PHP code section. This blank will be sent to the browser as text to be output. So? It will trigger the sending of whatever is in the "header" data (step 5). Once the "headers" have been sent to the browser, your page source (the executing SMF PHP code) is not permitted to make any more "header()" calls, because headers are only sent once per page. You get an error message that "headers already sent". So, if text was sent too early, before SMF has finished declaring its custom header settings — error!

#### How to fix it?

First, you need to get an idea of where the unwanted text is coming from. The error message should tell you two things: where the "header()" call is being made, that has been disallowed, and the file and line number where text output to the browser was actually started. It's this second item that gives you an idea of where to start looking. It will usually be one of the files, such as Settings.php or some language definition file, which are called quite early in building a page, and are not supposed to output anything to the browser by themselves. You also want to look at the page source the browser is displaying, accessed in your browser by View > Page source (or some menu item of similar name). Look at the very top of the page source — do you see something before the expected first text (e.g., a <!DOCTYPE> or <html> tag)? A space or blank? An error message? Some extraneous text from a botched edit or modification (such as ?>)? The UTF-8 Byte Order Mark ï » ¿ (which may be invisible to you if your page is displaying in UTF-8 mode!)? There are all sorts of things you might find there.

Regardless of what kind of text produced the problem, you may have to do some spade work to track down exactly what is producing it. If it's an error message, you need to resolve whatever caused the error in the first place. If it's a stray blank or other character, you need to remove it from the file.

A final note: such a "headers already sent" error message can produce a whole cascade of additional error messages, such as other "headers already sent" or "session not started". Don't worry about those. Fixing this first error will often fix all those other errors in the bargain. Just worry about fixing the first error, and hopefully that will fix the others.

Go to top

### The .htaccess file

The .htaccess file is used by most Apache server configurations to specify various server settings. Some will also permit PHP settings to be entered here (see Querying/Setting my PHP settings), but you should use a php.ini file for that, provided it is supported. IIS (Windows) servers use a different method of specifying settings, and do not normally read the .htaccess file. So, don't just plunk down a provided .htaccess file on a Windows server and expect it to work — it probably won't! (Assuming you're running IIS instead of Apache)

An* .htaccess file has some interesting properties, depending on where it is found. Apache will read and process all the .htaccess files from the root (HTML's /) down the chain of directories all the way to the directory where the script file is executing. An .htaccess file will be read and fully processed before moving on to the next one down the line. Settings changed by an .htaccess file are semi-global, applying to that and every directory below the one where it was set. So, you may find yourself needing to "undo" actions or settings by "higher up" .htaccess files. That is often a sign that your site directory structure is poorly laid out.

For example, the root .htaccess file may "turn indexing off" (Options -Indexes), to prevent visitors from listing files in directories that have no "index" file of their own. If you want to permit this automatic indexing in this directory, you would have an .htaccess file in this directory, with Options +Indexes to turn indexing back on. Note that this "indexing on" setting will apply in turn to any directories below this one, unless you turn off indexing with yet another .htaccess in a lower level directory. Turning "switches" on and off is easy, compared to undoing the effects of some .htaccess entries, such as URL rewriting. If you find yourself in such a sticky situation, you may need to move URL rewriting down to a lower level (e.g., not in the root), or even restructure your site so as to not apply the original URL rewriting in all cases.

Finally, your .htaccess file may provide a hacker with juicy bits of information on your server setup. You should use the appropriate .htaccess commands to prevent someone from reading its contents, and if publishing its contents in a support forum, be careful about how much you show.

* Debate rages over whether to say "a .htaccess" or "an .htaccess". It all depends on whether you pronounce the period (dot) — "a dot h-t-access" or "an h-t-access". It's a matter of personal preference (ain't English fun?), so long as you never say "a h-t-access". The same problem arises whenever you have a choice of spelling out an acronym or initialism that begins with a vowel sound letter, but is pronounced with a non-vowel sound: "Did you receive a SIGPLAN notice?" vs. "Did you receive an ess-eye-gee-pee-ell-aye-enn notice?".

Even though a subdomain or "add-on" domain may have its own "root" directory (e.g., /home/ACCOUNT NAME/public_html/subname/), with its own .htaccess file, the overall root .htaccess is still the first one read and executed! Apache does not begin with the subdomain or add-on domain's root .htaccess file. This is easy to forget.

Go to top

### domain/ vs. domain/forum, or, put an application in the site root or a subdirectory?

It is frequently asked how to move an SMF or other application installation from /forum (or some other directory below the HTML root) up to the HTML root (/), or vice-versa. This can be done, but stop and think it through first. Moving installed applications up or down in the directory levels is generally not a good idea, especially once you are indexed in search engines and your readers or customers have your pages bookmarked.

You are usually better off leaving any major application in its own directory (folder), rather than putting it into the root. The only reason for putting it in the root would be to avoid having users see (or type in) the "/forum" part of the URL. This can be handled with URL rewriting, at least if you're on an Apache* server. The following code in your .htaccess file:

RewriteEngine On RewriteCond %{REQUEST_URI} !^/forum [NC] RewriteRule ^(.*)$/forum/$1 [L]

will transparently redirect visitors to your forum, while pretending to be in the site root. That is, visitors (and search engines) should not see that they have been taken to the /forum directory. Note that you need to drop the domain name (http://yoursite.com/) in order to make the rewrite "silent" and not show the new address in the browser's address bar (otherwise, the default is a "302" redirect). It's probably best in a configuration settings file such as Settings.php, to leave the full actual path (e.g., "/forum") in any URL or filepath definitions. There's no need to carry on a charade about where the visitor is; just to automatically jump them from the root to the application's directory. This way, internal links show an address with "/forum", which is the true address which may come in handy later. It is certainly possible to use URL rewriting to completely hide the true location of an application, by generating only "/forum-less" links and adding "/forum" back to every incoming URL, but that's rarely necessary unless you're really trying to hide something (such as which of several application versions currently installed, that you are using).

If you always allow visitors to bookmark and search engines to index the real address of your application, nothing will break when you install another application (in its own directory) and build some sort of "landing page" or "splash page" in root /. This page would include links to the various applications you have installed. You are free to build this page right away, even if you have just one application, but visitors might be slightly annoyed by having to make an extra (and unnecessary) click. OK, if someone bookmarked the site root / as your application, that link will now go to the landing page, but that isn't too catastrophic.

Why do it this way, as opposed to actually moving your application (e.g., SMF) up one level to the root? There are several reasons. One is that all your site "applications" are neatly compartimentalized and kept separate from each other. When each application has its own directory subtree, it can be added, deleted, updated, and changed without fear of stepping on some other application's toes, i.e., without accidentally breaking a file used by another application. In addition, the site-wide files, such as .htaccess, robots.txt, php.ini, and nnn.shtml error handlers can be left cleanly in their own place (the site root) without fear that you'll accidentally change or delete one of them while working on your SMF (or other application) files.

Even more critically, if your root's (/'s) .htaccess file is customized to one particular application in the root, it may have nasty effects on other applications which are in subdirectories. For example, say you put Wordpress in the root / and SMF in /forum. /.htaccess is set up for Wordpress's SEO and possibly other settings tuned to Wordpress. When running /forum's SMF, you must go through /.htaccess first, and then through any /forum/.htaccess specific to SMF. It will have to be modified to "undo" certain changes to the URL or the environment made for Wordpress, and then do its own URL rewrites, etc., for SMF. You may end up having to modify Wordpress's /.htaccess so that it leaves /forum destinations alone. Either way, you need an intimate knowledge of how .htaccess works. Contrast this with putting Wordpress in /blog (along with its /blog/.htaccess) and SMF in /forum (along with its /forum/.htaccess). Now, your /.htaccess can be empty, or do things that apply on a sitewide basis, such as redirecting www. domain to non-www. or vice-versa. The /blog and /forum .htaccess files can be dedicated to those products without mixing in anything for sitewide use, or needing to avoid certain operations for certain destinations. Simplicity!

It may seem like wasted effort to do a URL redirect from your root to your SMF installation. If you are absolutely certain that you will never have anything other than, say, SMF installed, it may indeed be so. However, by leaving the application in its own directory, you have the flexibility to cleanly install other applications in the future, with your own custom landing page or Home Page in the root, to introduce your site and provide links to the various applications on your site. At that time, you would drop the rewrite from / to /forum.

* Most Windows-based servers run IIS or Windows Server, but some run Apache. Most Linux-based servers run the Apache server, although Nginx is increasingly popular. It is similar to Apache, but not exactly the same. When someone says "Linux server", they may be referring to the underlying Operating System (Linux), or also to the Apache server running on top of it. Strictly speaking, Linux doesn't have to mean Apache, and Apache doesn't have to run on Linux, but a lot of people use Linux and Apache interchangeably, when they really shouldn't.

To this day, I can't understand why Google, et al., don't treat mydomain.com and www.mydomain.com as the same thing, but instead they treat them as separate sites, dinging you for duplicate content and forcing you to redirect one form to the other.

Go to top

### Search Engine Optimization

First of all, understand that there are some general principles to SEO that will (probably) never change. Also understand that there are plenty of "services" which claim they understand Google's algorithms and can make your site do well. Unfortunately, as Google constantly changes their algorithms to stay ahead of those who are "gaming" the system, whatever artificial gains you see will soon disappear. "Black hat" techniques will hurt you even quicker.

• If you use <meta> keywords, make sure they honestly reflect the content of a page. Search engines don't pay all that much attention to meta keywords, but you can be dinged if they see you listing all sorts of stuff that isn't in your page. For example, "search term hijacking" is frowned upon — say you sell an athletic shoe knockoff called "Adidos". You load up your keywords tag with Adidas, Puma, Nike, Converse, New Balance, Keds, PF Flyer, etc. You are hijacking searches by consumers for specific shoe brands — unethical at best, and a trademark violation at worst.
• It's good to use a <meta> description tag, to provide something appropriate and readable for the summary when a search engine lists your page. Just keep in mind that the description isn't used all that much for ranking purposes.
• Use an appropriate <title> tag, preferably with an important keyword or two.
• Make sure you have plenty of text content for a search engine to look at. All nontrivial images should have alt= and title= attributes*. Remember that search engines can't look inside of images to see any text or content — you have to provide text. So, don't go overboard on doing your page in Flash, as a search engine won't see anything in there.
• All text is equal, but some text is more equal than others. To some degree, text in higher headings (<h1>, <h2>, etc.) has counted as a bit more important than other text. This may not still be true, but it doesn't hurt to use higher headings where appropriate that contain important keywords. That is, instead of making a new paragraph with heading text in bold and larger font, use an <hx> tag to do the job. You can always use CSS to fine tune the appearance. Remember that the whole point of header tags (and many other HTML tags) is to enable semantic markup, where your text is self-describing as to what its function is.
• Your text, in general, should contain the keywords you want to capture (what a visitor using a search engine might be looking for), in a natural, unforced manner. That is, use keyword-rich text, but don't go into contortions to try to fit in every keyword you can think of. Remember that people will be reading your pages, too!
• Avoid "black hat" techniques such as invisible/tiny-tiny text or repeating key words over and over. Search engines are now able to discover a lot of this, and will ding you badly for using such tricks.
• Watch out for duplicate page content. If two pages are seen as substantially identical, a search engine may ignore one of them, or penalize both (reducing the ranking). You may want to use a robots.txt file to keep search engines from cataloging more than one version of a page (say, catalog the main version, and ignore print and mobile versions).
• Run your site through a markup validator, such as http://validator.w3.org. While it is unnecessary to have an absolutely "clean" page (HTML and CSS) to get cataloged, if the search engine can't understand some of your markup, it may miss parts of your pages. Page display will likely be cleaner and more consistent across different browsers if it's clean markup, too, so it's worth the effort to at least get rid of the "stupid errors".
• Search engine spiders seem to have a short attention span, and may not read the entire page through. Consider
1. using CSS-assisted markup rather than heavy tables for layout — the less tag cruft the spider has to wade through, the sooner it will get to your real content.
2. if possible, front load your page and mention as many keywords and phrases as near the beginning as you can, to increase the chances that they'll be picked up. Some page designers even make the effort to define the divs holding key content up at the beginning of the page body, and using CSS positioning to locate the divs (and their text) in the right place on the page.
• Most authorities consider Search Engine Friendly ("SEF") links (along with .htaccess code to change them back into normal URL Query Strings) to be wasted effort. It used to be that search engines had a hard time dealing with URL Query Strings (/index.php?board=1234&topic=5678), so the recommendation was to write links (URLs) as something like /index.php/b/1234/t/5678.html, which .htaccess would convert back into the previously listed form (invisibly to the user). They cost a few extra cycles to convert in .htaccess, but probably do no other harm. They also look neater and better organized, and many people prefer them for that reason.

Some forums and blogs will take it further, adding a normalized human-readable form of the topic title (along with a message ID number), something like /index.php/1234/5678/why-you-should-use-sef-link-formats.html. Well, to be honest, no one is going to type that in anyway, much less even remember it long enough to type it in. It needs to include the message ID number anyway, so not much is gained. There also seems to be a chronic problem with non-ASCII characters not always being handled properly and the removal of punctuation changing the meaning of a title.

SEF link URLs can have an arbitary file extension, or no extension at all. Some sites like to hide the underlying technology of the site by always using an .html extension. This security through obscurity is really of little value. However, rewriting the name in that manner can allow you to change the underlying language at any time (e.g., from .php to .aspx), without invalidating all your visitor's bookmarks. For that reason alone, it can be of value.

Remember that search engine ranking criteria are constantly changing, so there's no point in putting a lot of effort into (or paying money for) "tricks" that will soon lose their effectiveness. Just keep the above points in mind when designing and maintaining your site, and you'll do just as well as if you had spent lots of money on SEO.

* Remember that up through (at least) Internet Explorer 6, Microsoft flouted standards (what else is new?) and used any alt tag for a "tooltip", unless overridden by a title tag. This is why you will often see a suggestion to use both tags for any nontrivial image (title="" to suppress a tooltip, when an alt tag is given). alt tags are intended for display or audio output only when graphics are not used (e.g., by a screen reader, or when graphics are turned off for speed, or by search engine spiders). title tags are intended solely for "tooltips", displayed only when your mouse pointer is over the image (or other HTML element).

Go to top

### Querying/Setting my PHP settings

A common problem is finding what PHP settings your site is running with. You should know how to query PHP's current settings. This may be available in an application's Admin section, or you can write and run the following script (name.php):

<?php phpinfo(); ?>

That's it. You run this from your browser, and it will give you all sorts of information about the current settings, as well as the parameter names you will need to make changes. Run this "phpinfo" script before and after making changes, so you can confirm that your changes to PHP "took".

It's not good to leave this script lying around, for snoopy people to run. While a hacker can't make any changes to your site with this script, they can potentially learn useful tidbits of information about your server configuration that they might be able to use to break in. When you're done with it, erase it or move it out of your site (e.g., above public_html/), or change its permissions to 600 (and test that it can't be run from a browser). At the very least, give it a very obscure name that no hacker will ever guess.

So what can you do with this information? Any server configuration worth the name will allow you to alter many of these settings. You will be subject to various limits imposed by your hosting service (such as allowable run time and maximum memory size used). You will put your changed settings in a file where the server knows to look. Old server software may still use entries in the .htaccess file, although most servers today expect entries to be in a php.ini file. The exact syntax of entries may vary according to the server configuration, so consult with your hosting service.

#### .htaccess

In an .htaccess file, it's usually done with phpflag (for true/false or on/off settings, usually as 1/0 values) or phpvalue (for anything other than binary settings). It's something similar to

phpflag register_globals 0 phpvalue memory_limit 8M

Note that in many older products, you will often see sample PHP settings given in sample .htaccess files. It is up to you to determine if this is the correct method, or if you should transfer these settings to a php.ini file. Some servers that require you to use php.ini will give you a "500" error if you insist on putting PHP settings in the .htaccess file.

#### php.ini

Most PHP installations require settings to be placed in a file named php.ini. The syntax is different from settings in .htaccess:

register_globals = off memory_limit = 8M

Again, consult your hosting service for the exact syntax required, whether a different name is used, and on some servers, whether you're allowed at all to change PHP settings.

Some server setups require you to tell the system where to find the php.ini file. This is usually a line in .htaccess similar to

suPHP_ConfigPath /home/ACCOUNT NAME/public_html

or whatever directory you placed the php.ini file in. Your hosting service can tell you for sure what to do. A couple of notes:

1. Whether you use a php.ini file or a httpd.conf file (or something else) depends on your server setup. Many shared hosting systems let you create your own php.ini files, while VPS or dedicated systems may make you edit system configuration files. As usual, ask your hosting service if you don't know.
2. On a shared server, it's not uncommon for no default php.ini file to exist. That is, you will not find a php.ini file provided for you, no matter how hard you look. You have to create the file yourself, either using your host's control panel file editor, or creating it on your PC and uploading it.

Note: Some hosting services use a different file name than php.ini. At least one is known to use php5.ini for PHP 5 systems, and other names are possible. If php.ini doesn't work, you should consult your hosting service's tech support to see if a different name should be used, if .htaccess needs to be updated, or if there are any other differences.

Finally, your php.ini file may provide a hacker with juicy bits of information on your server setup. You should use the appropriate .htaccess commands to prevent someone from reading its contents, and if publishing its contents in a support forum, be careful about how much you show.

Go to top

### Blank page/White Screen of Death

Sometimes when you expect a page to come up, you instead get a blank white screen. The usual cause is a serious syntax error in PHP, which caused processing of the page to stop dead. The first thing to do is to see if there is any HTML for the page. Go to View > Page source on your browser and look at the "source" of the page. Often it will be blank, but sometimes you'll get lucky and there will be some text* and maybe even an error message.

There are at least three other places to look for error messages. With luck, your PHP processor is configured to "log" somewhere any errors it encounters. It is common to drop a file named "error_log" (or something similar) in the directory of the failed script. It might be under a different name, or all errors might be logged in some central location on your site. You just have to learn what your system does. That log should list the specific error and the file and line it came from. Instead, or in addition, there may be an error log for your system (such as cPanel > Error log). You can try this log, but PHP errors are rarely logged here. Finally, your application (SMF, etc.) may have its own error log. In SMF, you go into the Admin section and ask to display the error log. It's kept in the database, separate from system error logging, and if errors are being produced fast enough, can run you out of database space!

In extreme cases, you may have error reporting settings that don't log the error anywhere that you can find it. Usually this means that someone put an ini.set() call on the page to suppress error reporting. If so, you need to temporarily disable it (comment it out or change the setting) in order to get the correct error message.

Once you find the reported error, you need to fix it. If it's beyond your programming prowess, you ask about it on the application's discussion board. Don't forget in SMF to see if the error message mentions "eval". If it does, you need to Disable Eval and run some more to get new errors with the correct file and line number.

All error logs should be checked on a regular basis, not just when you have a WSOD. Reported errors should be dealt with (fixed) without delay, and error logs trimmed back (emptied) so that they don't grow excessively large. This is of particular importance for error logs kept in the database, such as SMF's. I run a nightly cron job that alerts me to the presence of new error_log files on my site, so I can investigate them before too much damage (lost visitors) is done.

* If you're using some versions of Internet Explorer, don't get too excited upon seeing a non-blank page source. Some IE levels will output a dummy skeleton of a page, with perhaps one <meta> tag and no <body> content. In other words, it's useless for diagnosing what went wrong.

#### A special note for SMF

Some people experience intermittent WSODs on SMF — refreshing the page once or twice will usually bring it up OK, with no errors logged. In such a case, try going to the SMF Admin panel and Disable Hostname Lookup. Sometimes that will do the trick (for this and many other intermittent problems).

SMF 2 can also experience this error if its cache is corrupted. You can try erasing all data_*.php files in the "cache" directory (leave .htaccess and index.php alone).

Go to top

There are generally two kinds of file transfer modes, used by FTP when moving a file up from PC to server, or down from server to PC. These are binary and ASCII (also known as text).

Why the difference? Text files — human-readable files such as PHP scripts, .txt files, etc., have different "end of line" conventions to mark the end of a line of text (the physical end of line, where you hit the Enter key to break and go to a new line). Microsoft products (DOS, Windows) use "CRLF" (carriage return and linefeed), hex codes 0D0A. Linux and other Unix-family systems use a "newline" (hex 0A). Note that "linefeed" is the official ASCII name for this character, but the C language and Unix operating system established a convention of calling it a "newline" in this context. Mac operating systems have usually used a carriage return (hex 0D) to mark the end of a line. Three different families of operating systems; three different conventions for a file to mark the physical end of a line.

There is another class of files that are not human readable (as ASCII text). These are the binary files, and include almost all images, as well as executable modules or object files (things you normally won't deal with in a Web environment). Various "backup" and "archive" formats, especially if compressed (e.g., .tgz), are also binary. It is critical that these files not be modified in any way during upload or download.

Most servers you will deal with are Linux-based, although there are a fair number of Windows servers out there. For text file upload, ASCII mode transfer is used to convert your PC's convention of line ends to what the server wants to use. E.g., you have a Windows PC and your new .php file has CRLFs to mark the end of lines, the CRLFs have to be changed during transfer to newlines on your Linux server. Needless to add, "and vice-versa on download". If it's a Windows server, nothing would have to be done, but it's a good idea to get in the habit of specifying the correct mode, in case you ever switch over to a Linux server.

So why not just pick one mode and transfer everything that way? Well, you don't want to transfer image files in anything but binary mode! These files are likely to have a mixture of CRs, LFs, and even CRLFs in their data, none of them meaning "end of line". If any of these are converted during tranfer, it will corrupt the file. And if you try the reverse transfer in the same mode to "undo" that, you're likely to cause further corruption. Say you have some CRLFs and some LFs in the binary data. You unwittingly upload from your Windows PC to your Linux server in ASCII mode. The CRLFs are changed into newlines (LFs) but the existing LFs are usually left alone. The image doesn't show correctly. "No problem, I'll undo my mistake by downloading in ASCII, and then upload in binary. Nope. All the LFs will now be converted to CRLFs, including those that were originally just LFs. The file will probably be even more corrupted now!

It is possible to transfer files (download and then upload) in binary mode. You can't upload a text file with CRLF line ends in binary mode to a Linux server, as Linux often won't like the extra CRs in the text. However, you can binary download a text file from Linux to Windows or Mac, work with it if you wish, and then upload it in binary. The only drawback is that the file on your PC won't match your operating system's convention for line ends. You need to use an editor (e.g., ViM) that is comfortable handling any line end convention.

Note that many FTP clients, when asked to "bulk" transfer multiple files in one go, will choose binary mode. For single files, if you specify "automatic" mode detection, a client may go by the file extension, or look inside the file and try to figure out what the convention is (e.g., only human-readable text, with no control characters except for CRLFs and tabs, might be assumed to be ASCII text — otherwise, binary). If you allow your FTP client to pick the transfer mode, be sure to understand what it's doing, and be prepared to override its choice if you realize that it picked the wrong mode. Remember that file upload (or download) copies (not moves) files, so you should always be able to re-transfer the original file, provided you catch the mistake in time, before you erase or overwrite the original file.

See the Error 500 discussion for special warnings concerning the popular (but flawed) FileZilla FTP client!

In summary, do not mix FTP transfer modes on a given file — if you transferred in ASCII in one direction, transfer in ASCII the other way. Be careful to always transfer binary files (images, backups, etc.) in binary mode, as they will be terribly corrupted by ASCII mode. Binary down and back up is usually safe, but binary upload of a text file can cause trouble if its line end convention doesn't match the server's.

Go to top

### Microsoft “Smart Quotes”

In its infinite wisdom, Microsoft extended the Latin-1 (ISO-5589-1) character set to use many of the "upper" control character slots as special characters. These are found in positions hex 8x and 9x, where there are normally a number of seldom-used control characters. This extended Latin-1 is designated CP-1252 (a.k.a. Windows-1252), and is commonly the default encoding (character set) on English and Western-European Windows PCs. There are some other CP-125x encodings based on other ISO-8859-x encodings that you might run into from time to time. Microsoft products, such as Word and Outlook, by default make use of these characters in a feature called "Smart Quotes". You type in "Smart Quotes"-eligible character(s), and Word automatically changes them to their “Smart Quotes” equivalent, which is more typographically correct. Did you see the difference ("" changed to “”)? Unfortunately, these characters are using an encoding not often used on Web sites.

Many web applications, such as SMF, by default use Latin-1 for their page encoding (as well as database and English language support). What happens when you "cut and paste" from a word processor document using "Smart Quotes" is that the character byte values come over unchanged, including the non-standard characters used for "Smart Quotes". The visual difference between a " and “, for example, can be quite subtle and easy to overlook. The effect of having what your browser understands to be a control character in the midst of your text can range from that character simply vanishing, to all the text following it also vanishing. If your page is set up to use UTF-8 (standard for non-English and non-Western European languages), your browser will regard these 8x and 9x characters as "invalid" and refuse to display them (or sometimes, worse). Or, you may get lucky and the browser will convert the "Smart Quotes" into proper UTF-8 characters. It depends on the behavior of the source (e.g., Word), the Operating System's clipboard, and the behavior of the receiving application (picking the UTF-8 version from the clipboard, if available). Some people will assure you that this behavior is guaranteed, but it's not!

So, what can you do to avoid these problems? The best thing is to avoid using Microsoft products, at least when composing entries for a web-based forum. It's incredible that some people will actually fire up Word to type in their posting, cut and paste the result into their forum, and discard the Word document! It's somewhat understandable if you're going to keep the Word document anyway, for other purposes, but otherwise it's just plain stupid, as all the formatting is lost anyway when you cut and paste, and you have to edit the post to re-insert BBCode markup. Plus, you now have the problem of the text being poisoned by "Smart Quotes" characters. At the least, you can shut off the "Smart Quotes" feature while composing text destined for the Web (although it's a bit of a pain, as it's a multistep process going deep into Word's configuration). Note that this will leave your resulting document rather ugly, if it is to be printed out or displayed in Word in the future (because it's using plain ASCII characters).

If you and your members want to create Word documents for other purposes (printing and display), and happen to want to cut and paste the text into a Web-based application (e.g., SMF), what can you do?

1. Turn off "Smart Quotes", and end up with ugly ASCII-only text in the document.
2. Manually edit the text after cutting and pasting, to replace "Smart Quotes" with regular characters (e.g., “ with "). This way, your document still looks typographically good, while your SMF posting is clean, but at the cost of a lot of extra work (and edits you're bound to miss).
3. Extend SMF to change "Smart Quotes" to proper text on the fly.
4. Change the page encoding that SMF uses. If your SMF installation is currently using Latin-1 (ISO-88591-1), you should be able to safely change the encoding (character set) from Latin-1 to CP-1252. CP-1252 is a superset of Latin-1. The database (MySQL) shouldn't care all that much, although any sorting involving these "Smart Quotes" characters may produce slightly unexpected results (“ won't sort in the same place as ").

If your current SMF is using Latin-1, it should be safe to change to CP-1252. This would involve changing (or adding) a specific entry in the SMF database. In phpMyAdmin, go to the smf_settings table (the name may be different if you chose to use a different "prefix" than smf_). Find the "global_character_set" entry, if it exists. If you have a default installation, there may be no such entry (in which case SMF defaults to Latin-1), and you have to insert a new record with "variable" = "global_character_set" and "value" = "CP-1252". If such an entry already exists, change "value" to "CP-1252". In either case, do not enter the quotes ". Warning: do not do this if global_character_set already exists and is "UTF-8". It should work only if you are currently in Latin-1 (ISO-8859-1) encoding. Check that the <head> section of your resulting SMF pages include a <meta> tag defining the charset to be CP-1252.

Most browsers will support CP-1252 encoding. Some will even use it when you request ISO-8859-1! If not, try changing it from CP-1252 to its alternate name, Windows-1252, and see if that works. If you have problems with either, simply back out your change to the "settings" table (remove the inserted record, or change the value back), and you should be back to where you started.

Go to top

## Simple Machines Forum-specific items

### Enable/Disable Eval

Most error messages in SMF are complicated by the fact that they refer to "eval" and don't show the exact location of the error. You need to "disable" the eval function in order to get the correct file and line number, and then re-enable it once you've done that. The easiest way is (in SMF 1.x) to install mod 2054 to give you an admin panel interface (eval mod) to turn eval off and on. Note that this function was built into production releases of SMF 2.0, and is available through Admin. Please do not report an error if its error message includes a reference to "eval" or "eval?". It really annoys support people to have to ask again and again to turn off "eval" and give the correct file and line number!

Turning off "eval" can also be done manually via phpMyAdmin or some other interface to run SQL queries. You may choose to do this if you are unable to install the mod, or don't wish to:

REPLACE INTO smf_settings VALUES ('disableTemplateEval', 1);

If your forum is set up to use a different table prefix than smf_, use that instead.

Some people will tell you to use the following command:

INSERT INTO smf_settings VALUES ('disableTemplateEval', 1);

but this will fail if you already have a "disableTemplateEval" entry, so it's best to ignore that suggestion.

Disable eval by either method. Clear your error log. Run for a bit to generate some new error messages with the correct file name and line number(s). These are the file name and line numbers that you will report in the SMF support forum. Note that it is possible that after all this effort, the file name and line numbers might not change! At least, you should no longer see anything in the error message about "eval".

When you have collected the correct file name and line number(s), you should re-enable eval. If you installed the SMF admin mod, just go there and click to enable eval. If you manually changed the database via phpMyAdmin:

REPLACE INTO smf_settings VALUES ('disableTemplateEval', 0);

Go to top

Go to top

### SMF Falsely claiming a database upgrade is required

This might have been fixed at some point, but for a long time many users were baffled by ominous error messages that they needed to upgrade their databases! This false message was triggered by a simple check of the program version against the stored database version. The problem is that SMF's database layout usually changes very slowly, staying stable for many program releases, and its maintainers simply kept forgetting to update the database version setting! This is an example of a poor build process. Anyway, I played around with PHP code to map certain DB versions to ranges of program versions, but didn't carry it through to something releasable. This is another thing that the developers claimed was no problem, so there was no chance of it being incorporated into the product.

Go to top

All content © copyright 2005 – 2022 by Catskill Technology Services, LLC.