So, you want to add another language into your website. I’ll be concentrating primarily on Japanese, since that’s what I’m most familiar with. I used to be the system admin and web programmer for www.shopjapan.co.jp along with building Customer Relationship Management (CRM) solutions for Fujitsu in Nagoya.
Let’s start with the basics. If you are only familiar with English, get ready for the exciting (and often frustrating) world of “double-byte“. In English, because of the limited number of alpha-numeric characters possible (the alphabet, upper and lower cases, numbers, and some special characters), everything can fit into one byte. In other languages, to create their “alphabet”, double that amount of space is required. For example with Japanese, there are over 2,000 Chinese characters, called Kanji. Plus, they are ?lucky? enough to have 2 other alphabets; hiragana and katakana.
The frustration comes when you have to use older US programs and applications that are not double-byte compatible. (“Double-byte” isn’t all that common today; more often you’ll hear it referred to as “Unicode” now a day; close enough in meaning that I’ll use it for the rest of this discussion.) Unicode is now becoming the standard for most developers because it allows for these double-byte characters. That means you can use almost ANY language in there. What happens with these older, non-Unicode programs is that it tries to convert what you write in Japanese to the single-byte string, meaning it comes out as a bunch of garbage. Yes, it is VERY helpful if you can READ the language you are working with, otherwise the real language and the garbage-code looks pretty similar. 😉
If you use a Windows computer and you want to work with Japanese, one of the first things you’ll need is Windows 2000 or newer, and make sure you have installed the Asian Language Pack. If you didn’t install this initially you can later in XP by going to Control Panel, Regional and Language Options, Languages, Install East Asian Languages. You’ll need the original XP install disk to do this. You’ll eventually get a taskbar icon that lets you switch between English and Japanese, so you can then type nihongo into your application.
Now that you can type Japanese, you’ll have to see what happens when you enter it into the application you want to use. For newer applications, Unicode is a standard, so you shouldn’t have huge problems. In fact, the default install on WordPress is UTF-8, or Unicode. (Yes, you’ll have to start learning about the character sets.) Unicode generally renders OK for Japanese input. However, here’s an important tip, always test out your work with someone representing your target audience. That means, make friends with some nice Japanese person willing to check your page ON THEIR MACHINE. Most often their machine will be a completely Japanese OS, rather than just the added font installed on top of English OS.
Next, you’ll also probably have to add Japanese into your web browser, Firefox or IE, or whatever it is you suspect most web audiences will use. Test the page yourself, can you see the Japanese. No, it’s garbage? Then make sure you set the correct ENCODING in the browser. That’s usually View, Encoding, on most browsers. There are 3 main Japanese encoding: Shift-JIS, EUC-JP, and ISO-2022-JP, along with the generic (and fail-safe) UTF-8 (our friend, Unicode). If you do build a site SPECIFICALLY for Japanese, then it makes sense to use a either Shift-JIS or EUC, the two most common. This is particularly important when it comes to email — a REALLY frustrating problem. (Email in general is frustrating as there are so many email clients out there, plus things like CDONTS for ASP and PHP Mail don’t support those Japanese characters sets natively well, but there are work arounds. I’d recommend doing Google searches once you’re that far along.)
If you are building a site that will have both English and Japanese, for example a mult-language site, then Unicode might be the quick, easy answer. It’s not perfect, but it will save you a lot of time in frustration. (Beware the email problem I noted, if you try sending Unicode out in email, then the Japanese recipient may have to change the encoding in the email to view it correctly. Not something most users are up to speed on.)
Those oddities of character sets and encoding might cause you hours of frustration; so, best bet is to have your friend view your test sites long before you really start heavy duty development. It may be you just can’t get that particular application hacked enough to work correctly. Remember also that if you plan on using graphics with Japanese in then, you need to test out that software program. Newer versions of Photoshop are Unicode compatible, but some of those small “make a button” freeware apps aren’t. You’ll get garbage. 😦
Also, note that your web server needs to be able to render those double-byte characters. (IIS5 (Win2000) for example isn’t set up to do it for ASP natively, so, you’ll have to have to check that as a possible point of failure.) Newer web servers should be able to handle Unicode fine.
Sorry, that’s a lot of information, and yet, it’s just scratching the surface since there are so many applications out there. The best lessons are: 1) have patience, 2) use newer applications, and 3) make some foreign-language friends! 😉