Czech coding in documents
Each web document, as plain text in its essence, is written in one of the character encodings (character set, charset, character encoding). Today, you will most often encounter UTF-8 universal encoding on the web; sometimes you may come across ISO-8859-1 for English documents or ISO-8859-2 (or similar WINDOWS-1250) for Czech. It is necessary to ensure that the encoding in which the document is actually written agrees with the encoding that the target web browser will use when interpreting / decoding it. If there is a discrepancy, the person viewing your site will see more or less incomprehensible text.
If you write your pages in UTF-8 encoding, you don't have to worry about anything else. Visitors will see accents (and other non-ASCII characters) as you intend.
When interpreting a downloaded web document, the browser determines the encoding in which the document is written, based on the following indicators:
- HTTP header
Content-Type
- mark
meta
in the source text of the document - the default setting of the expected encoding in the browser
This list is organized by priority: that is, if the server sends a header when transmitting a document
Content-Type
with the specified encoding, the browser will take it as authoritative. If an adequate header is not present, the browser will look for any tags
meta
in the source text of the document. If even this is not present in the required form, the default browser settings will be used.
The web server on Aise (or www.fi.muni.cz) implicitly headed
Content-Type
sends and notifies browsers that UTF-8 encoding is being used. This behavior can be changed by a directive
AddDefaultCharset
in an adequate file
.htaccess
. Its syntax is
AddDefaultCharset
kódování
and its presence (on its own line) in the file
.htaccess
in the web document directory, causes browsers to be notified of the appropriate changed encoding for those documents. Instead of a valid encoding, it is possible to specify a word in the parameter
Off
, which turns off header sending completely. It is then desirable to indicate the mark in the documents
meta
, which the browser will follow when displaying these documents.
Server-supported creation of multilingual document mutations
The request for a specific page, which is sent by the server's web browser, also includes a priority specification of the languages in which the browser prefers the page. (This preference list is set by the user in the browser.) On the server, different mutations of one page are stored in different files in one directory, with each language mutation being marked with an additional extension depending on the language. For example, the document https://www.fi.muni.cz/~xnovak99/stranka.html can be represented on disk by files
stranka.html.cs
,
stranka.html.en
and
stranka.html.pl
(Czech, English and Polish versions).
If the server has multiple language variants of the requested document, it returns to the browser the one of the available variants that best suits the browser's priorities. If none of the preferred language variants is available, the server returns the Czech version.
This behavior is suppressed (on the server side) if at least one of the following conditions is true:
- a file corresponding to the document name in the URL is available (ie without language extension)
- a file is present in the directory where multilingual versioning should not be considered
.htaccess
with lineOptions -MultiViews
- the directory where multilingual versioning is not to be taken into account does not have Unix permissions
r
for others
stranka.html.cs
,
stranka.html.en
and
stranka.html
, so the contents of the file will always be returned to the user upon request for the page.html document
stranka.html
.