translated by Google

Machine-translated page for increased accessibility for English questioners.

Czech coding in documents

Each web document, as plain text in its essence, is written in one of the character encodings (character set, charset, character encoding). Today, you will most often encounter UTF-8 universal encoding on the web; sometimes you may come across ISO-8859-1 for English documents or ISO-8859-2 (or similar WINDOWS-1250) for Czech. It is necessary to ensure that the encoding in which the document is actually written agrees with the encoding that the target web browser will use when interpreting / decoding it. If there is a discrepancy, the person viewing your site will see more or less incomprehensible text.

If you write your pages in UTF-8 encoding, you don't have to worry about anything else. Visitors will see accents (and other non-ASCII characters) as you intend.

When interpreting a downloaded web document, the browser determines the encoding in which the document is written, based on the following indicators:

  • HTTP header Content-Type
  • mark meta in the source text of the document
  • the default setting of the expected encoding in the browser

This list is organized by priority: that is, if the server sends a header when transmitting a document Content-Type with the specified encoding, the browser will take it as authoritative. If an adequate header is not present, the browser will look for any tags meta in the source text of the document. If even this is not present in the required form, the default browser settings will be used.

The web server on Aise (or www.fi.muni.cz) implicitly headed Content-Type sends and notifies browsers that UTF-8 encoding is being used. This behavior can be changed by a directive AddDefaultCharset in an adequate file .htaccess . Its syntax is AddDefaultCharset kódování and its presence (on its own line) in the file .htaccess in the web document directory, causes browsers to be notified of the appropriate changed encoding for those documents. Instead of a valid encoding, it is possible to specify a word in the parameter Off , which turns off header sending completely. It is then desirable to indicate the mark in the documents meta , which the browser will follow when displaying these documents.

Server-supported creation of multilingual document mutations

The request for a specific page, which is sent by the server's web browser, also includes a priority specification of the languages in which the browser prefers the page. (This preference list is set by the user in the browser.) On the server, different mutations of one page are stored in different files in one directory, with each language mutation being marked with an additional extension depending on the language. For example, the document https://www.fi.muni.cz/~xnovak99/stranka.html can be represented on disk by files stranka.html.cs , stranka.html.en and stranka.html.pl (Czech, English and Polish versions).

If the server has multiple language variants of the requested document, it returns to the browser the one of the available variants that best suits the browser's priorities. If none of the preferred language variants is available, the server returns the Czech version.

This behavior is suppressed (on the server side) if at least one of the following conditions is true:

  • a file corresponding to the document name in the URL is available (ie without language extension)
  • a file is present in the directory where multilingual versioning should not be considered .htaccess with line Options -MultiViews
  • the directory where multilingual versioning is not to be taken into account does not have Unix permissions r for others
This means, among other things, that if you place files in one directory stranka.html.cs , stranka.html.en and stranka.html , so the contents of the file will always be returned to the user upon request for the page.html document stranka.html .