Firebird Documentation Index → Firebird 2.5 Language Ref. Update → Data types and subtypes → New collations |
Table of Contents
Added in: 1.0, 1.5, 1.5.1, 2.0, 2.1, 2.5
The following table lists the collations added in Firebird. The “Details” column is based on what has been reported in the Release Notes and other documents. The information in this column is probably incomplete; some collations with an empty Details field may still be case insensitive (ci), accent insensitive (ai) or dictionary-sorted (dic).
Please note that the default – binary – collations for new character sets are not listed here, as doing so would add no meaningful information.
Table 5.2. Collations new in Firebird
Character set | Collation | Language | Details | Added in |
---|---|---|---|---|
CP943C | CP943C_UNICODE | Japanese | 2.1 | |
GB18030 | GB18030_UNICODE | Chinese | 2.5 | |
GBK | GBK_UNICODE | Chinese | 2.1 | |
ISO8859_1 | ES_ES_CI_AI | Spanish | ci, ai | 2.0 |
FR_FR_CI_AI | French | ci, ai | 2.1 | |
PT_BR | Brazilian Portuguese | ci, ai | 2.0 | |
ISO8859_2 | CS_CZ | Czech | 1.0 | |
ISO_HUN | Hungarian | 1.5 | ||
ISO_PLK | Polish | 2.0 | ||
ISO8859_13 | LT_LT | Lithuanian | 1.5.1 | |
UTF8 | UCS_BASIC | All | 2.0 | |
UNICODE | All | dic | 2.0 | |
UNICODE_CI | All | ci | 2.1 | |
UNICODE_CI_AI | All | ci, ai | 2.5 | |
WIN1250 | BS_BA | Bosnian | 2.0 | |
PXW_HUN | Hungarian | ci | 1.0 | |
WIN_CZ | Czech | ci | 2.0 | |
WIN_CZ_CI_AI | Czech | ci, ai | 2.0 | |
WIN1251 | WIN1251_UA | Ukrainian and Russian | 1.5 | |
WIN1252 | WIN_PTBR | Brazilian Portuguese | ci, ai | 2.0 |
WIN1257 | WIN1257_EE | Estonian | dic | 2.0 |
WIN1257_LT | Lithuanian | dic | 2.0 | |
WIN1257_LV | Latvian | dic | 2.0 | |
KOI8R | KOI8R_RU | Russian | dic | 2.0 |
KOI8U | KOI8U_UA | Ukrainian | dic | 2.0 |
TIS620 | TIS620_UNICODE | Thai | 2.1 |
The UCS_BASIC collation sorts in Unicode code-point order: A, B, a, b, á... This is exactly the same as UTF8 with no collation specified. UCS_BASIC was added to comply with the SQL standard.
The UNICODE collation sorts using UCA (Unicode Collation Algorithm): a, A, á, b, B...
UNICODE_CI is truly case-insensitive. In a search for e.g. 'Apple', it will also find 'apple', 'APPLE' and 'aPPLe'.
UNICODE_CI_AI is accent-insensitive as well. According to this collation, 'APPEL' equals 'Appèl'.
Added in: 2.1
Firebird now comes with UNICODE collations for all the standard character sets. However, except for the ones listed in the new collations table in the previous section, these collations are not automatically available in your databases. Instead, they must be added with the CREATE COLLATION statement, like this:
create collation ISO8859_1_UNICODE for ISO8859_1
The new Unicode collations all have the name of their character set with
_UNICODE added. (The built-in Unicode collations for
UTF8 are the exception to the rule.) They are defined, along with the
other collations, in the manifest file fbintl.conf
in Firebird's
intl
subdirectory.
Collations may also be registered under a user-chosen name, e.g.:
create collation LAT_UNI for ISO8859_1 from external ('ISO8859_1_UNICODE')
See CREATE COLLATION for the full syntax.
Firebird Documentation Index → Firebird 2.5 Language Ref. Update → Data types and subtypes → New collations |