Questions tagged [unicode]

Unicode is intended to be a universal character set for describing all the characters required for written text incorporating all writing systems, technical symbols and punctuation.

Unicode

Unicode assigns each character a code point to act as a unique reference:

  • U+0041 A
  • U+0042 B
  • U+0043 C
  • ...
  • U+039B Λ
  • U+039C Μ

Unicode Transformation Formats

UTFs describe how to encode code points as byte representations. The most common forms are UTF-8 (which encodes code points as a sequence of one, two, three or four bytes) and UTF-16 (which encodes code points as two or four bytes).

Code Point          UTF-8           UTF-16 (big-endian)
U+0041              41              00 41
U+0042              42              00 42
U+0043              43              00 43
...
U+039B              CE 9B           03 9B
U+039C              CE 9C           03 9C

Specification

The Unicode Consortium also defines standards for sorting algorithms, rules for capitalization, character normalization and other locale-sensitive character operations.

Identifying Characters

Related tags

45 questions
0
votes
1 answer

Different results for json_encode in Wordpress with php compiled with same configuration and libmbfl version

I have an older Debian server and a local install of Wordpress; I'm trying to track down why calling: echo json_encode(''); on the Debian server results in "\ud83dde00" but on my local install, calling the same json_encode line results in…
jaygooby
  • 295
  • 1
  • 2
  • 12
0
votes
1 answer

Debian 9 - libicu60 dependency while trying to install php7.0-intl

The title says it all... I'm trying to install the php intl extension on Debian 9 and get the following error : php7.0-intl : Depends: libicu60 (>= 60.1-1~) but it is not installable Debian repo says libicu60 is experimental and buggy... has…
Pierre
  • 101
  • 3
0
votes
1 answer

How can I set mysql connection charset to utf8 in apache mod_dbd?

How can I set connection charset to utf8 in apache mod_dbd with mysql driver? I could not find any corresponding parameter in DBDParams, something like this: DBDParams…
Ehsan Khodarahmi
  • 285
  • 1
  • 7
  • 17
0
votes
1 answer

UNICODE version supported by SQL Server 2016

What is the Unicode standard version supported by SQL Server 2016? (I'm specifically interested in this version. However, information for 2014 or 2017 is welcome as well.) I can't find this information in Technet / MSDN. The only information I was…
Ondrej Tucny
  • 404
  • 1
  • 7
  • 25
0
votes
1 answer

Unicode characters in my PHP configuration

Earlier today I was having an issue where I couldn't setup my PHP7 remote interpreter on PhpStorm I ended up finding out that I have unicode characters in my PHP configuration: I have no idea how that happened and I probably would never have…
Raph Petrini
  • 101
  • 3
0
votes
1 answer

Postfix header_checks and unicode regexp

Does postfix support unicode regular expressions like \p{Han} to detect unicode scripts? I would like to use them with header_checks.
0
votes
3 answers

how does one quote unicode characters in mysql prompt or in SQL in general?

I have a weird unicode char in my mysql database the value looks like this card issuer bank didnt approve your payment so what should be an apostrophe is a weird unicode char, presumably from windows I want to replace it, but don't know how…
Aleksandar Ivanisevic
  • 3,327
  • 19
  • 24
0
votes
1 answer

When I ncpmount a Novell server under Linux (Ubuntu 14.04), I do not see filenames with accented characters

I use Linux on my computer at work and the server uses Novell. I am friendly with the IT staff in a "don't bother us, and we won't bother you" kind of way, and I can usually fix my problems myself with a bit of googling. However, there is one…
0
votes
2 answers

Why doesn't Firefox render this AWstats generated html?

XML Parsing Error: not well-formed Location: https://awstats.example.org/reports/www.example.org/2011/06/awstats.www.example.org.xml Line Number 603, Column 34: - Toile du Qu\uffffbec363363 The…
jldugger
  • 14,122
  • 19
  • 73
  • 129
0
votes
1 answer

Fail to get unicode URL in IIS 6

I setup an IIS server running at localhost and put 2 file with unicode file name, for e.g: 변태연.txt and 변태연.flv The 2 files are all real but only 1 working: localhost/변태연.txt I don't know why? Do u guys have any idea ab this problem? Any helps…
ByulTaeng
0
votes
1 answer

Character set issues in Postgres upgrade

I am moving some databases from Postgres 7.4.8 to 8.4.5, on CentOS 5. In the old database the encoding is UNICODE. So I did a text pg_dump, created my new databases like so: createdb --template template0 --encoding unicode testdb and imported the…
Janine Ohmer
  • 257
  • 1
  • 4
  • 8
0
votes
1 answer

mount and sync non english folder & file names

I'm trying to rsync a folder who's name contains non-english characters it breaks the whole rsync, how can I copy a folder even if they don't have english characters in it?
amirash
  • 129
  • 3
0
votes
0 answers

touch, why I get file name too long error

I have the file with name /home/lenka/Translations/Ф-119-Д/заключение об аннулировании, исправлении и-или дополнении акта о гражданском состоянии_рум-русс.docx echo -n "/home/lenka/Translations/Ф-119-Д/заключение об аннулировании, исправлении и-или…
niXman
  • 9
  • 2
0
votes
0 answers

Displaying Ratio unicode character in putty

I'm wondering if this could be a putty bug. I can't get displayed properly Unicode Character 'RATIO' (U+2236) https://www.fileformat.info/info/unicode/char/2236/index.htm Basically it should look like a colon but in putty to me it's just displayed…
-1
votes
1 answer

filenames, ASCII unicode escaped sequences to UTF8

I'm not sure if I've grasped the issue here so if I haven't just say so and I'll edit the title. My problem is the following: I have an Ubuntu 12.04 server (UTF-8 locale) to which users upload files via a web app or through shell. So I have no…
D.Mill
  • 379
  • 5
  • 15
1 2
3