For Debian 7 (Wheezy):
You can download the source files from Debian and make the changes yourself, then recompile and install the created .deb packages:
Open a root terminal:
apt-get install dpkg-dev;
apt-get build-dep libpango1.0-0;
exit;
Open a regular terminal:
cd; mkdir patch-libpango; cd patch-libpango;
apt-get source libpango1.0-0;
Now go to your home folder and open the file patch-libpango/pango1.0-1.30.0/pango/break.c
, then find this block of code:
/* ---- Word breaks ---- */
/* default to not a word start/end */
attrs[i].is_word_start = FALSE;
attrs[i].is_word_end = FALSE;
if (current_word_type != WordNone)
{
/* Check for a word end */
switch ((int) type)
{
case G_UNICODE_SPACING_MARK:
case G_UNICODE_ENCLOSING_MARK:
case G_UNICODE_NON_SPACING_MARK:
case G_UNICODE_FORMAT:
/* nothing, we just eat these up as part of the word */
break;
case G_UNICODE_LOWERCASE_LETTER:
case G_UNICODE_MODIFIER_LETTER:
case G_UNICODE_OTHER_LETTER:
case G_UNICODE_TITLECASE_LETTER:
case G_UNICODE_UPPERCASE_LETTER:
if (current_word_type == WordLetters)
{
/* Japanese special cases for ending the word */
if (JAPANESE (last_word_letter) ||
JAPANESE (wc))
{
if ((HIRAGANA (last_word_letter) &&
!HIRAGANA (wc)) ||
(KATAKANA (last_word_letter) &&
!(KATAKANA (wc) || HIRAGANA (wc))) ||
(KANJI (last_word_letter) &&
!(HIRAGANA (wc) || KANJI (wc))) ||
(JAPANESE (last_word_letter) &&
!JAPANESE (wc)) ||
(!JAPANESE (last_word_letter) &&
JAPANESE (wc)))
attrs[i].is_word_end = TRUE;
}
}
else
{
/* end the number word, start the letter word */
attrs[i].is_word_end = TRUE;
attrs[i].is_word_start = TRUE;
current_word_type = WordLetters;
}
last_word_letter = wc;
break;
case G_UNICODE_DECIMAL_NUMBER:
case G_UNICODE_LETTER_NUMBER:
case G_UNICODE_OTHER_NUMBER:
if (current_word_type != WordNumbers)
{
attrs[i].is_word_end = TRUE;
attrs[i].is_word_start = TRUE;
current_word_type = WordNumbers;
}
last_word_letter = wc;
break;
default:
/* Punctuation, control/format chars, etc. all end a word. */
attrs[i].is_word_end = TRUE;
current_word_type = WordNone;
break;
}
}
else
{
/* Check for a word start */
switch ((int) type)
{
case G_UNICODE_LOWERCASE_LETTER:
case G_UNICODE_MODIFIER_LETTER:
case G_UNICODE_OTHER_LETTER:
case G_UNICODE_TITLECASE_LETTER:
case G_UNICODE_UPPERCASE_LETTER:
current_word_type = WordLetters;
last_word_letter = wc;
attrs[i].is_word_start = TRUE;
break;
case G_UNICODE_DECIMAL_NUMBER:
case G_UNICODE_LETTER_NUMBER:
case G_UNICODE_OTHER_NUMBER:
current_word_type = WordNumbers;
last_word_letter = wc;
attrs[i].is_word_start = TRUE;
break;
default:
/* No word here */
break;
}
}
and replace it with this:
/* ---- Word breaks ---- */
/* default to not a word start/end */
attrs[i].is_word_start = FALSE;
attrs[i].is_word_end = FALSE;
if (current_word_type != WordNone)
{
/* Check for a word end */
switch ((int) type)
{
case G_UNICODE_SPACING_MARK:
case G_UNICODE_ENCLOSING_MARK:
case G_UNICODE_NON_SPACING_MARK:
case G_UNICODE_FORMAT:
/* nothing, we just eat these up as part of the word */
break;
case G_UNICODE_LOWERCASE_LETTER:
case G_UNICODE_MODIFIER_LETTER:
case G_UNICODE_OTHER_LETTER:
case G_UNICODE_TITLECASE_LETTER:
case G_UNICODE_UPPERCASE_LETTER:
if (current_word_type == WordLetters)
{
/* Japanese special cases for ending the word */
if (JAPANESE (last_word_letter) ||
JAPANESE (wc))
{
if ((HIRAGANA (last_word_letter) &&
!HIRAGANA (wc)) ||
(KATAKANA (last_word_letter) &&
!(KATAKANA (wc) || HIRAGANA (wc))) ||
(KANJI (last_word_letter) &&
!(HIRAGANA (wc) || KANJI (wc))) ||
(JAPANESE (last_word_letter) &&
!JAPANESE (wc)) ||
(!JAPANESE (last_word_letter) &&
JAPANESE (wc)))
attrs[i].is_word_end = TRUE;
}
}
last_word_letter = wc;
break;
case G_UNICODE_DECIMAL_NUMBER:
case G_UNICODE_LETTER_NUMBER:
case G_UNICODE_OTHER_NUMBER:
last_word_letter = wc;
break;
default:
if (wc == 0x005F) break; //underscore
/* Punctuation, control/format chars, etc. all end a word. */
attrs[i].is_word_end = TRUE;
current_word_type = WordNone;
break;
}
}
else
{
/* Check for a word start */
switch ((int) type)
{
case G_UNICODE_LOWERCASE_LETTER:
case G_UNICODE_MODIFIER_LETTER:
case G_UNICODE_OTHER_LETTER:
case G_UNICODE_TITLECASE_LETTER:
case G_UNICODE_UPPERCASE_LETTER:
current_word_type = WordLetters;
last_word_letter = wc;
attrs[i].is_word_start = TRUE;
break;
case G_UNICODE_DECIMAL_NUMBER:
case G_UNICODE_LETTER_NUMBER:
case G_UNICODE_OTHER_NUMBER:
current_word_type = WordNumbers;
last_word_letter = wc;
attrs[i].is_word_start = TRUE;
break;
default:
/* No word here */
break;
}
}
Go back to your regular terminal:
cd ~/patch-libpango/pango*;
dpkg-buildpackage -rfakeroot -uc -b;
Now go to your home folder and open the folder patch-libpango
, you should find some .deb
files there.
Install them all except for the debug and doc packages (the ones that have -dbg and -doc in their filename)
You can now delete the patch-libpango
directory, go back to your regular terminal:
cd; rm -rf patch-libpango;
Done, you don't need to restart your system.
Note: this will also treat the underscore as part of a word (find 0x005F in the edited code).
References:
Please also let me know if you can't reproduce this behavior. If that's the case, I suspect it may have something to do with locales.. I'm on Debian wheezy, using GNOME 3, en_US.UTF-8 locale (sometimes fr_FR.UTF-8). – Noyo – 2013-09-26T09:35:57.270
More research hints that it's maybe not locale-related, but rather a well-established mystery related to the way (all?) Gtk+ applications seem to behave by default: https://mail.gnome.org/archives/gtk-list/2011-June/msg00060.html and https://mail.gnome.org/archives/gtk-i18n-list/2011-June/msg00003.html
– Noyo – 2013-09-26T13:31:09.480Also: http://forums.opensuse.org/english/other-forums/looking-something-other-than-support/461854-anyone-know-how-change-pangos-word-separators.html
– Noyo – 2013-09-26T13:37:27.533