OS X Automator Shell Script for custom search with special character not encoded with utf-8

1

I find a script here

Mac OS X (Lion) Chrome: shortcut for "Search With Google"

which shows how to do google search using shell script in OS X Automator.

The original script is:

open "http://www.google.com/search?q=$(ruby -rcgi -e 'print CGI.escape $<.read.chomp')"

I am trying to adapt this script to a custom search for Chinese characters encoded in "gb2312".

Currently my script goes like:

open "http://www.yueyv.cn/index.asp?keyword=$(ruby -rcgi -e 'print CGI.escape $<.read.chomp.encode("gb2312")')"

It works fine in terminal. For example, if testing with character "一", the script opens http://www.yueyv.cn/index.asp?keyword=%D2%BB/

However when adding this script as a service in OS X automator, it opens http://www.yueyv.cn/index.asp?keyword=/

The code of "一" is gone.

I've googled for quite a while without a result. Can anybody help me? Thank you.

rouraito

Posted 2014-03-04T16:32:55.780

Reputation: 13

Answers

1

Terminal sets LANG to a value like en_US.UTF-8 by default if you haven't unchecked "Set locale environment variables on startup". Automator doesn't, so the ruby command results in an invalid byte sequence error.

$ unset LANG
$ echo 一|ruby -rcgi -e 'puts CGI.escape $<.read.chomp.encode("gb2312")'
-e:1:in `encode': "\xE4" on US-ASCII (Encoding::InvalidByteSequenceError)
    from -e:1:in `<main>'
$ echo 一|LC_CTYPE=UTF-8 ruby -rcgi -e 'puts CGI.escape $<.read.chomp.encode("gb2312")'
%D2%BB

Try to use LC_CTYPE=UTF-8 ruby. Or replace the ruby command with iconv -f utf-8 -t gb2312|xxd -p|tr -d \\n|sed 's/../%&/g'.

Lri

Posted 2014-03-04T16:32:55.780

Reputation: 34 501

It works perfectly! It is the second time your answer helps me. Thank you so much! – rouraito – 2014-03-05T15:30:52.950

I realize you put "tr -d \n" command in order to get rid of new line character in the string and dump the "0a" in the hex output. I think it should be placed after the string like this:"string"|tr -d \n|iconv -f utf-8 -t gb2312|xxd -p|sed 's/../%&/g' – rouraito – 2014-03-17T14:52:47.387

No, tr -d \\n is meant to remove linefeeds from the output of xxd -p (which prints 60 characters per line). You can use printf %s "$(cat)" to remove a linefeed from the end of the input. Or replace echo with printf %s. – Lri – 2014-03-18T13:15:37.143