How to check if a binary requires SSE4 or AVX on Linux

21

8

On Linux,/proc/cpuinfoallows one to check all the CPU flags the machine has in a simple way.
Usually, if a program requires a superset of a machine's instruction set, the easiest way to determine this is to run it and see whether it raises aSIGILLsignal.

But in my case, all my processors support at least SSE4.1 and AVX.
So, is there a simple way to check if a binary has special instructions inside?

user2284570

Posted 2014-03-08T16:53:46.417

Reputation: 1 160

Maybe there are emulators which let you select which instruction sets are enabled. QEMU currently does not support AVX, so that might "not work" as expected there: http://superuser.com/questions/453786/how-do-i-get-avx-support-in-qemu || http://superuser.com/questions/548740/disabling-instruction-set-in-virtualbox

– Ciro Santilli 新疆改造中心法轮功六四事件 – 2015-11-12T15:39:24.763

2objdump --disassemble performs a disassembly. You can use objdump to generate a list of mnemonics. It is part of Binutils, so its available on GNU Linux systems. Also, the additional instructions may be present but may not be executed. The program could have runtime guards. – jww – 2017-05-23T20:00:57.313

@jww : heemm, yes but I bother about having an executable running everywhere, not about learning over 600 opcodes in order to program in assembly. – user2284570 – 2017-05-23T20:03:31.913

Well, you kind of have to learn what you can (and can't) use. That's your responsibility. I suppose you could compile with -mavx to ensure the compiler only selects from the AVX ISA, but there are ways to sidestep it. For example, inline assembler can usually sidestep the compiler's ISA checks. – jww – 2017-05-23T20:11:58.000

@jww : and if the binary is a closed source (in the sense source code is deleted after building) shared object build by a proprietary script/compiler? – user2284570 – 2017-05-23T21:23:36.627

all the answers below do some sort of grepping through the disassembly. Remember that the code itself may have guards itself (as mentioned by @jww), i.e. it detects the command set the cpu supports and uses the fastest routine that will work on the cpu, but the objdump will still include the SSE4/AVX instructions. In short: a presence of these opcodes doesn’t necessarily mean they are used. OTOH, if none are present, you can be sure SSE4/AVX compatibility won’t be an issue. – Ro-ee – 2017-09-29T23:24:37.133

@Ro-ee : in my case, only command line compiler arguments. So if they are present, they’ll be fetched to the processor. – user2284570 – 2017-09-29T23:26:37.757

@phuclv which was asked after that question anyway. – user2284570 – 2019-11-16T19:55:23.093

Answers

12

I banged out a program in Rust that tries to do this. I think it works, although it is undocumented and awfully fragile:

https://github.com/pkgw/elfx86exts

Example usage:

$ cd elfx86exts
$ cargo build
[things happen]
$ cargo run -- /bin/ls
   Compiling elfx86exts v0.1.0 (file:///home/peter/sw/elfx86exts)
    Finished dev [unoptimized + debuginfo] target(s) in 1.9 secs
     Running `target/debug/elfx86exts /bin/ls`
MODE64
CMOV
SSE2
SSE1

Peter

Posted 2014-03-08T16:53:46.417

Reputation: 136

I tried running it on libtensorflow.so (sha224: 8f665acf0f455d5056014dfa2d48c22ab6cf83eb073842e8304878d0) from this package (version: 1.8.0-5) and it froze my entire computer.

– Philippe – 2018-07-10T09:24:15.227

@Philippe Please file an issue on GitHub! That's the better to place to discuss such topics, I think. – Peter – 2018-07-10T19:19:34.693

Ok, I created an issue on your github. – Philippe – 2018-07-11T08:20:46.557

19

I bumped into the same problem when I was trying to understand GCC optimization processes and to find out which instructions have or have not been used during this process. Since I am not friendly with the enormous number of operation codes, I was looking for a way to visualize specific (let's say SSE3) instructions within the disassembled code, or to at least print some minimal statistics like whether and how many these instructions are there in the binary.

I haven't found any existing solution, but Jonathan Ben-Avraham's answer proved very useful, as it points out a great (and even partially structured) source of operation codes. Based on this data, I have written a Bash script which can visualize specific instruction sets or print statistics about them using grep when fed with output from objdump.

The list of operation codes has been converted into a standalone Bash script which is then included (for the purpose of better readability) in the main file which I have named simply opcode. Since opcodes in gas.vim (Shirk's vim syntax definitions, from Jonathan's answer) were systematically grouped (seemingly) according to different CPU architectures, I tried to preserve this division and make an architecture->instruction set mapping; I am not sure now whether that was a good idea. The mapping is not accurate and I even had to make some changes in the original gas.vim grouping. Since architecture-related instruction sets were not my original intention, I tried only to construct instruction sets of major architectures described on the Internet, but without consulting manufacturers' documentations. AMD architectures do not seem reliable at all to me (except instruction sets like 3DNow! and SSE5). However, I decided to leave the code for instruction sets of various architectures here for someone else to examine and correct/improve and to give others some tentative results.

Beginning of the main file named opcode:

#!/bin/bash
#
# Searches disassembled code for specific instructions.
#
# Opcodes obtained from: https://github.com/Shirk/vim-gas/blob/master/syntax/gas.vim
#
# List of opcodes has been obtained using the following commands and making a few modifications:
#   echo '#!/bin/bash' > Opcode_list
#   wget -q -O- https://raw.githubusercontent.com/Shirk/vim-gas/master/syntax/gas.vim \
#    | grep -B1 -E 'syn keyword gasOpcode_|syn match   gasOpcode' | \
#    sed -e '/^--$/d' -e 's/"-- Section:/\n#/g' \
#    -e 's/syn keyword gasOpcode_\([^\t]*\)*\(\t\)*\(.*\)/Opcode_\1="\${Opcode_\1} \3"/g' \
#    -e 's/Opcode_PENT_3DNOW/Opcode_ATHLON_3DNOW/g' -e 's/\\//g' \
#    -e 's/syn match   gasOpcode_\([^\t]*\)*.*\/<\(.*\)>\//Opcode_\1="\${Opcode_\1} \2"/g' \
#    >> Opcode_list
#
# Modify file Opcode_list replacing all occurrences of:
#   * Opcode_Base within the section "Tejas New Instructions (SSSE3)" with Opcode_SSSE3
#   * Opcode_Base within the section "Willamette MMX instructions (SSE2 SIMD Integer Instructions)"
#                                        with Opcode_WILLAMETTE_Base

# return values
EXIT_FOUND=0
EXIT_NOT_FOUND=1
EXIT_USAGE=2

# settings
InstSet_Base=""
Recursive=false
Count_Matching=false
Leading_Separator='\s'
Trailing_Separator='(\s|$)' # $ matches end of line for non-parametric instructions like nop
Case_Insensitive=false
Invert=false
Verbose=false
Stop_After=0
Line_Numbers=false
Leading_Context=0
Trailing_Context=0

source Opcode_list   # include opcodes from a separate file

# GAS-specific opcodes (unofficial names) belonging to the x64 instruction set.
# They are generated by GNU tools (e.g. GDB, objdump) and specify a variant of ordinal opcodes like NOP and MOV.
# If you do not want these opcodes to be recognized by this script, comment out the following line.
Opcode_X64_GAS="nopw nopl movabs"


# instruction sets
InstSet_X86="8086_Base 186_Base 286_Base 386_Base 486_Base PENT_Base P6_Base KATMAI_Base WILLAMETTE_Base PENTM_Base"
InstSet_IA64="IA64_Base"
InstSet_X64="PRESCOTT_Base X64_Base X86_64_Base NEHALEM_Base X64_GAS"
InstSet_MMX="PENT_MMX KATMAI_MMX X64_MMX"
InstSet_MMX2="KATMAI_MMX2"
InstSet_3DNOW="ATHLON_3DNOW"
InstSet_SSE="KATMAI_SSE P6_SSE X64_SSE"
InstSet_SSE2="SSE2 X64_SSE2"
InstSet_SSE3="PRESCOTT_SSE3"
InstSet_SSSE3="SSSE3"
InstSet_VMX="VMX X64_VMX"
InstSet_SSE4_1="SSE41 X64_SSE41"
InstSet_SSE4_2="SSE42 X64_SSE42"
InstSet_SSE4A="AMD_SSE4A"
InstSet_SSE5="AMD_SSE5"
InstSet_FMA="FUTURE_FMA"
InstSet_AVX="SANDYBRIDGE_AVX"

InstSetDep_X64="X86"
InstSetDep_MMX2="MMX"
InstSetDep_SSE2="SSE"
InstSetDep_SSE3="SSE2"
InstSetDep_SSSE3="SSE3"
InstSetDep_SSE4_1="SSSE3"
InstSetDep_SSE4_2="SSE4_1"
InstSetDep_SSE4A="SSE3"
InstSetDep_SSE5="FMA AVX" # FIXME not reliable

InstSetList="X86 IA64 X64 MMX MMX2 3DNOW SSE SSE2 SSE3 SSSE3 VMX SSE4_1 SSE4_2 SSE4A SSE5 FMA AVX"


# architectures
Arch_8086="8086_Base"
Arch_186="186_Base"
Arch_286="286_Base"
Arch_386="386_Base"
Arch_486="486_Base"
Arch_Pentium="PENT_Base PENT_MMX" # Pentium = P5 architecture
Arch_Athlon="ATHLON_3DNOW"
Arch_Deschutes="P6_Base P6_SSE" # Pentium II
Arch_Katmai="KATMAI_Base KATMAI_MMX KATMAI_MMX2 KATMAI_SSE" # Pentium III
Arch_Willamette="WILLAMETTE_Base SSE2" # original Pentium IV (x86)
Arch_PentiumM="PENTM_Base"
Arch_Prescott="PRESCOTT_Base X64_Base X86_64_Base X64_SSE2 PRESCOTT_SSE3 VMX X64_VMX X64_GAS" # later Pentium IV (x64) with SSE3 (Willamette only implemented SSE2 instructions) and VT (VT-x, aka VMX)
Arch_P6=""
Arch_Barcelona="ATHLON_3DNOW AMD_SSE4A"
Arch_IA64="IA64_Base" # 64-bit Itanium RISC processor; incompatible with x64 architecture
Arch_Penryn="SSSE3 SSE41 X64_SSE41" # later (45nm) Core 2 with SSE4.1
Arch_Nehalem="NEHALEM_Base SSE42 X64_SSE42" # Core i#
Arch_SandyBridge="SANDYBRIDGE_AVX"
Arch_Haswell="FUTURE_FMA"
Arch_Bulldozer="AMD_SSE5"

ArchDep_8086=""
ArchDep_186="8086"
ArchDep_286="186"
ArchDep_386="286"
ArchDep_486="386"
ArchDep_Pentium="486"
ArchDep_Athlon="Pentium" # FIXME not reliable
ArchDep_Deschutes="Pentium"
ArchDep_Katmai="Deschutes"
ArchDep_Willamette="Katmai"
ArchDep_PentiumM="Willamette" # FIXME Pentium M is a Pentium III modification (with SSE2). Does it support also WILLAMETTE_Base instructions?
ArchDep_Prescott="Willamette"
ArchDep_P6="Prescott" # P6 started with Pentium Pro; FIXME Pentium Pro did not support MMX instructions (introduced again in Pentium II aka Deschutes)
ArchDep_Barcelona="Prescott" # FIXME not reliable
ArchDep_IA64=""
ArchDep_Penryn="P6"
ArchDep_Nehalem="Penryn"
ArchDep_SandyBridge="Nehalem"
ArchDep_Haswell="SandyBridge"
ArchDep_Bulldozer="Haswell" # FIXME not reliable

ArchList="8086 186 286 386 486 Pentium Athlon Deschutes Katmai Willamette PentiumM Prescott P6 Barcelona IA64 Penryn Nehalem SandyBridge Haswell Bulldozer"

An example of an Opcode_list file generated and modified using the instructions in opcode as of Oct 27, 2014, can be found at http://pastebin.com/yx4rCxqs. You can insert this file right into opcode in place of the source Opcode_list line. I have put this code out because Stack Exchange would not let me send such a large answer.

Finally, the rest of opcode file with the actual logic:

usage() {
    echo "Usage: $0 OPTIONS"
    echo ""
    echo "  -r      set instruction sets recursively according to dependency tree (must precede -a or -s)"
    echo "  -a      set architecture"
    echo "  -s      set instruction set"
    echo "  -L      show list of available architectures"
    echo "  -l      show list of available instruction sets"
    echo "  -i      show base instruction sets of current instruction set (requires -a and/or -s)"
    echo "  -I      show instructions in current instruction set (requires -a and/or -s)"
    echo "  -c      print number of matching instructions instead of normal output"
    echo "  -f      find instruction set of the following instruction (regex allowed)"
    echo "  -d      set leading opcode separator (default '$Leading_Separator')"
    echo "  -D      set trailing opcode separator (default '$Trailing_Separator')"
    echo "  -C      case-insensitive"
    echo "  -v      invert the sense of matching"
    echo "  -V      print all lines, not just the highlighted"
    echo "  -m      stop searching after n matched instructions"
    echo "  -n      print line numbers within the original input"
    echo "  -B      print n instructions of leading context"
    echo "  -A      print n instructions of trailing context"
    echo "  -h      print this help"
    echo
    echo "Multiple architectures and instruction sets can be used."
    echo
    echo "Typical usage is:"
    echo "  objdump -M intel -d FILE | $0 OPTIONS"
    echo "  objdump -M intel -d FILE | $0 -s SSE2 -s SSE3 -V                    Highlight SSE2 and SSE3 within FILE."
    echo "  objdump -M intel -d FILE | tail -n +8 | $0 -r -a Haswell -v -m 1    Find first unknown instruction."
    echo "  $0 -C -f ADDSD                                                      Find which instruction set an opcode belongs to."
    echo "  $0 -f .*fma.*                                                       Find all matching instructions and their instruction sets."
    echo
    echo "The script uses Intel opcode syntax. When used in conjunction with objdump, \`-M intel' must be set in order to prevent opcode translation using AT&T syntax."
    echo
    echo "BE AWARE THAT THE LIST OF KNOWN INSTRUCTIONS OR INSTRUCTIONS SUPPORTED BY PARTICULAR ARCHITECTURES (ESPECIALLY AMD'S) IS ONLY TENTATIVE AND MAY CONTAIN MISTAKES!"
    kill -TRAP $TOP_PID
}

list_contains() {   # Returns 0 if $2 is in array $1, 1 otherwise.
    local e
    for e in $1; do
        [ "$e" = "$2" ] && return 0
    done
    return 1
}

build_instruction_set() {   # $1 = enum { Arch, InstSet }, $2 = architecture or instruction set as obtained using -L or -l, $3 = "architecture"/"instruction set" to be used in error message
    local e
    list_contains "`eval echo \\\$${1}List`" "$2" || (echo "$2 is not a valid $3."; usage)      # Test if the architecture/instruction set is valid.
    if [ -n "`eval echo \\\$${1}_${2}`" ]; then                                                 # Add the instruction set(s) if any.
        for e in `eval echo \\\$${1}_${2}`; do                                                  # Skip duplicates.
            list_contains "$InstSet_Base" $e || InstSet_Base="$e $InstSet_Base"
        done
    fi
    if [ $Recursive = true ]; then
        for a in `eval echo \\\$${1}Dep_$2`; do
            build_instruction_set $1 $a "$3"
        done
    fi
    InstSet_Base="`echo $InstSet_Base | sed 's/$ *//'`"                                         # Remove trailing space.
}

trap "exit $EXIT_USAGE" TRAP    # Allow usage() function to abort script execution.
export TOP_PID=$$               # PID of executing process.

# Parse command line arguments.
while getopts ":ra:s:LliIcf:Fd:D:CvVm:nB:A:h" o; do
    case $o in
        r) Recursive=true ;;
        a) build_instruction_set Arch "$OPTARG" "architecture" ;;
        s) build_instruction_set InstSet "$OPTARG" "instruction set" ;;
        L) echo $ArchList; exit $EXIT_USAGE ;;
        l) echo $InstSetList; exit $EXIT_USAGE ;;
        i)
            if [ -n "$InstSet_Base" ]; then
                echo $InstSet_Base
                exit $EXIT_USAGE
            else
                echo -e "No instruction set or architecture set.\n"
                usage
            fi
            ;;
        I)
            if [ -n "$InstSet_Base" ]; then
                for s in $InstSet_Base; do
                    echo -ne "\e[31;1m$s:\e[0m "
                    eval echo "\$Opcode_$s"
                done
                exit $EXIT_USAGE
            else
                echo -e "No instruction set or architecture set.\n"
                usage
            fi
            ;;
        c) Count_Matching=true ;;
        f)
            # Unlike architectures, instruction sets are disjoint.
            Found=false
            for s in $InstSetList; do
                for b in `eval echo \\\$InstSet_$s`; do
                    Found_In_Base=false
                    for i in `eval echo \\\$Opcode_$b`; do
                        if [[ "$i" =~ ^$OPTARG$ ]]; then
                            $Found_In_Base || echo -ne "Instruction set \e[33;1m$s\e[0m (base instruction set \e[32;1m$b\e[0m):"
                            echo -ne " \e[31;1m$i\e[0m"
                            Found_In_Base=true
                            Found=true
                        fi
                    done
                    $Found_In_Base && echo ""
                done
            done
            if [ $Found = false ]; then
                echo -e "Operation code \e[31;1m$OPTARG\e[0m has not been found in the database of known instructions." \
                "Perhaps it is translated using other than Intel syntax. If obtained from objdump, check if the \`-M intel' flag is set." \
                "Be aware that the search is case sensitive by default (you may use the -C flag, otherwise only lower case opcodes are accepted)."
                exit $EXIT_NOT_FOUND
            else
                exit $EXIT_FOUND
            fi
            ;;
        d) Leading_Separator="$OPTARG" ;;
        D) Trailing_Separator="$OPTARG" ;;
        C) Case_Insensitive=true ;;
        v) Invert=true ;;
        V) Verbose=true ;;
        m) Stop_After=$OPTARG ;;
        n) Line_Numbers=true ;;
        B) Leading_Context=$OPTARG ;;
        A) Trailing_Context=$OPTARG ;;
        h) usage ;;
        \?)
            echo -e "Unknown option: -$OPTARG\n"
            usage
            ;;
    esac
done
shift $((OPTIND-1))
[ -n "$1" ] && echo -e "Unknown command line parameter: $1\n" && usage
[ -z "$InstSet_Base" ] && usage

# Create list of grep parameters.
Grep_Params="--color=auto -B $Leading_Context -A $Trailing_Context"
[ $Count_Matching = true ] && Grep_Params="$Grep_Params -c"
[ $Case_Insensitive = true ] && Grep_Params="$Grep_Params -i"
[ $Invert = true ] && Grep_Params="$Grep_Params -v"
[ $Stop_After -gt 0 ] && Grep_Params="$Grep_Params -m $Stop_After"
[ $Line_Numbers = true ] && Grep_Params="$Grep_Params -n"

# Build regular expression for use in grep.
RegEx=""
for s in $InstSet_Base; do
    eval RegEx=\"$RegEx \$Opcode_$s\"
done
# Add leading and trailing opcode separators to prevent false positives.
RegEx="$Leading_Separator`echo $RegEx | sed "s/ /$(echo "$Trailing_Separator"|sed 's/[\/&]/\\\&/g')|$(echo "$Leading_Separator"|sed 's/[\/&]/\\\&/g')/g"`$Trailing_Separator"

[ $Verbose = true -a $Count_Matching = false ] && RegEx="$RegEx|\$"

# The actual search.
grep $Grep_Params -E "$RegEx" && exit $EXIT_FOUND || exit $EXIT_NOT_FOUND

Please be aware that if your search query is too large (e.g., with Haswell instruction set and the -r switch - this includes hundreds of instructions), the computation may proceed slowly and take a long time on large inputs which this simple script was not intended for.

For detailed information on usage consult

./opcode -h

The whole opcode script (with Opcode_list included) can be found at http://pastebin.com/A8bAuHAP.

Feel free to improve the tool and to correct any mistakes I might have made. Lastly, I would like to thank Jonathan Ben-Avraham for his great idea of using Shirk's gas.vim file.

EDIT: The script is now able to find which instruction set an operation code belongs to (regular expression can be used).

Kyselejsyreček

Posted 2014-03-08T16:53:46.417

Reputation: 224

9

First, decompile your binary:

objdump -d binary > binary.asm

Then find all SSE4 instructions in the assembly file:

awk '/[ \t](mpsadbw|phminposuw|pmulld|pmuldq|dpps|dppd|blendps|blendpd|blendvps|blendvpd|pblendvb|pblenddw|pminsb|pmaxsb|pminuw|pmaxuw|pminud|pmaxud|pminsd|pmaxsd|roundps|roundss|roundpd|roundsd|insertps|pinsrb|pinsrd|pinsrq|extractps|pextrb|pextrd|pextrw|pextrq|pmovsxbw|pmovzxbw|pmovsxbd|pmovzxbd|pmovsxbq|pmovzxbq|pmovsxwd|pmovzxwd|pmovsxwq|pmovzxwq|pmovsxdq|pmovzxdq|ptest|pcmpeqq|pcmpgtq|packusdw|pcmpestri|pcmpestrm|pcmpistri|pcmpistrm|crc32|popcnt|movntdqa|extrq|insertq|movntsd|movntss|lzcnt)[ \t]/' binary.asm

(Note: CRC32 might match against comments.)

Find most common AVX instructions (including scalar, including AVX2, AVX-512 family and some FMA like vfmadd132pd):

awk '/[ \t](vmovapd|vmulpd|vaddpd|vsubpd|vfmadd213pd|vfmadd231pd|vfmadd132pd|vmulsd|vaddsd|vmosd|vsubsd|vbroadcastss|vbroadcastsd|vblendpd|vshufpd|vroundpd|vroundsd|vxorpd|vfnmadd231pd|vfnmadd213pd|vfnmadd132pd|vandpd|vmaxpd|vmovmskpd|vcmppd|vpaddd|vbroadcastf128|vinsertf128|vextractf128|vfmsub231pd|vfmsub132pd|vfmsub213pd|vmaskmovps|vmaskmovpd|vpermilps|vpermilpd|vperm2f128|vzeroall|vzeroupper|vpbroadcastb|vpbroadcastw|vpbroadcastd|vpbroadcastq|vbroadcasti128|vinserti128|vextracti128|vpminud|vpmuludq|vgatherdpd|vgatherqpd|vgatherdps|vgatherqps|vpgatherdd|vpgatherdq|vpgatherqd|vpgatherqq|vpmaskmovd|vpmaskmovq|vpermps|vpermd|vpermpd|vpermq|vperm2i128|vpblendd|vpsllvd|vpsllvq|vpsrlvd|vpsrlvq|vpsravd|vblendmpd|vblendmps|vpblendmd|vpblendmq|vpblendmb|vpblendmw|vpcmpd|vpcmpud|vpcmpq|vpcmpuq|vpcmpb|vpcmpub|vpcmpw|vpcmpuw|vptestmd|vptestmq|vptestnmd|vptestnmq|vptestmb|vptestmw|vptestnmb|vptestnmw|vcompresspd|vcompressps|vpcompressd|vpcompressq|vexpandpd|vexpandps|vpexpandd|vpexpandq|vpermb|vpermw|vpermt2b|vpermt2w|vpermi2pd|vpermi2ps|vpermi2d|vpermi2q|vpermi2b|vpermi2w|vpermt2ps|vpermt2pd|vpermt2d|vpermt2q|vshuff32x4|vshuff64x2|vshuffi32x4|vshuffi64x2|vpmultishiftqb|vpternlogd|vpternlogq|vpmovqd|vpmovsqd|vpmovusqd|vpmovqw|vpmovsqw|vpmovusqw|vpmovqb|vpmovsqb|vpmovusqb|vpmovdw|vpmovsdw|vpmovusdw|vpmovdb|vpmovsdb|vpmovusdb|vpmovwb|vpmovswb|vpmovuswb|vcvtps2udq|vcvtpd2udq|vcvttps2udq|vcvttpd2udq|vcvtss2usi|vcvtsd2usi|vcvttss2usi|vcvttsd2usi|vcvtps2qq|vcvtpd2qq|vcvtps2uqq|vcvtpd2uqq|vcvttps2qq|vcvttpd2qq|vcvttps2uqq|vcvttpd2uqq|vcvtudq2ps|vcvtudq2pd|vcvtusi2ps|vcvtusi2pd|vcvtusi2sd|vcvtusi2ss|vcvtuqq2ps|vcvtuqq2pd|vcvtqq2pd|vcvtqq2ps|vgetexppd|vgetexpps|vgetexpsd|vgetexpss|vgetmantpd|vgetmantps|vgetmantsd|vgetmantss|vfixupimmpd|vfixupimmps|vfixupimmsd|vfixupimmss|vrcp14pd|vrcp14ps|vrcp14sd|vrcp14ss|vrndscaleps|vrndscalepd|vrndscaless|vrndscalesd|vrsqrt14pd|vrsqrt14ps|vrsqrt14sd|vrsqrt14ss|vscalefps|vscalefpd|vscalefss|vscalefsd|valignd|valignq|vdbpsadbw|vpabsq|vpmaxsq|vpmaxuq|vpminsq|vpminuq|vprold|vprolvd|vprolq|vprolvq|vprord|vprorvd|vprorq|vprorvq|vpscatterdd|vpscatterdq|vpscatterqd|vpscatterqq|vscatterdps|vscatterdpd|vscatterqps|vscatterqpd|vpconflictd|vpconflictq|vplzcntd|vplzcntq|vpbroadcastmb2q|vpbroadcastmw2d|vexp2pd|vexp2ps|vrcp28pd|vrcp28ps|vrcp28sd|vrcp28ss|vrsqrt28pd|vrsqrt28ps|vrsqrt28sd|vrsqrt28ss|vgatherpf0dps|vgatherpf0qps|vgatherpf0dpd|vgatherpf0qpd|vgatherpf1dps|vgatherpf1qps|vgatherpf1dpd|vgatherpf1qpd|vscatterpf0dps|vscatterpf0qps|vscatterpf0dpd|vscatterpf0qpd|vscatterpf1dps|vscatterpf1qps|vscatterpf1dpd|vscatterpf1qpd|vfpclassps|vfpclasspd|vfpclassss|vfpclasssd|vrangeps|vrangepd|vrangess|vrangesd|vreduceps|vreducepd|vreducess|vreducesd|vpmovm2d|vpmovm2q|vpmovm2b|vpmovm2w|vpmovd2m|vpmovq2m|vpmovb2m|vpmovw2m|vpmullq|vpmadd52luq|vpmadd52huq|v4fmaddps|v4fmaddss|v4fnmaddps|v4fnmaddss|vp4dpwssd|vp4dpwssds|vpdpbusd|vpdpbusds|vpdpwssd|vpdpwssds|vpcompressb|vpcompressw|vpexpandb|vpexpandw|vpshld|vpshldv|vpshrd|vpshrdv|vpopcntd|vpopcntq|vpopcntb|vpopcntw|vpshufbitqmb|gf2p8affineinvqb|gf2p8affineqb|gf2p8mulb|vpclmulqdq|vaesdec|vaesdeclast|vaesenc|vaesenclast)[ \t]/' binary.asm

NOTE: tested with gawk and nawk.

Andriy Makukha

Posted 2014-03-08T16:53:46.417

Reputation: 233

6

Unfortunately there doesn't seem to be any well-known utility as of this date that detects the required instruction set from a given executable.

The best that I can suggest for x86 is is to use objdump -d on the ELF binary to disassemble the executable sections into Gnu Assemply language (gas). Then use Shirk's vim syntax definitions to either grep through the assembly code file or visually scan the assembler code for any of the gasOpcode_SSE41 or gasOpcode_SANDYBRIDGE_AVX instructions that you see in Shirk's gas.vim file.

The assembly language file contains the machine-level instructions ("opcodes") that the compiler generated when the program was compiled. If the program was compiled with compile-time flags for SSE or AVX instructions, and the compiler emitted any SSE or AVX instructions, then you should see one or more SSE or AVX opcodes in the dissassembly listing produced by objdump -d.

For example, if you do grep vroundsdb on the assembly code file and find a match, then you know that the binary file requires AVX capabilities to execute.

There are quite a few sub-architecture-specific instructions for x86, as you can see from Shirk's gas.vim file, So grepping for all of the opcodes for each sub-architecture would admittedly be tedious. Writing a C, Perl or Python program to do this could be an excellent idea for an Open Source project, especially if you could find someone to extend it for ARM, PPC and other architectures.

Jonathan Ben-Avraham

Posted 2014-03-08T16:53:46.417

Reputation: 936

What is the purpose of gas: I couldn't find what is that program? – user2284570 – 2014-03-08T19:40:00.090

@user2284570: I edited the answer to relate to your comment. HTH. – Jonathan Ben-Avraham – 2014-03-08T20:04:48.770

Sse4.2+AVX+3DNOW represent hundreds of instructions. It would take a long time to launch a search for each of them... – user2284570 – 2014-03-08T20:06:23.180

@user2284570: Yes, I mentioned that. If you need to do this regularly it would be better to write a Perl script based on Shirk's gas.vim. OTOH if this is a one-shot problem, then you can easily learn the patterns of the opcodes that differentiate between the sub-architectures. – Jonathan Ben-Avraham – 2014-03-08T20:10:23.103

I guess a library for dealing with opcodes would be something great for starting... – user2284570 – 2014-03-08T20:20:54.680

2

I gave writing some python utility script based on Jonathan Ben-Avrahams and Kyselejsyrečeks answers a go. Its a crudeish script but gets the job done.

https://gist.github.com/SleepProgger/d4f5e0a0ea2b9456e6c7ecf256629396 It automatically downloads and converts the gas.vim file and supports dumping all used (optional non basic) operations including the feature set they come from. Addtionally it supports op to feature set lookup.

Tries to detect which CPU features where used in a given binary.

positional arguments:
  executable            The executable to analyze or the command to lookup if
                        -l is set.

optional arguments:
  -h, --help            show this help message and exit
  -j JSON_SPECS, --json-specs JSON_SPECS
                        json file containing a command to feature mapping.
  -o JSON_OUTPUT, --json-output JSON_OUTPUT
                        json file to save the command to feature mapping
                        parsed from an gas.vim file. Defaults to same folder
                        as this scipt/specs.json
  -g GAS, --gas GAS     gas.vim file to convert to feature mapping.
  -nw, --no-json-save   Do not save converted mapping from gas.vim file.
  -b, --include-base    Include base instructions in the search.
  -l, --lookup-op       Lookup arch and feature for given command. Can be
                        regex.

SleepProgger

Posted 2014-03-08T16:53:46.417

Reputation: 121