2012-11-04

Patch strings in binary files with sed

So, you have a binary file that you need to patch. Perhaps it is a pre compiled proprietary program or dynamic library that contains hard coded paths (text strings) that you need to change.

If the file had been a text file, then sed would probably come to your rescue. For binary files there are hex editors available, but they require manual handling and can't be scripted. Other binary patch programs are out there as well but might not be packaged in your favorite distribution and compiling things from source is boring. You could also have the need to do the patching in a packaging stage when building say an RPM.

So, how can you use sed then?

Well, it's quite simple. Just convert the binary file to ASCII HEX with hexdump, patch it with sed and the convert it back to binary with xxd:
hexdump -ve '1/1 "%.2X"' file.bin | \
sed "s/<pattern>/<replacement>/g" | \
xxd -r -p > file.bin.patched
Of course there are caveats to this approach. The most significant one is that you can't replace a string with a string that is longer then the original one. Shorter is OK though. Another one is that the strings must be null terminated, but this is almost always the case. You also have to create <pattern> and <replacement> yourself as the ASCII HEX representations of the null terminated strings with their null terminator present. Further, <replacement> must be padded to the same length as <pattern>.

Lets take a concrete example:

You have a binary named foo that uses some plugins. The plugins are assumed to be in /usr/local/lib/foo and this path is hard coded in foo. You want to package foo with its plugins using your distributions packaging system and use the file system layout that your distribution uses.

So you will put foo in /usr/bin and put all the plugins in /usr/lib/foo. But foo would look in /usr/local/lib/foo when trying to load plugins and will therefore fail.

But, if we could replace all occurrences of /usr/local/lib/foo in foo with /usr/lib/foo everything shold work. Since the length of /usr/lib/foo is shorter then /usr/local/lib/foo this is doable.

Here is an example script that does the replacement:
#!/bin/bash

function patch_strings_in_file() {
    local FILE="$1"
    local PATTERN="$2"
    local REPLACEMENT="$3"

    # Find all unique strings in FILE that contain the pattern 
    STRINGS=$(strings ${FILE} | grep ${PATTERN} | sort -u -r)

    if [ "${STRINGS}" != "" ] ; then
        echo "File '${FILE}' contain strings with '${PATTERN}' in them:"

        for OLD_STRING in ${STRINGS} ; do
            # Create the new string with a simple bash-replacement
            NEW_STRING=${OLD_STRING//${PATTERN}/${REPLACEMENT}}

            # Create null terminated ASCII HEX representations of the strings
            OLD_STRING_HEX="$(echo -n ${OLD_STRING} | xxd -g 0 -u -ps -c 256)00"
            NEW_STRING_HEX="$(echo -n ${NEW_STRING} | xxd -g 0 -u -ps -c 256)00"

            if [ ${#NEW_STRING_HEX} -le ${#OLD_STRING_HEX} ] ; then
                # Pad the replacement string with null terminations so the
                # length matches the original string
                while [ ${#NEW_STRING_HEX} -lt ${#OLD_STRING_HEX} ] ; do
                    NEW_STRING_HEX="${NEW_STRING_HEX}00"
                done

                # Now, replace every occurrence of OLD_STRING with NEW_STRING 
                echo -n "Replacing ${OLD_STRING} with ${NEW_STRING}... "
                hexdump -ve '1/1 "%.2X"' ${FILE} | \
                sed "s/${OLD_STRING_HEX}/${NEW_STRING_HEX}/g" | \
                xxd -r -p > ${FILE}.tmp
                chmod --reference ${FILE} ${FILE}.tmp
                mv ${FILE}.tmp ${FILE}
                echo "Done!"
            else
                echo "New string '${NEW_STRING}' is longer than old" \
                     "string '${OLD_STRING}'. Skipping."
            fi
        done
    fi
}

patch_strings_in_file foo "/usr/local/lib/foo" "/usr/lib/foo"
Please note that this way of replacing strings does not allow for full blown regexp usage. You could probably modify the script to be more versatile if you have more complex needs. 

As always, watch out when you patch binary files. There could be circumstances when a replacement as described above will break the binary. Test, test, test and test some more...

12 comments:

  1. Thanks for this text. I learned much while reading it and googled up keywords in it. Great stuff man.

    ReplyDelete
  2. Thank you for the information.
    You might want to add this after ths xxd line

    chmod --reference ${FILE} ${FILE}.tmp

    ReplyDelete
    Replies
    1. Thanks for the suggestion isong, I have updated the post with a chmod!

      Delete
  3. This comment has been removed by the author.

    ReplyDelete
  4. I modified the script to handle spaces:
    ..
    STRINGS=$(strings ${FILE} | grep ${PATTERN} | sort -u -r | sed -e 's/ /@@@@@/g')
    ..
    for EL in ${STRINGS} ; do
    OLD_STRING=${EL//@@@@@/ }
    # Create the new string with a simple bash-replacement
    NEW_STRING=${OLD_STRING//${PATTERN}/${REPLACEMENT}}
    # Create null terminated ASCII HEX representations of the strings
    OLD_STRING_HEX="$(echo -n "${OLD_STRING}" | xxd -g 0 -u -ps -c 256)00"
    NEW_STRING_HEX="$(echo -n "${NEW_STRING}" | xxd -g 0 -u -ps -c 256)00"

    ReplyDelete
  5. Dude, this script works flawlessly. I can't even believe it.

    ReplyDelete
  6. does any help bash on OSX

    chmod: illegal option -- -
    usage: chmod [-fhv] [-R [-H | -L | -P]] [-a | +a | =a [i][# [ n]]] mode|entry file ...
    chmod [-fhv] [-R [-H | -L | -P]] [-E | -C | -N | -i | -I] file ...

    ReplyDelete
  7. Clever script. Thank you. I fixed an issue for me caused the the xxd -c parameter 256 which introduces a new line in the piped sed making its syntax invalid. This happens when strings find matches greater than 256 characters. Error for me was:
    sed: -e expression #1, char 514: unterminated `s' command
    Fix was:
    ...
    OLD_STRING_HEX="$(echo -n ${OLD_STRING} | xxd -g 0 -u -ps -c 256 | tr -d '\n')00"
    NEW_STRING_HEX="$(echo -n ${NEW_STRING} | xxd -g 0 -u -ps -c 256 | tr -d '\n')00"
    ...

    ReplyDelete
  8. Awesome script. FYI: you may not want to null terminate OLD_STRING_HEX and NEW_STRING_HEX if you're doing a partial string replacement.

    ReplyDelete

PrettyPrint