2

I have many password protected microsoft excel worksheet and I need to be able to edit it. I have forgotten the password, and I know that I can edit the file and take out the password check, but I have quite a few worksheets I would have to do this on. They all have the same password so I would like to crack the password. I got the sheetProtection element already:

<sheetProtection algorithmName="SHA-512" hashValue="exBYsHJhLjU2iumFwDs7uFdq6WBWwAyJj3DqKQg85bmTocK4dNrqCpPePQC23Rikd5QSY5WyREknFGhRxKcB2w==" saltValue="DdJzMvZ9KqpGuTrabHJ1eg==" spinCount="100000" sheet="1" objects="1" scenarios="1"/>

I tried using john the ripper to crack this password, but it just gives a no password hash loaded error, and when I use office2john (a tool to convert microsoft office files'hashes into one that john can use https://github.com/magnumripper/JohnTheRipper/blob/bleeding-jumbo/run/office2john.py) but I still get that same error.
How would I go about breaking this password (for the sake of this example the password posted here is "john"?

Ethan
  • 21
  • 2

1 Answers1

2

office2john is designed for encrypted files, not files that just have an easily-removable snake-oil protection field. Anyway, here's a quick implementation of the algorithm that they use:

#!/usr/bin/env python3

from base64 import b64encode, b64decode
from hashlib import sha512
import struct

def hash_password(password, salt, spincount, hash):
    result = hash(b64decode(salt) + password.encode('utf_16_le')).digest()
    for i in range(spincount):
        result = hash(result + struct.pack('<I', i)).digest()
    return b64encode(result)

print(hash_password('pwd', '876MLoKTq42+/DLp415iZQ==', 100000, sha512))
# 5l3mgNHXpWiFaBPv5Yso1Xd/UifWvQWmlDnl/hsCYbFT2sJCzorjRmBCQ/3qeDu6Q/4+GIE8a1DsdaTwYh1q2g==

I took that test case from LibreOffice.

This doesn't produce the right answer with what you posted, though:

print(hash_password('john', 'DdJzMvZ9KqpGuTrabHJ1eg==', 100000, sha512))
# L69ms0LAD5mz5M8RdRtcn1UTSXWSfX9YI9hK9mPW1n4eW6I8ilTLi6el6LafMj2RVsxjg2aumqSeIfFk25drVw==
print(hash_password('**john**', 'DdJzMvZ9KqpGuTrabHJ1eg==', 100000, sha512))
# 7YpVVKy6la5CEYWmfbUbF6BFgeopeT/Uh52as+I+NVYqFRqVFvSUWfdCP1J5dUGeWBy3cJHan+i/IZieabrGGw==

The OOXML specification for hashValue says "This value shall be compared with the resulting hash value after hashing the user-supplied password using the algorithm specified by the preceding attributes and parent XML element" and "The hashValue attribute value of 9oN7nWkCAyEZib1RomSJTjmPpCY= specifies that the user-supplied password must be hashed using the pre-processing defined by the parent element (if any) followed by the SHA-1 algorithm (specified via the algorithmName attribute value of SHA-1) and that the resulting hash value must be 9oN7nWkCAyEZib1RomSJTjmPpCY= for the protection to be disabled."

I suspect this difference comes from a parent XML element specifying an algorithm, which you didn't bother to show us. If you want to crack your password and not just remove it, then you need to do two things:

  1. Account for the extra preprocessing algorithm, if you have one
  2. Either add functionality to John the Ripper to support sheet protection passwords (which you'll need to keep separate from the existing Office encryption functionality), or write your own cracker based on the algorithm.