From 962813c11fc4489259f8de1ccda5f7d87f92c0d7 Mon Sep 17 00:00:00 2001
From: cyfraeviolae
+ Parameters for JPEG and BMP:
+
+ Parameters for PDF and PDF:
+
- You can test your ciphertext with Go. Run the following in a shell,
- then open
+ Key 1: 8007941455b5af579bb12fff92ef31a3
+
+ Key 2: 14ef746e8b1792e52b1d22ef124fae97
+
+ Nonce: 4a4f5247454c424f52474553
+
+ Key 1: c94a4dbd95faf02bdc0c39e0c0984299
+
+ Key 2: e4d26cdfbc732473103a5a887a755e19
+
+ Nonce: 4a4f5247454c424f52474553
+ /tmp/polyglot-first.jpg
and /tmp/polyglot-second.bmp
- in an image viewer. You may need to alter the path of polyglot.enc
to reflect
- your download directory.
+ You can test your ciphertext with Go. You may need to alter the
+ path of polyglot.enc
to reflect your download
+ directory.
curl -L -o /tmp/decrypt-aes-gcm.go https://cyfraeviolae.org/forbidden-salamanders/static/decrypt-aes-gcm.go
go build -o /tmp/decrypt-aes-gcm /tmp/decrypt-aes-gcm.go
-< polyglot.enc /tmp/decrypt-aes-gcm 8007941455b5af579bb12fff92ef31a3 4a4f5247454c424f52474553 > /tmp/polyglot-first.jpg
-< polyglot.enc /tmp/decrypt-aes-gcm 14ef746e8b1792e52b1d22ef124fae97 4a4f5247454c424f52474553 > /tmp/polyglot-second.bmp
+
+# For JPEG and BMP
+< polyglot.enc > /tmp/polyglot-first.jpg /tmp/decrypt-aes-gcm 8007941455b5af579bb12fff92ef31a3 4a4f5247454c424f52474553
+< polyglot.enc > /tmp/polyglot-second.bmp /tmp/decrypt-aes-gcm 14ef746e8b1792e52b1d22ef124fae97 4a4f5247454c424f52474553
+
+# For PDF and PDF
+< polyglot.enc > /tmp/polyglot-first.pdf /tmp/decrypt-aes-gcm c94a4dbd95faf02bdc0c39e0c0984299 4a4f5247454c424f52474553
+< polyglot.enc > /tmp/polyglot-second.pdf /tmp/decrypt-aes-gcm e4d26cdfbc732473103a5a887a755e19 4a4f5247454c424f52474553
-
Attack outline.
@@ -154,21 +187,120 @@ go build -o /tmp/decrypt-aes-gcm /tmp/decrypt-aes-gcm.go
\]
- Note that the choice to place the extra block in the final position was arbitrary. For the attack below we will instead need + Note that the choice to place the extra block in the final position was arbitrary. For the JPEG/BMP attack we will instead need to change the penultimate block rather than adding a block; the computation is similar.
-- For the next phase, we construct a ciphertext that decrypts to a valid JPEG under one key and a valid BMP under another. + For the next phase, we construct a ciphertext that decrypts to one file under one key and another file under another. Recall that the ciphertext of AES-GCM, as in AES-CTR, is computed by taking the XOR of the keystream and the message. The keystream is computed from the cipher key and the nonce.
- The basic strategy is to place the JPEG bytes and BMP bytes at different locations, carefully arranging it so - each parser will ignore the other data for the file. JPEG files can include comments, in which we will include the - BMP data. The BMP parser will stop reading as soon as the indicated length of the BMP has been read, after which - we will include the JPEG data. In each decrypted file, the data for the other image will be scrambled as we are using - a different key, but it will not matter as the junk data will be in a location that is ignored by the image parser. + First we will present the PDF and PDF collision (this is a new + contribution); then we will present the JPEG and BMP collision + (this was shown in the Invisible Salamanders paper). +
++ The construction we show will not result in specification-valid PDFs, but nevertheless, most PDF viewers will render them as intended. +
++ The PDF file format starts with a header: +
+ %PDF-1.7 + %µ¶+ The header is followed by a sequence of objects. A simple object containing a stream is shown below, with the data + for the object inserted at
[DATA]
.
+ + 1 0 obj + <<>> + stream + [DATA] + endstream + endobj+ At the end of the file, an
xref
table determines how the objects are layed out in space,
+ and finally, the file ends with the line %%EOF
, after which no more bytes are read by the PDF parser.
+
+
+ Our strategy will be to place a new stream object at the beginning of the first PDF file. This object
+ will include the entirety of the second PDF file. Because the
+ xref
table does not reference this new object, the
+ first PDF will not attempt to render the additional data.
+
+ Fix an arbitrary nonce \(n\) and key \(k_1\). We need our ciphertext header \(c_H\) to decrypt to the following
+ plaintext \(m_H\) under \(k_1\). We choose the object ID 0 0
+ obj
as it is meant to be reserved and unused.
+
+ %PDF-1.7 + %µ¶ + + 0 0 obj + <<>> + stream+ Therefore we set \(c_H = \operatorname{AES-GCTR}(k_1, n, m_H)\), + where \(\operatorname{AES-GCTR}\) returns the ciphertext portion of \(\operatorname{AES-GCM}\) but not the MAC. + +
+ Since we will place the second PDF file afterwards, we need this header to be ignored when decrypted under the second key \(k_2\).
+ PDF files can include comments, which start with %
and extend until a newline \n
, and these
+ comments are ignored by the PDF parser.
+
+ Randomly choose \(k_2\)s until the decryption of \(c_H\) under
+ \(k_2\) yields a plaintext that starts with %
,
+ ends with \n
, and does not contain any newlines in
+ between. If we model the stream
+ output as uniformly random, the expected number of attempts is
+ \[ \frac{1}{\frac{1}{256}\frac{1}{256}\left(1-\frac{1}{256}\right)^{\vert m_H \vert-2}} \approx 76{,}045, \]
+ which is possible in less than a minute on a desktop computer.
+ Note that these keys and header are independent of the specific PDF files we wish
+ to collide; thus, they can be precomputed.
+
+ To the ciphertext header, we append the encryption of \(\textrm{PDF2}\) under \(k_2\). + We need to end the stream object tag in the first PDF, + so set \(m_E\) as +
+ endstream + endobj+ and then append the encryption of \(m_2\) under \(k_1\), then the encryption of \(\textrm{PDF1}\) + under \(k_1\). + +
+ Finally, we pad the ciphertext with \(p\) bytes until it is a multiple of the block
+ length (16 for AES-GCM), then append an extra block \(X\) so the MACs collide,
+ as described earlier.
+ Because PDF parsers stop reading immediately when
+ they see %%EOF
, the second PDF file will not be
+ corrupted by the first PDF file appearing afterwards, nor will
+ the first PDF file be corrupted by the MAC collision block appearing afterwards.
+
+ We summarize the construction below. + For the blank cells in the ciphertext row, use either the + encryption of the first PDF cell under \(k_1\) or the second PDF cell under + \(k_2\) as indicated. + \[ + \begin{array}{|c|c|}\hline + \mathsf{PDF 1} &m_H & & m_{E} & \textrm{PDF1} & \\ + C & & & & & \mathtt{00}^{p} & X \\ + \mathsf{PDF 2} && \textrm{PDF2} & & & \\\hline + \end{array} + \] +
++ The basic strategy shown by the Invisible Salamanders paper is to + place the JPEG bytes and BMP bytes at different locations, + carefully arranging it so each parser will ignore the other data + for the file. JPEG files can include comments, in which we will + include the BMP data. The BMP parser will stop reading as soon as + the indicated length of the BMP has been read, after which we will + include the JPEG data. In each decrypted file, the data for the + other image will be scrambled as we are using a different key, but + it will not matter as the junk data will be in a location that is + ignored by the image parser.
All JPEG files start with the magic bytes \(\mathtt{ffd8}\) and end @@ -192,7 +324,7 @@ go build -o /tmp/decrypt-aes-gcm /tmp/decrypt-aes-gcm.go include the size of the color array (the pixels of the image) in the initial metadata. BMP parsers ignore any data after the color array is supposed to be over, even if the file length has not been - exhausted yet. That means we can set \(J=\mathtt{ffff}=65536\), and the + exhausted yet. That means we can set \(J=\mathtt{ffff}=65{,}536\), and the resulting header will be valid for any BMP file less than \(J\) bytes.
@@ -210,13 +342,12 @@ go build -o /tmp/decrypt-aes-gcm /tmp/decrypt-aes-gcm.go takes less than a minute on a desktop computer.
- We have now computed the ciphertext header \(C_{H}\) and two keys + We have now computed the ciphertext header \(c_{H}\) and two keys which will decrypt it to the correct header bytes for both files. - Note that \(C_{H}\) only depends on the maximum size of + Note that \(c_{H}\) only depends on the maximum size of the BMP file, and thus can be precomputed. The remainder of the attack that depends on the specific images is very fast.
-As explained before, we place the BMP bytes in the JPEG comment, add padding to finish the comment, and add the JPEG bytes after the comment is over. @@ -225,7 +356,7 @@ go build -o /tmp/decrypt-aes-gcm /tmp/decrypt-aes-gcm.go \[ \begin{array}{|c|c|}\hline \mathsf{JPEG} && & & \textrm{JPEG} & \mathtt{ffd9} \\ - C & C_H & & \mathtt{00}^{J-\vert \textrm{BMP}\vert}& & \\ + C & c_H & & \mathtt{00}^{J-\vert \textrm{BMP}\vert}& & \\ \mathsf{BMP} && \textrm{BMP} & & & \\\hline \end{array} \] @@ -260,7 +391,7 @@ go build -o /tmp/decrypt-aes-gcm /tmp/decrypt-aes-gcm.go \[ \begin{array}{|c|c|}\hline \mathsf{JPEG} && & & \mathrm{JPEG} & \mathtt{fffe} & J' & & & & \mathtt{ffd9} \\ - C & C_{H} & & \mathtt{00}^{J-\vert \textrm{BMP}\vert}& & & &\mathtt{00}^{J'-30} & X & \mathtt{00}^{14}& \\ + C & c_{H} & & \mathtt{00}^{J-\vert \textrm{BMP}\vert}& & & &\mathtt{00}^{J'-30} & X & \mathtt{00}^{14}& \\ \mathsf{BMP} && \textrm{BMP} & & \\\hline \end{array} \] -- cgit v1.2.3