Opening a PDF link in the browser (e.g. google chrome with the ootb PDF viewer plugin) apparently indicates that when the PDF is hosted on a cloudflare-facing domain there is additional data present in the embed code.
Inspecting the page source of a displayed PDF file with chrome dev tools shows some 'reporting' URL when the PDF is behind cloudflare e.g. https://a.nel.cloudflare.com/report/v3?s=%2BW057P981N7Esg...
(see the second code block).
PDF embed of a file NOT served via cloudflare:
<embed id="plugin" type="application/x-google-chrome-pdf" src="https://www.w3.org/WAI/ER/tests/xhtml/testfiles/resources/pdf/dummy.pdf" stream-url="chrome-extension://mhjfbmdgcfjbbpaeojofohoefgiehjai/f02f891e-7fd9-4857-8a34-f4e05abb87f8" headers="accept-ranges: bytes
cache-control: max-age=21600
content-length: 13264
content-type: application/pdf; qs=0.001
date: Sun, 05 Sep 2021 08:17:57 GMT
etag: "33d0-438b181451e00"
expires: Sun, 05 Sep 2021 14:17:57 GMT
last-modified: Mon, 27 Aug 2007 17:15:36 GMT
strict-transport-security: max-age=15552000; includeSubdomains; preload
x-backend: ssl-mirrors
" background-color="4283586137" javascript="allow" full-frame="" pdf-viewer-update-enabled="">
PDF embed for a file that IS served via cloudflare:
<embed id="plugin" type="application/x-google-chrome-pdf" src="https://www.cloudflare.com/static/839a7f8c9ba01f8cfe9d0a41c53df20c/cloudflare-cdn-whitepaper-19Q4.pdf" stream-url="chrome-extension://mhjfbmdgcfjbbpaeojofohoefgiehjai/fab5433b-5189-4469-91bb-fe144b761c7f" headers="accept-ranges: bytes
age: 105287
alt-svc: h3-27=":443"; ma=86400, h3-28=":443"; ma=86400, h3-29=":443"; ma=86400, h3=":443"; ma=86400
cache-control: max-age=8640000
cf-cache-status: HIT
cf-ray: 689e1d381a951501-MAD
content-length: 921473
content-type: application/pdf
date: Sun, 05 Sep 2021 08:33:41 GMT
etag: static/839a7f8c9ba01f8cfe9d0a41c53df20c/cloudflare-cdn-whitepaper-19Q4.797a721498.pdf
expect-ct: max-age=604800, report-uri="https://report-uri.cloudflare.com/cdn-cgi/beacon/expect-ct"
nel: {"success_fraction":0,"report_to":"cf-nel","max_age":604800}
report-to: {"endpoints":[{"url":"https:\/\/a.nel.cloudflare.com\/report\/v3?s=Bi6bZw6jf1FJoimuy2arirenUDiwyZX%2B%2B1Ty506xD9qMJ5UggIvZAy2h8gKogsJORkPlWdnZ12udf6CN%2BadaEF0FRKFAyZQabI6xkui0%2FrV%2BaCFsp7BmbEHnoLk0HPmJ6pMeMQ%3D%3D"}],"group":"cf-nel","max_age":604800}
server: cloudflare
strict-transport-security: max-age=31536000
vary: Accept-Encoding
x-content-type-options: nosniff
x-frame-options: SAMEORIGIN
x-xss-protection: 1; mode=block
" background-color="4283586137" javascript="allow" full-frame="" pdf-viewer-update-enabled="">
Question
Does this imply that cloudflare is rewriting the HTML source for PDF embeds and tracking PDF files opened through the browser PDF plugins? What are the security/privacy implications of this? Would disabling the browser PDF embed plugin reduce the amount of data collected by cloudflare?
What is particularly confusing is that the <embed/>
code is supposedly generated by the PDF browser plugin and NOT from the incoming response so how can this rewriting be happening specifically for cloudflare?