A month ago, I wrote a
very detailed and critical review
of the Coalition for Content Provenance and Authenticity (C2PA) specification for validating media metadata. My basic conclusion was that C2PA introduces more problems than it
tries
to solve
and
it didn't solve any problems.
Although a few people have
left comments
on the blog, I have had a near daily stream of new contacts writing in. Some are curious developers, while others are people tasked with evaluating or implementing C2PA. The comments have come from small organizations, Fortune-500 companies, and even other CAI and C2PA members. (Since some of them mentioned not having permission to publicly comment, I'm treating them all as anonymous sources.)
Despite having different areas of interest, they all seem to ask variations of the same basic questions. (It's so frequent, that I've often been cut-and-pasting text into replies.) In this blog entry, I'm going to cover the most common questions:
-
Are there any updates?
-
How bad is C2PA's trust model?
-
Isn't C2PA's tamper detection good enough?
-
Is C2PA really worse than other types of metadata?
-
What's the worst-case scenario?
For people interested in problem solving with forensic tools, I have a challenge in the 5th item.
Question 1: Are there any updates?
There have been a couple of updates to C2PA since my blog came out. Some appear to be unrelated to me, and some are directly related.
-
C2PA released version 1.4 of their specifications. Unfortunately, it doesn't address any of the problems with the trust model. (Since this happened a few days after my blog, it's an unrelated update.)
-
Some of the bugs submitted to various C2PA github projects suddenly became actively addressed by the C2PA/CAI developers. (Including some submitted by me.) While this is progress, none have been fixed yet. In most cases, the bugs seem to have
been
copied
from the public github locations to Adobe's private bug management system. (This means that C2PA development is really only happening inside Adobe.) Unfortunately, none of these bugs address C2PA's trust model.
-
Last week I had a productive discussion with people representing C2PA and CAI. I'm not going to disclose the discussion topics, except to say that there were some things we agreed on, and many things we did not agree on.
Excluding myself and my colleague, everyone on the call was an Adobe employee. In my
previous blog
, I mentioned that CAI acts like a façade to support C2PA. Between all of the key people being at Adobe and all development going through Adobe, I'm now getting the distinct impression that CAI and C2PA are both really just "all Adobe all the time." It does not appear to be the wider community that C2PA and CAI describe.
-
A few days after the conference call, Adobe made a change to CAI's Content Credentials web site. Now self-signed certificates (and some of the older non-test certificates) are flagged as untrustworthy: "This Content Credential was issued by an unknown source."
The only way around this requires paying money to a certificate authority for a "signing certificate". (At DigiCert, this
costs $289
for one year.) Switching to a
fee-based requirement
dramatically reduces the desirability for smaller companies and individual developers to adopt C2PA. And yet, adding in a fee requirement does
nothing
to deter fraud. (Fraud is a multi-billion dollar per year industry. The criminals can afford to pay a little for "authentication".)
As an aside, "HTTPS" adoption had a similar pay-to-play limitation until
Let's Encrypt
began offering free certificates. Today, HTTPS is the norm because it's free. Unfortunately, those free certificates are for web domains only and
not for cryptographic signatures
.
Question 2: How bad is C2PA's trust model?
My
previous evaluation
identified different ways to create forgeries, but it only superficially covered the trust model.
Back in the old days of network security, we relied on "trust", but "trust" was rapidly shown to be vulnerable and insecure. "Trust" doesn't work very well as a security mechanism; it's not even "better than nothing" security. "Trust" is used for legal justification after the fact. We trust that someone won't ignore the "No Soliciting" sign by your door. The sign doesn't stop dishonest solicitors. Rather, it establishes
intent
for any legal consequences that happen later.
In networking, "trust" was replaced with "
trust but verify
", which
is
better than nothing. Today, network security is moving from "trust but verify" to "
zero trust
".
C2PA is based heavily on the "trust" and not "trust but verify". (And definitely not "zero trust".) With C2PA:
-
We trust that the metadata accurately reflects the content. (Provenance, origin, handling, etc.) This explicitly means trusting in the honesty of the person inserting the metadata.
-
We trust that each new signer verified the previous claims.
-
We trust that a signer didn't alter the previous claims.
-
We trust that the cryptographic certificate (cert) was issued by an authoritative source.
-
We trust that the metadata and cert represents an authoritative source. While the cert allows us to validate how it was issued ("trust but verify"), it doesn't validate who it was issued to or how it is used.
-
We trust the validation tools to perform a proper validation.
-
We trust that any bad actors who violate any of these trusts will be noticed before causing any significant damage. (
Who
will notice it? We trust that there is someone who notices, somewhere to report it, and someone who can do something about it. And we trust that this will happen quickly, even though it always takes years to
revoke
a
pernicious
certificate
authority
.)
All of this trust is great for keeping honest people honest. However, it does nothing to deter malicious actors.
C2PA is literally based on the honor system. However, the honor system is the definition of having no security infrastructure. (We don't need security because we trust everyone to act honorably.)
Question 3: Isn't C2PA's tamper detection good enough?
Putting strong cryptography around unverified data does not make the data suddenly trustworthy. (It's
lipstick on a pig
!)
Other than the cryptographic signature, all of C2PA's metadata can be altered without detection. (Yes, C2PA also uses a hash, like sha256, as a secondary check, but it's trivial to recompute the hashes after making any alterations.) The cryptographic signature ensures that the data hasn't changed after signing. C2PA calls this
tamper evident
, while other C2PA members describe it as being
tamper resistant
or
tamper proof
. Regardless of the terminology, there are four possible scenarios for their tamper detection: unsigned, invalid, valid, and missing.
-
Case 1: Unsigned data
This cases occurs when the C2PA metadata exists but there is no cryptographic signature. As I've mentioned, everything can be easily forged. Even though the C2PA metadata exists, you cannot explicitly trust it since you don't know who might have changed it.
The C2PA specification permits including files as a component of another file. This unsigned case can easily happen when incorporating media that lacks C2PA metadata. In my
previous blog entry
, I pointed out that CAI's example with the flowery hair contains a component without a signature.
What this means: If there is no cryptographic signature, then we cannot trust the data. We must use other, non-C2PA methods to perform any validation.
-
Case 2: Invalid signature or checksum
When the C2PA metadata has an invalid checksum or signature, it explicitly means something was changed. However, you don't know what was changed, when it happened, or who did it.
Keep in mind, this does
not
mean that there is intentional fraud or a malicious activity. Alterations could be innocent or unintentional. For example, importing a JPEG into the Windows Photo Gallery (
discontinued
but still widely used) automatically updates the metadata. This update causes a change, making the C2PA signature invalid. The same kind of unintentional alteration can also happen when attaching a picture to an outgoing email. (As opposed to importing a picture into the iPhone Photo Library, which simply strips out C2PA metadata.)
If the signature is invalid, then C2PA says we cannot trust the data. However, the important aspects of the data may not have changed and may still be trustworthy. We must use other, non-C2PA methods to determine the cause and perform any validation.
-
Case 3: Valid signature and checksums
There are two different ways that a file can contain valid signatures and checksums:
(A) While we don't know if we can trust the data, we know we can trust that it wasn't changed after being signed.
or
(B) The data was changed after being signed and any invalid signatures were replaced, making it valid again. In this case, the data and signatures are untrusted.
In both cases (A and B), the signatures and checksums are valid
and
we cannot trust the data. Moreover, we can't distinguish unaltered (A) from intentionally altered (B). Since we cannot use C2PA for trust or tamper detection, we must use other, non-C2PA methods to perform any validation.
-
Case 4: No C2PA metadata
This is currently the most common case. Does it mean that the picture never included C2PA metadata, or that the metadata was removed (stripped)?
CAI's
Content Credentials
service addresses this problem by performing a similar picture search (perceptual search). Most of the time, they find nothing. However, even if they do find a visually similar file that contains C2PA metadata, it doesn't mean the data in the search result is trustworthy or authentic. (See Cases 1-3.)
In every case:
-
The data is untrusted.
-
Intentional tampering cannot be detected.
-
The data must be validated through other, non-C2PA means.
What does C2PA provide? Nothing except computational busywork. C2PA's complexity and use of buzzwords like "cryptography" and "tamper evident" makes it appear impressive, but it currently provides nothing of value.
Question 4: Is C2PA really worse than other types of metadata?
Given that C2PA does not validate, why do we need yet another standard for storing the exact same information?
The metadata information provided by C2PA is typically present in other metadata fields: EXIF, IPTC, XMP, etc. However, C2PA provides the same information in an overly complicated manner, requiring 5 megs of libraries and four different complex formats to process. In contrast, EXIF and IPTC are simple formats, easy to implement, require few resources, and come in very small libraries. Even
XMP
(Adobe's XML-based format that sucks due to a lack of consistency) is a better choice than C2PA's JUMBF/CBOR/JSON/XML.
A few people have written to me with comments like, "C2PA has the same long-term integrity issue as other metadata" or "Isn't C2PA as trustworthy as anything else?"
My reply is a simple "No." C2PA is much worse:
-
Regular metadata doesn't claim to be tamper-evident.
-
Regular metadata doesn't use cryptographic signatures.
-
Regular metadata doesn't have the backing of tech companies like Adobe, Microsoft, and Intel that give the false impression of legitimacy.
Other types of metadata (EXIF, IPTC, etc.) can be altered and should be validated though other means. C2PA gives the false impression that you don't need to validate the information because the cryptographic signatures appear valid.
Even the converse is problematic. Because the metadata can be altered without malicious intent (see the previous Case 2), an invalid C2PA signature does not mean the visual content, GPS, timestamps, or other metadata is invalid or altered. Everything else could be legitimate even with an invalid C2PA signature.
Unlike other types of metadata, C2PA's cryptographic signatures act as a
red herring
. Regardless of whether the signature is valid, invalid, or missing, forensic investigators should ignore C2PA's signatures since they tell nothing about the data's validity, authenticity, and provenance.
Question 5: What's the worst-case scenario?
This is the most common question I've received. I usually just explain the problem. But for a few people, I've manufactured examples of the worst-case scenario for the questioner to evaluate. In each of these instances, I used the questioner's own personal information to really drive home the point.
Since lots of people are reading this blog entry, I'm going to use Shantanu Narayen as the fictional example of the worst-case scenario. Narayen is the current chair and CEO of Adobe; his personal information in this example comes from his
Wikipedia
page. (I'm not doxing him.)
In my opinion, the worst-case scenario is when the data is used to frame someone for a serious crime, like child pornography. Here's the completely fictitious scenario:
Last month, Shantanu Narayen got into an online argument with a very vindictive person. The vindictive person acquired a bunch of child pornography and used C2PA to attribute the pictures to Narayen. The pictures were then distributed around the internet.
It doesn't take long for the pictures to be reported to the National Center for Missing and Exploited Children (NCMEC). NCMEC sees the attribution to Narayen and immediately passes the information to the FBI and California's Internet Crimes Against Children (ICAC) task force. A short while later, the police knock on Narayen's door.
I'm not going to include a real picture of child porn for this demonstration of a fictional situation. (That would be gross, illegal, and irrelevant to this example.) Instead, I used a teddy bear picture (because it's "bearly legal").
For this evaluation, ignore the visual portion and assume the picture shows illegal activity. Here's one of those forged pictures!
Now, you get to play the role of the investigator:
Prove Narayen didn't do it.
You'll probably want to:
-
Download
the image. (Save it as 'bearlylegal.jpeg'.) To make sure it wasn't modified between my server and your evaluation, the file is 306,924 bytes and the SHA1 checksum is 04857686607b05b9fe3efef11fa6a11cd68e51df.
-
Evaluate it using whatever metadata viewers you have available.
-
If you don't have a starting place, then you can use my own online forensic services, like
FotoForensics
and
Hintfo
. But don't feel like you need to use my tools.
-
From the command line, try
exiftool
or
exiv2
.
-
Most graphical applications have built-in metadata viewers, including Mac's Preview, Windows file properties ("Details" tab), Gimp, and Photoshop. (Just be careful to not hit 'save' or make any changes to the file's metadata.)
-
Use Adobe's command-line
c2patool
to evaluate the C2PA metadata. With c2patool, you can view what sections are valid and the x.509 certificate contents:
c2patool -d bearlylegal.jpeg
c2patool --certs bearlylegal.jpeg | openssl x509 -text | less
-
View it at Adobe/CAI's
Content Credentials
online service. While this web service currently isn't as informative as the command-line c2patool, it does provide information about the file's C2PA metadata.
If you try solving this problem, either with software or as a conceptual exercise, I'd love to
hear your results!
I hope people include information like:
-
What's your background? Inquisitive amateur, student, software engineer, law enforcement, legal, professional investigator, or something else?
-
What tools did you use?
-
What were some of your observations?
-
What other issues should we consider?
-
Do you think you could prove innocence or reasonable doubt?
-
Let's turn the problem around: If you were working for the prosecution, do you think you could get a conviction?
I look forward to hearing how people did!
[Spoiler Alert]
*
If you want to evaluate this problem on your own, come back to this section later.
If you go through these various tools, you'll see:
-
The metadata says it's from a Leica camera and is explicitly associated with Narayen. (For this forgery, the metadata looks authentic because I copied it directly from a real Leica photo before changing the dates and attribution.) The GPS coordinates identify Adobe's headquarters where Narayen works.
-
The cert signer's information looks like a Leica certificate (complete with Leica's correct German address) because I copied the cert's subject information from a real Leica cert.
-
The C2PA checksums and signatures are valid. Programs like c2patool do not report any problems. The only issue (introduced after an update a few days ago) comes from the Content Credentials web site. That site says the cert is from an unknown source. ("Unknown" is not the same as "untrusted".) If the vindictive person wanted to pay $289 for a trusted signing cert, then even that warning could be bypassed. Every C2PA validation tool says the signatures look correct; nothing suspicious. (They were forged using c2patool, so they really are valid.)
-
The picture's timestamps predate the disagreement between Narayen and the vindictive person. (Even if you can show it is forged, you don't know who forged it or when. The vindictive person is effectively anonymous.)
According to the C2PA metadata, this picture came from Narayen. The picture has valid authentication and provenance information.
For reasonable doubt, you'll need show that C2PA's metadata can be easily forged and the metadata is unreliable. (If Narayen's attorneys are smart, they'll reference the Nov 28 CAI webinar
on YouTube
(from 16:37 to 17:32) where DataTrails explicitly demonstrates how C2PA is easy to forge with a few simple clicks. DataTrail's solution? They authenticate using non-C2PA technology. This shows that other experts also realize that C2PA doesn't work for validation, tamper detection, establishing provenance, or providing authentication.)
To prove his innocence, you'll need to prove it is a forgery. (In this case, I intentionally backdated the metadata to a time before Leica supported C2PA. However, if the vindictive person didn't make that kind of error, then there's no easy way to detect the forgery.)
Worst-Case Results
Given all of these findings, there are a few other things to consider:
-
Law enforcement, attorneys, and the courts are not very technical. If the file's metadata says it's his name and it has a valid signature, then it's valid until proven otherwise. Remember: digital signatures are
legal signatures
in a court of law; they are accepted as real until you can prove tampering. (And saying "I didn't do that" doesn't prove tampering.)
-
Narayen can say "that's not my picture!" While Leica may be able to verify that the cert isn't legitimate, Leica is in Germany and Narayen is in the United States. That makes serving a subpoena really difficult. In general, law enforcement (and everyone else who isn't Leica) cannot verify this. Also, Leica is a big company and can have multiple certs. Maybe they just don't remember creating this one or maybe it was part of a limited beta test.
-
C2PA's trust model assumes that a forged certificate will be eventually noticed. However, Narayen doesn't have months or years to wait for the forged certificate to be reused with some other victim. And that's assuming it is reused; it doesn't have to ever be reused.
-
You can try to explain that the cert only validates that the data existed and hasn't been changed since signing. But the courts see the words "authentication" and "provenance" and "certified" and "verified" and "validated". The evidence clearly says it is attributed to Narayen.
-
You might see flaws in the authenticated forgery. The prosecution will claim that the flaws are evidence that Narayen tried and failed to hide his identity. (As many people in the legal system have said, "
We don't catch the smart ones.
")
-
While the C2PA tools don't identify any problems, you might notice problems if you use other tools. In that case, why even bother with C2PA? But that's a rhetorical question and the answer carries no weight in court. The fact is, the metadata identifies Narayen and the cryptographic signature for authentication and provenance says the metadata can be trusted.
-
Worse: even if the signature appears fake, it doesn't mean Narayen didn't do it. Remember: there is other metadata besides C2PA's metadata that names Narayen. There are also multiple pictures naming him, and not just one photo.
-
This is a very technical topic. Really technical explanations and jargon will quickly confuse the courtroom. Remember: The judge still thinks a FAX machine is a secure data transmission system and most jury members don't know the difference between "Google" and "the Internet". Assuming you can identify how it was forged, communicating it to the judge and jury is an uphill battle. (Creating a forgery in a live demo would be great here. However, while live demos are common in TV crime shows, they almost never happen in real life. Also, live demos can go horribly wrong, as demonstrated by the OJ Simpson trial's "if the glove fits" fiasco. No sane attorney would permit you to perform a live demo.)
-
Even if you, as the expert, can explain how it was forged, your testimony will just appear to be you trying to discredit the certified authenticated provenance information. Remember: C2PA claims to be tamper-evident. But in this case, everything checks out so there is no evidence of tampering.
-
Most people can't afford, don't have access, and/or don't know an expert witness who can determine if the C2PA metadata is legitimate. As the CEO of Adobe, using his own employees as experts carries little or no weight. (Any expert from Adobe is clearly biased since Narayen can fire them at any time.) Law enforcement and the courts will assume: If it says it's from Narayen and the tamper-evident C2PA says there is no evidence of tampering, then it's from Narayen.
-
Let's pretend that you can identify the forgery, demonstrate that C2PA does not work, and communicate it clearly to the court. It's now your word against every technical expert at Microsoft, Intel, Sony, Arm, DigiCert, and a dozen other big tech companies. Who has more credibility? Hundreds of highly paid security and cryptography professionals or you? And remember: these big companies have their reputations on the line. They have a financial incentive to not be proven wrong
As an online service provider, I've interacted closely with NCMEC and ICACs for over a decade, and with attorneys and law enforcement for even longer. I can tell you that the prosecution won't spend much effort here. The cops will knock on his door. Narayen won't be able to convince the courts how it happened, and he'll either be found guilty or his attorney will convince him to take a plea deal.
Narayen's only option in this fictional scenario is to demonstrate that C2PA does not verify the metadata, the metadata can be altered, anyone can sign anything, and C2PA doesn't provide validated provenance and authenticity -- even though it's in the name: C2
PA
. I'm not a lawyer, but I think this option also could show that every company currently selling products that feature C2PA are actively engaged in
fraud
since they know what they are selling doesn't work.
Either C2PA doesn't work and Narayen walks free, or C2PA works and this forgery sends him to jail. That's the worst case scenario, and it's very realistic.
Truth or Consequences
If this fictional child porn example seems too extreme, then the same application of fake C2PA metadata works with propaganda from wars (Ukraine, Gaza, etc.), insurance fraud, medical fraud, fake passport photos, defective merchandise claims at Amazon and Etsy, altered photojournalism, photo contests, political influences, etc. At FotoForensics, I'm already seeing known fraud groups developing test pictures with C2PA metadata. (If C2PA was more widely adopted, I'm certain that some of these groups would deploy their forgeries right now.)
To reiterate:
-
Without C2PA: Analysis tools can often identify forgeries, including altered metadata.
-
With C2PA: Identifying forgeries becomes much harder. You have to convince the audience that valid, verifiable, tamper-evident 'authentication and provenance' that uses a cryptographic signature, and was created with the backing of big tech companies like Adobe, Microsoft, Intel, etc., is wrong.
Rather than eliminating or identifying fraud, C2PA enables a new type of fraud: forgeries that are authenticated by trust and associated with some of the biggest names on the tech landscape.
A solution for assigning authentication and provenance would go a long way toward mitigating fraud and misrepresentation. Unfortunately, the current C2PA specification is not a viable solution: it fails to authenticate, is trivial to forge, and cannot detect simple intentional tampering methods. It is my sincerest hope that the C2PA and CAI leadership tries to re-engage with the wider tech community and opens discussions for possible options, rather than making sudden unilateral decisions and deploying ad hoc patches (like requiring pay-to-play) that neither address the basic problems nor encourage widespread adoption.
Značky: #Forensics, #Network, #Security, #FotoForensics, #Programming