aboutsummaryrefslogtreecommitdiffstats
diff options
context:
space:
mode:
authorjvoisin <julien.voisin@dustri.org>2018-04-01 15:36:45 +0200
committerjvoisin <julien.voisin@dustri.org>2018-04-01 15:36:45 +0200
commit7992cd0d51c3b858f36e74abd76ceef986b51df8 (patch)
treeaecc6a3501199cf2418f8d820432fe4c50d17b9f
parent9e7a4bd217c314a0a86bf9e794f0fda4392a19d9 (diff)
downloadexternal_mat2-7992cd0d51c3b858f36e74abd76ceef986b51df8.tar.gz
external_mat2-7992cd0d51c3b858f36e74abd76ceef986b51df8.tar.bz2
external_mat2-7992cd0d51c3b858f36e74abd76ceef986b51df8.zip
Add some documentation
-rw-r--r--doc/implementation_notes.md33
-rw-r--r--doc/threat_model.md85
2 files changed, 118 insertions, 0 deletions
diff --git a/doc/implementation_notes.md b/doc/implementation_notes.md
new file mode 100644
index 0000000..bc83671
--- /dev/null
+++ b/doc/implementation_notes.md
@@ -0,0 +1,33 @@
+Implementation notes
+====================
+
+Symlink attacks
+---------------
+
+MAT2 output predictable filenames (like yourfile.jpg.cleaned).
+This may lead to symlink attack. Please check if you OS prevent
+against them
+
+Archives handling
+-----------------
+
+MAT2 doesn't support archives yet, because we haven't found an usable way to ask the user
+what to do when a non-supported files are encountered.
+
+PDF handling
+------------
+
+MAT was doing some kind of rendering for PDF files, on a cairo surface, then
+printed it to a file. This kept the text selectable, but unfortunately, it
+didn't remove any *deep metadata*, like the ones in embedded pictures. This was
+on of the reason MAT was abandoned: the absence of satisfying solution to
+handle PDF. But apparently, people are ok with [pdf redact
+tools](https://github.com/firstlookmedia/pdf-redact-tools), that simply
+transform the PDF into images. So this is what's MAT2 is doing too.
+
+Images handling
+---------------
+
+When possible, images are handled like PDF: rendered on a surface, then saved
+to the filesystem. This ensures that every metadata is removed.
+
diff --git a/doc/threat_model.md b/doc/threat_model.md
new file mode 100644
index 0000000..6d14ca6
--- /dev/null
+++ b/doc/threat_model.md
@@ -0,0 +1,85 @@
+Threat Model
+============
+The Metadata Anonymisation Toolkit 2 adversary has a number
+of goals, capabilities, and counter-attack types that can be
+used to guide us towards a set of requirements for the MAT2.
+
+This is an overhaul of MAT's (the first iteration of the software) one.
+
+Warnings
+--------
+
+Mat only removes standard metadata from your files, it does _not_:
+
+ - anonymise their content
+ - handle watermarking
+ - handle steganography
+ - handle any non-standard metadata field/system
+
+If you really want to be anonymous format that does not contain any
+metadata, or better : use plain-text. And as usual, think before clicking.
+
+
+Adversary
+------------
+
+* Goals:
+
+ - Identifying the source of the document, since a document
+ always has one. Who/where/when/how was a picture
+ taken, where was the document leaked from and by
+ whom, ...
+
+ - Identify the author; in some cases documents may be
+ anonymously authored or created. In these cases,
+ identifying the author is the goal.
+
+ - Identify the equipment/software used. If the attacker fails
+ to directly identify the author and/or source, his next
+ goal is to determine the source of the equipment used
+ to produce, copy, and transmit the document. This can
+ include the model of camera used to take a photo, or
+ which software was used to produce an office document.
+
+
+* Adversary Capabilities - Positioning
+ - The adversary created the document specifically for this
+ user. This is the strongest position for the adversary to
+ have. In this case, the adversary is capable of inserting
+ arbitrary, custom watermarks specifically for tracking
+ the user. In general, MAT cannot defend against this
+ adversary, but we list it for completeness.
+
+ - The adversary created the document for a group of users.
+ In this case, the adversary knows that they attempted to
+ limit distribution to a specific group of users. They may
+ or may not have watermarked the document for these
+ users, but they certainly know the format used.
+
+ - The adversary did not create the document, the weakest
+ position for the adversary to have. The file format is (most of the time)
+ standard, nothing custom is added: MAT
+ should be able to remove all meta-information from the
+ file.
+
+Requirements
+---------------
+
+* Processing
+ - The MAT2 *should* avoid interactions with information.
+ Its goal is to remove metadata, and the user is solely
+ responsible for the information of the file.
+
+ - The MAT2 *must* warn when encountering an unknown
+ format. For example, in a zipfile, if MAT encounters an
+ unknown format, it should warn the user, and ask if the
+ file should be added to the anonymised archive that is
+ produced.
+
+ - The MAT2 *must* not add metadata, since its purpose is to
+ anonymise files: every added items of metadata decreases
+ anonymity.
+
+ - The MAT2 *should* handle unknown/hidden metadata fields,
+ like proprietary extensions of open formats.
+