For more details on the theory behind the program and how the confidence intervals for the estimations are calculated see the technical report "Calculation of Probabilistic Anonymity from Sampled Data". For more details about information theory in general we recommend the book "Elements of Information Theory" by Cover and Thomas.
| Downloads |
| What the Tool Does |
Capacity is calculated using the iterative Blahut-Arimoto algorithm. The acceptable error and maximum number of iterations can be set using flags (run the program with no arguments to get a description of all the flags).
The program can calculate an estimate of capacity from trial runs of a system. The sample file should contain lines of the form "(i,o)" where "i" is the initial settings (inputs) of the system and "o" is an observed output. The program also calculates a confidence interval for the results. To get an accurate result from sampled data you will need to have an order of magnitude more samples than the product of the inputs and outputs; see this pdf for more details.
As well as calculating the capacity, the program can calculate mutual information for a uniform input. This is done by setting the "-mi" flag at the command line.
| Running the Program |
The program is packaged as a jar file, so it can be run from the command line of any computer with a Java runtime environment, e.g. type:
java -jar ae.jar
to get the help message. To calculate the capacity of a channel from a probability transition matrix type:
java -jar ae.jar matrixFile.txt
and to calculate the mutual information for a uniform input distribution, instead of capacity, add the flag "-mi".
To estimate the capacity or mutual information for a channel from sampled runs of a system simply run the program with a file containing sampled data instead of the matrix. Example sample files can be found here and the file format for both matrices and observation files is described here. For sampled data the program will tell you if the sampled data is consistent with zero information leakage and if not it will estimate a confidence interval for the true value.
| Program Status |
The current version of the program is a proof of concept and comes with no guarantees. Some unsupported and undocumented features include calculating conditional mutual information (side information), mutual information for any input distribution, the capacity of multi-sender channels and the distribution of a number of tests. If any of these would be useful to you then email me at the address below and I will get them into a user-friendly form.
| People |
The people involved in creating the theory and implementation of this tool are:
Please send commends, questions etc. to T.Chothia (a) cs bham ac uk