OVERVIEW
========

Motivation
~~~~~~~~~~

The current list of various security devices, virtual or hardware, more or
less get defeated over time by measures that the device didn't intend to
protect, leaving the user helpless in the event of an actual account hack.

This new security device doesn't aim necessarily at protecting the account
against unwanted access at all cost, but rather provides the user with
physical feedback about what's going on, and ways to interrupt the unauthorized
access to his account.

We have to suppose that everything's compromised to the worst extend, save it
for the secret. But generally speaking, the code of the device might be
exposed, the user's computer might be fully compromised with keyloggers and
trojans, and the connection of the user might be hijacked.


General idea
~~~~~~~~~~~~

The device is based on HMAC:SHA1, signing a symmetric private key, shared
between the device and the servers. The key should be 20 bytes long, and the
payload the device has to sign should also be 20 bytes long.

The device has to be able to answer server's messages, directly or indirectly.
If the device is a USB key or so, a proxy software needs to forward requests
back and forth between the device and the server. This proxy might be
compromised or be malicious.


Two variants of devices
~~~~~~~~~~~~~~~~~~~~~~~

There might be several kind of devices. Physical, in the form of a USB token,
virtual, in the form of a software running on a computer, or semi-virtual, in
the form of a mobile phone application.

We will distinguish two forms of devices: the passives ones, that need to be
connected to the servers using a potentially compromised proxy software, and
the active ones, that have a direct "secure" connection to the server.

The "secure" part of the connection actually means it shouldn't be suspectible
to man-in-the-middle attacks. But sniffings are okay.


The devices and the servers
~~~~~~~~~~~~~~~~~~~~~~~~~~~

The devices need to register to the server, either actively, of via their
proxy software. The devices don't have to unregister; the server may keep a
list of currently active devices and automatically unregister them after a
certain timeout, or whenever a request fails. Also, what really matter is the
routing table, to know to where one should forward a request for a given key.

The registration of the devices need to be agnostic from the keys it may hold,
so that multiple devices may hold the same key, or one device may hold several
keys, or both.


MODUS OPERANDI
==============

Device operations
~~~~~~~~~~~~~~~~~

Each device has to hold at least one secret key and one serial number which is
associated with that key. But a device may hold more than one serial/key pair.

Each key has to be exactly 20 bytes of secure random data. Each serial number
should be 20 bytes of data, which usage and format is at the discretion of the
system operator. It would be recommended though that a 4-bytes prefix is used,
to identify the vendorid of the system operator that wants to use that key.
This way, the same devices may be used to protect several different accounts
provided by different independent vendors, though one single device can only
be used for one single purpose at a time.

Each device has to support these queries from the server:
-) List all the serial numbers held into the device, alongside a given
   payload that will make possible to prove the keys are genuine.
-) Initialize the device to begin a session with a key given its serial number,
   and sending a special init-payload.
-) Sign a payload of 20 bytes sent by the server.
-) Ask a yes/no question to the user, by sending two random payloads; the
   answer will be depending upon which payload is being signed and sent back.
   Devices might just always answer "yes", if they don't want to support user
   interaction, or if the question is a more complex. The "no" answer should
   always be most logical and secure answer.
-) Enumerate the list of the optional capabilities, not including the fact the
   device is "active". That last part is solely determined by the server.

Each device may support these optional queries, based on its capabilities, or
security options the user selected:
-) Display an arbitrary message on the device
-) Write a new key/serial pair, in an unsecure method
-) Dump a key using an unsecure method, given its serial number
-) Erase a serial/key pair
-) Enable or disable a key
-) Transfer a key using a secure external communication apparatus
-) Receives a key using a secure external communication apparatus
-) Initiate a secure (Diffie-Hellman) connection to the device
-) Import a key/serial pair using a secure method (DH)
-) Export a key using a secure method (DH), given its serial number

Devices should still support the reception of messages it is not capable of
processing. An error code should be properly returned.


Server operations
~~~~~~~~~~~~~~~~~

The server might either be the official server from the system operator, a
third-party software that will operate the device to manipulate its keys, or a
malicious server. Which means the device's code should be robust against
fully malformed messages.

In all cases, only the official server shall retain the private portion of the
key. A good chunk of the security of this system is based on the privacy of
that key.

Upon registration of a device, the server will request the list of the keys the
device holds, and will hold a routing table of keys based on their serial
number to the appropriate device, after verifying the device holds the proper
keys using the special payload. The server may also cache the list of the
capabilities of the devices locally, but that list may change dynamically.
So the server should always process the error codes returned by the devices.
If two devices want to register the same key, only the last genuine one wins.
And the server should ignore any key serial it doesn't know about.

Also, distinguishing passive or active devices is the responsability of the
server. Although this is somewhat important, as active devices have more power
than passive devices, the protocol ensures a device can not be proxified into
another. So the distinction between active and passive might just be based on
the incoming connection type.

When the server wants to start securing a connection, it will first locate the
key/serial pair associated with this account. Then, it will search within the
list of all currently registered devices which one is currently offering that
key. Finally, it will send an init message to the device for the serial number
associated with the key for that account. That init message will also contain
an init payload which is 20 secure random bytes XOR the private key. And if
the device has the appropriate capabilities, the server will send a message
describing the login attempt being made to his accountand a question to ask if
the user wants to proceed with that connection.

Then, for passive devices, the server should periodically validate the account
connection, by sending payloads to be signed. Any failure to properly sign a
payload, or to return that signature in time should result in the termination
of the current user session into his account.

Active devices shouldn't need to be polled once initialized, given the nature
of their design, as the polling is meant to protect against man-in-the-middle
attacks on themselves. But they don't protect against man-in-the-middle attacks
against the application this is supposed to protect. So they still should
respond to proof packets for a while, until the user is fully connected. In
order to do this, the server should send yes/no questions every few seconds,
waiting for a "yes" answer.


Math operations and protocol
~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Every value should be seen as a 20-bytes array. All the math below implies
that. Only the first 4 messages are described. All of these messages should
also be filled with a random unique identifier that the device must echo into
its answer, and the server should ignore any message that doesn't have the
propre uniqueid in there.

K is the secret key shared between the server and the device.
keys is the arrays of keys indexed by their serial number.
H is the HMAC:SHA1 function that takes a key and a value as argument.
Note: S and K are persistent across messages into the device and the server.
During the considerations section, S is going to be called scratchpad.


Key listing message:
-------------------

Server computes P = rand()
Server sends P to device
foreach N,K in keys,
  if isEnabled(N) then
    Device computes S = H(K, P)
    Device sends {N,S}
foreach N,S in packet
  Server computes S2 = H(keys[N], P)
  if S == S2 then
    Server registers N to device connection


Init message:
------------

Server selects the {N,K} pair to authorize the account's session
Server computes S = rand()
Server computes P = S ^ K
Server sends {N,P}
if isEnabled(N) then
  Device selects K = keys[N]
  Device computes S = P ^ K


Polling message:
---------------

Server computes P = rand()
Server computes S ^= P
Server sends P
Device computes S ^= P
Device computes R = H(K, S)
Device sends R
Server computes R2 = H(K, S)
if R != R2 then
  Server interrupts user's session


Yes/no question:
---------------

Server computes P1 = rand()
Server computes P2 = rand()
Server sends P1,P2
Device selects P = answer is yes ? P1 : P2
Device computes S ^= P
Device computes R = H(K, S)
Device sends R
Server computes S1 ^= P1
Server computes S2 ^= P2
Server computes R1 = H(K, S1)
Server computes R2 = H(K, S2)
if R == R1 then
  Server assigns S = S1
  Server takes "Yes" action
elseif R == R2 then
  Server assigns S = S2
  Server takes "No" action
else
  Server interrupts user's session


CONSIDERATIONS
==============

Security
~~~~~~~~

Although the privacy of the key is important, the security of the system mainly
relies on the fact the user is aware of the device's features and usage. Is is
critical the user is able to interact with its security device, and has been
sensibilized with the importance of its security.

Also, physically speaking, the device should give visual feedback when being
polled for proofs, so the user might be informed someone is hijacking his key
one way and another, and he should prevent it by disconnecting, shutting down,
or disabling the device. That still should be the primarily way of attacking
this device, as a man in the middle attack, with a proxy trojan. Thus the
importance of describing to the users the mechanism surrounding that attack,
and how to prevent it using the device properly.

HMAC:SHA1 with 20 bytes keys and values should provide the appropriate amount
of security against key extraction.

Furthermore, the internal scratchpad usage should prevent from replay and
oracle attacks.

The way to initialize the scratchpad should prevent a fake active device to act
as a proxy for a real passive device.


Implementation
~~~~~~~~~~~~~~

The active version is really meant for mobile phones. The fact the server
shouldn't poll from it is because you want to be able to use your phone while
connected to your account.

Also, the active version should take advantage of the fact it has a very
versatile interaction with the user, by displaying messages about who is
going to connect, and requesting confirmation.

Once the connection starts to be established, the server should send yes/no
questions in the "are you connected ?" way, and the mobile phone should always
answer "no", and display the message "press Ok when you're connected". Once the
user completes this action, the phone will answer "yes" to the next server's
question, and stop the application.

Also, it is critical the same screen should offer the opportunity to stop the
current user session, which would be done by stopping responding to the server
questions without answering "yes". That means the protected application should
have a way to make sure that, once connected, man-in-the-middle attacks weren't
possible and are not able to hijack the application's protocol. SRP is one way
to solve this problem.

The passive version is meant for USB devices or anything that are connected
to anything that may be heavily compromised in any way.

The device itself may have various dip switches or buttons to control the
capabilities it's allowing. The "dump the key in an unsecure way" is mainly
there for debugging purposes, but could still be enabled with dip switches.

The initial seeding of the devices isn't really specified there, even though
the Diffie-Hellman session is part of the protocol. But it could also be done
in another unspecified secure step using RSA with the server, in an unsecure
way by writing directly the key, or in an offline way, during the manufacturing
of the device, keeping a database of the pre-loaded serial and keys.