On the Internet, anyone can represent any identity, and it's difficult to verify those identities. Think of an organization that needs to track the activities of thousands of employees who work from all around the world. This can be a headache for the management. That’s how identity management became a fundamental requirement for any computer system. You can find the key terms related to identity management in the Additional Info section at the end of this blog.
Currently, most systems use centralized identity.
In centralized identity management, identity data is handled by a trusted central party on behalf of the identity owners. This produces the following problems:
The last few years were flooded with privacy violation incidents like the Cambridge Analytica scandal, Spain and France fining Facebook, and DeepMind illegal patient data sharing. In most privacy violation incidents, the identity owners did not have any idea about the privacy violation because they can’t see what’s happening inside the identity management system.
Another downside is the lack of controllability of your own data. Have you ever tried to delete your Facebook profile? You can’t entirely delete it. The information belongs to you, but you have no control over it, even to delete it.
Figure 2 (Source: ecija.com)
Identity honeypots are computer systems which consist of massive personal identifiable data. In centralized identity management, the identities are usually stored in one place, which creates an identity honey pot.
You might think that if you’ve stored your identities in a secure place, this won’t be a problem. But there isn’t a 100% secure system in this world. Every system is vulnerable. It’s a matter of time until an attacker finds the vulnerability. For example, Google is considered a tech giant in the computing world. But in 2014, 5 million email addresses and passwords were stolen including my email address and password. It was terrifying to see my password on a public database.
Figure 3 (Source: The LastPass Blog)
If we know the problems, why didn’t we use the decentralized identity management approach earlier?
Because of Zooko’s triangle. According to Zooko’s triangle, any naming system including identity systems could satisfy only two properties from the triangle:
More details here.
Therefore, most identity management systems gave up on decentralization. For example, our email addresses are secure (hypothetically) and mnemonic, but not decentralized.
But the trials to decentralize identity management did not stop.
At the first phase of the identity management history, the data was controlled by a centralized authority. Then federal identity management became popular since it allowed the identity data owners to use the same identity in several places with the owner's consent. However, again the federal identities were centralized due to the centralized data storage which is not transparent to the identity owner.
Then user-centric identity became popular with the ambition of giving the identity owners total control of their digital identities. But again, the identities were bound to the identity register. For example, the identity register can create a denial of service for the specific identities.
Self-sovereign identity was born to give full control to the identity owners. For more details read The Path to Self-Sovereign Identity.
Figure 4: The evolution of identity
The concept of self-sovereign identity was well defined with decades of experience. Let’s discuss what it really is.
According to the Sovrin foundation, the following are the principal goals of sovereign identities and its management:
The above principal goals are achieved by maintaining the following properties:
Figure 5 (Source: Christopher Allen)
For an identity to become truly self-sovereign, the infrastructure of the identities should reside in a
Indy shares three important virtues with the Internet: No one owns it. Everyone can use it. Anyone can improve it. (Phillip Windley, Sovrin Foundation)
In self-sovereign identity management, these are roles of the key players (see the Appendix) of identity management:
Decentralized identity management has become real with the Hyperledger Indy project — a distributed ledger, purpose-built for decentralized identity. You can store your identities in a location where you can edit or delete them. After giving the location of your identities, the Indy platform lists your location with a globally identifiable name. When someone wants to read your identity data, the Indy platform points out the location of your identities. Let’s look into the details of how it works.
First, the users can choose a human-memorable name for their identity. Then the platform converts the name to a unique identity called DID (decentralized identifiers). For example:
Did:example:123456789abcdefghi
When the DID is stored as the key, the metadata related to the DID is stored in DDO (DID descriptor objects). This key-value pair is called the DID record. For example:
{ "@context": "https://w3id.org/did/v1", "id": "did:example:123456789abcdefghi", "publicKey": [{ "id": "did:example:123456789abcdefghi#keys-1", "type": "RsaVerificationKey2018", "owner": "did:example:123456789abcdefghi", "publicKeyPem": "-----BEGIN PUBLIC KEY...END PUBLIC KEY-----\r\n" }], "authentication": [{ // this key can be used to authenticate as DID ...9938 "type": "RsaSignatureAuthentication2018", "publicKey": "did:example:123456789abcdefghi#keys-1" }], "service": [{ "type": "ExampleService", "serviceEndpoint": "https://example.com/endpoint/8377464" }] }
For more details go to https://w3c-ccg.github.io/did-spec/#the-generic-did-scheme.
Each DID record is assigned with a public-private key pair where the signatures of the documents can be created using the private key and the signature can be verified using the public key. A corresponding public key is published in the DDO under ‘piblicKey’. Each DID is associated with a separate DID method specification which is specific to a ledger or network. A DID method specifies the set of rules related to registration, resolution, modification, and revocation.
Since the ledger is public, it should not expose any Personally Identifiable Information (PII) directly on the ledger. To achieve the privacy of the public ledger, only the following items are stored in the ledger:
The above four basic pieces of data represent the identity owner through the off-ledger agents.
Figure 6: Peer-to-peer off-ledger agent interaction (Source: Indy project proposal)
A claim becomes a verified claim if there is proof of the claim. Indy supports cryptographically secure verifiable claims as follows. Let’s think Alice wants to prove her medical license to the hospital where she works at.
Figure 7: Example of a verifiable claim interaction (Source: Indy project proposal)
Sovrin identity architecture supports interchangeability between any Distributed Ledger Technology (DLT) including Blockchain. The following key items were designed to accomplish the interchangeability:
In self-sovereign identities, the hierarchical Public Key Infrastructure (PKI) model is replaced by the DID/DDO model. The peers in the network use their own DDOs as the root of trust. But DID and DDO key material and metadata can be used to generate an X.509 certificate which is the most widely established format. X.509 certificates can be used as verifiable claims with the DID/DDO model. In this way, hierarchical/federated identity systems can be coupled with decentralized identity systems.
As discussed above, there is no single exemplary ledger in Hyperledger Indy, but the system as a whole is the exemplary ledger. In the system, there might be malicious or faulty ledgers as well. But the system as a whole must come to a consensus to achieve reliability. For that, we need a consensus algorithm.
Figure 8: (Source: Hyperledger Architecture-Volume 1)
A consensus is a process by which a network of nodes provides a guaranteed ordering of transactions and validates the block of transactions. (Hyperledger Architecture-Volume 1). Consensus must provide the following functionality:
In Hyperledger, there are different types of consensus mechanisms. Because of that, validating transactions and the order of the transactions are logically separate processes and reusable in any consensus mechanism.
In Indy, both ordering and validation are done by RBFT (see the Appendix). Because of that, both ordered and validated transactions are in a single Blockchain ledger. The state is the projection of the Indy Blockchain ledger. The state is kept in a data structure called a Merkle Patricia tree and stored in a database as a collection of variables and their values. Hyperledger Indy stores Decentralized Identifiers (DIDs) as state variables with current verification key.
Even though the concept of decentralized identity was introduced long before Blockchain technology, we were not successful in achieving full decentralization. Hyperledger Indy is a platform for decentralized identity management that leverages Blockchain. It promises to make decentralized identity a reality with the following features:
Here, you can search for the terms that have been used in the article.
There are four key players in identity management:
In figure 11, the ID department (identity provider) issues an ID card to a person “P” who is the identity owner. “P” uses his ID card to create an account in a bank (identity user).
Figure 9: Key players in identity management
he followings are the general components of an identity management system:
Byzantine fault is any fault which presents different sympathies to different observers. In a network of replicated nodes, if one node shows a Byzantine fault, other nodes cannot come to an agreement about the fault node because the fault is different in the angle of each node. Byzantine fault tolerance is a fundamental problem in distributed computing which must be solved.
RBFT is an advanced Plenum Byzantine Fault Tolerance (PBFT) mechanism. In RBFT, a single node called primary node proposes new blocks and others update their ledgers using the proposed blocks. But that primary node can be smartly malicious and degrade the performance of the system without being detected by correct replicas. As a solution for that, the primary node’s performance is periodically compared with the other nodes’ average performance.
Figure 10 (Source: Aublin, Mokhtar & Quéma, 2013)
If there are “f” number of faulty nodes in the network, at least “3f + 1” nodes need to handle the faulty nodes. In figure 7, a network of 4 nodes that can handle 1 (3*1 + 1 = 4) faulty node, has 2 nodes running, one master and one backup. Each node can listen to one primary node.
Figure 8 shows the flow of updating the Blockchain ledger.
Figure 11: RBFT protocol steps (Source: Aublin et al., 2013)
The primary node updates the ledger while sending the proposal and other nodes update their ledger after receiving the COMMIT messages. These updates are called optimistic updates because they are done by assuming the proposal will be accepted. If the proposal gets rejected after some time, then the ledger will be reverted. To accept a transaction which is followed by another transaction, the previous transaction must be visible to the state.