researchblog
2018/06/13
June 13, 2018
13 min read

The Rise of Self-sovereign Identity - Hyperledger Indy

Figure 1: On the Internet, nobody knows you are a dog! (Source: Twitter)

Centralized Identity - What’s Wrong With It?

On the Internet, anyone can represent any identity, and it's difficult to verify those identities. Think of an organization that needs to track the activities of thousands of employees who work from all around the world. This can be a headache for the management. That’s how identity management became a fundamental requirement for any computer system. You can find the key terms related to identity management in the Additional Info section at the end of this blog.

Currently, most systems use centralized identity.

In centralized identity management, identity data is handled by a trusted central party on behalf of the identity owners. This produces the following problems:

  • Lack of transparency and control to identity owners
  • Indetity honeypots

Lack of Transparency and Control to Identity Owners

The last few years were flooded with privacy violation incidents like the Cambridge Analytica scandal, Spain and France fining Facebook, and DeepMind illegal patient data sharing. In most privacy violation incidents, the identity owners did not have any idea about the privacy violation because they can’t see what’s happening inside the identity management system.

Another downside is the lack of controllability of your own data. Have you ever tried to delete your Facebook profile? You can’t entirely delete it. The information belongs to you, but you have no control over it, even to delete it.

Figure 2 (Source: ecija.com)

Identity Honeypots

Identity honeypots are computer systems which consist of massive personal identifiable data. In centralized identity management, the identities are usually stored in one place, which creates an identity honey pot.

You might think that if you’ve stored your identities in a secure place, this won’t be a problem. But there isn’t a 100% secure system in this world. Every system is vulnerable. It’s a matter of time until an attacker finds the vulnerability. For example, Google is considered a tech giant in the computing world. But in 2014, 5 million email addresses and passwords were stolen including my email address and password. It was terrifying to see my password on a public database.

Figure 3 (Source: The LastPass Blog)

The History of Identity

If we know the problems, why didn’t we use the decentralized identity management approach earlier?

Because of Zooko’s triangle. According to Zooko’s triangle, any naming system including identity systems could satisfy only two properties from the triangle:

  • Secure - When you search for an object by its name, you should find the actual object, not any other object
  • Decentralized - No central authority controls all the names
  • Mnemonic (human-readable) - The name is something you can actually remember instead of some random string

More details here.

Therefore, most identity management systems gave up on decentralization. For example, our email addresses are secure (hypothetically) and mnemonic, but not decentralized.

But the trials to decentralize identity management did not stop.

Evolution of Identity

At the first phase of the identity management history, the data was controlled by a centralized authority. Then federal identity management became popular since it allowed the identity data owners to use the same identity in several places with the owner's consent. However, again the federal identities were centralized due to the centralized data storage which is not transparent to the identity owner.

Then user-centric identity became popular with the ambition of giving the identity owners total control of their digital identities. But again, the identities were bound to the identity register. For example, the identity register can create a denial of service for the specific identities.

Self-sovereign identity was born to give full control to the identity owners. For more details read The Path to Self-Sovereign Identity.

Figure 4: The evolution of identity

Decentralized Identity Management

The concept of self-sovereign identity was well defined with decades of experience. Let’s discuss what it really is.

Self-sovereign Identity (Decentralized Identity or Independent Identity)

According to the Sovrin foundation, the following are the principal goals of sovereign identities and its management:

  1. Security - The identity information must be protected from unintentional disclosure
  2. Controllability - The identity owner must be in control of who can see and access their data and for what purposes
  3. Portability - Users must be able to expose their identity wherever they want and not be tied to a single provider

The above principal goals are achieved by maintaining the following properties:

  1. Existence - Users must have an independent existence
  2. Control - Users must control their identities
  3. Access - Users must have access to their own data
  4. Transparency - Systems and algorithms must be transparent
  5. Persistence - Identities must be long-lived
  6. Portability - Information and services about identity must be transportable
  7. Interoperability - Identities should be as widely usable as possible
  8. Consent - Users must agree to the use of their identity
  9. Minimization - Disclosure of claims must be minimized
  10. Protection - The rights of users must be protected

Figure 5 (Source: Christopher Allen)

For an identity to become truly self-sovereign, the infrastructure of the identities should reside in a decentralized platform, not only in a distributed platform. That’s where Blockchain technology comes into play. The Blockchain ledger is a distributed and decentralized ledger. Every player in the system has their own ledger which can be maintained according to the desires of the player. There is no single exemplary ledger in the system, but the system as a whole is the exemplary ledger. But by analyzing the different ledgers in the system, it’s possible to validate the actual ledger.

Indy shares three important virtues with the Internet: No one owns it. Everyone can use it. Anyone can improve it. (Phillip Windley, Sovrin Foundation)

In self-sovereign identity management, these are roles of the key players (see the Appendix) of identity management:

  • There is no centralized identity provider, but a well-defined standard to create, share and revoke identities as identity owners wish.
  • The identity owners can have their own set of independent identities that are verified by their own set of public/private key pairs. The “root” of this identity system is no longer a central party, but a distributed ledger which is managed by key players in identity management.
  • The identity validator is also not a centralized party, but the individual parties which have gained trust. They can validate identities by the private key of the identity validator. Any outside party can verify the identity using the public key of the identity validator.
  • The identity users can use the distributed ledger of identities to get the identities and verify it by themselves.

Self-sovereign Identity with Hyperledger Indy

Decentralized identity management has become real with the Hyperledger Indy project — a distributed ledger, purpose-built for decentralized identity. You can store your identities in a location where you can edit or delete them. After giving the location of your identities, the Indy platform lists your location with a globally identifiable name. When someone wants to read your identity data, the Indy platform points out the location of your identities. Let’s look into the details of how it works.

First, the users can choose a human-memorable name for their identity. Then the platform converts the name to a unique identity called DID (decentralized identifiers). For example:

Did:example:123456789abcdefghi

When the DID is stored as the key, the metadata related to the DID is stored in DDO (DID descriptor objects). This key-value pair is called the DID record. For example:

{
  "@context": "https://w3id.org/did/v1",
  "id": "did:example:123456789abcdefghi",
  "publicKey": [{
    "id": "did:example:123456789abcdefghi#keys-1",
    "type": "RsaVerificationKey2018",
    "owner": "did:example:123456789abcdefghi",
    "publicKeyPem": "-----BEGIN PUBLIC KEY...END PUBLIC KEY-----\r\n"
  }],
  "authentication": [{
    // this key can be used to authenticate as DID ...9938
    "type": "RsaSignatureAuthentication2018",
    "publicKey": "did:example:123456789abcdefghi#keys-1"
  }],
  "service": [{
    "type": "ExampleService",
    "serviceEndpoint": "https://example.com/endpoint/8377464"
  }]
}

For more details go to https://w3c-ccg.github.io/did-spec/#the-generic-did-scheme.

Each DID record is assigned with a public-private key pair where the signatures of the documents can be created using the private key and the signature can be verified using the public key. A corresponding public key is published in the DDO under ‘piblicKey’. Each DID is associated with a separate DID method specification which is specific to a ledger or network. A DID method specifies the set of rules related to registration, resolution, modification, and revocation.

Off-ledger Interactions

Since the ledger is public, it should not expose any Personally Identifiable Information (PII) directly on the ledger. To achieve the privacy of the public ledger, only the following items are stored in the ledger:

  1. DIDs - Which in the vast majority of cases will be pairwise-unique
  2. Public keys - Enables encrypted communications with the identity owner
  3. Service endpoints - Enables interaction with the identity owner via an associated agent (see figure 6)
  4. Proofs - Hashed or zero-knowledge proof artifacts that enable identity validators, identity owners, and relying parties to prove the validity period of that particular information

The above four basic pieces of data represent the identity owner through the off-ledger agents.

Figure 6: Peer-to-peer off-ledger agent interaction (Source: Indy project proposal)

Verifiable Claims and Zero Knowledge Proofs

A claim becomes a verified claim if there is proof of the claim. Indy supports cryptographically secure verifiable claims as follows. Let’s think Alice wants to prove her medical license to the hospital where she works at.

  1. If Alice is a doctor registered under the Doctor’s Association, Alice asks the association to send a verifiable claim to the hospital to prove the validity of her medical license.
  2. Then the Doctor’s Association creates a DID on the ledger with a claim schema with the public key P.
  3. Then the Doctor’s Association creates a document that shows the validity of Alice’ medical license and signs the document with the private key of the P.
  4. Then the signed document (or its digest) is added to the ledger as a transaction under the DID of the Doctor’s Association.
  5. Then the hospital verifies the signed document using public key P of the Doctor’s Association and makes sure that the document is signed by the Doctor’s Association.

Figure 7: Example of a verifiable claim interaction (Source: Indy project proposal)

Interoperability of Decentralized Identity Across DLTs

Sovrin identity architecture supports interchangeability between any Distributed Ledger Technology (DLT) including Blockchain. The following key items were designed to accomplish the interchangeability:

  • DIDs - Globally unique identifiers that don’t require any centralized resolution authority (can be resolved by a distributed ledger). More details on DID Data Model and Generic Syntax.
  • DDOs - A JSON object which exposes the public keys and service endpoints for interacting with the entity identified by the DID. More details on DID Data Model and Generic Syntax.
  • Verifiable claims - The standard format for exchange of digital identity attributes and their relationships. More details on Verifiable Claims Task Force.
  • Agents - A standard for communication between the agents that support off-ledger activities.

Interoperability of Digital Certificates

In self-sovereign identities, the hierarchical Public Key Infrastructure (PKI) model is replaced by the DID/DDO model. The peers in the network use their own DDOs as the root of trust. But DID and DDO key material and metadata can be used to generate an X.509 certificate which is the most widely established format. X.509 certificates can be used as verifiable claims with the DID/DDO model. In this way, hierarchical/federated identity systems can be coupled with decentralized identity systems.

Consensus in Hyperledger Indy

As discussed above, there is no single exemplary ledger in Hyperledger Indy, but the system as a whole is the exemplary ledger. In the system, there might be malicious or faulty ledgers as well. But the system as a whole must come to a consensus to achieve reliability. For that, we need a consensus algorithm.

Figure 8: (Source: Hyperledger Architecture-Volume 1)

Consensus

A consensus is a process by which a network of nodes provides a guaranteed ordering of transactions and validates the block of transactions. (Hyperledger Architecture-Volume 1). Consensus must provide the following functionality:

  1. Validate the proposed block and its transactions according to endorsement and consensus policies
  2. Agree on order and correctness and hence on results of execution (implies agreement on global state)
  3. Interface with and depend on the smart-contract layer to verify the correctness of an ordered set of transactions in a block

In Hyperledger, there are different types of consensus mechanisms. Because of that, validating transactions and the order of the transactions are logically separate processes and reusable in any consensus mechanism.

Hyperledger Indy — state

In Indy, both ordering and validation are done by RBFT (see the Appendix). Because of that, both ordered and validated transactions are in a single Blockchain ledger. The state is the projection of the Indy Blockchain ledger. The state is kept in a data structure called a Merkle Patricia tree and stored in a database as a collection of variables and their values. Hyperledger Indy stores Decentralized Identifiers (DIDs) as state variables with current verification key.

Conclusion

Even though the concept of decentralized identity was introduced long before Blockchain technology, we were not successful in achieving full decentralization. Hyperledger Indy is a platform for decentralized identity management that leverages Blockchain. It promises to make decentralized identity a reality with the following features:

  • Accessible provenance for trust transactions - Indy provides a universal platform for exchanging trustworthy claims, which produces a trusted provenance.
  • User-controlled exchange of verifiable claims - The Indy public permissioned network is open to all. Also, it is the identity owner who gives permission for his or her identity to be validated.
  • Rock-solid revocation model - Indy has a rock-solid revocation model for cases where those verifiable claims are no longer true.
  • Privacy on a public ledger - “Privacy By Design” has been included to Indy’s architecture.

Appendix

Here, you can search for the terms that have been used in the article.

The Key Players in Identity Management

There are four key players in identity management:

  • Identity owners (the person who is represented by the identity)
  • Identity providers (the person who assigns identities)
  • Identity validators (the person who validates the identity of a person or the person by an identity)
  • Identity users (the person who needs identities for their business/applications)

In figure 11, the ID department (identity provider) issues an ID card to a person “P” who is the identity owner. “P” uses his ID card to create an account in a bank (identity user).

Figure 9: Key players in identity management

The Key Components of Identity Management

he followings are the general components of an identity management system:

  • Administration and provisioning - Provides administration and provisioning services for the identities managed by the identity management infrastructure such as account management.
  • Identity governance - The decision services for the identity management infrastructure itself. It includes (governing roles, permissions and identity analytics).
  • Identity federation - It is an arrangement that enables a user to use the same identity in multiple places without disclosing sensitive information about the identity. It might include tasks such as single sign-on, single log-out, and session management.
  • Authentication - When a user accesses an application, it validates the user credentials using the services provided by the identity management infrastructure. The authentication and the associated communication to the application are accomplished with the identity policy assertion services.
  • Authorization - Once authenticated, the application must also verify whether the user has sufficient privileges over resources protected by the application.

RBFT — Redundant Byzantine Fault Tolerance in Blockchain

Byzantine fault is any fault which presents different sympathies to different observers. In a network of replicated nodes, if one node shows a Byzantine fault, other nodes cannot come to an agreement about the fault node because the fault is different in the angle of each node. Byzantine fault tolerance is a fundamental problem in distributed computing which must be solved.

RBFT is an advanced Plenum Byzantine Fault Tolerance (PBFT) mechanism. In RBFT, a single node called primary node proposes new blocks and others update their ledgers using the proposed blocks. But that primary node can be smartly malicious and degrade the performance of the system without being detected by correct replicas. As a solution for that, the primary node’s performance is periodically compared with the other nodes’ average performance.

Figure 10 (Source: Aublin, Mokhtar & Quéma, 2013)

If there are “f” number of faulty nodes in the network, at least “3f + 1” nodes need to handle the faulty nodes. In figure 7, a network of 4 nodes that can handle 1 (3*1 + 1 = 4) faulty node, has 2 nodes running, one master and one backup. Each node can listen to one primary node.

Figure 8 shows the flow of updating the Blockchain ledger.

  • The client sends a transaction to “f + 1” number of nodes (there’s no need to send to all nodes).
  • After receiving the transaction, the nodes send the transactions to all other nodes through PROPAGATE.
  • Then, each primary node creates a proposal from the received transactions called PRE-PREPARE and sends it to all other nodes.
  • If the proposal is accepted by the nodes, they send an acknowledgment to the proposal by a message called PREPARE.
  • If a node gets a PRE-PREPARE proposal and “2f” PREPARE messages, it sends a COMMIT messages to all other nodes.
  • Once a node gets “2f+1” COMMIT messages, the proposal is added to the ledger of the node.
  • The primary can make another proposal without adding the previous proposal to the ledger. Proposed but not committed changes are called uncommitted changes.

Figure 11: RBFT protocol steps (Source: Aublin et al., 2013)

Optimistic Updates

The primary node updates the ledger while sending the proposal and other nodes update their ledger after receiving the COMMIT messages. These updates are called optimistic updates because they are done by assuming the proposal will be accepted. If the proposal gets rejected after some time, then the ledger will be reverted. To accept a transaction which is followed by another transaction, the previous transaction must be visible to the state.

Indy Code Base

Hyperledger Project