Centralized Identity - What’s Wrong With It?
On the Internet, anyone can represent any identity, and it's difficult to verify those identities. Think of an organization that needs to track the activities of thousands of employees who work from all around the world. This can be a headache for the management. That’s how identity management became a fundamental requirement for any computer system. You can find the key terms related to identity management in the Additional Info section at the end of this blog.
Currently, most systems use centralized identity.
In centralized identity management, identity data is handled by a trusted central party on behalf of the identity owners. This produces the following problems:
- Lack of transparency and control to identity owners
- Indetity honeypots
Lack of Transparency and Control to Identity Owners
In 2017, Facebook was fined in Spain and France for violating the privacy laws of the users. The Spanish data protection authority (AEPD) said Facebook had been breaking privacy rules on multiple counts over the way it uses people’s personal data for advertising purposes. The Commission Nationale de l’Informatique et des Libertés (CNIL) said Facebook has failed to properly inform users of how their personal data is tracked and shared with advertisers. Also, there was another incident of illegally sharing the records of 1.6M patients by British healthcare providers for Google artificial intelligence lab DeepMind.
In most privacy violation incidents like above, the identity owners have no idea about the privacy violation because they can’t see what’s happening inside the identity management system.
Another downside is the lack of controllability of your own data. Have you ever tried to delete your Facebook profile? You can’t entirely delete it. The information belongs to you, but you have no control over it, even to delete it.
Figure 2 (Source: ecija.com)
Honeypots are valuable computer systems such as massive data storages. Because of the value of those computer systems, honeypots lure attackers. The attackers are especially attracted to identity honeypots because of the sensitivity of the data. In centralized identity management, the identities are usually stored in one place, which creates an identity honey pot.
You might think that if you’ve stored your identities in a secure place, this won’t be a problem. But there isn’t a 100% secure system in this world. Every system is vulnerable. It’s a matter of time until an attacker finds the vulnerability. For example, Google is considered a tech giant in the computing world. But in 2014, 5 million email addresses and passwords were leaked including my email address and password.
Figure 3 (Source: The LastPass Blog)
The History of Identity
If we know the problems, why didn’t we use the decentralized identity management approach earlier?
Because of Zooko’s triangle. According to Zooko’s triangle, any naming system including identity systems could satisfy only two properties from the triangle:
- Secure - When you search for an object by its name, you should find the actual object, not any other object
- Decentralized - No central authority controls all the names
- Mnemonic (human-readable) - The name is something you can actually remember instead of some random string
Therefore, most identity management systems gave up on decentralization. For example, our email addresses are secure (hypothetically) and mnemonic, but not decentralized.
But the trials to decentralize identity management did not stop.
Evolution of Identity
As mentioned before, identity management used to be centralized and controlled by a single authority. Later came the era of federated identity management where federated authorities started controlling identity management systems. Federated identities allowed users to utilize the same identity in multiple places. But federated identity was again centralized at the identity provider’s side and sharing identities was not transparent to the identity owner.
Then user-centric identity came in with the ambition of giving the identity owners complete control of their digital identities. But again, the identities were bound to the identity register. For example, the identity register can stop giving access to the user’s identity without consulting the identity owner.
Self-sovereign identity was born to give full control to the identity owners. For more details read The Path to Self-Sovereign Identity.
Figure 4: The evolution of identity
Decentralized Identity Management
The concept of self-sovereign identity was well defined with decades of experience. Let’s discuss what it really is.
Self-sovereign Identity (Decentralized Identity or Independent Identity)
According to the Sovrin foundation, the following are the principal goals of sovereign identities and its management:
- Security - The identity information must be protected from unintentional disclosure
- Controllability - The identity owner must be in control of who can see and access their data and for what purposes
- Portability - Users must be able to expose their identity wherever they want and not be tied to a single provider
The above principal goals are achieved by maintaining the following properties:
- Existence - Users must have an independent existence
- Control - Users must control their identities
- Access - Users must have access to their own data
- Transparency - Systems and algorithms must be transparent
- Persistence - Identities must be long-lived
- Portability - Information and services about identity must be transportable
- Interoperability - Identities should be as widely usable as possible
- Consent - Users must agree to the use of their identity
- Minimization - Disclosure of claims must be minimized
- Protection - The rights of users must be protected
Figure 5 (Source: Christopher Allen)
For an identity to become truly self-sovereign, the infrastructure of the identities should reside in a decentralized platform, not only in a distributed platform. That’s where Blockchain technology comes into play. The Blockchain ledger is a distributed and decentralized ledger. Every player in the system has their own ledger which can be maintained according to the desires of the player. There is no single exemplary ledger in the system, but the system as a whole is the exemplary ledger. But by analyzing the different ledgers in the system, it’s possible to validate the actual ledger.
Indy shares three important virtues with the Internet: No one owns it. Everyone can use it. Anyone can improve it. (Phillip Windley, Sovrin Foundation)
In self-sovereign identity management, these are roles of the key players (see Additional Info) of identity management:
- There is no centralized identity provider, but a well-defined standard to create, share and revoke identities as identity owners wish.
- The identity owners can have their own set of independent identities that are verified by their own set of public/private key pairs. The “root” of this identity system is no longer a central party, but a distributed ledger which is managed by key players in identity management.
- The identity validator is also not a centralized party, but the individual parties which have gained trust. They can validate identities by the private key of the identity validator. Any outside party can verify the identity using the public key of the identity validator.
- The identity users can use the distributed ledger of identities to get the identities and verify it by themselves.
Self-sovereign Identity with Hyperledger Indy
Decentralized identity management has become real with the Hyperledger Indy project — a distributed ledger, purpose-built for decentralized identity. You can store your identities in a location where you can edit or delete them. After giving the location of your identities, the Indy platform lists your location with a globally identifiable name. When someone wants to read your identity data, the Indy platform points out the location of your identities. Let’s look into the details of how it works.
In Hyperledger Indy users can assign a human-memorable name for their identity. On the ledger, the identity name is converted into a unique identity key called DID (decentralized identifiers). Also, there is a value associated with the key called DDO (DID descriptor objects). This key-value pair is called the DID record.
The identity owners can be identified on a ledger with the DID record. Each DID record is cryptographically secured by private keys of the identity owner or the guardian. A corresponding public key of the key-pair is published in the DDO using a key description. A DDO may also contain a set of service endpoints for interacting with the identity owner.
Each DID is associated with a separate DID method specification which is specific to a ledger or network. A DID method specifies the set of rules for how a DID is registered, resolved, updated, and revoked on that specific ledger or network.
Since the ledger is public, it should not expose any Personally Identifiable Information (PII) directly on the ledger. To achieve the privacy of the public ledger, only the following items are stored in the ledger:
- DIDs - Which in the vast majority of cases will be pairwise-unique
- Public keys - Enables encrypted communications with the identity owner
- Service endpoints - Enables interaction with the identity owner via an associated agent (see figure 6)
- Proofs - Hashed or zero-knowledge proof artifacts that enable identity validators, identity owners, and relying parties to prove the validity period of that particular information
The above four basic pieces of data represent the identity owner through the off-ledger agents.
Figure 6: Peer-to-peer off-ledger agent interaction (Source: Indy project proposal)
Verifiable Claims and Zero Knowledge Proofs
A verifiable claim is a requirement when the relying parties need to verify the identity owners’ information. Claims can be issued against pairwise-unique DIDs registered on the ledger, signed with the issuer’s private key, and verified by the issuer’s public key from their DDO. Claims exchange and verification can all be handled by off-ledger agents using the service endpoints discovered in the DDO.
Also, Indy provides zero-knowledge proof - a revocation model for cases where those verifiable claims are no longer true.
Let’s take an example of exchanging the details (claims) of Doctor Anna to a hospital.
- Before issuing claims, a Doctor’s Licensing Authority (identity validator) creates a claim schema (or uses an existing one), creates public keys, and creates a revocation registry entry on the ledger.
- The Doctor’s Licensing Authority supplies Doctor Anna with a verifiable claim confirming she is a medical doctor via pairwise DID A.
- Doctor Anna presents proof of a subset of her claim — whatever she chooses — to a hospital (relying party or the identity user) she’s applying to work for via pairwise DID B. Anna also includes proof that her claim has not been revoked by the identity validator.
- The hospital verifies the authenticity of the claim without contacting the identity validator (from Indy Project proposal)
Figure 7: Example of a verifiable claim interaction (Source: Indy project proposal)
Interoperability of Decentralized Identity Across DLTs
Sovrin identity architecture supports interchangeability between any Distributed Ledger Technology (DLT) including Blockchain. The following key items were designed to accomplish the interchangeability:
- DIDs - Globally unique identifiers that don’t require any centralized resolution authority (can be resolved by a distributed ledger). More details on DID Data Model and Generic Syntax.
- DDOs - A JSON object which exposes the public keys and service endpoints for interacting with the entity identified by the DID. More details on DID Data Model and Generic Syntax.
- Verifiable claims - The standard format for exchange of digital identity attributes and their relationships. More details on Verifiable Claims Task Force.
- Agents - A standard for communication between the agents that support off-ledger activities.
Interoperability of Digital Certificates
In self-sovereign identities, the hierarchical PKI model is replaced by the DID/DDO model. The peers in the network use their own DDOs as the root of trust. But DID and DDO key material and metadata can be used to generate an X.509 certificate which is the most widely established format. X.509 certificates can be used as verifiable claims with the DID/DDO model. In this way, hierarchical/federated identity systems can be coupled with decentralized identity systems.
Consensus in Hyperledger Indy
As discussed above, there is no single exemplary ledger in Hyperledger Indy, but the system as a whole is the exemplary ledger. In the system, there might be malicious or faulty ledgers as well. But the system as a whole must come to a consensus to achieve reliability. For that, we need a consensus algorithm.
Figure 8: (Source: Hyperledger Architecture-Volume 1)
A consensus is a process by which a network of nodes provides a guaranteed ordering of transactions and validates the block of transactions (Hyperledger Architecture-Volume 1). Consensus must provide the following functionality:
- Validate the proposed block and its transactions according to endorsement and consensus policies
- Agree on order and correctness and hence on results of execution (implies agreement on global state)
- Interface with and depend on the smart-contract layer to verify the correctness of an ordered set of transactions in a block
In Hyperledger, there are different types of consensus mechanisms. Because of that, validating transactions and the order of the transactions are logically separate processes and reusable in any consensus mechanism.
RBFT — Redundant Byzantine Fault Tolerance in Blockchain
Byzantine fault is any fault which presents different sympathies to different observers. In a network of replicated nodes, if one node shows a Byzantine fault, other nodes cannot come to an agreement about the fault node because the fault is different in the angle of each node. Byzantine fault tolerance is a fundamental problem in distributed computing which must be solved.
RBFT is an advanced Plenum Byzantine Fault Tolerance (PBFT) mechanism. In RBFT, a single node called primary node proposes new blocks and others updates their ledgers using proposed blocks. But that primary node can be smartly malicious and degrade the performance of the system without being detected by correct replicas. As a solution for that, the primary node’s performance is periodically compared with the other nodes’ average performance.
Figure 9 (Source: Aublin, Mokhtar & Quéma, 2013)
If there are “f” number of faulty nodes in the network, at least “3f + 1” nodes need to handle the faulty nodes. In figure 7, a network of 4 nodes that can handle 1 (3*1 + 1 = 4) faulty node, has 2 nodes running, one master and one backup. Each node can listen to one primary node.
Figure 9 shows the flow of updating the Blockchain ledger.
- The client sends a transaction to “f + 1” number of nodes (there’s no need to send to all nodes).
- After receiving the transaction, the nodes send the transactions to all other nodes through PROPAGATE.
- Then, each primary node creates a proposal from the received transactions called PRE-PREPARE and sends it to all other nodes.
- If the proposal is accepted by the nodes, they send an acknowledgment to the proposal by a message called PREPARE.
- If a node gets a PRE-PREPARE proposal and “2f” PREPARE messages, it sends a COMMIT messages to all other nodes.
- Once a node gets “2f+1” COMMIT messages, the proposal is added to the ledger of the node.
- The primary can make another proposal without adding the previous proposal to the ledger. Proposed but not committed changes are called uncommitted changes.
Figure 10: RBFT protocol steps (Source: Aublin et al., 2013)
Hyperledger Indy — state
In Indy, both ordering and validation are done by RBFT. Because of that, both ordered and validated transactions are in a single Blockchain ledger. The state is the projection of the Indy Blockchain ledger. The state is kept in a data structure called a Merkle Patricia tree and stored in a database as a collection of variables and their values. Hyperledger Indy stores Decentralized Identifiers (DIDs) as state variables with values including the current verification key.
The primary node updates the ledger while sending the proposal and other nodes update their ledger after receiving the COMMIT messages. These updates are called optimistic updates because they are done by assuming the proposal will be accepted. If the proposal gets rejected after some time, then the ledger will be reverted. To accept a transaction which is followed by another transaction, the previous transaction must be visible to the state.
Even though the concept of decentralized identity was introduced long before Blockchain technology, we were not successful in achieving full decentralization. Hyperledger Indy is a platform for decentralized identity management that leverages Blockchain. It promises to make decentralized identity a reality with the following features:
- Accessible provenance for trust transactions - Indy provides a universal platform for exchanging trustworthy claims, which produces a trusted provenance.
- User-controlled exchange of verifiable claims - The Indy public permissioned network is open to all. Also, it is the identity owner who gives permission for his or her identity to be validated.
- Rock-solid revocation model - Indy has a rock-solid revocation model for cases where those verifiable claims are no longer true.
- Privacy on a public ledger - “Privacy By Design” has been included to Indy’s architecture.
Here, you can search for the terms that have been used in the article.
The key players in identity management
There are four key players in identity management:
- Identity owners (the person who is represented by the identity)
- Identity providers (the person who assigns identities)
- Identity validators (the person who validates the identity of a person or the person by an identity)
- Identity users (the person who needs identities for their business/applications)
In figure 11, the ID department (identity provider) issues an ID card to a person “P” who is the identity owner. “P” uses his ID card to create an account in a bank (identity user).
Figure 11: Key players in identity management
The key components of identity management
he followings are the general components of an identity management system:
- Administration and provisioning - Provides administration and provisioning services for the identities managed by the identity management infrastructure such as account management.
- Identity governance - The decision services for the identity management infrastructure itself. It includes (governing roles, permissions and identity analytics).
- Identity federation - It is an arrangement that enables a user to use the same identity in multiple places without disclosing sensitive information about the identity. It might include tasks such as single sign-on, single log-out, and session management.
- Authentication - When a user accesses an application, it validates the user credentials using the services provided by the identity management infrastructure. The authentication and the associated communication to the application are accomplished with the identity policy assertion services.
- Authorization - Once authenticated, the application must also verify whether the user has sufficient privileges over resources protected by the application.