Tuesday, April 5, 2022
HomeArtificial IntelligenceStudy finest practices for debugging and error dealing with in an enterprise-grade...

Study finest practices for debugging and error dealing with in an enterprise-grade blockchain utility – IBM Developer


Blockchain is a shared, replicated immutable ledger for recording transactions, monitoring property, and constructing belief. An asset will be tangible (for instance, a home or a automotive) or intangible (for instance, mental property or patents). Blockchain is constructed on properties like consensus, provenance, immutability, finality.

In a standard enterprise situation, a transaction that includes a number of organizations is recorded otherwise by every enterprise. If two organizations disagree on the state of a transaction, then a dispute happens, which might typically be pricey and time consuming to resolve. Blockchain introduces the next ideas:

  • Multiparty transactions: Signed by everybody concerned within the transaction.
  • Shared ledgers: The identical ledger is replicated in each group within the community and saved synchronized utilizing a course of known as consensus. Ledgers are immutable and closing; after a multiparty transaction is written to the ledger, it can’t be reversed.

To get an outline of blockchain in additional element, try the Get began with blockchain studying path. This weblog put up focuses on completely different factors of failures in a typical blockchain-based utility, doable causes for failures, and really helpful debugging and error dealing with strategies.

The significance of error dealing with

In cloud-based utility deployments that contain a number of integration factors, there’s at all times a risk of encountering transient failures. Planning for and dealing with these transient failures is necessary to keep up a resilient structure.

Unhandled error situations can result in failures and system crashes, they usually typically expose the applying in a weak state. Good exception dealing with might help to anticipate errors or programs crashes prematurely after which put in acceptable code to get well from them. It won’t be doable to deal with all distinctive instances or surprising situations, however a properly designed system ensures swish exit with out inflicting any main points, inconsistencies, and safety vulnerabilities to the system.

Useful blockchain terminology

  • Peer: Maintains ledger and state, commits transactions, and may endorse transactions by receiving a transaction proposal and responding by granting or denying endorsement (should maintain good contract to endorse).
  • Ordering node: Orders and packages transactions into blocks after which communicates these blocks to committing friends.
  • CA: Points digital certificates to member organizations and their customers.
  • Channels: Channels present privateness between completely different ledgers. Ledgers exist within the scope of a channel.
  • Sensible contract: Comprises the enterprise logic that governs how information is written to and browse from the ledger.
  • Transaction: Any operation that modifies the ledger state is recorded as a transaction, resembling asset alternate or a switch.
  • Ledger: A ledger is maintained by every peer and contains the blockchain and world state.
  • Identification: The assets that you could entry in a community are decided by the identification and is usually represented by an X.509 certificates issued by the CA.
  • Endorsement coverage: Describes the situations by which a transaction will be endorsed. A transaction can solely be thought-about legitimate if it has been endorsed in keeping with its coverage.
  • Connection profile: Comprises community info resembling node stage connection info, TLS certs, and extra.

The next picture represents the steps concerned in an everyday blockchain transaction circulate:

Image showing a regular blockchain transaction flow

  1. Shopper utility submits a transaction proposal.
  2. Endorsers E0, E1, and E2 every execute the proposed transaction. None of those executions replace the ledger. Every execution captures the set of Learn Write information (known as an RW-set), which now flows within the material.
  3. Software receives responses. The RW-sets are signed by every endorser and likewise embody every document model quantity.
  4. Software submits proposal responses as a transaction for ordering.
  5. Orderer sends blocks to committing friends.
  6. Committing friends validate transactions. Validated transactions are utilized to the world state and retained on the ledger.
  7. Software is notified when a block is dedicated to the ledger of a peer.

Potential errors in a blockchain community

Errors can come up at any level of the transaction and are attributable to the underlying community, enterprise logic, outdated information, or community time-outs. You may resolve errors that aren’t attributable to utility enter or incorrect utilization by including the proper error dealing with and retry mechanism within the utility layer to construct in utility resiliency.

Errors usually fall into one in every of two classes:

  • Retryable: Errors from chain code or the community ought to be propagated again to the applying layer for error dealing with and a retry mechanism.
  • Nonretryable: Errors which can be attributable to incorrect utilization ought to be dealt with resulting in a swish exit of the code path.

Community errors

A Hyperledger Material node or Java consumer communicates with the Hyperledger Material community utilizing gRPC. The gRPC know-how handles shifting information reliably between the Material community and the Material consumer utility. The appliance units the gRPC settings based mostly on the applying utilization.

Downside abstract

grpc request timeout whereas submitting the proposal

This timeout sometimes occurs at any stage of blockchain transaction circulate. Within the earlier picture, the trip happens throughout Steps 1-4 the place the consumer communicates with the community attributable to community latency, unavailability of the chain code container, or poor peer well being.

Error message:

sendPeersProposal - Promise is rejected: Error: REQUEST_TIMEOUT
Peer{ id: 1 , title: peer1.org1.instance.com, channelName: mychannel,
url:grpcs://192.168.1.1:7051, mspid: Org1MSP} failed due to
timeout(35000 milliseconds) expiration
java.util.concurrent.TimeoutException: null

Request timeout throughout the execution of a proposal

This error happens when the time taken for executing the proposal exceeds the configured execution time default of 30 seconds. As a finest apply, it is best to restrict the computations or operations that may be carried out in a sensible contract, and if the applying logic doesn’t help the identical, then you may enhance the execution timeout to mitigate the difficulty.

This error will be thrown by the lifecycle system chaincode (LSCC) in Hyperledger Material as properly throughout the startup of the chaincode container. Nonetheless, the decision stays the identical on this case as properly.

Error message:

"peer1.org.com:7051" failed: message=did not execute transaction 799eb959954a7f2f8f75dee735969a4ba374b4bc98b4bbacd2fc85fc57a860b9: error sending: timeout expired whereas executing transaction, stack=Error: did not execute transaction 799eb959954a7f2f8f75dee735969a4ba374b4bc98b4bbacd2fc85fc57a860b9: error sending: timeout expired whereas executing transaction

  1. Set CORE_CHAINCODE_EXECUTETIMEOUT =<60s or increased> within the Material configuration for dealing with the request timeout throughout the execution of a proposal.
  2. Set the next gRPC settings on the Material consumer finish to assist enhance the gRPC timeout, which usually occurs due to community latency:

    "grpc.keepalive_time_ms": 120000,
    "grpc.http2.min_time_between_pings_ms": 120000,
    "grpc.keepalive_timeout_ms": 20000,
    

  3. Verify the Material consumer trip configuration and tune it based mostly on the applying processing logic and community suggestions.
  4. Verify the peer well being and the well being of IO operations on the peer, and if the peer just isn’t wholesome, a restart may resolve the difficulty.
  5. Shopper-side retry dealing with: Timeout Exception ought to be categorised as a retryable error and dealt with by writing retry logic on the consumer facet. You should use a easy retry dealing with of this exception with exponential backoff to get well whether it is attributable to intermittent community points.

MVCC_READ_CONFLICT errors

Learn-write units are generated by the peer when a transaction is submitted to a peer. This learn/write set is then used when the transaction is dedicated to the ledger. It comprises the title of the variables to be learn/written and their model once they have been learn.

Downside abstract

Each peer within the community (VSCC) validates the variety of signed proposals and the model of each learn key within the read-write units in opposition to the world state upon receiving the blocks from the ordering service.

If, throughout the time between set creation and committing, a unique transaction was dedicated and altered the model of the important thing within the peer’s present world state, then the unique transaction is rejected throughout committal as a result of the model when learn just isn’t the present model. This error is usually seen throughout commit.

Error message:

Peerpeer1.org.com:7051 has rejected transaction "c91172484bad08eaae2595522a0a8c0a30891b4a90110e0a4fc490c0aacdb399" with code "MVCC_READ_CONFLICT" , validationCode=11

  1. To handle this situation, on the design stage, you may create information and transaction buildings that keep away from enhancing the identical key concurrently. Check out the Hyperledger Material samples for an instance of how to do that.

  2. The appliance must keep away from key collisions as a lot as doable and may want to jot down retry logic on the consumer facet. The retry logic ought to question the most recent state of the important thing from the ledger and apply the required adjustments on the most recent state.

Shopper-side retry dealing with

If(validationCode == 11) {
    Object a = // question the ledger once more for the most recent state.the place a being an object on the ledger
    //carry out the writes on  a
    a.setAction() }

In a typical blockchain transaction circulate, when functions are registered to be notified when a block is dedicated to the ledger of a peer, when the transaction fails, the notification sometimes has VALIDATION_CODE: 11, which signifies MVCC_READ_CONFLICT.

You too can contemplate VALIDATION_CODE: 12 that signifies PHANTOM_READ_CONFLICT underneath the identical class.

A pattern blockchain occasion:

 [eventTransactionId: f948056aa59d42810d3318dc1df7643152e220d3bb75924d5228a5e9efb95017, ,status: failure,eventName: xxxx,actionType: xxxx,errorCode: VALIDATION_CODE: 11,errorMessage:null,validationCode:11,channel: xxxx,blockNum: 24712079]

The blockchain occasion contains the transaction ID that was submitted and the small print indicating whether or not it was dedicated on the blockchain or not. The standing: failure signifies the block was not dedicated and validationCode signifies the rationale for failure. On this instance, VALIDATION_CODE:11 signifies there was MVCC_READ_CONFLICT. VALIDATION_CODE: 12 indicating PHANTOM_READ_CONFLICT additionally falls in the identical class.

Pattern retry code:

If ( blockChainNotificationObject.getValidationCode() == 11 || blockChainNotificationObject.getValidationCode == 12 ){
              1.Question Ledger to get the most recent state of the Object
     2. if the state is already carried out/utilized
                 {//most likely parallel duplicate invocation
                   2.1 Think about the occasion as success  and proceed the applying logic  }
          3.else{  apply the required adjustments on the most recent state and retry the Transaction }}

Peer lag errors

Community delays can typically result in the friends being out of sync the place one peer is perhaps nonetheless catching up with the most recent block. So, any queries made on the lagging peer would return an outdated state which may now not be legitimate for the applying.

Downside abstract

P1 and P2 are two friends, and K1 is the brand new key added to the ledger. Due to community latency, K1 just isn’t but added to P2’s ledger. The appliance has queried for K1 on P2, which might obtain a null worth.

It’s important that you simply program the applying to determine and deal with inconsistent information attributable to peer lag. Purposes ought to have a mechanism to hear to dam addition occasions and verify that the block has been added efficiently. Within the case of inconsistent behaviour or null values returned whereas querying, then you may implement retry logic to establish the state replace. A retry with exponential backoff offers the time for friends to beat the community points/delay and sync up on the state.

Endorsement coverage failures

All transactions should be endorsed by the endorsing friends within the execution part (second step) of a blockchain transaction circulate. The endorsement of transactions can fail for a number of causes, resembling invalid endorser signatures or different technical causes. Many of the doable causes for endorsement coverage failures are due to misconfigurations or transient world state inconsistencies between the friends. The important thing-value retailer, which maintains the world state, is up to date by every peer independently within the validation part. Due to this fact, transient world state inconsistencies between the friends are doable. On the similar time, the endorsing friends use the world state to generate learn/write units within the execution part. Thus, the world state inconsistencies result in a learn/write set mismatch within the endorsement response inflicting an endorsement coverage failure of the transaction.

  • Within the case of endorsement coverage failures attributable to misconfigurations, the that you must validate the configurations and proper them.
  • Purposes can have resiliency and retry logic in place and take a look at fetching endorsements from each peer in a company earlier than giving up.
  • The appliance can change the endorsement coverage, based mostly on its logic. To vary the endorsement coverage, you may specify Channel/Software/Endorsement in configtx.yaml to ANY endorsement. Material makes use of this configuration because the default endorsement coverage in all chaincode.
  • Within the case of the world state inconsistencies, it is suggested to have consumer facet retry logic as mentioned within the MVCC_READ_CONFLICT errors part.

Chaincode errors

Chaincode to chaincode communication error

There are functions the place chaincode enterprise logic would require communication with different chaincode to implement a rule or logic.

Downside abstract

Chaincode to chaincode communication can fail due to a number of causes, resembling unavailability of specific chaincode on a given channel or the dependent chaincode not being prepared to just accept requests. A number of the errors fall underneath nonretryable errors like INVALID CHAINOCDE NAME, INVALID CHAINOCDE VERION, INVALID ARGUMENTS, and CHAINCODE_UNAVAILABLE. It’s best to deal with different chaincode to chaincode communication errors by having retry logic on the consumer facet.

Error message:

Error: INVOKE_CHAINCODE failed: transaction ID: f6ab6dbd747ddf25ebfe158eea5a9b0b7478d6a031b66a08a4f3c2f02fe2f7fe: can not retrieve package deal for chaincode take a look at/1.0, error open /var/hyperledger/manufacturing/chaincodes/take a look at.1.0: no such file or listing"

This message signifies that the chaincode container is unavailable. Nonetheless, if in case you have ascertained that the chaincode is put in and instantiated on the friends and is simply unavailable due to the dependent container startup delay, then you may implement custom-made dealing with on the utility resiliency layer to get well the transient failure.

To deal with chaincode to chaincode communication errors, distinctive error codes will be propagated by the caller chaincode and the consumer utility will be programmed to determine and categorise these as retriable errors based mostly on the identical. You may retry with exponential backoff for restoration.

No LedgerContext discovered error
A number of the errors will not be thrown as a part of the validation part VSCC however happen throughout the execution part. There may very well be a number of causes for these errors, they usually typically occur if any of the operations on the ledger are taking longer than the anticipated time.

Error message:

ERRO 09f [ddc81d1b] Didn't deal with PUT_STATE. error: no ledger context runtime.goexit /choose/go/src/runtime/asm_amd64.s:1333 PUT_STATE failed: transaction ID: ddc81d1bcb69eecd6c6bbcf85ba16b2168486d4b232ef3c03fe5bbc7bb2adea1 github.com/hyperledger/material/core/chaincode. runtime.goexit

These error situations will be dealt with in chaincode whereby a singular error code will be propagated by the chaincode and the consumer utility will be programmed to determine and categorise these as retriable errors based mostly on the identical. You may retry with exponential backoff for restoration.

ValidationCode=17 and ValidationCode=18 errors

A number of the chaincode invocation can fail with error validationCode=17 or validationCode=18. validationCode=17 signifies EXPIRED_CHAINCODE and validationCode=18 signifies CHAINCODE_VERSION_CONFLICT.

Downside abstract

The transaction submitted on an older chaincode container that later expired due to the provision of a brand new container; throughout the validation part (VSCC), the transaction will get rejected.

Error message:

Nov 22 16:32:52  peerxxxxxxx-r2yyyy  peer  2021-11-22 11:02:52.180 UTC [valimpl] preprocessProtoBlock -> WARN fdd59 Channel [XXXXX]: Block [22596691] Transaction index [3] TxId [2224af5907927c60fad8b39f82537620b89d438ee5c44229533a3779985b833f] marked as invalid by committer. Cause code [EXPIRED_CHAINCODE]

You may program the consumer to determine and categorise these as retryable errors. You may retry with exponential backoff for restoration.

Abstract

These are a few of the main kinds of errors that you simply may run into in any blockchain-based utility. By understanding these factors of failure forward of time, you may have the proper configurations, error dealing with, and restoration methods in place from the design part to deployment. That is essential to constructing a resilient structure.

RELATED ARTICLES

LEAVE A REPLY

Please enter your comment!
Please enter your name here

Most Popular

Recent Comments