Skip to main content

3 posts tagged with "Infrastructure"

Setting up infrastructure for Antithesis run

View All Tags

Antithesis tools for Cardano Community

· 8 min read
Arnaud Bailly
Project Lead

The next Node diversity workshop, where Cardano developers focusing on building various kind of nodes will meet in Toulouse, is just around the corner. It therefore seemed a good idea to share an update on the Antithesis project initiated by the Cardano Foundation, especially as representatives from Antithesis and the teams working on this project will attend the workshop!

And although it's been while since we have shared news in our latest blog post, the project hasn't stayed still and we made progress on a number of fronts, ready for the next phase namely its transformation from an experiment restricted to a small team to a community-lead resource.

Testing cardano-node with Antithesis

Since May we have been running Antithesis on a cluster of 5 cardano-nodes as part of the Continuous Integration process of the project, leveraging the existing Github Action to trigger test runs. Those test runs have been done on the latest available versions of the Docker image of cardano-node, on the master branch, and checking a limited number of basic properties on top of the ones provided by Antithesis out-of-the-box:

  • The main property checked is implemented in the eventually-converged.sh script and, as its name implies, checks that all nodes have reached the same tip eventually. The fact this is an eventually command is critically important as it guarantees it will be run and its status checked at the end of the test run and without any more faults injected. It took us a while to refine the property and work with the Antithesis team to ensure it does not produce false positives,
  • We also check 2 Never properties related to logs emitted by the cardano-node: That it never emits Error or Critical logs.

Note that by default Antithesis checks a number of other properties, like the fact no container exits unexpectedly, or that the processes run never go over 95% of the memory available.

The team's effort has not been particularly focused on finding bugs, but we still managed to report a couple of potentially "interesting" issues to the core developers: One issue related to DNS resolution failure timeout, and a potential race condition in the shutdown/startup sequence.

Tracking test runs on-chain

The main goal of this project has always been to open access to the Antithesis resource to the Cardano community, in a transparent, fair, and as easy as possible way. And what's better than a blockchain like Cardano to manage access to a shared resource transparently? Furthermore, as developers of Cardano systems it seems quite logical and inline with the principle of dogfooding to insist on using Cardano itself in our development workflow.

Once convinced about the capabilities of the tool, we started designing and building a Cardano-based system that would allow developers to "easily" request test runs for their particular software stack, and receive results and reports once produced by the platform. Version 0.2.0.0 of the command-line tool called anti-cli has just been released and it provides the core features needed to manage the lifecycle of Antithesis tests for any system hosted on Github1.

Anti-cli leverages Merkle-Patricia Forestry Service as a backend-service to store data off-chain with proofs verifiable on-chain. This service is in turn based on the excellent Aiken Merkle-Patricia Forestry library.

The following picture illustrates the overall design of this application and how it interacts with the chain and the Antithesis service:

anti-cli interactions

The developer, here Alice uses anti-cli to post test run requests and retrieve test run results from the MPFS service. MPFS service is used to compute unsigned transactions necessary for the parties to operate index the blockchain to track and serve the state of the interaction. It can be run locally, giving it access to a node, or parties can use a remote instance (see mpfs repository).

On the other side, the Agent is reponsible for mediating and curating run requests, forward those requests to the Antithesis service, retrieve the tests result, and finally notify the results on-chain for Alice's consumption. Note that because of the sensitive nature of test results, the report's URL posted on-chain is encrypted using Alice's public key which in turns require them to record this key as part of their registration process.

More details can be found in the anti-cli documentation and particularly in the requester role document.

Testing Amaru in Antithesis

Amaru is an alternative node implemented in Rust which is being actively developed by a small but dedicated team under the umbrella of PRAGMA, a Member-Based Organisation dedicated exclusively to foster open-source projects. As such it's a prime candidate to prove Antithesis' relevance for the development of core systems in Cardano beside the cardano-node itself.

It took the team some time to get there, as there were quite a few moving parts to get right in order to be able to run a cluster of cardano-nodes and amaru nodes2, but we finally managed to run our first Antithesis in the past couple of weeks.

The setup is pretty simple: We spin up a network of 5 block producing cardano-nodes and 2 Amaru nodes in client mode, with one of the Amarus connected to a cardano-node and the other one to the previous Amaru node. While the test run itself wasn't successful, which is expected given the current state of development of Amaru, this is nevertheless a huge step forward for us.

Details of the setup can be found in the Amaru codebase. This initial success paves the way for quickly pinpointing main shortcomings in Amaru and target development effort accordingly first to make sure Amaru can be a reliable relay and then turn into a full-fledged block producer.

Adversarial node & network

A key feature of Antithesis is its ability to inject faults in the System-under-test. Currently, Antithesis can trigger various kind of network-related faults like dropped or delayed packets, disconnections, split-brains ; cpu and memory throttling to simulate sudden limitation of resources ; or random crashes. But as clearly stated in the documentation page, those faults are very generic and simulate faults from the environment but not really adversarial behaviour, and as we are the ones knowing our system best it rests on us to produce more interesting faults.

Following initial discussions with the Consensus team we have started work on an Adversarial Node. This is intended to be an extensible and configurable application that can be triggered by Antithesis platform during tests to emulate adversarial behaviour at the protocol and logic level. Our very first "adversary" is quite mild in its "attacks" on nodes as it merely tries to open several connections to a node and start synchronising the chain from random points in the past. Its goal is to expose potential issues in the way a node tracks Chain followers, eg. memory leaks, concurrency issues, network resources allocation...

A key design constraint of the Adversary is that it should only act through Cardano protocols, without any prior knowledge of the implementation of the nodes it connects to. This makes it possible to use the Adversary in any valid Cardano-network and running against any piece of software implementing Cardano protocols, whether at the network, consensus, or even ledger level3.

Conclusion & Future work

While the High-Assurance Lab team at Cardano foundation has made great progress to turn what started as an experiment and proof-of-concept a few months ago into a full-fledged usable project, there obviously remains a lot to be done. Our end goal is to build tools and systems that benefit the Cardano ecosystem a as whole by providing easy-to-use, transparent, and safe access to a sophisticated testing tools empowering teams with state-of-the-art system-level testing infrastructure.

But we also want these tools and systems to be owned by the community in order to ensure adequate funding and proper governance of something we believe can increase the level of quality, resilience, and therefore confidence in Cardano. That's why we are eager to engage with "node builders" around the globe, and to collaborate with all teams and projects building core infrastructure components and applications to make the most out of this effort.

Footnotes

  1. We have tested the tool not only internally within the team, but using Amaru as our guinea pig project.

  2. For more details on this journey, one can consult the team's journal.

  3. We have good hopes this adversary will be useful in other contexts, in particular within the context of Tartarus.

Antithesis Proof-of-Concept Results

· 9 min read
Arnaud Bailly
Project Lead

Since the start of our project, Cardano Foundation and Antithesis have been busy working on a Proof-of-Concept aimed at evaluating Antithesis's capability to enhance Cardano's testing infrastructure. The results have been promising, with the platform successfully identifying both previously known and new unknown bugs in the cardano-node.

This blog post outlines our journey and findings so far, and sketches out plans for the short and medium term in order to ensure this project delivers the most value for the Cardano community.