Communication demands are usually the leading factor that defines the efficiency of operations on a read/write shared memory emulation in the message-passing environment. In the quest for minimizing the communication demands, the algorithms proposed either require restrictions in the system or incur high computation demands. As a result, such solutions may be not suitable to be used in practice.

In this paper we focus on the practicality of implementations of atomic read/write shared memory emulation in the message-passing environment. In particular we investigate implementations that reduce both communication and computation demands. We first examine the shortcomings of the best two (in terms of communication demands) known algorithms that implement atomic single-writer multiple-reader (SWMR) atomic memory. The algorithm ccFast proposed by A. Fernández et al., achieves optimal communication by allowing each operation to complete in one round trip, with light computation requirements. Unfortunately, it relies on strict limitations on the number of readers. On the other hand, algorithm OhSam, imposes no restrictions on the system, but provides operations that require one and a half communication rounds. In the light of these shortcomings, we present two algorithms that implement multi-speed operations with light computation, and without imposing any restriction on the system. In particular, algorithm ccHybrid adopts the fast (one-round) writes and makes clients to switch to a slow (two-round) mode whenever the system is congested. On the other hand, algorithm OhFast, pushes the responsibility of deciding for the speed switch to the servers. This allows the algorithm to utilize the fast operations, and the slow one-and-a-half-rounds operations of the algorithm presented by T. Hadjistasi et al., whenever is necessary. We prove that both new algorithms preserve atomicity. To evaluate the new algorithms we implement five different atomic memory algorithms in the NS3 simulator, and we compare their performance in terms of operation latency, and ratio of slow over fast operations performed. We test the algorithms over different: (i) topologies, and (ii) operation loads. Our results support that the newly presented algorithms increase the practicality of atomic read/write atomic shared memory implementations in the message-passing, asynchronous environment.