Clusters of commodity computers are currently being used to provide the scalability required by several popular Internet services. In this paper we evaluate an efficient cluster-based WWW server, as a function of the characteristics of the intra-cluster communication architecture. More specifically, we evaluate the impact of processor overhead, network bandwidth, remote memory writes, and zero-copy data transfers on the performance of our server. Our experimental results with an 8-node cluster and four real WWW traces show that network bandwidth affects the performance of our server by only 6%. In contrast, user-level communication can improve performance by as much as 29%. Low processor overhead, remote memory writes, and zero-copy all make small contributions towards this overall gain. To be able to extrapolate from our experimental results, we use an analytical model to assess the performance of our server under different workload characteristics, different numbers of cluster nodes, and higher performance systems. Our modeling results show that higher gains (of up to 55%) can be accrued for workloads with large working sets and next-generation servers running on large clusters.