Perhaps the greatest impediment to reaching this goal is caused by the vagaries of the underlying infrastructure that provides the communication between the game clients and servers: the Internet. The path to the goal of perfection is washed away by two primary effects in game communication over the Internet:
- Throughput limitations - the connection between server and client has some maximum throughput.
- Network delays - even with unlimited throughput, server messages do not arrive instantly.
The second, network delays, means that regardless of the available bandwidth, there is always a delay between the servers sending a message and the client receiving it and vice versa. In effect, the information in the message is time-shifted to a later wall-clock time: an event may happen at actual time X, but by the time the client receives the message of the event, the actual time is perhaps X+2. When the possibility of time-shifting is present, the possibility of causality violations rears its head: events may seem to occur in the wrong order for game clients. For example, a player may open a door, but that door isn't there in the game world because it was destroyed by a grenade (and seen as such by another online player), an event that the client component of the first player was not yet aware of due to the delays in message arrival. Other players might see the first player's avatar "opening" a door that isn't even there.
We will review the details of these problems, the effect they have for players of online games, and the techniques commonly used to minimize the unfairness between players that can result if these problems are left untreated. We will cover the definitions and details of the problems in this blog entry, with part II to cover the effects these have on players and the techniques used in games to minimize their impact.
Throughput and Latency:
The throughput requirements for online PC games vary widely, but in general are far below the available bandwidth of the typical client (we will only be discussing the client side and impact, obviously, running a server for a game dictates a much higher aggregate bandwidth requirement). Recent studies (Feng 2002 & 2005) using games such as Counter-Strike, Day of Defeat, Medal of Honor: Allied Assault, and Unreal Tournament 2003 showed a client load ranging from 6400bps to 8000bps for client to server packets and 20800bps to 36000bps for server to client communications. These are far below even lower-tired ISP services typically used by online gamers.
Congestion of the network may cause throughput to drop to a level that is insufficient for smooth game play. Congestion typically occurs in one of three primary areas: the last mile near the user, the middle or Internet cloud, and the last mile on the server side.
In the case of the user-side congestion, they may simply have a service tier that does not provide sufficient bandwidth. This can of course be remedied with a service upgrade. At a minimum, a service with 600kbps down and 50kbps up should suffice. The faster down link speed while not strictly required to play online will ensure faster downloads of game items such as server hosted maps.
The gamer should also ensure that other activities on their local network are not causing congestion. Other users gaming, streaming audio or video, using services such as torrents, etc. can all adversely affect the overall available broadband bandwidth for the player.
Problems in the last mile on the server side can be caused by too many players joining a specific game server, causing a bottleneck on the network link to that server. Game servers typically employ a player count limit to avoid this occurrence. Any other congestion in this link of the game communication network (router congestion or failure modes, etc.) is likely to be out of the control of both the player and their server provider.
Congestion in the Internet cloud is usually temporal: Perhaps a widely viewed sporting event is viewed by large numbers via streaming technologies. As with most last mile issues on the server side, these are out of the control of the server provider and game player. In cases where Internet cloud congestion is the cause of game play issues, the only remedy is to wait until the problem "goes away".
Any kind of congestion, whatever the cause, can cause throughput degradation that may adversely affect the consistency of game play. If the game client is starved of message packets due to actual throughput issues or congestion related throughput issues, the synchronization between the client and server will be lost, resulting in "laggy" game play, "rubber-banding", and other temporal effects. Severe throughput problems can result in the game client "giving up" and disconnecting from the game server.
There is no accepted and commonly agreed upon definition for latency (Delaney 2006). The latency of a network is commonly measured using the ping command. This however measures not the one-way trip from client to server or vice versa, but instead measures the round-trip time. Since the routes from client to server and server to client are usually asymmetric, simply guessing at half the value arrived at from a ping measurement may be grossly inaccurate, and provide incorrect information for making client and server timing decisions. In addition, such a measurement does not account for processing and other delays at the client and server endpoints.
A more useful measurement is the endpoint-to-endpoint measurement of latency that accounts for time needed for client-side processing, bi-directional network delay, and server-side processing (Stead 2008).
This is important: It has been found in studies that much of the delay in the overall game processing loop is caused by the game client handling and processing of messages.
The sources of network delay fall into four basic categories (Kurose 2009):
- Transmission delay: Packet time to physical layer.
- Queuing delay: Packet time waiting to be sent to a link.
- Processing delay: Packet time spent at routers along the route.
- Propagation delay: Packet time in physical link (bounded by the speed of light).
Queuing delay can occur at routers along the path of the packets. If a router is is under heavy utilization or the required outbound link is busy, the packet will be queued into a buffer until it can be sent.
Processing delay is also incurred at routers, since these must handle routing table checks, possible firewall rule application, packet check sum and error checking.
Lastly, even if delays in packet transmission due to processing overhead , transmission delays and queuing delays could be eliminated, we are still bound by the laws of physics. No signal can travel faster than light (2.998x10^8 m/s in vacuo). Speeds in actual transmission media will be lower (e.g. 1.949x10^8 m/s in typical optical fiber, significantly lower for twisted-pair copper). This means we are bounded by an absolute minimum round-trip latency of roughly 2 ms client endpoint to server endpoint and back for a client to server distance of 200 km.
Jitter:
Jitter is the variation in network latency caused by changes in the state of the network. Packets that comprise the communication between the game client and server seldom follow the exact same route endpoint to endpoint. This can cause packets to have different latencies. In addition, network congestion can result in changes in the routing and router buffering behavior, changing the queuing delays for the affected routers.
We can visualize this effect with the aid of a diagram.
In this diagram, packets are sent from the server represented by the lower solid line at regular intervals (time ticks) to the client represented by the upper solid line. If we were able to construct a network with none of the four causes of latency outlined, and in addition discovered a way to violate the laws of physics and send our packets with infinite speed, the green line results: there is no latency between the server sending a packet and the client receiving it.
The more realistic example is represented by the blue line, which shows the slight delay the packet experiences traversing the network from the server to the client. The orange line depicts the next packet in the sequence, which is delayed by the same amount as the packet of the blue line. In the ideal world, the latency from the server to client and vice versa would exhibit this constancy. This would simplify any "compensation" for latency the game developers might wish to utilize, and even without compensation, humans tend to have an easier time adapting to latency in a game when it is relatively constant, even when the latency is rather large (Claypool 2006).
More typically, the game packets experience changes in latency from routing and congestion problems. This is illustrated with the final train of three packets colored red, magenta, and dark brick red. For these packets, it is clear any semblance of packet arrival at relatively regular time ticks is completely lost. There is currently no standard measure for jitter in game traffic. Jitter in networks tends to exhibit randomness, but can be characterized by a Gaussian distribution for inter-packet arrival times (Perkins 2003). Since we are bounded by conditions such as some minimal amounts of processing, queuing, and transmission delay in addition to the absolute bound due to the propagation delay, the actual distribution is biased: there is some absolute minimum that can be realized, and network congestion and related issues can cause delays to be skewed. This is illustrated in the following graph.
The fit is sufficient that we can use this model for predicting the likelihood of specific inter-packet times for use in the design of compensatory mechanisms for games.
In part II of Doing the Jitterbug, we will investigate what effects these issues have on game play, and what techniques can be used to minimize these effects.
Interested readers can find references for further study after the jump break.