Soft Timers: Efficient Microsecond Software Timer Support for Network Processing

Paper by: Mohit Aron and Peter Druschel

Presented by: Billy Cheung



Presentation Slides: Here


Discussion Recap:

Question:
A question was posed as to whether periodic timers can be completely removed from the equation, meaning that we just stick with one-shot timers, and keep resetting them, and use soft timers for the rest.

It was tend pointed out that this, in fact, has already been done. One of the rationale behind it is that for mobile devices, energy efficiency is a major concern. While the device is idle, it makes little sense to have the timer keep ticking away. Instead, it is better to only handle events when there are events to be handled.

Question:
The question of whether any operating systems actually employ soft timers was raised (as well as e-mailed).

The answer is that, as far as we know, no. There is some talk of Linux implementing it, but we were unable to confirm it at the time of the discussion.

Question:
A question that was raised is the compatibility of rate-based clocking. For example, how would the communication between a system that uses rate-based clocking and one that uses standard TCP/IP protocl work?

The answer is that it wouldn't really matter. The rate-based one would be sending things at a somewhat constant rate to the TCP/IP one, and upon receiving the packets, based on whether any of them were lost, etc. , the TCP/IP based one would either double its rate or get cut and have to go through slow start again.

Question:
As mentioned in the presentation/paper, one of the trigger states that authors included is right after a system call. What is the benefit of calling it there instead of, say, before the call?

When a system call is made, changes are there there would be parameters passed to it. If we go and handle events before we handle the system call, we're going to have to save, unload, and then reload those parameters later on, which will cause overhead. While the argument can be made that there would be things that context switches would need to be made after the call anyways, it is more likely that in the process of the call, your registers are affected already.

Question:
Another question that was raised was the use of jumbo packets to replace the current 1500 byte Ethernet packets as the standard. Obviously, there would be advantages to it, namely that, since the networks nowadays can support a heavier load, why not take advantage of this fact to send things quicker?

However, the main issue here is compatibility. Since routers break up overly large packets based on their MTUs, which likely does not support the jumbo frame's 9kBs, the bottleneck you'll be creating outweighs any potential gain.

Note: There was also a brief discussion on rate-based clocking and the fact that it intends to ignore Slow Start, which is one of the most important tools for congestion control. This might explain its lack of popularity.



Compilation of submitted questions:

1. Under what conditions we prefer to use soft timer and under what kind of conditions, we should use one-shot timer or use them together ?

2. I know it could be used to solve starvation problem. Could soft timers is used to increase the responsiveness of an idle system?

3. Why does the on-chip timer event have a lower per event cost as compared to a off-chip timer ? (One would incur the same amount of context switching overhead in both cases I think)

4. What other application classes are well suited to the model of a probabilistic timing guarantee with a ceiling, which is being offered by soft timers ?

5. Which all operating systems have included support for soft timers ?

6. What trade off are you making when using soft timers? Is this concept curiously like user/kernel threads (user/kernel timers)?

7. I've seen soft timers done before 2000. What's particularly novel about this paper that warrants a publication? (answer: extensive benchmarking, emphasis on TCP implemenation issues, but what else?)

8. Is this technique used by "sampling" type profilers, like OProfile, gprof or VTUNE?

9. Is this really just triggering timer interrupts off of all hardware interrupts & schedule changes?

10. This is a good paper. However, I don't totally agree with author's assumption that hardware timers interrupt on every clock tick. In case nothing is scheduled, hardware interrupt could go off every n cycles. Do you think my thought is right?

11. I am wondering what change will happen to soft timer while hardware changes? Do interrupts always have high overhead?

12. From this paper, I know hardware timers have problems. What would be a better hardware interface design?

13. The paper presented the soft timer for scheduling of software events but only pointed out two applications in rate-based clocking and network polling. What other applications can take advantages of the soft timer?

14. The disadvantage of the soft timer is that event handlers may be delayed past its scheduled invoking time. How do you measure this disadvantage?

15. The authors mention hybrid polling/interrupt systems which serve the same purpose as soft-timers, but they don't include much analysis of why their system is better. How do these hybrid systems compare?

16. What other OS services require frequent polling or interrupts?

17. Can we pre-empt the processing of a non-interrupt based soft-timer?

18 .This is a totally cool idea. However, when I see performance 'hacks' that couple subsystems together (in this case, the system call and the alarm scheduling systems), I start to get nervous about how programmers will react to them.

19. In this case, I would suspect that a programmer concerned about having her events triggered with high fidelity would start looking for ways to pull tricks with system call timing. Do you think this is a concern, or am I just being paranoid? It is just my experience that if you leave enough room for a systems programmer to swing her hacking pick, she'll use that room.

20. Apart from periodically sending packages, what other uses are there for soft timers? I can't think of anything other than polling some I/O device.

21. TCP has "built-in" network congestion control, doesn't the rate-based clocking scheme for transmission defeat the purpose of using TCP (since we are "ignoring" the ACKs)? Why are the tests run with FreeBSD? Would using other OS such as Linux in the tests deliver a significantly different performance characteristics?

22. Is technique like this one used to stabilize the bandwidth for streaming media over the Internet? (Especially with its lower overhead on CPU usage)

23. When soft-trigger (or any Rate-Based Clocking technique) is used, why isn't throughput increased? I thought lower RTT would increase the bandwidth of the system (provided that the channel is not saturated).

24. Do you think that this technique might be useful for a distributed system that involves intensive inter-machine communications? For example consider a grid of computers running a scientific simulation. (in this type of applications the amount of system calls is minimal!). Rate based clocking does not seem applicable what about polling?

25. Is it possible that event handling overhead results in considerable performance reduction in some cases? For example consider a case that there are so many over-due events in the list after a large trigger interval (due to a large number of connections to the server in the interval).

26. Soft-timers require you knowing information about the network in order to use them effectively. I don't think it's fair to rely on this because even TCP could avoid a lot of the slow-start problems (or even compressed/big ACKs) if it had some knowledge of the network. Do you really think it's fair to compare performance to TCP which has no initial knowledge of the network?

27. This whole argument also relies on the usefulness of rate-based clocking but I'm not convinced it is that useful. If we're given knowledge of the network can't we also avoid the big ACK and compressed ACK scenarios and skip the slow start? Doesn't QoS on our network avoid these problems? Also, it seems for the applications they chose that rate-based clocking doesn't perform as well as the standard implementation so why use it? It just doesn't seem to be motivated well in this paper.

28. Table 5 shows that about 50% pf the trigger state interrupts are related to network interrupts. The goal of network polling technique is to avoid network interrupts ,hence the frequent context switching, via using soft timers for polling. I`m a bit unsatisfied on the usage of soft timers for network polling especially the system is running compute bound jobs background, since stated by the system in case of compute bound jobs there are stil enough trigger states , network interrupts, to save the day. However doesn`t this contradict with the goal of network polling?

29. Soft timers require modification on the underlying kernel (modifying trigger states) and from the experiments it seems APIC (on chip) timers have a close performance compared to soft timers. My question is if any OS has gone through the strugglee to implement soft timers?

30. This might be off the topic but in case you have knowledge: What causes the huge performance gap between off chip timers and APIC? (Is it only the place of the timers(proximity to chip) or are there other reasons?)

31. Why do you think the X (resolution of the interrupt clock relative to the measurement clock) is a good measure used for the upper bound to the actual time event.

32. Can rate-based clocking introduce noticeable delays in some cases?

33. In case of high loads (like www servers have them) would you say that soft-timers still perform better than standard interrupts (in terms of cache pollution)?

34. I don't understand what authors mean by the sentence (page 207, second paragraph): "The measured total overhead per event (as calculated from the slowdown of the Web server in the presence of the additional events) was found to be independent of the event frequency in this experiment."

35. Isn't this always true, not just in the experiment the authors mention? I mean, every interrupt is handeled by a handler whose execution time is deterministic, and O(1), so you see, I don't understand how it can depend on event frequency? However, the total of all overheads would depend on how many events occured between two trigger states, so I am confused? I hope you can help me on this one.

36. What can be done to allow a compute-extensive program that makes system calls only infrequently benefit from soft timers ?

37. Most of the paper is devoted to motivating the idea by showing how it can be used to improve the performance of network-based applications (via rate-based clocking or network polling). Is this the only software domain where soft timers would be useful ?

38. The paper deals with the problem of a high number of network device interrupts with the increased speeds available nowadays by handling them with polling using software timers. I was wondering maybe we could do this in hardware? Why is it necessary that the NIC has to interrupt at the arrival of each packet? Isn't it possible to have a hardware buffer on the NIC? Hmm it should have one I suppose.

39. A timer facility in software rather than hardware is provided with a non-guaranteed time interval between the software timer ticks. Can this be used for say non-work intensive systems, say for desktops? But then I guess desktops don't require that fine granularity of timing. But this technique is limited to high-processing servers, yes? Or in other words, what applications other than network polling requires high granularity timers?

40. Why do they advise using conventional timers for events with granularity below that of the interval timer? Given that the granularity of soft timers will be no worse than that of the interval timer, wouldn't it make sense to use soft timers everywhere for consistency?

41. Does support for soft timers have a negative effect on the performance of applications that do not use them?

42. Soft timers enable a transport protocol like TCP to efficiently perform rate-based clocking, i.e., to transmit packets at a given rate, independent of the arrival of acknowledgment (ACK) packets. For efficiency without considering security, is it better Soft timer enable UDP rather than TCP?

43. The paper compared with other related work such as Smith and Traw, Mogul and Ramakrishan, etc., and concluded Soft timer has a great performance, low delay and low overhead. Just curiously, what disadvantages does Soft timer have? What about the long latency for Soft timer polling?

44. The performance of software timers depends on the preciseness of occurence of events (which may get delayed past their scheduled time). A sequence of events triggered by upredictable delays are said to be handled using measure_time() and the probability distribution d. In multitasking environment do you think this distribution will be uniform over time or will every task be associated with its own software timer?

45. Frequent checks for pending events will impact system performance and may consume more time than that is saved by avoiding a signal. Can't there be checks inserted at 'regularised' intervals by the compiler instead?