Paper #6
Why Events Are a Bad Idea for High-Concurrency Servers
Presented by
Jean-Sebastien Legare
<jslegare gmail.com>




1. Presentation Slides [PPT] [PDF]


2. Discussion recap

    This paper has two main goals:

1)    To refute the main arguments against the usage of threads in high concurrency models. The paper advanced that the claimed problems so far with threads were a result of the inefficiencies of the actual thread implementations rather than on the underlying thread model itself.
2)    To promote the use of threads over events for their simplicity and ease of use.

    This paper is one in a long series of papers that attempt to end the ongoing debate about which model between threads or events is best. In these papers, the authors take an optimized version of a program written in the model claimed to be better (very often their favorite model), and show that it achieves better results than a similar application written in the other model. This paper is not an exception. They compare an highly tuned threaded web server, Knot,  to Haboob, an event-driven web server and present results showing that their version achieves higher throughput.

    Contrarily to what this paper tries to teach us, the impression that is left after reading it, is that none of the two models in the end can be claimed as the absolute "best" model. They each have their advantages and inconvenients. Rather, the "best" model often turns out to be more a question of which of the two models the programmer finds more natural, or with which model the programmer is most at ease.

   

3. Submitted Questions


We have now seen a paper where a group of people who are very experienced with threads have maligned the use of threads for concurrency, and a paper where a group of people who are very experienced with events have done the same to the event-based model. Could it be that concurrent programming is just inherently difficult, and that these expositions are merely a manifestation of familiarity breeding contempt?

The 'compiler support' that the authors speak of consists entirely of static analysis: That is, the compiler should signal to the programmer when they've done something silly. What about adding extra constructs to the language itself to support threaded programming? Could that not be equally useful?

In section 2, the authors invoke the Lauer-Needham duality to 'prove' that certain components of the event-driven model could be implemented in the threading model. (See especially 2.2 under the heading "Scheduling"). However, no discussion is given about the actual practicality of such an implementation. Is this simply a theoretical equivalence, or is it actually the case that the implementation could be done easily and robustly?
Michael DiBernardo


If the problems with threads are not "intrinsic" to the model (as the
authors suggest) then why do the same problems seem to be so pervasive?

Knot was tested only in terms of bandwidth.  Although the score was
impressive, what does this imply in the measure of fairness?

Knot has a very small codebase.  What does this mean in relation to the
more sophisticated SEDA model in terms of the fairness of comparisons and
the relative merits of the two approaches?
Dutch Meyer


The authors said the only pattens are less graceful with threads are
dynamic fan-in and fan-out. I want to know the details of the reason for
it. Could you explain it ?

Do you think the scheme provided in this paper has enable threads to
be with  the 4 points mentioned in the first part ? I think this paper
want to solve part of the problem by use of programming language, such
as, compiler and dynamic stack. This perhaps could solve some problems.
However, it perhaps could not solve most of the problems.
Jun  Zhang (Erica Zhang)


Which operations are O(n) in the number of threads?

What is Habood’s strength, compared to Knot? (How about resource management?)
Seonah Lee


What is the problem with cleaning up task states in event driven
model? How can the branches in control flow cause dealocation steps to
miss?

Why does Knot-A (the one that favors new connections over active
ones) behaves the way depicted in Fig 3?Actually I don't understand
why does it go up again after the first fall (between 2048 and 4096).
Ali Bakhoda


Ousterhout vs the authors of "Events Are A Bad Idea" - in a battle to the
death, who will come out victorious?

Do you think that the events vs threads all comes down to which ever paradigm
the programmer is most familiar with?

Suppose that the threads really work as well as the paper states. How would
you cook up a set of benchmarks that would favour event based systems
(without leaving the arena of high-concurrency servers)?

Notice that the subtitle of this paper indicates that this matters mostly
for "high-concurrency servers"). Does the argument of the paper apply to:
modern computer games? embedded systems for space shuttle? embedded systems
for set top boxes? firmware for your home office router? word processor and
traditional applications?
Alfred Yu-Han Pang


What are the O(n) operations in the scheduler that are removed to
lessen overhead of threads? Are they vital?

What is the maximum latency when Knot is used in comparison to that of
Haboob? How is fairness and graceful degradation be ensured here?
wilson fung

 
von Behren mentions that threads are better because "... the
concurrency in modern servers results frm concurrent requests that are
largely independent." (Section 3) I would suspect quite the opposite, do
you think requests are "largely independent?"

It seems the author does a lot of handwaving, particularly about
compiler optimizations for threads (for isntance: "... these problems can
be addressed with furher analyses" Section 4.1), but is there any evidence
to support these claims?
Lloyd Markle


They say that they translate blocking I/O requests to asynchronous
requests, but that for disk I/O these are implemented with blocking
calls performed by a thread pool. Isn't the effect that disk I/O is
still a blocking call that the thread must wait on? Why even bother with
the thread pool?

Doesn't cooperative multitasking generally imply sub-optimal
performance, at least in the case where there are more threads than
processors, because the programmer almost never knows exactly where a
thread should yield? It seems likely that you'll either have an
over-abundance of context switches or else threads will have to wait for
long periods while other threads use the processors.
Sam Davis

What is the benifit of "stack ripping"?

What does cooperative multitasking mean for events?
Aiman Erbad


How do you think the authors could make an estimate on the amount of
stack space needed by a recursive function call in order to have a dynamic
allocation of stack space? Would you happen to know of any solutions?

The authors mention that "the compiler could also warn the programmer
of cases where large amounts of state might be held accross a blocking
call, allowing the programmer to modify the algorithms if space is an
issue". Do you find this a good suggestion to programmers working on big
and convoluted projects?

Do you agree with their opinion that today's thread based systems are
badly implemented so they perform poorly when both high concurrency and
blocking operations are present?
Mirna Limic


Why do threads pressent a better sense of abstraction for high-concurrency servers than events?

What are the improvements proposed in the compiler support for threads?
Nguyet Nguyen


The performance of threads are closely dependent on the the thread library
  and the compiler as well as the underlying operating system.  Is the
  performance of an event-based system more predictable across different
  operating systems and compilers?

The paper mentioned that they created a threading package and a webserver
  that out performs SEDA.  It seems that they have made a specific library
  for a narrow range of applications, do you think that their
  arguement/threading library can be applied to a wider range of
  applications?

PS: Did you noticed that they mentioned a user-level thread package that
scales to 100,000 threads, but in their only data figure (Figure 3) only
performences upto 65,536 are shown
Ivan Sham


In section 3, Exception Handling and State Lifetime, paper says
"Cleaning up task state after exceptions...since the thread stack
naturally tracks the live state for that task". I am not quite familiar
with the threaded system, could you provide a more brief description of
this sentence?

In section 4, the author presented the idea of "Compiler Support for
Threads", but these descriptions about functionalities provided by the
complier are very abstract and really hard for me understand how they
are related to each other. Could you show me an example and draw a
diagram to visualize the process? and I wonder how the "standard
libraries thread unsafe" problems could be solved? Can the complier
remedy this problems?
Haoran Song


What exactly do the authors mean by 'naturally'?  They
employ the concept in more than one place that Threads are more natural
than Events(in terms of use, comprehension, etc.), but never actually
provided any basis (i.e. psychological reports, etc.) that prove their case.

Along that line, one of their supporting claims that
Threads is 'more natural' is that even in their own Ninja system, they
employed threads for complex parts in lieu of using Events.  Isn't that
kinda like saying that threads is good because we used threads because
threads is good so we used threads(ad nauseaum)?
Billy Cheung


The authors suggested several modifications on the current thread implemenations which are mainly inspired from their counterparts in events. (e.g.cooperative threads, synamic stack allocation against stack overflow etc.) In general is it fair to say that authors are suggesting event-like threads for performance improvements, specifically for uniprocessors.

In general it occurs like designing a compiler that`s suited for estimating dynamic stack bounds for dynamic allocation in stacks would be a tedious job, especiallin nested function calls, mutual recursions etc. Are there any implementations of such compilers? If there are how precise are their estimations?
Mehmet Alparsian


With reference to live state management, how does reordering
variables with overlapping lifetimes help ? As in how does it "prevent
live variables from unnecessarily pinning dead ones in memory" ?

Which has higher locality - Batched event processing or Thread
processing ?
Mayukh Saubhasik


Figure 2: It scales to past 100,000 threads/concurrent tasks.
But why does throughput drop off past 200k threads, and not for events?

Is it an inherent property of threads that "too many" will drop throughput?

Is a user-level thread package required for such performance levels?
Does Linux 2.6 kernel threads achieve something similar?
Henry Wong


The paper says Event is a bad idea and SEDA says event is a good idea.
Do both of idea conflict with each other?

Can we say improving compiler will help improve thread issues?
Gary Huang


The paper states that the major source of overhead in threads is the
presence of operations that are O(n) in the number of threads and that by
removing this factor, performance by threads is equivalent to event
modeled performance.  What are these operations? And is it truly possible
to remove them?

What is dynamic fan in/fan out?
Anoop Karollil


The argument that "events are bad" hinges on a point that if threads
can be made as efficient (or more efficient) than events if O(n) aspects
of scheduling are removed. The authors seem to prove this with
optimizations of the Linux 2.4 kernel scheduler. Is the entire premise
then that only certain implementations of threads have led to
misinformation on thread vs. events?

The authors are still acknowledging that programming with threads is
difficult, though some algorithms are naturally expressed in threads. Can
any design patterns be drawn from event programming and applied to thread
programming (e.g. cooperative multitasking) that would lessen the pain of
threads?
Kevin Loken