Does
the mechanism of setting errno to EINPROGRESS not lead to code bloat
where you frequently find statements like if (errno == EINPROGRESS) {
do something } else { do something else} Is it even good style
to call something being in progress an "error"? (That's
admittedly a nitpick, but this seems like a hack to me.)
Another
nitpick: Wouldn't it have been simpler to write "wrappers"
around existing syscalls that take a parameter to identify whether
they run in async mode or not, instead of *duplicating* all
existing syscalls in a parallel API? What would have been the
disadvantages of doing it this way? (Breaking old code is an obvious
one, but something as simple as a default parameter in C could solve
that I think.)
In 9.2.1, the authors use a LOC metric to
demonstrate differences in code complexity across various
implementations. Is the distribution of these lines not also
important? For instance, I'd rather have 100 LOC inserted into one
function call than 20 new lines of code scattered across multiple
files.
The
lazy part of their IO operations referred to the strategy where an
asynchronous operation is not used if the operation will not block.
But, why being lazy is necessary? I do not agree on this pure single
threaded approach. I think that in order to keep balance between
programming complexity and exploiting available system resources,
we'd better use both thread and event. What do you think?
The
LAIO library was developed originally for web servers. I wonder what
other services can also benefit from LAIO? Could you give some
specific examples?
What is continuation ? Normally, how AIO and LAIO create it ?
I am not sure if LAIO could avoid blocking entirely just as the paper says. What is your idea about it ?
How
does the performance of a event-driven http server with LAIO
compare
with the Knot server with Capriccio? Or better, how does LAIO
compare to epoll in Linux?
In a high concurrency system,
the huge number of LAIO calls would
generate an overloading
number of scheduler activations. Wouldn't it
be more efficient to
provide non-blocking IO and system call from OS
directly?
Is
the primary benefit of LAIO performance? That it only creates a
continuation if the underlying I/O blocks? It seems we have to create
extra code to check for block / no-block and *still* provide
the continuation anyway (in case we block).
There is an
underlying assumption that this is a single threaded application and
that no other thread can pre-empt and set errno (which is a system
wide global!). Does this really hold? Is it not possible that the
background I/O that is already queued may also set errno? This seems
like a huge potential area for failure of the system by improperly
accounting for continuation of a function.
It
seems that the performance of Flash LAIO-LAIO is better than
flash-NB-AMPED due to better utilization of disk. The paper
doesn't
explain how this happens.
Why is it necessary to
be "Lazy"?
Why
the evaluation results of Figure 7 and 8 are different each other? *
For example, while Flash-LAID-LAID-warm is higher than
Flash-NB-B-warm in Figure 7 (a), Flash-LAID-LAID-warm is lower than
Flash-NB-B-warm in Figure 8 (a).
Which parts from Figure 1 to
Figure 6 show that LAIO reduces coding complexity? *
Section
9.2.1 tells us the cases that using LAID reduces programming effort.
However, I am still looking for the critical part to show the cases
in the six figures.
Do
we consider completion notification better than partial
completion notification especially since some algorithms appreciate
to get some data to process while waiting? (Ease of programming VS
effeciency)
Can we cohe number of lines to be a good
indication of programming complexity? (what if the programmer is more
experienced or the shorter code is moreharder to read)
The
paper states that LAIO is general, as in it can be used to call all
system calls and that it does not create a continuation for
operations that return immediately. But don't you think the latter
feature is actually quiet essential for generality? (I mean an
operation might generally not block, and hence having it as an LAIO
operation is not exactly required and it might even decrease
performance because of the added overhead - and hence the laziness
requirement)
Why does AIO support only read/write operations?
Why couldn't they initially support other operations like open(),
stat()? Or what has been done in LAIO to actually support these
calls?
Is
there a way to enforce the requirement placed on a background
laio_syscall(), that any argument buffers are not modified until it
returns?
Is the requirement for kernel support of scheduler
activations reasonable?
Given the degree to which scheduler
activations do the work here, how much of a contribution should we
credit the authors with?
Regarding
the comparison of complexity based on how many lines of code were
changed, is that really a valid basis of comparison? I mean,
sometimes 10 lines of code could be more painful to write than 100
lines you just copy and pasted from another section.
Regarding
the lazy generation of continuations, would it not be an improvement
on performance if some sort of heuristics could be used to
predetermine which calls are most likely to require continuation, and
generate them beforehand instead of having to wait to see if
something requires a continuation or not?
The author mentioned that LAIO requires support of scheduler activation from the kernel. Is this necessary? Can LAIO be implemented without scheduler activation?
Using LAIO does not allow progress of the blocking calls to be monitored or partial results to be shown (since only one event is sent back up when the operation is completed). Is this acceptable?
Facts:
In Section 2 authors say that whenever a background laio_syscall()
occurs it is the laio_gethandle() that returns a handle identifying
it, and the handle always points to the last background laio_syscall.
The laio_poll() returns a set of LAIO completition object, one per
completed background laio_syscall().
Question: Does this mean
that laio_gethandle() constantly has to be called to return the
handle of the last laio_syscall() which didn't receive a LAIO
completition object. However, in the event loop of Figure 1, there
are no calls to laio_gethandle(). Why aren't there any?
This question is about eventp which first appears in Figure 1. Is that a structure that holds all event objects? But why is it then disabled at the end of the event loop?
I am not sure what laio really does, but according to the authors it is better than other asynchronous interfaces. Would you happen to know if it is becoming common? Would you say that processor "grabbing" of scheduler activations could be a problem of their design or not?
It's
interesting the way Khaled compares code copmlexity by measuring the
number of lines of code. LOC counts are generally not a
reliable method of comparing code complexity so why does he do this?
Does this add anything to the argument?
I like the simplicity
of this approach bt why has no one else noticed or been able to
address the problems of certain system calls alwas blocking (like
opening fiels)? Seems odd that we're only attempting to fix
this in 2004.
What are the improved features of LAIO compared to the other forms of I/O processing (non-blocking, AIO, AMPED,..)?
Do you think that counting the affected lines of code within one sample server like Flash web server is enough to conclude LAIO having programming advantages over the others (non-blocking I/O, AMPED,...)?
How does the LAIO implementation of Flash have a lower disk usage than the AMPED implementation for the same traces ? How does LAIO help in reducing the 'amount' of disk IO ?
How
would the LAIO interface be implemented on a OS without scheduler
activation support ?
How
many hours of your life have you lost debugging non-blocking I/O
code? Are there applications where you would still prefer to use NB
I/O even when LAIO can provide cleaner code? Multimedia applications?
I can understand LAIO matching (or almost matching) the
performance of non-blocking I/O. But I still can't explain how it can
beat out non-blocking I/O by such a large factor in some of the tests
(Figure 10, Berkeley; Figure 11, Berkeley). Is it because of slightly
better memory locality of the application because of the LAIO memory
access patterns?
The kernel uses a Windows NT asynchronous notification mechanism called asynchronous procedure call (APC) to notify the applications thread of the I/O operations completion. Is LAIO is a special case of APC?
LAIO overcomes the limitations of previous I/O mechanisms, both in terms of ease of programming and performance. Are there any side effects from LAIO?
While LAIO reduces the number of possible events one needs to handle (events only for fully-completed syscalls, not partial completion?), I don't see how it would simplify event-driven programming any further than that.
i.e., the fundamental problem of having to split up "tasks" into multiple functions remains?
Section 9.2.2 Berkeley workload: They claim that LAIO transferred less disk data than NB-AMPED. Why?
(I would expect a similar amount of data transferred to/from disk, perhaps at a higher transfer rate)
Lazy
I/O on disk reads:
I will base my question on an assumption: I`d
assume that disk reads performed would always block a system(will not
have data ready on the buffer when read is called, since a disk
access time is required). Why shouldn`t we us AIO on the disk reads
only and use LAIO on the rest( e.g. Network and writes on disk)? The
AIO performance is slightly better(about 1.08) when data is not
available according to microbench test presented in section 7.
Is
the advantage of lazy creation of threads used in schedulers of LAIO?
In other words is there a scheduler where it can somehow guess
non-blocking calls and give those threads a lower priority?
Does
laio_syscall() save the context of the calling thread on each
invocation, or only when it knows it is going to block?
Obviously, there would be cases where the time taken to
complete a
blocking call would be smaller than going through the
LAIO process:
creating a new thread in the kernel, and notify the
blocking
application (a scheduler activation). Is this why
non-blocking IO
performed better than LAIO in the microbenchmarks
(reading a byte from
a pipe)?
the real contribution of the paper, given that most of the work is done by Scheduler Activations, is the recognition of the importance of laziness
the possibility of implementing it without schedular activations, which provide more than is necessary
a more general implemenation that would allow calling blocking library routines
two distinct test cases:
often, buying more RAM or using a more effective caching strategy is the easiest solution
could have lazy asynchrounous I/O with partial completion notification if you wanted
lack of explanation for the claims of better disk utilization (although they are validated)
why every system call must not block in event-driven programming (whole program blocks)