The Performance of Micro-Kernel Based Systesms
by Hermann Hartig, Michael Hohmuth, Jochen Liedtke Sebastian Schonberg, Jean Wolter
Slides prepared and presented by Ivan Sham can be downloaded
here
Class Discussion Summary
Microkernel structure promotes better coding style: Arguments were made
for both sides whether an environment based on microkernels promotes better
coding style in terms of encapsulation and low coupling between modules.
Multiple Operating Systems: It is possible to run multiple operating
systems simultaneously with microkernels. While this idea is neat, it is not
widely used in practice. Potential usages for such a system include
database servers, and executing real time applications.
Links
The L4 Micro-Kernel Family
The L4Ka Project
L4Linux
Discussion Questions
The paper is attempting to dispel poor performance attitudees regarding
micro-kernels, which are based on first generation (flawed) designs. They
show that a second generation micro-kernel (L4) has only a very slight
decline in performance regarding "native" code (e.g. L4Linux vs Linux). So
what is the compelling argument for micro-kernels. Best efforts (and extreme
optimization) on various hardware still result in *less* performance ... why
do it at all?
There seems to be an underlying assumption in many of these OS papers
that portability is important (i.e. concept is transferrable across multiple
hardware architectures). Isn't there a place for specific OS concepts tightly
coupled to certain types of HW to get extreme performance for particular
applications?
Could you give a simple introduction on the development of u-kernel and
the purpose of the research on it ?
Could you explain lmbench and hbench ?
Why does L4 adopt synchronous IPC? What are advantages and disadvantages
of synchronous IPC?
Why did the authors choose MkLinux to compare L4Linux? are there any
other systems? In considering MkLinux is based on Mach whichshows a bad
performance, the choice seems not to be reasonable.
I kind of get the feeling that microkernel will never achieve the
performance it desires but this implementation comes close, which is
impression. Will we ever see the rise of the microkernel as the dominant OS?
And what is up with Mach? It never seems to perform that well.
One thing I am curious about is why the L4Linux architecture part of the
kernel is almost double the original. I thought L4 would provide a lot of
features to reduce the size of the code? How does it get bigger when you've
already got a platform to do a lot of work for you? This may go against the
design but, do you think they could've reduced this code size by using a few
L4 tasks to help manage the encountered memory problems?
The paper is to present the performance comparison between L4Linux and
monolithic Linux to demonstrate the performance improvements of
2nd-generation microkernels significantly affect OS personalities and
applications. But the experiments are for Linux on the top of L4, how about
other OSs?. Do you think whether there is any difference in performance with
these?
Do you think whether the kernel extensions are better than extensibility
using libraries and servers or not?
The Linux kernel is run as a server task in L4 with memory addresses and
a thread of execution provided by L4. Won't this have required a substantial
number of changes to the kernel in the architecture dependent part? Is it
really true that only 2% of the Linux kernel is architecture dependent?
There seems to be a lot of waste of the Linux kernel functionalities like
paging (requiring virtual -> pseudo physical -> physical paging), scheduling
etc, which is kind of unavoidable as we are trying to run a fully functional
kernel on a micro-kernel. But is the gained flexibility and easy
extensibility really possible only with micro-kernels?
Micro-kernels provide two important features specialization and
extensibility. How did the auther support argument in the Pipes and RPC
example?
The author refutes the claim that "the lower the level of the primitive,
the more effeciently it can be implemented and the more latitude it grants to
implememtors of higher level abstractions". How did the author support his
argument in PIC us PCT example and how valid is his opinion?
Are there really a lot of people who want to run real time systems on the
same machine as a general purpose OS like Linux? While it's interesting in
theory, I'm not sure how much practical use there is for it.
The authors say that Linux has a "relatively well-defined interface
between architecture-dependent and independent parts of the kernel." Given
that, is it fair of them to refer to Linux as "monolithic?"
Do you think molilithic operating systems such as Linux will start
borrowing micro-kernel ideas to improve Pipes and VM performance ? Has it
already been done ?
Even though the authors meant to investigate the performance of "pure"
micro-kernel approaches (see intro), instead of directly comparing the L4
micro-kernel to Linux, they compare a version of Linux on L4 to native
Linux. Could this be because it is very difficult to use a micro-kernel for
other purposes than hosting virtual machines ?
First-generation -kernels have a reputation for being too slow and
lacking sufficient flexibility. To determine whether L4, a lean
second-generation -kernel, has overcome these limitations, we have repeated
several earlier experiments and conducted some novel ones. From the context,
what improvements does the second generation -kernel make comparing the first
generation -kernel?
In Figure 2, there are three types of thread, main thread, bottom half
thread, and interrupt thread. What is the half thread?
I am a little confused on a statement in this paper that says "Only a
single L4 thread is used in the L^4 Linux server for handling all activities
induced by system calls and page faults." The explanation following this
statement does not present me clear idea why they only use a single
thread. Could you give more hints/descriptions?
The current version of L^4 Linux uses 10 ms time slices and only 4 of 256
priorities,in decreasing order: interrupt top-half, interrupt bottom-half,
Linux kernel, Linux user process. Why are there only 4 priorities being used?
What other priorities are not being used and why?
I wonder if the successful micro-kernel really exists by now. Mach
apparently isn't a "pure" micro kernel. Does L4 count? Has anyone actually
built a successful, industrial micro kernel that can be used with real
operating system like Windows, or MacOS? If not, how come?
In section 5.3, the paper sites AIM benchmark results, but then it goes
to say that the benching marking is not certified by AIM Technology. Why did
they mention that point specifically? Isn't it assumed that, when you use a
calculator to do some arthimetics, for example, that the results are not
certified by the makers of the chips in said calculator?
Was there any followup research on the hanging question of the effects of
downloading grafts into kernels in terms of performance increases?
Doesn't running a monolithic kernel as a user process on top of a
microkernel defeat one of the reasons for using a microkernel? (that the OS is
nicely structured into user-mode servers.) It's just isolating the monolithic
"mess" into its own sandbox, but not quite "fixing" the problem. Just a
"feeling" and probably not true.
L4 privileged user process can disable interrupts: Can this crash the
system? So a user-mode process can crash the system?
Can a microkernel get significantly faster than L4? (Or, has L4 already
pushed quite close to the fundamental limits of microkernel performance?)
Linux vs. L4 pipe: Given all the talk about Linux 2.6 being so much
faster, does this advantage still hold compared to Linux 2.6?
Microkernel: "enables specialization" and "extensibility". I'm not
familiar with Linux kernel modules, but I get the impression those can
accomplish (most?) of these tasks, but without the memory protection.
How does L4 IPC structure differ from L3 IPC? Especially how does clan &
chiefs effect the performance of L4 IPC compared to L3?
In Macrobenchmarks section: the compilation time of L4Linux and native
Linux is used as a supporting fact to the performance results produced by AIM
benchmark. How can these two measurements be related?
How does PCT (Protected control transfer) work in case of a server
domain. Ie When multiple threads make simultaneous RPC calls (implemented via
PCT) onto a server domain, How does the server serve these RPCs since there
is no associated thread context (As I understand there is no thread switch
involved in a PCT, the same thread just switches address spaces)
How much of a difference does the small address space optimization make ?
While micro-kernel, according to the paper, can have better real-time
performance than monolithic kernel , why is SymbianOS slowly being replaced
by Linux in the mobile market, where real-time response is needed?
In the paper, 'trampoline' is the technique that allow binary compatibility
between L4Linux and native Linux. How is 'trampoline' implemented in detail?
Where is the emulated native system-call trap implemented? In the L4 kernel
space or the Linux user-level space?