Fine-Grained Mobility in the Emerald System


Mayukh Saubhasik & Jun Zhang Presented

Slides

Emerald.pdf
Emerald_Discussion.pdf

Questions


1.  There has been very little acceptance in industry of *languages*  that target programming in a networked or distributed environment.  There are some exceptions -- such as Erlang (although this was  developed in house) -- but largely this is a space that is dominated  by libraries. Which model do you think is more appropriate? Why do  you think language-based solutions have not become popular?

2.  The authors emphasize that Emerald does not support classes, but  instead has objects that store code in "concrete type objects". This  distinction does not seem to be very strong. What are some of the  major differences between a class model and the model used in Emerald?

3.  Do you think the authors were justified in inventing a whole new  language, instead of modifying an existing one (a la distributed  Smalltalk or uC++)? Do you think that adoption of their solution  would have been greater if it had been implemented overtop a  contemporary, popular language?

4. Does the fact that logical separation in class structure corresponds directly to separation in execution location force programmers to structure their code more carefully?  Could this result in less readable/maintainable programs?

5. Could we apply similar ideas in explicit data mobility to give a programmer direct control over a cache hierarchy?

6. The authors state that it is possible for remote object to be unavailable. How does this possibility for failure affect the kind of programs we can write with emerald?  How could we recover from such a failure?

7. What is the best entity to move from one system to the other? The paper tries to move objects over distributed systems. However, I still think the unit of object is too small, because objects do not operate independently (they have to cooperate; it means the coupling of them is strong). In order to overcoming the small size, the paper may suggest three different styles of object implementation. However, how does the compiler classify objects as these three styles? I also still doubt if the system assigns a right style to each object without any problems.

8. Operating systems and compilers both may need to support an distributed environment. However, which parts each of them has to support for? The pars supported by them seem to be overlapped. Would you distinguish the parts that each of operating systems and compilers should support?

9. How does Emerald compare with CORBA? What are the similarities if there are any?

10. About the architecture of Emerald, the Emerald kernel runs over the BSD kernel? An Emerald process is just a series of object function invocations and not say a traditional process with PCB, PID etc?

11. Who sets up the objects initially and what is the criteria for object movement/migration decisions?

12. Why objects are not class-based and do not form a hierarchy? Why is it costly to separate an object from its code in a distributed environment? What are pros and cons of this instance-based object model?

13. Take a look at  section 3.1.  I donot quite understand the description about what happens if a global object is on a different node? And why within a single node, all objects can be addressed directly without kernel intervention?

14. As we know, most object-based systems use garbage collection to recover memory spaces that are occupied by useless objects. For example, in Java system, if an object is lost, garbage collection can automatically remove this object from the memory. How does Emerald deal with the problems of garbage collection in a distributed environment?

15. Why do we need fine grain mobility down to being able to store an integer on a remote machine?  What gain can we get some such programing paradigm?

16.  How can we efficiently schedule processes from other machines without allowing them to hijack the local machine?

17. Can some bugs be hidden at development time by the fact that the same Emerald source code generating different implementation for different context?

18. While Emerald's compiler allows the developer to assign process objects to machines manually, I seems to me the assignment is done at compiler time. Would it be better for load balance if the user can let the kernel to dynamically assign a free computer at runtime? Or, would it be better for the developer to allocate a group of machines as candidate to contain the object and let the compiler to decide?

19. What applications would Emerald be particularly well suited for? (answer: servers for MMORPGs!)

20. (more general version of the same question) There seems to be two contrasting  design goals that can come about when designing a distributed system: 1. you
 either move processes around to achieve your goals OR 2. you prefer to allow  remote access to resources that you need. What characters of your distributed
 application will make one option more appealing than the other.

21. It seems that they don't have any special policy for fault handling. What happens in case of a node failures? Is there anyway to recover object states? What if there is a node with intermittent connectivity? Did they address these issues later? If the answer is no is it practical to use such a system?

22.  Emerald has completely integrated the distribution into the language. Although this approach has several strengths, I think that combining language issues and system issues is not a good idea in general, and this is a weakness for their method. Maybe this is why their method is not in widespread use. What do you think?
 
23. How do they ensure uniqueness of the object Ids?

24. One would normally associate a distributed system with attributes such as robustness and reliability.  However, does the introduction of mobility actually harms this?  (i.e while the authors assume that nodes can be trusted, what happens if a certain program were to direct all 'important' objects into a single node, and then either maliciously or otherwise, remove the node from the network?)

25. Are the active objects in Emerald essentially mobile agents?  They are capable of executing (arbitary?) code and moving from node to node depending on necessity, etc, as agents are probably best known for.  Is there any difference between the two?

26. The decision of creating a new language was motivated by the fact that the semantics of the already existing languages prevent efficient optimization in either the local or the remote case. Wouldn't it have been simpler to extend an existing popular language ?

27. Where would fault-tolerance (network failures, machines exploding...) fit into an Emerald application ? Would it need to be handled by the programmer, or would that be left to the kernel running the application ?

28. What does fine-grained mobility offers that is superior or different from  RPC?

29. Are the calls in Emerald synchronous or Asynchronous? Why?

30. Is it better to design a new language with new constructs (abstract types, movable stack frames) or add mobility as an extension to an existing programming langauge?

31. How can we use the primitives of mobility in Emerald (move, locate, Fix, refix)?

32. Why did emerald leave out other features of a object based system like inheritance and run time dispatching ?

33. Did emerald use any naming/directory lookup service to locate objects by name ?

34. Does emerald have any mechanism to deal with node failure ?

35. About homogenity of the system: Even though the system used is homogenous, how possible do you think to use this idea on nonhomogenous systems.(i.e. different architectures in the nodes). One problem that appears is the movement of register values as different architectures have different purpose registers. Can you think of anything else? ( i.e. kernel dependencies etc)

36. Is it possible to dynamically attach (new) objects to another object? From what I understood the template for an object holds the information for attached objects as well. Since the templates are compile time generated it seems dynamic attachment is not possible. Is it actually the case?

37. How does this system protect object integrity?  If a copy of the object is stored on all systems that are using it, how does that object stay concurrent?  Don't we have to pass around update messages to each system and synchronize them all?

38. Why do we break the process stack when moving an object?  We know that we'll likely need that object again anyway so why do we move it?

39. I'm not really sure what the authors mean by "linked" and "not linked" when refering to process activation records.  Can you explain what these terms mean?

40. Call-by-move and call-by-visit are the primary methods of transferring data when invoking remote system calls. It seems that the authors missed an optimization in call-by-value for immutable values (i.e. why bother with the return or maintaining forwarding addresses of an int).

41. This work was done in 1988 .. is there any progeny of this work that is in use today?

42. The system depends on homogeneous machines ... would it be possible to implement across heterogeneous machine architecture?

43. For handling processor registers, why does Emerald use callee-saved registers?

44. About Emerald Processes, objects may move, so what to do about activations?

45. Emerald is a system that provides object mobility. How do you compare this system with the mobile-agent paradigm developed recently?

 
46. One of the limitations in approach of Emerald is that it works only in homogeneous local area networks. And because the authors didn’t include the future work in the paper, do you think which should be developed to overcome that limitation or any thing that need to be improved from the system?

47. Why do you think that they have decide to develop a separate language rather than extending one of the already existing programming languages?

48. Do you think that keeping the direct memory addressing is the best choice?

49. In what Situation would you want to move an object?

50. Emerald uses constructs that support mobility, but its test application did not require much distributed processing! Has the use of such constructs really been justified?

51. Emerald ignores network congestions and also there is no mention of scalability and reliability issues for a large distributed network. It raises questions on its suitability for such networks!

52. They say that Emerald has no class/instance hierarchy, but their concrete type objects sound a lot like classes, and their abstract types sound a lot like Java interfaces. How are their objects different from instances of a class?

53. If threads of control span multiple objects, and multiple threads of control can be active in the same object at once, what does it mean for an object to be associated with a process? Why is a process one of the four components of an object?

54. They seem to make all decisions about when objects should move statically (movement is specified by the programmer or the compiler). Would it be better to (also) make the decisions dynamically, based on observed usage patterns and performance?

55. It sounds like Emerald is itself an Operating System... Does this restrict all applications running on this system to be written in the Emerald language only?

56. Object locations don't appear to be transparent (There are explicit move, locate, etc. primitives) Is there anything here that can't be done with a library on top of, say, MPI?

Discussion Recap

1. The reason to create a new language instead of using the old one is that: old could not avoid some disadvantages of its own and it is only a research branch. At that time, Java has not existed.

2. According to another paper on Emerald,  the Emerald compiler implements a syntactic extension called a class and supports a form of inheritance for classes.

3. Due to machine crashes or communication network failures, objects may be temporarily or permanent unavailable. Emerald provides unavaillable handlers to allow programmers to detect such situations and attempt recovery.

4. There is a research paper to explore this topic on homogeneous system. This paper is provided by Microsoft Research. It extend Emerald by changing its compiler and run time kernel. For details, you could refer to the paper directly.

5. For later work, Microsoft research has done some research on extending Emerald to Heterogeneous Computers. Also, the University of Washington investigate specific topics on Emerald, such as garbage collections. Until now, I do not know the usage of this system or language.

6. Not sure if emerald used any naming/directory lookup service to locate objects by name .

7. As I know, Invocations are synchronous; the process performing the invocation is suspended until the operation is completed or until the run-time system determines that the operation cannot be completed.

8. The key to achieving high performance in local execution is to permit local memory references(pointers) to be used. This, however greatly complicates moving objects between machines because these local references must be detected and modified to point at the correct memory(on the new machine) or to be marked as pointing at a remote object.