Paper 24: The LOCUS distributed operating system
B. Walker, G. Popek, R. English, C. Kline, G. Thiel 1983
Paper (PDF)
Presentation Slides (PowerPoint)
Discussion Slides (PDF)
Discussion Slides (OpenDocument)
Discussion Summary
- The discussion largely followed the discussion slides
-
Transparency
- Analogous problem of how much should be hidden from the user ("transparency"): Assembly-language programming vs. high-level language. High-level language performace is usually good enough, although it is often necessary to expose the system (via assembly language) to improve performance.
- Locus provides location transparency, but not performance transparency. Remote requests have a performance penalty.
- Location transparency might not be a good thing, as it is not always the best policy and does not allow the user to override the default policy by specifying location. e.g. Using idle computing resources owned by another group in a company is good, but probably violates the default policy.
- Is emailing users for file conflict resolution really "transparent"? But it appears to be the best that can be done. Even version control systems rely on manual merging of conflicted files.
- Discussion on transparency as applied to parallelizing code on clusters vs. "supercomputers": Compiler automatically parallelizes (Fortran) code vs. message passing (MPI) where all communication and parallelizing is done manually.
- Scalability
- DNS seems to scale fairly well. But there are people who worry it may not continue to scale.
- The current Locus system doesn't look like it would scale well to a large number of machines
- Security
- There are various levels of trust: e.g., a root password would never be entrusted to anyone other than the admins.
- Where does one delimit between trusted and untrusted?. Perhaps trusting the institution as Locus does is to trusting.
Error Recovery
- Discussion on whether it is easier to make a reliable network and live with downtime, or keep services available during network downtime and deal with conflicted modifications to files.
Windows does not appear to use a RAM-based /tmp equivalent.
List of Questions
-
Point of clarification: Requests for data from remote sites are accompanied by a 'guess' as to where the inode info is stored at the storage site. This terminology isn't elaborated on anywhere. How would this guess be made? How reliably could it be made? I think I'm missing the point here...
-
How much space is consumed by shadow paging relative to the space consumed by a logging scheme? Can the difference be well characterized?
-
I find it somewhat strange that the final handling step of a failed name conflict is to email the owners... what happens if these files are being programmatically created? Should there not be some mechanism of notifying the program that something has gone wrong, without forcing the program to explicitly check if the file it created has actually been renamed?
-
The architecture design of LOCUS seems to build a large distributed system that operates quite independently. How about the communication between a system running LOCUS and other systems like UNIX?.
-
What are the disadvantages of LOCUS?. It seems that LOCUS has little concern about the security. Is it right?
-
Distributed system is growing in size, which makes us have to consider the scalability principles when we design a distributed system. The algorithms used in Locus depends on nodes to achieve consensus on the current partition state of the network. This strategy requires election by large numbers of nodes. How can we improve this with respect of scalability?
-
This paper mentioned cashing and locking. Could you explain more about how cashing and locking is used in distributed file system?
-
The LOCUS implemented consistent sharing using sophisticated locking and transactions mechanism for shared files, but this results in a complex interface and implementation. Did you investigate any better design of distributed operating system or any relevant performance issues comparing with LOCUS?
-
The issue of scalability is not even mentioned in the paper. Is it
because scalability was not a concern at the time the paper was
written ? Do think LOCUS could scale to more than a few dozens of
computers ?
-
Threads that we use nowadays in our programs can share everything
between each other, except maybe their stack memory. This contrasts
with the notion of processes in LOCUS where it seems that only file
descriptors for files or pipes could be shared. Could distributed
systems such as LOCUS or more evolved ones allow running 2 threads of
the same program on different computers ?
-
This paper reminds me a lot of database system design in general and maybe doesn't make a clear enough distinction between the two. How is LOCUS different from a database (other than storing files instead of records)?
-
All throughout the paper we see examples of the complexity of designing a distributed file system but at the end, they sum it up by saying that this can all be handled by a good design. Is LOCUs a good enough design? They mention doing a lot of reconfiguration/redesigning to get it working.
-
The ideas here don't seem that novel. Aren't we just taking ideas from database design and applying them to a file system? Is that the contribution of this paper?
-
In LOCUS all systems resources are implemented in an integrated
way, remote and local resources are treated identically and system
makes decisions on resource allocations. I think that these properties
lead to big scaling problems. But the authors claimed that they were
very optimistic on the scalability of LOCUS. What do you think? Are
there any reports on the scalability of LOCUS?
-
In LOCUS resource allocation decisions are made locally by the
application without a high level view across jobs. Is it possible that
some problems arise when the applications start simultaneously?
-
In LOCUS all systems resources are implemented in an integrated
way, remote and local resources are treated identically and system
makes decisions on resource allocations. I think that these properties
lead to big scaling problems. But the authors claimed that they were
very optimistic on the scalability of LOCUS. What do you think? Are
there any reports on the scalability of LOCUS?
-
It seems that in LOCUS remote 'fork' takes a long time because of
the need to transfer the program image across the network. How would
it impact the system if lots of remote executions are occurring? e.g.
Is there the possibility of network congestion causing an appearance
of partitioning?
-
One of the facts the authors relied on in the implementation of LOCUS is that most file/directory related operations are reads instead of updates, so one need not 'overly' worry about different versions of the same file being used and causing discrepancies when replicating them for convenience, only very rarely requireing contacting the user about conflicts while merging. Do you feel that is still a valid argument 20+ years from the time the paper was written?
-
What ever became of LOCUS? I think the authors formed a company around it, but are they still around? Is LOCUS still an ongoing project in its current form?
-
During a resolution of an update and delete, how can the system determine which of the update and delete happened first?
-
Does the limited experience with file replication imply that the experience with file merges, and user driver recovery, is also limited?
-
Is the decision to handle errors only at the high level sensible? How does this relate with their beliefs that network partitioning is unavoidable, their assumptions about network transitivity, and their deployment in a broadcast only configuration?
-
Using a distributed filesystem as in LOCUS seems to incur a lot of overhead. Since each client can be a host for a "popular" file, there is no guarantee of any performance experienced by the users of the file, and the user at the machine hosting the file. Can't we just use a centralized file server?
-
Being able to execute programs at any site seems dangerous. If the exec call can be made asynchronously, will a malicious user be able to pullute the sites in the network by firing off a whole bunch of remote processes?
-
How would you say updating of an object would be done if a move of an object happens before the object has been updated?
-
Where would the version vector be stored? Not who stores it, but where would it be stored (in inode, ...)?
-
On page 54, what do the authors mean by
"a. The incore inode is found using the guess provided;"
Which guess are they talking about?
-
How would <logical filegroup, inode number> look like for a symbolic link? Would you expect any problems with symbolic links in Locus?
-
It's not clear how the system deals with reading a file while a synchronization / update is occuring. For example, reading file "f"
from one server, an update is performed on another server, and the background replication process kicks in to write sections of the file.
Is the synchronization delayed, or does it end up being part of the recovery process.
-
What is the "commercial environment" that this work was carried on under.
-
What happens with temporary files (e.g. /tmp)?
-
Recovery from a partitioned file system seems to be relying heavily
renaming the two conflicting file into seperate entity and having the
user to manually merge the two files. Is there any better way than
this? What if the user do not know which version is correct?
-
Has recent advancement on P2P network brought over any improvement
over to distributed file system, such as getting rid of the CSS node
(which seems to be a vulnerable point in the system)?
-
Are there good ways to allow for joining and leaving the collective in such
a way that the overall system is not compromised? Clearly this system needs
some minimal centralized management in order to function.
-
In the discussion about the use of ACKs to ensure that changes are in
lock-step: what kind of operations are being performed that would make the
implementers worry about the delay caused by the acknowledgements?
-
What information does version vector exactly hold? For example if a file is partially modified but not propogated (modification) to some server S1 holding a copy. Is it possible for S1 to act as SS for the unmodified portion of the file?
-
Is there a notion of fairness when tokens are distributed among processes when accessing shared resources?(timeouts, priorities etc?)
-
The paper states that the file system is distributed across the LOCUS and that file access is synchronized by a current synchronization site. Changes to two copies of the same file is not allowed at the same time and so is reading a file while a copy is being modified. I do remember reading that some of the files are not duplicated. I was wondering how large this set is because if the set is not large enough, won't the performance be hit a lot due to synchronization needs?
-
Similarly with process creation. Forking a process and then migrating it to a different node involves a huge latency when compared to standard systems. Every I/O that needs to be performed non-locally is over the LAN. Message passing too. Don't you think that distributed systems are suitable only for say tightly coupled systems?
-
Locus seems to be primarily a distributed file system rather than a full distributed operating system - Does Locus provide any sort of process migration / automatic load balancing facilities ?
-
It is mentioned in the end that most of the LOCUS work was being done in a commercial environment, so what happened to the work ?
-
The paper concluded that the LOCUS work is that a high performance, network transparent, distributed file system, and etc. However, the paper didnt discuss any security issues. Do you think if LOCUS should consider the security issues?
-
In the section 3 Remote Process, the paper mentioned the remote process creation, but no much detail is given. Is the mechanism that the paper used for Remote Process better than RPC?
-
What are Locus semantics when two programs concurrently update a single file? how does this compare to NFS and AFS?
-
How do Locus file sharing semantics interact with process migration?
-
How much transparency is needed? Do we need to be completely
transparent netwrok which might lead to ineffeciencies? or do we want
it completely exposed which might impose overhead on programmer?
-
What is a virtual circuit? What is used for practically?
-
Section 4 explains the recovery policy to solve the conflicts among replicated files. How about managing several versions of files and introducing the policies solving conflicts among several versions of files so as to solve the conflicts among replicated files over distributed systems? I think it could make the policy simple.
-
The dynamic reconfiguration in Section 5 seems to focus on reconfiguring distributed systems when adding more machines. What happens if removing machines from the distributed system? Or does the partition protocol work for the removal of machines?