Linus Torvalds (torvalds@cs.helsinki.fi)
Tue, 6 Aug 1996 12:47:31 +0300 (EET DST)
Messages sorted by: [ date ][ thread ][ subject ][ author ]
Next message: Bernd P. Ziller: "Re: Oops in get_hash_table"
Previous message: Linus Torvalds: "Re: I/O request ordering"
On Mon, 5 Aug 1996, Peter P. Eiserloh wrote:
We need to keep a clear the concept of threads. Too many people
    seem to confuse a thread with a process. The following discussion
    does not reflect the current state of linux, but rather is an
    attempt to stay at a high level discussion.
NO!
There is NO reason to think that "threads" and "processes" are
  separate entities. That's how it's traditionally done, but I
  personally think it's a major mistake to think that way. The only
  reason to think that way is historical baggage. 
Both threads and processes are really just one thing: a "context of
  execution". Trying to artificially distinguish different cases is just
  self-limiting. 
A "context of execution", hereby called COE, is just the conglomerate
  of  all the state of that COE. That state includes things like CPU
  state  (registers etc), MMU state (page mappings), permission state
  (uid, gid)  and various "communication states" (open files, signal
  handlers etc). Traditionally, the difference between a "thread" and a
  "process" has been mainly that a threads has CPU state (+ possibly
  some other minimal state), while all the other context comes from the
  process. However, that's just
  one way of dividing up the total state of the COE, and there is nothing that says that it's the right way to do it. Limiting yourself
  to that kind of image is just plain stupid. 
The way Linux thinks about this (and the way I want things to work) is
  that there is no such thing as a "process" or a "thread". There is
  only the totality of the COE (called "task" by Linux). Different COE's
  can share parts of their context with each other, and one subset of
  that sharing is the traditional "thread"/"process" setup, but that
  should really be seen as ONLY a subset (it's an important subset, but
  that importance comes not from design, but from standards: we obviusly
  want to run standards-conforming threads programs on top of Linux
  too). 
In short: do NOT design around the thread/process way of thinking. The
  kernel should be designed around the COE way of thinking, and then the
  pthreads library can export the limited pthreads interface to users
  who  want to use that way of looking at COE's.
Just as an example of what becomes possible when you think COE as
  opposed  to thread/process:
- You can do a external "cd" program, something that is traditionally impossible in UNIX and/or process/thread (silly example, but the idea 
  is that you can have these kinds of "modules" that aren't limited to 
  the traditional UNIX/threads setup). Do a:
clone(CLONE_VM|CLONE_FS);
child: execve("external-cd");
/* the "execve()" will disassociate the VM, so the only reason we 
  used CLONE_VM was to make the act of cloning faster */
- You can do "vfork()" naturally (it meeds minimal kernel support, but  that support fits the CUA way of thinking perfectly):
clone(CLONE_VM);
child: continue to run, eventually execve()
mother: wait for execve
- you can do external "IO deamons":
clone(CLONE_FILES);
child: open file descriptors etc
mother: use the fd's the child opened and vv.
All of the above work because you aren't tied to the thread/process
  way of thinking. Think of a web server for example, where the CGI
  scripts are done as "threads of execution". You can't do that with
  traditional threads, because traditional threads always have to share
  the whole address space, so you'd have to link in everything you ever
  wanted to do in the web server itself (a "thread" can't run another
  executable). 
Thinking of this as a "context of execution" problem instead, your
  tasks can now chose to execute external programs (= separate the
  address space from the parent) etc if they want to, or they can for
  example share everything with the parent except for the file
  descriptors (so that the sub-"threads" can open lots of files without
  the parent needing to worry about them: they close automatically when
  the sub-"thread" exits, and it doesn't use up fd's in the parent). 
Think of a threaded "inetd", for example. You want low overhead
  fork+exec, so with the Linux way you can instead of using a "fork()"
  you write a multi-threaded inetd where each thread is created with
  just CLONE_VM (share address space, but don't share file descriptors
  etc). Then the child can execve if it was a external service (rlogind,
  for example), or maybe it was one of the internal inetd services
  (echo, timeofday) in which case it just does it's thing and exits. 
You can't do that with "thread"/"process".
Linus