diff options
author | The Android Open Source Project <initial-contribution@android.com> | 2009-03-03 18:29:04 -0800 |
---|---|---|
committer | The Android Open Source Project <initial-contribution@android.com> | 2009-03-03 18:29:04 -0800 |
commit | e54eebbf1a908d65ee8cf80bab62821c05666d70 (patch) | |
tree | 4b825dc642cb6eb9a060e54bf8d69288fbee4904 /sh/TOUR | |
parent | a1e1c1b106423de09bc918502e7a51d4ffe5a4ae (diff) | |
download | system_core-e54eebbf1a908d65ee8cf80bab62821c05666d70.zip system_core-e54eebbf1a908d65ee8cf80bab62821c05666d70.tar.gz system_core-e54eebbf1a908d65ee8cf80bab62821c05666d70.tar.bz2 |
auto import from //depot/cupcake/@135843
Diffstat (limited to 'sh/TOUR')
-rw-r--r-- | sh/TOUR | 357 |
1 files changed, 0 insertions, 357 deletions
diff --git a/sh/TOUR b/sh/TOUR deleted file mode 100644 index f5c00c4..0000000 --- a/sh/TOUR +++ /dev/null @@ -1,357 +0,0 @@ -# $NetBSD: TOUR,v 1.8 1996/10/16 14:24:56 christos Exp $ -# @(#)TOUR 8.1 (Berkeley) 5/31/93 - -NOTE -- This is the original TOUR paper distributed with ash and -does not represent the current state of the shell. It is provided anyway -since it provides helpful information for how the shell is structured, -but be warned that things have changed -- the current shell is -still under development. - -================================================================ - - A Tour through Ash - - Copyright 1989 by Kenneth Almquist. - - -DIRECTORIES: The subdirectory bltin contains commands which can -be compiled stand-alone. The rest of the source is in the main -ash directory. - -SOURCE CODE GENERATORS: Files whose names begin with "mk" are -programs that generate source code. A complete list of these -programs is: - - program intput files generates - ------- ------------ --------- - mkbuiltins builtins builtins.h builtins.c - mkinit *.c init.c - mknodes nodetypes nodes.h nodes.c - mksignames - signames.h signames.c - mksyntax - syntax.h syntax.c - mktokens - token.h - bltin/mkexpr unary_op binary_op operators.h operators.c - -There are undoubtedly too many of these. Mkinit searches all the -C source files for entries looking like: - - INIT { - x = 1; /* executed during initialization */ - } - - RESET { - x = 2; /* executed when the shell does a longjmp - back to the main command loop */ - } - - SHELLPROC { - x = 3; /* executed when the shell runs a shell procedure */ - } - -It pulls this code out into routines which are when particular -events occur. The intent is to improve modularity by isolating -the information about which modules need to be explicitly -initialized/reset within the modules themselves. - -Mkinit recognizes several constructs for placing declarations in -the init.c file. - INCLUDE "file.h" -includes a file. The storage class MKINIT makes a declaration -available in the init.c file, for example: - MKINIT int funcnest; /* depth of function calls */ -MKINIT alone on a line introduces a structure or union declara- -tion: - MKINIT - struct redirtab { - short renamed[10]; - }; -Preprocessor #define statements are copied to init.c without any -special action to request this. - -INDENTATION: The ash source is indented in multiples of six -spaces. The only study that I have heard of on the subject con- -cluded that the optimal amount to indent is in the range of four -to six spaces. I use six spaces since it is not too big a jump -from the widely used eight spaces. If you really hate six space -indentation, use the adjind (source included) program to change -it to something else. - -EXCEPTIONS: Code for dealing with exceptions appears in -exceptions.c. The C language doesn't include exception handling, -so I implement it using setjmp and longjmp. The global variable -exception contains the type of exception. EXERROR is raised by -calling error. EXINT is an interrupt. EXSHELLPROC is an excep- -tion which is raised when a shell procedure is invoked. The pur- -pose of EXSHELLPROC is to perform the cleanup actions associated -with other exceptions. After these cleanup actions, the shell -can interpret a shell procedure itself without exec'ing a new -copy of the shell. - -INTERRUPTS: In an interactive shell, an interrupt will cause an -EXINT exception to return to the main command loop. (Exception: -EXINT is not raised if the user traps interrupts using the trap -command.) The INTOFF and INTON macros (defined in exception.h) -provide uninterruptable critical sections. Between the execution -of INTOFF and the execution of INTON, interrupt signals will be -held for later delivery. INTOFF and INTON can be nested. - -MEMALLOC.C: Memalloc.c defines versions of malloc and realloc -which call error when there is no memory left. It also defines a -stack oriented memory allocation scheme. Allocating off a stack -is probably more efficient than allocation using malloc, but the -big advantage is that when an exception occurs all we have to do -to free up the memory in use at the time of the exception is to -restore the stack pointer. The stack is implemented using a -linked list of blocks. - -STPUTC: If the stack were contiguous, it would be easy to store -strings on the stack without knowing in advance how long the -string was going to be: - p = stackptr; - *p++ = c; /* repeated as many times as needed */ - stackptr = p; -The folloing three macros (defined in memalloc.h) perform these -operations, but grow the stack if you run off the end: - STARTSTACKSTR(p); - STPUTC(c, p); /* repeated as many times as needed */ - grabstackstr(p); - -We now start a top-down look at the code: - -MAIN.C: The main routine performs some initialization, executes -the user's profile if necessary, and calls cmdloop. Cmdloop is -repeatedly parses and executes commands. - -OPTIONS.C: This file contains the option processing code. It is -called from main to parse the shell arguments when the shell is -invoked, and it also contains the set builtin. The -i and -j op- -tions (the latter turns on job control) require changes in signal -handling. The routines setjobctl (in jobs.c) and setinteractive -(in trap.c) are called to handle changes to these options. - -PARSING: The parser code is all in parser.c. A recursive des- -cent parser is used. Syntax tables (generated by mksyntax) are -used to classify characters during lexical analysis. There are -three tables: one for normal use, one for use when inside single -quotes, and one for use when inside double quotes. The tables -are machine dependent because they are indexed by character vari- -ables and the range of a char varies from machine to machine. - -PARSE OUTPUT: The output of the parser consists of a tree of -nodes. The various types of nodes are defined in the file node- -types. - -Nodes of type NARG are used to represent both words and the con- -tents of here documents. An early version of ash kept the con- -tents of here documents in temporary files, but keeping here do- -cuments in memory typically results in significantly better per- -formance. It would have been nice to make it an option to use -temporary files for here documents, for the benefit of small -machines, but the code to keep track of when to delete the tem- -porary files was complex and I never fixed all the bugs in it. -(AT&T has been maintaining the Bourne shell for more than ten -years, and to the best of my knowledge they still haven't gotten -it to handle temporary files correctly in obscure cases.) - -The text field of a NARG structure points to the text of the -word. The text consists of ordinary characters and a number of -special codes defined in parser.h. The special codes are: - - CTLVAR Variable substitution - CTLENDVAR End of variable substitution - CTLBACKQ Command substitution - CTLBACKQ|CTLQUOTE Command substitution inside double quotes - CTLESC Escape next character - -A variable substitution contains the following elements: - - CTLVAR type name '=' [ alternative-text CTLENDVAR ] - -The type field is a single character specifying the type of sub- -stitution. The possible types are: - - VSNORMAL $var - VSMINUS ${var-text} - VSMINUS|VSNUL ${var:-text} - VSPLUS ${var+text} - VSPLUS|VSNUL ${var:+text} - VSQUESTION ${var?text} - VSQUESTION|VSNUL ${var:?text} - VSASSIGN ${var=text} - VSASSIGN|VSNUL ${var=text} - -In addition, the type field will have the VSQUOTE flag set if the -variable is enclosed in double quotes. The name of the variable -comes next, terminated by an equals sign. If the type is not -VSNORMAL, then the text field in the substitution follows, ter- -minated by a CTLENDVAR byte. - -Commands in back quotes are parsed and stored in a linked list. -The locations of these commands in the string are indicated by -CTLBACKQ and CTLBACKQ+CTLQUOTE characters, depending upon whether -the back quotes were enclosed in double quotes. - -The character CTLESC escapes the next character, so that in case -any of the CTL characters mentioned above appear in the input, -they can be passed through transparently. CTLESC is also used to -escape '*', '?', '[', and '!' characters which were quoted by the -user and thus should not be used for file name generation. - -CTLESC characters have proved to be particularly tricky to get -right. In the case of here documents which are not subject to -variable and command substitution, the parser doesn't insert any -CTLESC characters to begin with (so the contents of the text -field can be written without any processing). Other here docu- -ments, and words which are not subject to splitting and file name -generation, have the CTLESC characters removed during the vari- -able and command substitution phase. Words which are subject -splitting and file name generation have the CTLESC characters re- -moved as part of the file name phase. - -EXECUTION: Command execution is handled by the following files: - eval.c The top level routines. - redir.c Code to handle redirection of input and output. - jobs.c Code to handle forking, waiting, and job control. - exec.c Code to to path searches and the actual exec sys call. - expand.c Code to evaluate arguments. - var.c Maintains the variable symbol table. Called from expand.c. - -EVAL.C: Evaltree recursively executes a parse tree. The exit -status is returned in the global variable exitstatus. The alter- -native entry evalbackcmd is called to evaluate commands in back -quotes. It saves the result in memory if the command is a buil- -tin; otherwise it forks off a child to execute the command and -connects the standard output of the child to a pipe. - -JOBS.C: To create a process, you call makejob to return a job -structure, and then call forkshell (passing the job structure as -an argument) to create the process. Waitforjob waits for a job -to complete. These routines take care of process groups if job -control is defined. - -REDIR.C: Ash allows file descriptors to be redirected and then -restored without forking off a child process. This is accom- -plished by duplicating the original file descriptors. The redir- -tab structure records where the file descriptors have be dupli- -cated to. - -EXEC.C: The routine find_command locates a command, and enters -the command in the hash table if it is not already there. The -third argument specifies whether it is to print an error message -if the command is not found. (When a pipeline is set up, -find_command is called for all the commands in the pipeline be- -fore any forking is done, so to get the commands into the hash -table of the parent process. But to make command hashing as -transparent as possible, we silently ignore errors at that point -and only print error messages if the command cannot be found -later.) - -The routine shellexec is the interface to the exec system call. - -EXPAND.C: Arguments are processed in three passes. The first -(performed by the routine argstr) performs variable and command -substitution. The second (ifsbreakup) performs word splitting -and the third (expandmeta) performs file name generation. If the -"/u" directory is simulated, then when "/u/username" is replaced -by the user's home directory, the flag "didudir" is set. This -tells the cd command that it should print out the directory name, -just as it would if the "/u" directory were implemented using -symbolic links. - -VAR.C: Variables are stored in a hash table. Probably we should -switch to extensible hashing. The variable name is stored in the -same string as the value (using the format "name=value") so that -no string copying is needed to create the environment of a com- -mand. Variables which the shell references internally are preal- -located so that the shell can reference the values of these vari- -ables without doing a lookup. - -When a program is run, the code in eval.c sticks any environment -variables which precede the command (as in "PATH=xxx command") in -the variable table as the simplest way to strip duplicates, and -then calls "environment" to get the value of the environment. -There are two consequences of this. First, if an assignment to -PATH precedes the command, the value of PATH before the assign- -ment must be remembered and passed to shellexec. Second, if the -program turns out to be a shell procedure, the strings from the -environment variables which preceded the command must be pulled -out of the table and replaced with strings obtained from malloc, -since the former will automatically be freed when the stack (see -the entry on memalloc.c) is emptied. - -BUILTIN COMMANDS: The procedures for handling these are scat- -tered throughout the code, depending on which location appears -most appropriate. They can be recognized because their names al- -ways end in "cmd". The mapping from names to procedures is -specified in the file builtins, which is processed by the mkbuil- -tins command. - -A builtin command is invoked with argc and argv set up like a -normal program. A builtin command is allowed to overwrite its -arguments. Builtin routines can call nextopt to do option pars- -ing. This is kind of like getopt, but you don't pass argc and -argv to it. Builtin routines can also call error. This routine -normally terminates the shell (or returns to the main command -loop if the shell is interactive), but when called from a builtin -command it causes the builtin command to terminate with an exit -status of 2. - -The directory bltins contains commands which can be compiled in- -dependently but can also be built into the shell for efficiency -reasons. The makefile in this directory compiles these programs -in the normal fashion (so that they can be run regardless of -whether the invoker is ash), but also creates a library named -bltinlib.a which can be linked with ash. The header file bltin.h -takes care of most of the differences between the ash and the -stand-alone environment. The user should call the main routine -"main", and #define main to be the name of the routine to use -when the program is linked into ash. This #define should appear -before bltin.h is included; bltin.h will #undef main if the pro- -gram is to be compiled stand-alone. - -CD.C: This file defines the cd and pwd builtins. The pwd com- -mand runs /bin/pwd the first time it is invoked (unless the user -has already done a cd to an absolute pathname), but then -remembers the current directory and updates it when the cd com- -mand is run, so subsequent pwd commands run very fast. The main -complication in the cd command is in the docd command, which -resolves symbolic links into actual names and informs the user -where the user ended up if he crossed a symbolic link. - -SIGNALS: Trap.c implements the trap command. The routine set- -signal figures out what action should be taken when a signal is -received and invokes the signal system call to set the signal ac- -tion appropriately. When a signal that a user has set a trap for -is caught, the routine "onsig" sets a flag. The routine dotrap -is called at appropriate points to actually handle the signal. -When an interrupt is caught and no trap has been set for that -signal, the routine "onint" in error.c is called. - -OUTPUT: Ash uses it's own output routines. There are three out- -put structures allocated. "Output" represents the standard out- -put, "errout" the standard error, and "memout" contains output -which is to be stored in memory. This last is used when a buil- -tin command appears in backquotes, to allow its output to be col- -lected without doing any I/O through the UNIX operating system. -The variables out1 and out2 normally point to output and errout, -respectively, but they are set to point to memout when appropri- -ate inside backquotes. - -INPUT: The basic input routine is pgetc, which reads from the -current input file. There is a stack of input files; the current -input file is the top file on this stack. The code allows the -input to come from a string rather than a file. (This is for the --c option and the "." and eval builtin commands.) The global -variable plinno is saved and restored when files are pushed and -popped from the stack. The parser routines store the number of -the current line in this variable. - -DEBUGGING: If DEBUG is defined in shell.h, then the shell will -write debugging information to the file $HOME/trace. Most of -this is done using the TRACE macro, which takes a set of printf -arguments inside two sets of parenthesis. Example: -"TRACE(("n=%d0, n))". The double parenthesis are necessary be- -cause the preprocessor can't handle functions with a variable -number of arguments. Defining DEBUG also causes the shell to -generate a core dump if it is sent a quit signal. The tracing -code is in show.c. |