summaryrefslogtreecommitdiff
path: root/usr.bin/make/generate.c
AgeCommit message (Collapse)Author
2002-06-11This is the first step in sanitizing the conditional parser.Marc Espie
Change the conditional recognition algorithm: scan for a sequence of alphabetic characters, hash it, and compare it against a small table (using ohash functions). This makes Cond_Eval entry more logical, and allows for some shortcuts in recognizing .include, .for, .undef. This also means that conditionals must have an intervening blank between the keyword and the actual test, e.g., .ifA will no longer work. (but no-one actually uses this, and it's highly obfuscated) Okay miod@.
2001-05-23Mostly clean-up:Marc Espie
- cut up those huge include files into separate interfaces for all modules. Put the interface documentation there, and not with the implementation. - light-weight includes for needed concrete types (lst_t.h, timestamp_t.h). - cut out some more logically separate parts: cmd_exec, varname, parsevar, timestamp. - put all error handling functions together, so that we will be able to clean them up. - more systematic naming: functioni to handle interval, function to handle string. - put the init/end code apart to minimize coupling. - kill weird types like ReturnStatus and Boolean. Use standard bool (with a fallback for non-iso systems) - better interface documentation for lots of subsystems. As a result, make compilation goes somewhat faster (5%, even considering the largish BSD copyrights to read). The corresponding preprocessed source goes down from 1,5M to 1M. A few minor code changes as well: Parse_DoVar is no longer destructive. Parse_IsVar functionality is folded into Parse_DoVar (as it knows what an assignment is), a few more interval handling functions. Avoid calling XXX_End when they do nothing, just #define XXX_End to nothing. Parse_DoVar is slightly more general: it will handle compound assignments as long as they make sense, e.g., VAR +!= cmd will work. As a side effect, VAR++=value now triggers an error (two + in assignment). - this stuff doesn't occur in portable Makefiles. - writing VAR++ = value or VAR+ +=value disambiguates it. - this is a good thing, it uncovered a bug in bsd.port.mk. Tested by naddy@. Okayed millert@. I'll handle the fallback if there is any. This went through a full make build anyways, including isakmpd (without mickey's custom binutils, as he didn't see fit to share it with me).
2001-05-03Synch with my current work.Marc Espie
Numerous changes: - generate can build several tables - style cleanup - statistics code - use variable names throughout (struct Name) - recursive variables everywhere - faster parser (pass buffer along instead of allocating multiple copies) - correct parser. Handles comments everywhere, and ; correctly - more string intervals - simplified dir.c, less recursion. - extended for loops - sinclude() - finished removing extra junk from Lst_* - handles ${@D} and friends in a simpler way - cleaned up and modular VarModifiers handling. - recognizes some gnu Makefile usages and errors out about them. Additionally, some extra functionality is defined by FEATURES. The set of functionalities is currently hardcoded to OpenBSD defaults, but this may include support for some NetBSD extensions, like ODE modifiers. Backed by miod@ and millert@, who finally got sick of my endless patches...
2001-03-02Use the ohash_* that's now in libc.Marc Espie
2000-06-23This is the speed-up patch, which doubles make speed (almost).Marc Espie
Use the open hashing functions for global contexts instead of List in var.c. All the preliminary work to trim down local contexts means that we don't suffer from the heavy initialization work that a hash table entails. There is some make kludgery to: - build the hashing functions as a library, - recreate hashconsts.h, even if make depend was not invoked. One point of the hashing scheme written was to separate the computation of the hash function, and the hash lookup itself. This is very convenient for make, because of those pesky special variables. hashconsts.h is there to pre-hash the correct values, which replaces a few expensive string comparisons with quick hash value comparisons, followed by one expensive string comparison. The modulus MAGICSLOTS chosen in the Makefile is ad-hoc: it is small enough to write a small switch without collision, and will need changing if the hash function changes... The function quick_lookup is the most important: it either returns an index, for a local variable, or it does compute a hashing value, and returns -1. Another somewhat controversial decision is the use of string intervals. This avoids either copying a string, or twiddling with a byte for cases such as ${VAR}. Finally, the variable name is stored within the variable itself. Since a given variable name never changes, this makes sense. All that was needed was a hash library with support for this. Note that the hashing table holds only a variable pointer AND the corresponding hashing value, WITHOUT a modulo hashtablesize. Two reasons: - hash resizes can be done faster, without having to recompute hashing values. - locality of access. The hash table fits into memory without problem. Once a candidate slot is found, we check the complete hashing value. Probability of a collision is very small (32 bits...). So bringing up the whole variable in memory at once is good: the name will almost always match, in which case we want the variable value as well, so it makes sense to put them together. The ohash functions implement open hashing, as described in Knuth, but with a variable table size. Choosing powers of 2 sizes does not yield more collisions, but it makes the hashing scheme much simpler. The thresholds at which to expand/shrink the tables seem to work well in practice. The default sizes were chosen such that the tables hardly ever shrink or expand anyways (though I've tried with smaller/larger sizes to verify that the shrinking/expanding worked correctly): larger Makefiles hold roughly 500/600 variables, which fits without trouble into a 1024-sized variable. Disregard #ifdef STATS_HASH, this is some internal scaffolding I'm using to measure make performance. The only known issue with open-hashing is that deletions cannot create empty slots, but do leave slots marked as `occupied once' so that lookup works. We use a well-known optimization which records those pseudo-empty slots while looking up values. If the value is not found, the pseudo-empty slot is returned to be filled. If the value is found, it is swapped with the pseudo-empty slot. This is an improvement in both cases, since this shortens the length of lookup chains, eventually pushing the pseudo-empty slots to the end. Reviewed by millert@ and miod@