diff options
author | Niklas Hallqvist <niklas@cvs.openbsd.org> | 1995-12-20 01:06:22 +0000 |
---|---|---|
committer | Niklas Hallqvist <niklas@cvs.openbsd.org> | 1995-12-20 01:06:22 +0000 |
commit | c482518380683ee38d14024c1e362a0d681cf967 (patch) | |
tree | e69b4f6d3fee3aced20a41f3fdf543fc1c77fb5d /gnu/usr.bin/gcc/PROJECTS | |
parent | 76a62188d0db49c65b696d474c855a799fd96dce (diff) |
FSF GCC version 2.7.2
Diffstat (limited to 'gnu/usr.bin/gcc/PROJECTS')
-rw-r--r-- | gnu/usr.bin/gcc/PROJECTS | 448 |
1 files changed, 448 insertions, 0 deletions
diff --git a/gnu/usr.bin/gcc/PROJECTS b/gnu/usr.bin/gcc/PROJECTS new file mode 100644 index 00000000000..ee9be0241ed --- /dev/null +++ b/gnu/usr.bin/gcc/PROJECTS @@ -0,0 +1,448 @@ +0. Improved efficiency. + +* Parse and output array initializers an element at a time, freeing +storage after each, instead of parsing the whole initializer first and +then outputting. This would reduce memory usage for large +initializers. + +* See if the techniques describe in Oct 1991 SIGPLAN Notices +(Frazer and Hanson) are applicable to GCC. + +1. Better optimization. + +* Constants in unused inline functions + +It would be nice to delay output of string constants so that string +constants mentioned in unused inline functions are never generated. +Perhaps this would also take care of string constants in dead code. + +The difficulty is in finding a clean way for the RTL which refers +to the constant (currently, only by an assembler symbol name) +to point to the constant and cause it to be output. + +* More cse + +The techniques for doing full global cse are described in the red +dragon book, or (a different version) in Frederick Chow's thesis from +Stanford. It is likely to be slow and use a lot of memory, but it +might be worth offering as an additional option. + +It is probably possible to extend cse to a few very frequent cases +without so much expense. + +For example, it is not very hard to handle cse through if-then +statements with no else clauses. Here's how to do it. On reaching a +label, notice that the label's use-count is 1 and that the last +preceding jump jumps conditionally to this label. Now you know it +is a simple if-then statement. Remove from the hash table +all the expressions that were entered since that jump insn +and you can continue with cse. + +It is probably not hard to handle cse from the end of a loop +around to the beginning, and a few loops would be greatly sped +up by this. + +* Optimize a sequence of if statements whose conditions are exclusive. + +It is possible to optimize + + if (x == 1) ...; + if (x == 2) ...; + if (x == 3) ...; + +into + + if (x == 1) ...; + else if (x == 2) ...; + else if (x == 3) ...; + +provided that x is not altered by the contents of the if statements. + +It's not certain whether this is worth doing. Perhaps programmers +nearly always write the else's themselves, leaving few opportunities +to improve anything. + +* Un-cse. + +Perhaps we should have an un-cse step right after cse, which tries to +replace a reg with its value if the value can be substituted for the +reg everywhere, if that looks like an improvement. Which is if the +reg is used only a few times. Use rtx_cost to determine if the +change is really an improvement. + +* Clean up how cse works. + +The scheme is that each value has just one hash entry. The +first_same_value and next_same_value chains are no longer needed. + +For arithmetic, each hash table elt has the following slots: + +* Operation. This is an rtx code. +* Mode. +* Operands 0, 1 and 2. These point to other hash table elements. + +So, if we want to enter (PLUS:SI (REG:SI 30) (CONST_INT 104)), we +first enter (CONST_INT 104) and find the entry that (REG:SI 30) now +points to. Then we put these elts into operands 0 and 1 of a new elt. +We put PLUS and SI into the new elt. + +Registers and mem refs would never be entered into the table as such. +However, the values they contain would be entered. There would be a +table indexed by regno which points at the hash entry for the value in +that reg. + +The hash entry index now plays the role of a qty number. +We still need qty_first_reg, reg_next_eqv, etc. to record which regs +share a particular qty. + +When a reg is used whose contents are unknown, we need to create a +hash table entry whose contents say "unknown", as a place holder for +whatever the reg contains. If that reg is added to something, then +the hash entry for the sum will refer to the "unknown" entry. Use +UNKNOWN for the rtx code in this entry. This replaces make_new_qty. + +For a constant, a unique hash entry would be made based on the +value of the constant. + +What about MEM? Each time a memory address is referenced, we need a +qty (a hash table elt) to represent what is in it. (Just as for a +register.) If this isn't known, create one, just as for a reg whose +contents are unknown. + +We need a way to find all mem refs that still contain a certain value. +Do this with a chain of hash elts (for memory addresses) that point to +locations that hold the value. The hash elt for the value itself should +point to the start of the chain. It would be good for the hash elt +for an address to point to the hash elt for the contents of that address +(but this ptr can be null if the contents have never been entered). + +With this data structure, nothing need ever be invalidated except +the lists of which regs or mems hold a particular value. It is easy +to see if there is a reg or mem that is equiv to a particular value. +If the value is constant, it is always explicitly constant. + +* Support more general tail-recursion among different functions. + +This might be possible under certain circumstances, such as when +the argument lists of the functions have the same lengths. +Perhaps it could be done with a special declaration. + +You would need to verify in the calling function that it does not +use the addresses of any local variables and does not use setjmp. + +* Put short statics vars at low addresses and use short addressing mode? + +Useful on the 68000/68020 and perhaps on the 32000 series, +provided one has a linker that works with the feature. +This is said to make a 15% speedup on the 68000. + +* Keep global variables in registers. + +Here is a scheme for doing this. A global variable, or a local variable +whose address is taken, can be kept in a register for an entire function +if it does not use non-constant memory addresses and (for globals only) +does not call other functions. If the entire function does not meet +this criterion, a loop may. + +The VAR_DECL for such a variable would have to have two RTL expressions: +the true home in memory, and the pseudo-register used temporarily. +It is necessary to emit insns to copy the memory location into the +pseudo-register at the beginning of the function or loop, and perhaps +back out at the end. These insns should have REG_EQUIV notes so that, +if the pseudo-register does not get a hard register, it is spilled into +the memory location which exists in any case. + +The easiest way to set up these insns is to modify the routine +put_var_into_stack so that it does not apply to the entire function +(sparing any loops which contain nothing dangerous) and to call it at +the end of the function regardless of where in the function the +address of a local variable is taken. It would be called +unconditionally at the end of the function for all relevant global +variables. + +For debugger output, the thing to do is to invent a new binding level +around the appropriate loop and define the variable name as a register +variable with that scope. + +* Live-range splitting. + +Currently a variable is allocated a hard register either for the full +extent of its use or not at all. Sometimes it would be good to +allocate a variable a hard register for just part of a function; for +example, through a particular loop where the variable is mostly used, +or outside of a particular loop where the variable is not used. (The +latter is nice because it might let the variable be in a register most +of the time even though the loop needs all the registers.) + +It might not be very hard to do this in global.c when a variable +fails to get a hard register for its entire life span. + +The first step is to find a loop in which the variable is live, but +which is not the whole life span or nearly so. It's probably best to +use a loop in which the variable is heavily used. + +Then create a new pseudo-register to represent the variable in that loop. +Substitute this for the old pseudo-register there, and insert move insns +to copy between the two at the loop entry and all exits. (When several +such moves are inserted at the same place, some new feature should be +added to say that none of those registers conflict merely because of +overlap between the new moves. And the reload pass should reorder them +so that a store precedes a load, for any given hard register.) + +After doing this for all the reasonable candidates, run global-alloc +over again. With luck, one of the two pseudo-registers will be fit +somewhere. It may even have a much higher priority due to its reduced +life span. + +There will be no room in general for the new pseudo-registers in +basic_block_live_at_start, so there will need to be a second such +matrix exclusively for the new ones. Various other vectors indexed by +register number will have to be made bigger, or there will have to be +secondary extender vectors just for global-alloc. + +A simple new feature could arrange that both pseudo-registers get the +same stack slot if they both fail to get hard registers. + +Other compilers split live ranges when they are not connected, or +try to split off pieces `at the edge'. I think splitting around loops +will provide more speedup. + +Creating a fake binding block and a new like-named variable with +shorter life span and different address might succeed in describing +this technique for the debugger. + +* Detect dead stores into memory? + +A store into memory is dead if it is followed by another store into +the same location; and, in between, there is no reference to anything +that might be that location (including no reference to a variable +address). + +* Loop optimization. + +Strength reduction and iteration variable elimination could be +smarter. They should know how to decide which iteration variables are +not worth making explicit because they can be computed as part of an +address calculation. Based on this information, they should decide +when it is desirable to eliminate one iteration variable and create +another in its place. + +It should be possible to compute what the value of an iteration +variable will be at the end of the loop, and eliminate the variable +within the loop by computing that value at the loop end. + +When a loop has a simple increment that adds 1, +instead of jumping in after the increment, +decrement the loop count and jump to the increment. +This allows aob insns to be used. + +* Using constraints on values. + +Many operations could be simplified based on knowledge of the +minimum and maximum possible values of a register at any particular time. +These limits could come from the data types in the tree, via rtl generation, +or they can be deduced from operations that are performed. For example, +the result of an `and' operation one of whose operands is 7 must be in +the range 0 to 7. Compare instructions also tell something about the +possible values of the operand, in the code beyond the test. + +Value constraints can be used to determine the results of a further +comparison. They can also indicate that certain `and' operations are +redundant. Constraints might permit a decrement and branch +instruction that checks zeroness to be used when the user has +specified to exit if negative. + +* Smarter reload pass. + +The reload pass as currently written can reload values only into registers +that are reserved for reloading. This means that in order to use a +register for reloading it must spill everything out of that register. + +It would be straightforward, though complicated, for reload1.c to keep +track, during its scan, of which hard registers were available at each +point in the function, and use for reloading even registers that were +free only at the point they were needed. This would avoid much spilling +and make better code. + +* Change the type of a variable. + +Sometimes a variable is declared as `int', it is assigned only once +from a value of type `char', and then it is used only by comparison +against constants. On many machines, better code would result if +the variable had type `char'. If the compiler could detect this +case, it could change the declaration of the variable and change +all the places that use it. + +* Better handling for very sparse switches. + +There may be cases where it would be better to compile a switch +statement to use a fixed hash table rather than the current +combination of jump tables and binary search. + +* Order of subexpressions. + +It might be possible to make better code by paying attention +to the order in which to generate code for subexpressions of an expression. + +* More code motion. + +Consider hoisting common code up past conditional branches or +tablejumps. + +* Trace scheduling. + +This technique is said to be able to figure out which way a jump +will usually go, and rearrange the code to make that path the +faster one. + +* Distributive law. + +The C expression *(X + 4 * (Y + C)) compiles better on certain +machines if rewritten as *(X + 4*C + 4*Y) because of known addressing +modes. It may be tricky to determine when, and for which machines, to +use each alternative. + +Some work has been done on this, in combine.c. + +* Can optimize by changing if (x) y; else z; into z; if (x) y; +if z and x do not interfere and z has no effects not undone by y. +This is desirable if z is faster than jumping. + +* For a two-insn loop on the 68020, such as + foo: movb a2@+,a3@+ + jne foo +it is better to insert dbeq d0,foo before the jne. +d0 can be a junk register. The challenge is to fit this into +a portable framework: when can you detect this situation and +still be able to allocate a junk register? + +2. Simpler porting. + +Right now, describing the target machine's instructions is done +cleanly, but describing its addressing mode is done with several +ad-hoc macro definitions. Porting would be much easier if there were +an RTL description for addressing modes like that for instructions. +Tools analogous to genflags and genrecog would generate macros from +this description. + +There would be one pattern in the address-description file for each +kind of addressing, and this pattern would have: + + * the RTL expression for the address + * C code to verify its validity (since that may depend on + the exact data). + * C code to print the address in assembler language. + * C code to convert the address into a valid one, if it is not valid. + (This would replace LEGITIMIZE_ADDRESS). + * Register constraints for all indeterminates that appear + in the RTL expression. + +3. Other languages. + +Front ends for Pascal, Fortran, Algol, Cobol, Modula-2 and Ada are +desirable. + +Pascal, Modula-2 and Ada require the implementation of functions +within functions. Some of the mechanisms for this already exist. + +4. More extensions. + +* Generated unique labels. Have some way of generating distinct labels +for use in extended asm statements. I don't know what a good syntax would +be. + +* A way of defining a structure containing a union, in which the choice of +union alternative is controlled by a previous structure component. + +Here is a possible syntax for this. + +struct foo { + enum { INT, DOUBLE } code; + auto union { case INT: int i; case DOUBLE: double d;} value : code; +}; + +* Allow constructor expressions as lvalues, like this: + + (struct foo) {a, b, c} = foo(); + +This would call foo, which returns a structure, and then store the +several components of the structure into the variables a, b, and c. + +5. Generalize the machine model. + +* Some new compiler features may be needed to do a good job on machines +where static data needs to be addressed using base registers. + +* Some machines have two stacks in different areas of memory, one used +for scalars and another for large objects. The compiler does not +now have a way to understand this. + +6. Useful warnings. + +* Warn about statements that are undefined because the order of +evaluation of increment operators makes a big difference. Here is an +example: + + *foo++ = hack (*foo); + +7. Better documentation of how GCC works and how to port it. + +Here is an outline proposed by Allan Adler. + +I. Overview of this document +II. The machines on which GCC is implemented + A. Prose description of those characteristics of target machines and + their operating systems which are pertinent to the implementation + of GCC. + i. target machine characteristics + ii. comparison of this system of machine characteristics with + other systems of machine specification currently in use + B. Tables of the characteristics of the target machines on which + GCC is implemented. + C. A priori restrictions on the values of characteristics of target + machines, with special reference to those parts of the source code + which entail those restrictions + i. restrictions on individual characteristics + ii. restrictions involving relations between various characteristics + D. The use of GCC as a cross-compiler + i. cross-compilation to existing machines + ii. cross-compilation to non-existent machines + E. Assumptions which are made regarding the target machine + i. assumptions regarding the architecture of the target machine + ii. assumptions regarding the operating system of the target machine + iii. assumptions regarding software resident on the target machine + iv. where in the source code these assumptions are in effect made +III. A systematic approach to writing the files tm.h and xm.h + A. Macros which require special care or skill + B. Examples, with special reference to the underlying reasoning +IV. A systematic approach to writing the machine description file md + A. Minimal viable sets of insn descriptions + B. Examples, with special reference to the underlying reasoning +V. Uses of the file aux-output.c +VI. Specification of what constitutes correct performance of an + implementation of GCC + A. The components of GCC + B. The itinerary of a C program through GCC + C. A system of benchmark programs + D. What your RTL and assembler should look like with these benchmarks + E. Fine tuning for speed and size of compiled code +VII. A systematic procedure for debugging an implementation of GCC + A. Use of GDB + i. the macros in the file .gdbinit for GCC + ii. obstacles to the use of GDB + a. functions implemented as macros can't be called in GDB + B. Debugging without GDB + i. How to turn off the normal operation of GCC and access specific + parts of GCC + C. Debugging tools + D. Debugging the parser + i. how machine macros and insn definitions affect the parser + E. Debugging the recognizer + i. how machine macros and insn definitions affect the recognizer + +ditto for other components + +VIII. Data types used by GCC, with special reference to restrictions not + specified in the formal definition of the data type +IX. References to the literature for the algorithms used in GCC + |