src - OpenBSD base system

diff options


context:
space:
mode:

author	Niklas Hallqvist <niklas@cvs.openbsd.org>	1995-12-20 01:06:22 +0000
committer	Niklas Hallqvist <niklas@cvs.openbsd.org>	1995-12-20 01:06:22 +0000
commit	c482518380683ee38d14024c1e362a0d681cf967 (patch)
tree	e69b4f6d3fee3aced20a41f3fdf543fc1c77fb5d /gnu/usr.bin/gcc/PROJECTS
parent	76a62188d0db49c65b696d474c855a799fd96dce (diff)

FSF GCC version 2.7.2

Diffstat (limited to 'gnu/usr.bin/gcc/PROJECTS')

-rw-r--r--

gnu/usr.bin/gcc/PROJECTS

448

1 files changed, 448 insertions, 0 deletions

diff --git a/gnu/usr.bin/gcc/PROJECTS b/gnu/usr.bin/gcc/PROJECTS
new file mode 100644
index 00000000000..ee9be0241ed
--- /dev/null
+++ b/gnu/usr.bin/gcc/PROJECTS

@@ -0,0 +1,448 @@

+0. Improved efficiency.

+* Parse and output array initializers an element at a time, freeing

+storage after each, instead of parsing the whole initializer first and

+then outputting. This would reduce memory usage for large

+initializers.

+* See if the techniques describe in Oct 1991 SIGPLAN Notices

+(Frazer and Hanson) are applicable to GCC.

+1. Better optimization.

+* Constants in unused inline functions

+It would be nice to delay output of string constants so that string

+constants mentioned in unused inline functions are never generated.

+Perhaps this would also take care of string constants in dead code.

+The difficulty is in finding a clean way for the RTL which refers

+to the constant (currently, only by an assembler symbol name)

+to point to the constant and cause it to be output.

+* More cse

+The techniques for doing full global cse are described in the red

+dragon book, or (a different version) in Frederick Chow's thesis from

+Stanford. It is likely to be slow and use a lot of memory, but it

+might be worth offering as an additional option.

+It is probably possible to extend cse to a few very frequent cases

+without so much expense.

+For example, it is not very hard to handle cse through if-then

+statements with no else clauses. Here's how to do it. On reaching a

+label, notice that the label's use-count is 1 and that the last

+preceding jump jumps conditionally to this label. Now you know it

+is a simple if-then statement. Remove from the hash table

+all the expressions that were entered since that jump insn

+and you can continue with cse.

+It is probably not hard to handle cse from the end of a loop

+around to the beginning, and a few loops would be greatly sped

+up by this.

+* Optimize a sequence of if statements whose conditions are exclusive.

+It is possible to optimize

+ if (x == 1) ...;

+ if (x == 2) ...;

+ if (x == 3) ...;

+into

+ if (x == 1) ...;

+ else if (x == 2) ...;

+ else if (x == 3) ...;

+provided that x is not altered by the contents of the if statements.

+It's not certain whether this is worth doing. Perhaps programmers

+nearly always write the else's themselves, leaving few opportunities

+to improve anything.

+* Un-cse.

+Perhaps we should have an un-cse step right after cse, which tries to

+replace a reg with its value if the value can be substituted for the

+reg everywhere, if that looks like an improvement. Which is if the

+reg is used only a few times. Use rtx_cost to determine if the

+change is really an improvement.

+* Clean up how cse works.

+The scheme is that each value has just one hash entry. The

+first_same_value and next_same_value chains are no longer needed.

+For arithmetic, each hash table elt has the following slots:

+* Operation. This is an rtx code.

+* Mode.

+* Operands 0, 1 and 2. These point to other hash table elements.

+So, if we want to enter (PLUS:SI (REG:SI 30) (CONST_INT 104)), we

+first enter (CONST_INT 104) and find the entry that (REG:SI 30) now

+points to. Then we put these elts into operands 0 and 1 of a new elt.

+We put PLUS and SI into the new elt.

+Registers and mem refs would never be entered into the table as such.

+However, the values they contain would be entered. There would be a

+table indexed by regno which points at the hash entry for the value in

+that reg.

+The hash entry index now plays the role of a qty number.

+We still need qty_first_reg, reg_next_eqv, etc. to record which regs

+share a particular qty.

+When a reg is used whose contents are unknown, we need to create a

+hash table entry whose contents say "unknown", as a place holder for

+whatever the reg contains. If that reg is added to something, then

+the hash entry for the sum will refer to the "unknown" entry. Use

+UNKNOWN for the rtx code in this entry. This replaces make_new_qty.

+For a constant, a unique hash entry would be made based on the

+value of the constant.

+What about MEM? Each time a memory address is referenced, we need a

+qty (a hash table elt) to represent what is in it. (Just as for a

+register.) If this isn't known, create one, just as for a reg whose

+contents are unknown.

+We need a way to find all mem refs that still contain a certain value.

+Do this with a chain of hash elts (for memory addresses) that point to

+locations that hold the value. The hash elt for the value itself should

+point to the start of the chain. It would be good for the hash elt

+for an address to point to the hash elt for the contents of that address

+(but this ptr can be null if the contents have never been entered).

+With this data structure, nothing need ever be invalidated except

+the lists of which regs or mems hold a particular value. It is easy

+to see if there is a reg or mem that is equiv to a particular value.

+If the value is constant, it is always explicitly constant.

+* Support more general tail-recursion among different functions.

+This might be possible under certain circumstances, such as when

+the argument lists of the functions have the same lengths.

+Perhaps it could be done with a special declaration.

+You would need to verify in the calling function that it does not

+use the addresses of any local variables and does not use setjmp.

+* Put short statics vars at low addresses and use short addressing mode?

+Useful on the 68000/68020 and perhaps on the 32000 series,

+provided one has a linker that works with the feature.

+This is said to make a 15% speedup on the 68000.

+* Keep global variables in registers.

+Here is a scheme for doing this. A global variable, or a local variable

+whose address is taken, can be kept in a register for an entire function

+if it does not use non-constant memory addresses and (for globals only)

+does not call other functions. If the entire function does not meet

+this criterion, a loop may.

+The VAR_DECL for such a variable would have to have two RTL expressions:

+the true home in memory, and the pseudo-register used temporarily.

+It is necessary to emit insns to copy the memory location into the

+pseudo-register at the beginning of the function or loop, and perhaps

+back out at the end. These insns should have REG_EQUIV notes so that,

+if the pseudo-register does not get a hard register, it is spilled into

+the memory location which exists in any case.

+The easiest way to set up these insns is to modify the routine

+put_var_into_stack so that it does not apply to the entire function

+(sparing any loops which contain nothing dangerous) and to call it at

+the end of the function regardless of where in the function the

+address of a local variable is taken. It would be called

+unconditionally at the end of the function for all relevant global

+variables.

+For debugger output, the thing to do is to invent a new binding level

+around the appropriate loop and define the variable name as a register

+variable with that scope.

+* Live-range splitting.

+Currently a variable is allocated a hard register either for the full

+extent of its use or not at all. Sometimes it would be good to

+allocate a variable a hard register for just part of a function; for

+example, through a particular loop where the variable is mostly used,

+or outside of a particular loop where the variable is not used. (The

+latter is nice because it might let the variable be in a register most

+of the time even though the loop needs all the registers.)

+It might not be very hard to do this in global.c when a variable

+fails to get a hard register for its entire life span.

+The first step is to find a loop in which the variable is live, but

+which is not the whole life span or nearly so. It's probably best to

+use a loop in which the variable is heavily used.

+Then create a new pseudo-register to represent the variable in that loop.

+Substitute this for the old pseudo-register there, and insert move insns

+to copy between the two at the loop entry and all exits. (When several

+such moves are inserted at the same place, some new feature should be

+added to say that none of those registers conflict merely because of

+overlap between the new moves. And the reload pass should reorder them

+so that a store precedes a load, for any given hard register.)

+After doing this for all the reasonable candidates, run global-alloc

+over again. With luck, one of the two pseudo-registers will be fit

+somewhere. It may even have a much higher priority due to its reduced

+life span.

+There will be no room in general for the new pseudo-registers in

+basic_block_live_at_start, so there will need to be a second such

+matrix exclusively for the new ones. Various other vectors indexed by

+register number will have to be made bigger, or there will have to be

+secondary extender vectors just for global-alloc.

+A simple new feature could arrange that both pseudo-registers get the

+same stack slot if they both fail to get hard registers.

+Other compilers split live ranges when they are not connected, or

+try to split off pieces `at the edge'. I think splitting around loops

+will provide more speedup.

+Creating a fake binding block and a new like-named variable with

+shorter life span and different address might succeed in describing

+this technique for the debugger.

+* Detect dead stores into memory?

+A store into memory is dead if it is followed by another store into

+the same location; and, in between, there is no reference to anything

+that might be that location (including no reference to a variable

+address).

+* Loop optimization.

+Strength reduction and iteration variable elimination could be

+smarter. They should know how to decide which iteration variables are

+not worth making explicit because they can be computed as part of an

+address calculation. Based on this information, they should decide

+when it is desirable to eliminate one iteration variable and create

+another in its place.

+It should be possible to compute what the value of an iteration

+variable will be at the end of the loop, and eliminate the variable

+within the loop by computing that value at the loop end.

+When a loop has a simple increment that adds 1,

+instead of jumping in after the increment,

+decrement the loop count and jump to the increment.

+This allows aob insns to be used.

+* Using constraints on values.

+Many operations could be simplified based on knowledge of the

+minimum and maximum possible values of a register at any particular time.

+These limits could come from the data types in the tree, via rtl generation,

+or they can be deduced from operations that are performed. For example,

+the result of an `and' operation one of whose operands is 7 must be in

+the range 0 to 7. Compare instructions also tell something about the

+possible values of the operand, in the code beyond the test.

+Value constraints can be used to determine the results of a further

+comparison. They can also indicate that certain `and' operations are

+redundant. Constraints might permit a decrement and branch

+instruction that checks zeroness to be used when the user has

+specified to exit if negative.

+* Smarter reload pass.

+The reload pass as currently written can reload values only into registers

+that are reserved for reloading. This means that in order to use a

+register for reloading it must spill everything out of that register.

+It would be straightforward, though complicated, for reload1.c to keep

+track, during its scan, of which hard registers were available at each

+point in the function, and use for reloading even registers that were

+free only at the point they were needed. This would avoid much spilling

+and make better code.

+* Change the type of a variable.

+Sometimes a variable is declared as `int', it is assigned only once

+from a value of type `char', and then it is used only by comparison

+against constants. On many machines, better code would result if

+the variable had type `char'. If the compiler could detect this

+case, it could change the declaration of the variable and change

+all the places that use it.

+* Better handling for very sparse switches.

+There may be cases where it would be better to compile a switch

+statement to use a fixed hash table rather than the current

+combination of jump tables and binary search.

+* Order of subexpressions.

+It might be possible to make better code by paying attention

+to the order in which to generate code for subexpressions of an expression.

+* More code motion.

+Consider hoisting common code up past conditional branches or

+tablejumps.

+* Trace scheduling.

+This technique is said to be able to figure out which way a jump

+will usually go, and rearrange the code to make that path the

+faster one.

+* Distributive law.

+The C expression *(X + 4 * (Y + C)) compiles better on certain

+machines if rewritten as *(X + 4*C + 4*Y) because of known addressing

+modes. It may be tricky to determine when, and for which machines, to

+use each alternative.

+Some work has been done on this, in combine.c.

+* Can optimize by changing if (x) y; else z; into z; if (x) y;

+if z and x do not interfere and z has no effects not undone by y.

+This is desirable if z is faster than jumping.

+* For a two-insn loop on the 68020, such as

+ foo: movb a2@+,a3@+

+ jne foo

+it is better to insert dbeq d0,foo before the jne.

+d0 can be a junk register. The challenge is to fit this into

+a portable framework: when can you detect this situation and

+still be able to allocate a junk register?

+2. Simpler porting.

+Right now, describing the target machine's instructions is done

+cleanly, but describing its addressing mode is done with several

+ad-hoc macro definitions. Porting would be much easier if there were

+an RTL description for addressing modes like that for instructions.

+Tools analogous to genflags and genrecog would generate macros from

+this description.

+There would be one pattern in the address-description file for each

+kind of addressing, and this pattern would have:

+ * the RTL expression for the address

+ * C code to verify its validity (since that may depend on

+ the exact data).

+ * C code to print the address in assembler language.

+ * C code to convert the address into a valid one, if it is not valid.

+ (This would replace LEGITIMIZE_ADDRESS).

+ * Register constraints for all indeterminates that appear

+ in the RTL expression.

+3. Other languages.

+Front ends for Pascal, Fortran, Algol, Cobol, Modula-2 and Ada are

+desirable.

+Pascal, Modula-2 and Ada require the implementation of functions

+within functions. Some of the mechanisms for this already exist.

+4. More extensions.

+* Generated unique labels. Have some way of generating distinct labels

+for use in extended asm statements. I don't know what a good syntax would

+be.

+* A way of defining a structure containing a union, in which the choice of

+union alternative is controlled by a previous structure component.

+Here is a possible syntax for this.

+struct foo {

+ enum { INT, DOUBLE } code;

+ auto union { case INT: int i; case DOUBLE: double d;} value : code;

+};

+* Allow constructor expressions as lvalues, like this:

+ (struct foo) {a, b, c} = foo();

+This would call foo, which returns a structure, and then store the

+several components of the structure into the variables a, b, and c.

+5. Generalize the machine model.

+* Some new compiler features may be needed to do a good job on machines

+where static data needs to be addressed using base registers.

+* Some machines have two stacks in different areas of memory, one used

+for scalars and another for large objects. The compiler does not

+now have a way to understand this.

+6. Useful warnings.

+* Warn about statements that are undefined because the order of

+evaluation of increment operators makes a big difference. Here is an

+example:

+ *foo++ = hack (*foo);

+7. Better documentation of how GCC works and how to port it.

+Here is an outline proposed by Allan Adler.

+I. Overview of this document

+II. The machines on which GCC is implemented

+ A. Prose description of those characteristics of target machines and

+ their operating systems which are pertinent to the implementation

+ of GCC.

+ i. target machine characteristics

+ ii. comparison of this system of machine characteristics with

+ other systems of machine specification currently in use

+ B. Tables of the characteristics of the target machines on which

+ GCC is implemented.

+ C. A priori restrictions on the values of characteristics of target

+ machines, with special reference to those parts of the source code

+ which entail those restrictions

+ i. restrictions on individual characteristics

+ ii. restrictions involving relations between various characteristics

+ D. The use of GCC as a cross-compiler

+ i. cross-compilation to existing machines

+ ii. cross-compilation to non-existent machines

+ E. Assumptions which are made regarding the target machine

+ i. assumptions regarding the architecture of the target machine

+ ii. assumptions regarding the operating system of the target machine

+ iii. assumptions regarding software resident on the target machine

+ iv. where in the source code these assumptions are in effect made

+III. A systematic approach to writing the files tm.h and xm.h

+ A. Macros which require special care or skill

+ B. Examples, with special reference to the underlying reasoning

+IV. A systematic approach to writing the machine description file md

+ A. Minimal viable sets of insn descriptions

+ B. Examples, with special reference to the underlying reasoning

+V. Uses of the file aux-output.c

+VI. Specification of what constitutes correct performance of an

+ implementation of GCC

+ A. The components of GCC

+ B. The itinerary of a C program through GCC

+ C. A system of benchmark programs

+ D. What your RTL and assembler should look like with these benchmarks

+ E. Fine tuning for speed and size of compiled code

+VII. A systematic procedure for debugging an implementation of GCC

+ A. Use of GDB

+ i. the macros in the file .gdbinit for GCC

+ ii. obstacles to the use of GDB

+ a. functions implemented as macros can't be called in GDB

+ B. Debugging without GDB

+ i. How to turn off the normal operation of GCC and access specific

+ parts of GCC

+ C. Debugging tools

+ D. Debugging the parser

+ i. how machine macros and insn definitions affect the parser

+ E. Debugging the recognizer

+ i. how machine macros and insn definitions affect the recognizer

+ditto for other components

+VIII. Data types used by GCC, with special reference to restrictions not

+ specified in the formal definition of the data type

+IX. References to the literature for the algorithms used in GCC