1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
1001
1002
1003
1004
1005
1006
1007
1008
1009
1010
1011
1012
1013
1014
1015
1016
1017
1018
1019
1020
1021
1022
1023
1024
1025
1026
1027
1028
1029
1030
1031
1032
1033
1034
1035
1036
1037
1038
1039
1040
1041
1042
1043
1044
1045
1046
1047
1048
1049
1050
1051
1052
1053
1054
1055
1056
1057
1058
1059
1060
1061
1062
1063
1064
1065
1066
1067
1068
1069
1070
1071
1072
1073
1074
1075
1076
1077
1078
1079
1080
1081
1082
1083
1084
1085
1086
1087
1088
1089
1090
1091
1092
1093
1094
1095
1096
1097
1098
1099
1100
1101
|
.\" $OpenBSD: lint.ms,v 1.1 2003/06/26 16:30:10 mickey Exp $
.\"
.\" Copyright (C) Caldera International Inc. 2001-2002.
.\" All rights reserved.
.\"
.\" Redistribution and use in source and binary forms, with or without
.\" modification, are permitted provided that the following conditions
.\" are met:
.\" 1. Redistributions of source code and documentation must retain the above
.\" copyright notice, this list of conditions and the following disclaimer.
.\" 2. Redistributions in binary form must reproduce the above copyright
.\" notice, this list of conditions and the following disclaimer in the
.\" documentation and/or other materials provided with the distribution.
.\" 3. All advertising materials mentioning features or use of this software
.\" must display the following acknowledgement:
.\" This product includes software developed or owned by Caldera
.\" International, Inc.
.\" 4. Neither the name of Caldera International, Inc. nor the names of other
.\" contributors may be used to endorse or promote products derived from
.\" this software without specific prior written permission.
.\"
.\" USE OF THE SOFTWARE PROVIDED FOR UNDER THIS LICENSE BY CALDERA
.\" INTERNATIONAL, INC. AND CONTRIBUTORS ``AS IS'' AND ANY EXPRESS OR
.\" IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES
.\" OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED.
.\" IN NO EVENT SHALL CALDERA INTERNATIONAL, INC. BE LIABLE FOR ANY DIRECT,
.\" INDIRECT INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES
.\" (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
.\" SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
.\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT,
.\" STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING
.\" IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
.\" POSSIBILITY OF SUCH DAMAGE.
.\"
.\" @(#)lint.ms 6.2 (Berkeley) 4/17/91
.\"
.EH 'PS1:9-%''Lint, a C Program Checker'
.OH 'Lint, a C Program Checker''PS1:9-%'
.\".RP
.ND "July 26, 1978"
.OK
.\"Program Portability
.\"Strong Type Checking
.TL
Lint, a C Program Checker
.AU "MH 2C-559" 3968
S. C. Johnson
.AI
.MH
.AB
.PP
.I Lint
is a command which examines C source programs,
detecting
a number of bugs and obscurities.
It enforces the type rules of C more strictly than
the C compilers.
It may also be used to enforce a number of portability
restrictions involved in moving
programs between different machines and/or operating systems.
Another option detects a number of wasteful, or error prone, constructions
which nevertheless are, strictly speaking, legal.
.PP
.I Lint
accepts multiple input files and library specifications, and checks them for consistency.
.PP
The separation of function between
.I lint
and the C compilers has both historical and practical
rationale.
The compilers turn C programs into executable files rapidly
and efficiently.
This is possible in part because the
compilers do not do sophisticated
type checking, especially between
separately compiled programs.
.I Lint
takes a more global, leisurely view of the program,
looking much more carefully at the compatibilities.
.PP
This document discusses the use of
.I lint ,
gives an overview of the implementation, and gives some hints on the
writing of machine independent C code.
.AE
.CS 10 2 12 0 0 5
.SH
Introduction and Usage
.PP
Suppose there are two C
.[
Kernighan Ritchie Programming Prentice 1978
.]
source files,
.I file1. c
and
.I file2.c ,
which are ordinarily compiled and loaded together.
Then the command
.DS
lint file1.c file2.c
.DE
produces messages describing inconsistencies and inefficiencies
in the programs.
The program enforces the typing rules of C
more strictly than the C compilers
(for both historical and practical reasons)
enforce them.
The command
.DS
lint \-p file1.c file2.c
.DE
will produce, in addition to the above messages, additional messages
which relate to the portability of the programs to other operating
systems and machines.
Replacing the
.B \-p
by
.B \-h
will produce messages about various error-prone or wasteful constructions
which, strictly speaking, are not bugs.
Saying
.B \-hp
gets the whole works.
.PP
The next several sections describe the major messages;
the document closes with sections
discussing the implementation and giving suggestions
for writing portable C.
An appendix gives a summary of the
.I lint
options.
.SH
A Word About Philosophy
.PP
Many of the facts which
.I lint
needs may be impossible to
discover.
For example, whether a given function in a program ever gets called
may depend on the input data.
Deciding whether
.I exit
is ever called is equivalent to solving the famous ``halting problem,'' known to be
recursively undecidable.
.PP
Thus, most of the
.I lint
algorithms are a compromise.
If a function is never mentioned, it can never be called.
If a function is mentioned,
.I lint
assumes it can be called; this is not necessarily so, but in practice is quite reasonable.
.PP
.I Lint
tries to give information with a high degree of relevance.
Messages of the form ``\fIxxx\fR might be a bug''
are easy to generate, but are acceptable only in proportion
to the fraction of real bugs they uncover.
If this fraction of real bugs is too small, the messages lose their credibility
and serve merely to clutter up the output,
obscuring the more important messages.
.PP
Keeping these issues in mind, we now consider in more detail
the classes of messages which
.I lint
produces.
.SH
Unused Variables and Functions
.PP
As sets of programs evolve and develop,
previously used variables and arguments to
functions may become unused;
it is not uncommon for external variables, or even entire
functions, to become unnecessary, and yet
not be removed from the source.
These ``errors of commission'' rarely cause working programs to fail, but they are a source
of inefficiency, and make programs harder to understand
and change.
Moreover, information about such unused variables and functions can occasionally
serve to discover bugs; if a function does a necessary job, and
is never called, something is wrong!
.PP
.I Lint
complains about variables and functions which are defined but not otherwise
mentioned.
An exception is variables which are declared through explicit
.B extern
statements but are never referenced; thus the statement
.DS
extern float sin(\|);
.DE
will evoke no comment if
.I sin
is never used.
Note that this agrees with the semantics of the C compiler.
In some cases, these unused external declarations might be of some interest; they
can be discovered by adding the
.B \-x
flag to the
.I lint
invocation.
.PP
Certain styles of programming
require many functions to be written with similar interfaces;
frequently, some of the arguments may be unused
in many of the calls.
The
.B \-v
option is available to suppress the printing of
complaints about unused arguments.
When
.B \-v
is in effect, no messages are produced about unused
arguments except for those
arguments which are unused and also declared as
register arguments; this can be considered
an active (and preventable) waste of the register
resources of the machine.
.PP
There is one case where information about unused, or
undefined, variables is more distracting
than helpful.
This is when
.I lint
is applied to some, but not all, files out of a collection
which are to be loaded together.
In this case, many of the functions and variables defined
may not be used, and, conversely,
many functions and variables defined elsewhere may be used.
The
.B \-u
flag may be used to suppress the spurious messages which might otherwise appear.
.SH
Set/Used Information
.PP
.I Lint
attempts to detect cases where a variable is used before it is set.
This is very difficult to do well;
many algorithms take a good deal of time and space,
and still produce messages about perfectly valid programs.
.I Lint
detects local variables (automatic and register storage classes)
whose first use appears physically earlier in the input file than the first assignment to the variable.
It assumes that taking the address of a variable constitutes a ``use,'' since the actual use
may occur at any later time, in a data dependent fashion.
.PP
The restriction to the physical appearance of variables in the file makes the
algorithm very simple and quick to implement,
since the true flow of control need not be discovered.
It does mean that
.I lint
can complain about some programs which are legal,
but these programs would probably be considered bad on stylistic grounds (e.g. might
contain at least two \fBgoto\fR's).
Because static and external variables are initialized to 0,
no meaningful information can be discovered about their uses.
The algorithm deals correctly, however, with initialized automatic variables, and variables
which are used in the expression which first sets them.
.PP
The set/used information also permits recognition of those local variables which are set
and never used; these form a frequent source of inefficiencies, and may also be symptomatic of bugs.
.SH
Flow of Control
.PP
.I Lint
attempts to detect unreachable portions of the programs which it processes.
It will complain about unlabeled statements immediately following
\fBgoto\fR, \fBbreak\fR, \fBcontinue\fR, or \fBreturn\fR statements.
An attempt is made to detect loops which can never be left at the bottom, detecting the
special cases
\fBwhile\fR( 1 ) and \fBfor\fR(;;) as infinite loops.
.I Lint
also complains about loops which cannot be entered at the top;
some valid programs may have such loops, but at best they are bad style,
at worst bugs.
.PP
.I Lint
has an important area of blindness in the flow of control algorithm:
it has no way of detecting functions which are called and never return.
Thus, a call to
.I exit
may cause unreachable code which
.I lint
does not detect; the most serious effects of this are in the
determination of returned function values (see the next section).
.PP
One form of unreachable statement is not usually complained about by
.I lint;
a
.B break
statement that cannot be reached causes no message.
Programs generated by
.I yacc ,
.[
Johnson Yacc 1975
.]
and especially
.I lex ,
.[
Lesk Lex
.]
may have literally hundreds of unreachable
.B break
statements.
The
.B \-O
flag in the C compiler will often eliminate the resulting object code inefficiency.
Thus, these unreached statements are of little importance,
there is typically nothing the user can do about them, and the
resulting messages would clutter up the
.I lint
output.
If these messages are desired,
.I lint
can be invoked with the
.B \-b
option.
.SH
Function Values
.PP
Sometimes functions return values which are never used;
sometimes programs incorrectly use function ``values''
which have never been returned.
.I Lint
addresses this problem in a number of ways.
.PP
Locally, within a function definition,
the appearance of both
.DS
return( \fIexpr\fR );
.DE
and
.DS
return ;
.DE
statements is cause for alarm;
.I lint
will give the message
.DS
function \fIname\fR contains return(e) and return
.DE
The most serious difficulty with this is detecting when a function return is implied
by flow of control reaching the end of the function.
This can be seen with a simple example:
.DS
.ta .5i 1i 1.5i
\fRf ( a ) {
if ( a ) return ( 3 );
g (\|);
}
.DE
Notice that, if \fIa\fR tests false, \fIf\fR will call \fIg\fR and then return
with no defined return value; this will trigger a complaint from
.I lint .
If \fIg\fR, like \fIexit\fR, never returns,
the message will still be produced when in fact nothing is wrong.
.PP
In practice, some potentially serious bugs have been discovered by this feature;
it also accounts for a substantial fraction of the ``noise'' messages produced
by
.I lint .
.PP
On a global scale,
.I lint
detects cases where a function returns a value, but this value is sometimes,
or always, unused.
When the value is always unused, it may constitute an inefficiency in the function definition.
When the value is sometimes unused, it may represent bad style (e.g., not testing for
error conditions).
.PP
The dual problem, using a function value when the function does not return one,
is also detected.
This is a serious problem.
Amazingly, this bug has been observed on a couple of occasions
in ``working'' programs; the desired function value just happened to have been computed
in the function return register!
.SH
Type Checking
.PP
.I Lint
enforces the type checking rules of C more strictly than the compilers do.
The additional checking is in four major areas:
across certain binary operators and implied assignments,
at the structure selection operators,
between the definition and uses of functions,
and in the use of enumerations.
.PP
There are a number of operators which have an implied balancing between types of the operands.
The assignment, conditional ( ?\|: ), and relational operators
have this property; the argument
of a \fBreturn\fR statement,
and expressions used in initialization also suffer similar conversions.
In these operations,
\fBchar\fR, \fBshort\fR, \fBint\fR, \fBlong\fR, \fBunsigned\fR, \fBfloat\fR, and \fBdouble\fR types may be freely intermixed.
The types of pointers must agree exactly,
except that arrays of \fIx\fR's can, of course, be intermixed with pointers to \fIx\fR's.
.PP
The type checking rules also require that, in structure references, the
left operand of the \(em> be a pointer to structure, the left operand of the \fB.\fR
be a structure, and the right operand of these operators be a member
of the structure implied by the left operand.
Similar checking is done for references to unions.
.PP
Strict rules apply to function argument and return value
matching.
The types \fBfloat\fR and \fBdouble\fR may be freely matched,
as may the types \fBchar\fR, \fBshort\fR, \fBint\fR, and \fBunsigned\fR.
Also, pointers can be matched with the associated arrays.
Aside from this, all actual arguments must agree in type with their declared counterparts.
.PP
With enumerations, checks are made that enumeration variables or members are not mixed
with other types, or other enumerations,
and that the only operations applied are =, initialization, ==, !=, and function arguments and return values.
.SH
Type Casts
.PP
The type cast feature in C was introduced largely as an aid
to producing more portable programs.
Consider the assignment
.DS
p = 1 ;
.DE
where
.I p
is a character pointer.
.I Lint
will quite rightly complain.
Now, consider the assignment
.DS
p = (char \(**)1 ;
.DE
in which a cast has been used to
convert the integer to a character pointer.
The programmer obviously had a strong motivation
for doing this, and has clearly signaled his intentions.
It seems harsh for
.I lint
to continue to complain about this.
On the other hand, if this code is moved to another
machine, such code should be looked at carefully.
The
.B \-c
flag controls the printing of comments about casts.
When
.B \-c
is in effect, casts are treated as though they were assignments
subject to complaint; otherwise, all legal casts are passed without comment,
no matter how strange the type mixing seems to be.
.SH
Nonportable Character Use
.PP
On the PDP-11, characters are signed quantities, with a range
from \-128 to 127.
On most of the other C implementations, characters take on only positive
values.
Thus,
.I lint
will flag certain comparisons and assignments as being
illegal or nonportable.
For example, the fragment
.DS
char c;
...
if( (c = getchar(\|)) < 0 ) ....
.DE
works on the PDP-11, but
will fail on machines where characters always take
on positive values.
The real solution is to declare
.I c
an integer, since
.I getchar
is actually returning
integer values.
In any case,
.I lint
will say
``nonportable character comparison''.
.PP
A similar issue arises with bitfields; when assignments
of constant values are made to bitfields, the field may
be too small to hold the value.
This is especially true because
on some machines bitfields are considered as signed
quantities.
While it may seem unintuitive to consider
that a two bit field declared of type
.B int
cannot hold the value 3, the problem disappears
if the bitfield is declared to have type
.B unsigned .
.SH
Assignments of longs to ints
.PP
Bugs may arise from the assignment of
.B long
to
an
.B int ,
which loses accuracy.
This may happen in programs
which have been incompletely converted to use
.B typedefs .
When a
.B typedef
variable
is changed from \fBint\fR to \fBlong\fR,
the program can stop working because
some intermediate results may be assigned
to \fBints\fR, losing accuracy.
Since there are a number of legitimate reasons for
assigning \fBlongs\fR to \fBints\fR, the detection
of these assignments is enabled
by the
.B \-a
flag.
.SH
Strange Constructions
.PP
Several perfectly legal, but somewhat strange, constructions
are flagged by
.I lint;
the messages hopefully encourage better code quality, clearer style, and
may even point out bugs.
The
.B \-h
flag is used to enable these checks.
For example, in the statement
.DS
\(**p++ ;
.DE
the \(** does nothing; this provokes the message ``null effect'' from
.I lint .
The program fragment
.DS
unsigned x ;
if( x < 0 ) ...
.DE
is clearly somewhat strange; the
test will never succeed.
Similarly, the test
.DS
if( x > 0 ) ...
.DE
is equivalent to
.DS
if( x != 0 )
.DE
which may not be the intended action.
.I Lint
will say ``degenerate unsigned comparison'' in these cases.
If one says
.DS
if( 1 != 0 ) ....
.DE
.I lint
will report
``constant in conditional context'', since the comparison
of 1 with 0 gives a constant result.
.PP
Another construction
detected by
.I lint
involves
operator precedence.
Bugs which arise from misunderstandings about the precedence
of operators can be accentuated by spacing and formatting,
making such bugs extremely hard to find.
For example, the statements
.DS
if( x&077 == 0 ) ...
.DE
or
.DS
x<\h'-.3m'<2 + 40
.DE
probably do not do what was intended.
The best solution is to parenthesize such expressions,
and
.I lint
encourages this by an appropriate message.
.PP
Finally, when the
.B \-h
flag is in force
.I lint
complains about variables which are redeclared in inner blocks
in a way that conflicts with their use in outer blocks.
This is legal, but is considered by many (including the author) to
be bad style, usually unnecessary, and frequently a bug.
.SH
Ancient History
.PP
There are several forms of older syntax which are being officially
discouraged.
These fall into two classes, assignment operators and initialization.
.PP
The older forms of assignment operators (e.g., =+, =\-, . . . )
could cause ambiguous expressions, such as
.DS
a =\-1 ;
.DE
which could be taken as either
.DS
a =\- 1 ;
.DE
or
.DS
a = \-1 ;
.DE
The situation is especially perplexing if this
kind of ambiguity arises as the result of a macro substitution.
The newer, and preferred operators (+=, \-=, etc. )
have no such ambiguities.
To spur the abandonment of the older forms,
.I lint
complains about these old fashioned operators.
.PP
A similar issue arises with initialization.
The older language allowed
.DS
int x \fR1 ;
.DE
to initialize
.I x
to 1.
This also caused syntactic difficulties: for example,
.DS
int x ( \-1 ) ;
.DE
looks somewhat like the beginning of a function declaration:
.DS
int x ( y ) { . . .
.DE
and the compiler must read a fair ways past
.I x
in order to sure what the declaration really is..
Again, the problem is even more perplexing when the
initializer involves a macro.
The current syntax places an equals sign between the
variable and the initializer:
.DS
int x = \-1 ;
.DE
This is free of any possible syntactic ambiguity.
.SH
Pointer Alignment
.PP
Certain pointer assignments may be reasonable on some machines,
and illegal on others, due entirely to
alignment restrictions.
For example, on the PDP-11, it is reasonable
to assign integer pointers to double pointers, since
double precision values may begin on any integer boundary.
On the Honeywell 6000, double precision values must begin
on even word boundaries;
thus, not all such assignments make sense.
.I Lint
tries to detect cases where pointers are assigned to other
pointers, and such alignment problems might arise.
The message ``possible pointer alignment problem''
results from this situation whenever either the
.B \-p
or
.B \-h
flags are in effect.
.SH
Multiple Uses and Side Effects
.PP
In complicated expressions, the best order in which to evaluate
subexpressions may be highly machine dependent.
For example, on machines (like the PDP-11) in which the stack
runs backwards, function arguments will probably be best evaluated
from right-to-left; on machines with a stack running forward,
left-to-right seems most attractive.
Function calls embedded as arguments of other functions
may or may not be treated similarly to ordinary arguments.
Similar issues arise with other operators which have side effects,
such as the assignment operators and the increment and decrement operators.
.PP
In order that the efficiency of C on a particular machine not be
unduly compromised, the C language leaves the order
of evaluation of complicated expressions up to the
local compiler, and, in fact, the various C compilers have considerable
differences in the order in which they will evaluate complicated
expressions.
In particular, if any variable is changed by a side effect, and
also used elsewhere in the same expression, the result is explicitly undefined.
.PP
.I Lint
checks for the important special case where
a simple scalar variable is affected.
For example, the statement
.DS
\fIa\fR[\fIi\|\fR] = \fIb\fR[\fIi\fR++] ;
.DE
will draw the complaint:
.DS
warning: \fIi\fR evaluation order undefined
.DE
.SH
Implementation
.PP
.I Lint
consists of two programs and a driver.
The first program is a version of the
Portable C Compiler
.[
Johnson Ritchie BSTJ Portability Programs System
.]
.[
Johnson portable compiler 1978
.]
which is the basis of the
IBM 370, Honeywell 6000, and Interdata 8/32 C compilers.
This compiler does lexical and syntax analysis on the input text,
constructs and maintains symbol tables, and builds trees for expressions.
Instead of writing an intermediate file which is passed to
a code generator, as the other compilers
do,
.I lint
produces an intermediate file which consists of lines of ascii text.
Each line contains an external variable name,
an encoding of the context in which it was seen (use, definition, declaration, etc.),
a type specifier, and a source file name and line number.
The information about variables local to a function or file
is collected
by accessing the symbol table, and examining the expression trees.
.PP
Comments about local problems are produced as detected.
The information about external names is collected
onto an intermediate file.
After all the source files and library descriptions have
been collected, the intermediate file is sorted
to bring all information collected about a given external
name together.
The second, rather small, program then reads the lines
from the intermediate file and compares all of the
definitions, declarations, and uses for consistency.
.PP
The driver controls this
process, and is also responsible for making the options available
to both passes of
.I lint .
.SH
Portability
.PP
C on the Honeywell and IBM systems is used, in part, to write system code for the host operating system.
This means that the implementation of C tends to follow local conventions rather than
adhere strictly to
.UX
system conventions.
Despite these differences, many C programs have been successfully moved to GCOS and the various IBM
installations with little effort.
This section describes some of the differences between the implementations, and
discusses the
.I lint
features which encourage portability.
.PP
Uninitialized external variables are treated differently in different
implementations of C.
Suppose two files both contain a declaration without initialization, such as
.DS
int a ;
.DE
outside of any function.
The
.UX
loader will resolve these declarations, and cause only a single word of storage
to be set aside for \fIa\fR.
Under the GCOS and IBM implementations, this is not feasible (for various stupid reasons!)
so each such declaration causes a word of storage to be set aside and called \fIa\fR.
When loading or library editing takes place, this causes fatal conflicts which prevent
the proper operation of the program.
If
.I lint
is invoked with the \fB\-p\fR flag,
it will detect such multiple definitions.
.PP
A related difficulty comes from the amount of information retained about external names during the
loading process.
On the
.UX
system, externally known names have seven significant characters, with the upper/lower
case distinction kept.
On the IBM systems, there are eight significant characters, but the case distinction
is lost.
On GCOS, there are only six characters, of a single case.
This leads to situations where programs run on the
.UX
system, but encounter loader
problems on the IBM or GCOS systems.
.I Lint
.B \-p
causes all external symbols to be mapped to one case and truncated to six characters,
providing a worst-case analysis.
.PP
A number of differences arise in the area of character handling: characters in the
.UX
system are eight bit ascii, while they are eight bit ebcdic on the IBM, and
nine bit ascii on GCOS.
Moreover, character strings go from high to low bit positions (``left to right'')
on GCOS and IBM, and low to high (``right to left'') on the PDP-11.
This means that code attempting to construct strings
out of character constants, or attempting to use characters as indices
into arrays, must be looked at with great suspicion.
.I Lint
is of little help here, except to flag multi-character character constants.
.PP
Of course, the word sizes are different!
This causes less trouble than might be expected, at least when
moving from the
.UX
system (16 bit words) to the IBM (32 bits) or GCOS (36 bits).
The main problems are likely to arise in shifting or masking.
C now supports a bit-field facility, which can be used to write much of
this code in a reasonably portable way.
Frequently, portability of such code can be enhanced by
slight rearrangements in coding style.
Many of the incompatibilities seem to have the flavor of writing
.DS
x &= 0177700 ;
.DE
to clear the low order six bits of \fIx\fR.
This suffices on the PDP-11, but fails badly on GCOS and IBM.
If the bit field feature cannot be used, the same effect can be obtained by
writing
.DS
x &= \(ap 077 ;
.DE
which will work on all these machines.
.PP
The right shift operator is arithmetic shift on the PDP-11, and logical shift on most
other machines.
To obtain a logical shift on all machines, the left operand can be
typed \fBunsigned\fR.
Characters are considered signed integers on the PDP-11, and unsigned on the other machines.
This persistence of the sign bit may be reasonably considered a bug in the PDP-11 hardware
which has infiltrated itself into the C language.
If there were a good way to discover the programs which would be affected, C could be changed;
in any case,
.I lint
is no help here.
.PP
The above discussion may have made the problem of portability seem
bigger than it in fact is.
The issues involved here are rarely subtle or mysterious, at least to the
implementor of the program, although they can involve some work to straighten out.
The most serious bar to the portability of
.UX
system utilities has been the inability to mimic
essential
.UX
system functions on the other systems.
The inability to seek to a random character position in a text file, or to establish a pipe
between processes, has involved far more rewriting
and debugging than any of the differences in C compilers.
On the other hand,
.I lint
has been very helpful
in moving the
.UX
operating system and associated
utility programs to other machines.
.SH
Shutting Lint Up
.PP
There are occasions when
the programmer is smarter than
.I lint .
There may be valid reasons for ``illegal'' type casts,
functions with a variable number of arguments, etc.
Moreover, as specified above, the flow of control information
produced by
.I lint
often has blind spots, causing occasional spurious
messages about perfectly reasonable programs.
Thus, some way of communicating with
.I lint ,
typically to shut it up, is desirable.
.PP
The form which this mechanism should take is not at all clear.
New keywords would require current and old compilers to
recognize these keywords, if only to ignore them.
This has both philosophical and practical problems.
New preprocessor syntax suffers from similar problems.
.PP
What was finally done was to cause a number of words
to be recognized by
.I lint
when they were embedded in comments.
This required minimal preprocessor changes;
the preprocessor just had to agree to pass comments
through to its output, instead of deleting them
as had been previously done.
Thus,
.I lint
directives are invisible to the compilers, and
the effect on systems with the older preprocessors
is merely that the
.I lint
directives don't work.
.PP
The first directive is concerned with flow of control information;
if a particular place in the program cannot be reached,
but this is not apparent to
.I lint ,
this can be asserted by the directive
.DS
/* NOTREACHED */
.DE
at the appropriate spot in the program.
Similarly, if it is desired to turn off
strict type checking for
the next expression, the directive
.DS
/* NOSTRICT */
.DE
can be used; the situation reverts to the
previous default after the next expression.
The
.B \-v
flag can be turned on for one function by the directive
.DS
/* ARGSUSED */
.DE
Complaints about variable number of arguments in calls to a function
can be turned off by the directive
.DS
/* VARARGS */
.DE
preceding the function definition.
In some cases, it is desirable to check the
first several arguments, and leave the later arguments unchecked.
This can be done by following the VARARGS keyword immediately
with a digit giving the number of arguments which should be checked; thus,
.DS
/* VARARGS2 */
.DE
will cause the first two arguments to be checked, the others unchecked.
Finally, the directive
.DS
/* LINTLIBRARY */
.DE
at the head of a file identifies this file as
a library declaration file; this topic is worth a
section by itself.
.SH
Library Declaration Files
.PP
.I Lint
accepts certain library directives, such as
.DS
\-ly
.DE
and tests the source files for compatibility with these libraries.
This is done by accessing library description files whose
names are constructed from the library directives.
These files all begin with the directive
.DS
/* LINTLIBRARY */
.DE
which is followed by a series of dummy function
definitions.
The critical parts of these definitions
are the declaration of the function return type,
whether the dummy function returns a value, and
the number and types of arguments to the function.
The VARARGS and ARGSUSED directives can
be used to specify features of the library functions.
.PP
.I Lint
library files are processed almost exactly like ordinary
source files.
The only difference is that functions which are defined on a library file,
but are not used on a source file, draw no complaints.
.I Lint
does not simulate a full library search algorithm,
and complains if the source files contain a redefinition of
a library routine (this is a feature!).
.PP
By default,
.I lint
checks the programs it is given against a standard library
file, which contains descriptions of the programs which
are normally loaded when
a C program
is run.
When the
.B -p
flag is in effect, another file is checked containing
descriptions of the standard I/O library routines
which are expected to be portable across various machines.
The
.B -n
flag can be used to suppress all library checking.
.SH
Bugs, etc.
.PP
.I Lint
was a difficult program to write, partially
because it is closely connected with matters of programming style,
and partially because users usually don't notice bugs which cause
.I lint
to miss errors which it should have caught.
(By contrast, if
.I lint
incorrectly complains about something that is correct, the
programmer reports that immediately!)
.PP
A number of areas remain to be further developed.
The checking of structures and arrays is rather inadequate;
size
incompatibilities go unchecked,
and no attempt is made to match up structure and union
declarations across files.
Some stricter checking of the use of the
.B typedef
is clearly desirable, but what checking is appropriate, and how
to carry it out, is still to be determined.
.PP
.I Lint
shares the preprocessor with the C compiler.
At some point it may be appropriate for a
special version of the preprocessor to be constructed
which checks for things such as unused macro definitions,
macro arguments which have side effects which are
not expanded at all, or are expanded more than once, etc.
.PP
The central problem with
.I lint
is the packaging of the information which it collects.
There are many options which
serve only to turn off, or slightly modify,
certain features.
There are pressures to add even more of these options.
.PP
In conclusion, it appears that the general notion of having two
programs is a good one.
The compiler concentrates on quickly and accurately turning the
program text into bits which can be run;
.I lint
concentrates on issues
of portability, style, and efficiency.
.I Lint
can afford to be wrong, since incorrectness and over-conservatism
are merely annoying, not fatal.
The compiler can be fast since it knows that
.I lint
will cover its flanks.
Finally, the programmer can
concentrate at one stage
of the programming process solely on the algorithms,
data structures, and correctness of the
program, and then later retrofit,
with the aid of
.I lint ,
the desirable properties of universality and portability.
.SG MH-1273-SCJ-unix
.\".bp
.[
$LIST$
.]
.bp
.SH
Appendix: Current Lint Options
.PP
The command currently has the form
.DS
lint\fR [\fB\-\fRoptions ] files... library-descriptors...
.DE
The options are
.IP \fBh\fR
Perform heuristic checks
.IP \fBp\fR
Perform portability checks
.IP \fBv\fR
Don't report unused arguments
.IP \fBu\fR
Don't report unused or undefined externals
.IP \fBb\fR
Report unreachable
.B break
statements.
.IP \fBx\fR
Report unused external declarations
.IP \fBa\fR
Report assignments of
.B long
to
.B int
or shorter.
.IP \fBc\fR
Complain about questionable casts
.IP \fBn\fR
No library checking is done
.IP \fBs\fR
Same as
.B h
(for historical reasons)
|