summaryrefslogtreecommitdiff
path: root/gnu/usr.bin/lynx/samples/cernrules.txt
blob: 977f25d85a537c9929f53122a147d86c781a68c9 (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
# This files contains examples and an explanation for the RULESFILE / RULE
# feature.
#
# Rules for Lynx are experimental.  They provide a rudimentary capability
# for URL rejection and substitution based on string matching.
# Most users and most installations will not need this feature, it is here
# in case you find it useful.  Note that this may change or go away in
# future releases of Lynx; if you find it useful, consider describing your
# use of it in a message to <lynx-dev@sig.net>.
#
# Syntax:
# =======
# Summary of common forms:
#
#   Fail           URL1
#   Map            URL1  URL2      [CONDITION]
#   Pass           URL1  [URL2]    [CONDITION]
#   Redirect       URL1  URL2      [CONDITION]
#   RedirectPerm   URL1  URL2      [CONDITION]
#   UseProxy       URL1  PROXYURL  [CONDITION]
#   UseProxy       URL1  "none"    [CONDITION]
#
#   Alert          URL1  MESSAGE   [CONDITION]
#   AlwaysAlert    URL1  MESSAGE   [CONDITION]
#   UserMsg        URL1  MESSAGE   [CONDITION]
#   InfoMsg        URL1  MESSAGE   [CONDITION]
#   Progress       URL1  MESSAGE   [CONDITION]
#
# As you may have guessed, comments are introduced by a '#' character.
# Rules have the general form
#   Operator  Operand1  [Operand2]  [CONDITION]
# with words separated by whitespace.  Words containing space can be quoted
# with "double quotes".  Although normally this should not be necessary
# necessary for URLs, it has to be used for MESSAGE Operands in Alert etc.
# See below for an explanation of the optional CONDITION.
#
# Recognized operators are
#
#   Fail  URL1
# Reject access to this URL, stop processing further rules.
#
#   Map   URL1  URL2
# Change the current URL to URL2, then continue processing.
#
#   Pass  URL1  [URL2]
# Accept this URL and stop processing further rules; if URL2
# is given, apply this as the last mapping.
# See the next item for reasons why you generally don't want to "pass"
# a changed URL.
#
#   RedirectTemp       URL1  URL2
#   RedirectPerm       URL1  URL2
#   Redirect [STATUS]  URL1  URL2
# Stop processing further rules and redirect to URL2, just as if lynx had
# received a HTTP redirection with URL2 as the new location.  This means that
# URL2 is subject to any applicable permission checking, if it passes a new
# request will be issued (which may result in a new round of rules checking,
# with a new "current URL") or the new URL might be taken from the cache, and,
# after successful loading, lynx's idea of what the loaded document's URL is
# will be fully updated.  All this does not happen if you just "pass" a changed
# URL (or let it fall through), so this is generally the preferred way for
# substituting URLs. 
# If the RedirectPerm variant is used, or if the optional word is supplied and
# is either "permanent" or "301", act as if lynx had received a permanent
# redirection (with HTTP status 301).  In most cases this will not make a
# noticeable difference.  Lynx may cache the location in a special way for 301
# redirections, so that the redirection is followed immediately the next time
# the same original URL is accessed, without re-checking of rules.  Therefore
# the permanent variant should never be used if the desired outcome of rules
# processing depends on variable conditions (see CONDITIONS below) or on
# setting a special flag (see next item).
#
#   PermitRedirection  URL1
# Mark following redirection as permitted, and continue processing.  Some
# redirection locations are normally not allowed, because permitting them in a
# response from an arbitrary remote server would open a security hole, and
# others are not allowed if certain restrictions options are in effect.  Among
# redirection locations normally always forbidden are lynxprog:  and lynxexec: 
# schemes.  With "default" anonymous restrictions in effect, many URL schemes
# are disallowed if the user would not be allowed to use them with 'g'oto. 
# This rule allows to override the permission checking if rules processing ends
# with a Redirect (including the RedirectPerm or RedirectTemp forms).  It is
# ignored otherwise, in particular, it does not influence acceptance if rules
# processing ends with a "Pass" and a real redirection is received in the
# subsequent HTTP request.  If redirections are chained, it only applies to the
# redirection that ends the same rules cycle.  Note that the new URL is still
# subject to other permission checks that are not specific to redirections; but
# using this rule may still weaken the expected effect of -anonymous,
# -validate, -realm, and other restriction options, including TRUSTED_EXEC and
# similar in lynx.cfg, so be careful where you redirect to if restrictions are
# important!
#
#   UseProxy  URL1  PROXYURL
# Stop processing further rules, and force access through the proxy given by
# PROXYURL.  PROXYURL should have the same form as required for foo_proxy
# environment variables and lynx.cfg options, i.e., (unless you are trying to
# do something unusual) "http://some.proxy-server.dom:port/".  This rule
# overrides any use of a proxy (or external gateway) that might otherwise apply
# because of environment variables or lynx.cfg options, it also overrides any
# "no_proxy" settings.
#
#   UseProxy  URL1  none
# Mark request as NOT using any proxy (or external gateway), and continue
# processing(!).  For a request marked this way, any subsequent UseProxy
# rule with a PROXYURL will be ignored, and any use of a proxy (or external
# gateway) that might otherwise apply because of environment variables or
# lynx.cfg options will be overridden.  Note that the marking will not
# survive a Redirect rule (since that will result, if successful, in a
# new request).
#
#   Alert         URL1  MESSAGE
#   AlwaysAlert   URL1  MESSAGE
#   UserMsg       URL1  MESSAGE
#   InfoMsg       URL1  MESSAGE
#   Progress      URL1  MESSAGE
# These produce various kinds of statusline messages, differing in whether
# a pause is enforced and in its duration, immediately when the rule is
# applied.  AlwaysAlert shows the message text even in non-interactive mode
# (-dump, -source, etc.).  Rule processing continues after the message is
# shown.  As usual, these rules only apply if URL1 matches.  MESSAGE is
# the text to be displayed, it can contain one occurrence of "%s" which
# will be replaced by the current URL, literal '%' characters should be
# doubled as "%%".
#
# Rules are processed sequentially first to last for each request, a rule
# applies if the current URL matches URL1.  The current URL is initally the
# URL for the resource the user is trying to access, but may change as the
# result of applied Map rules.  case-sensitive (!) string comparison is used,
# in addition URL1 can contain one '*' which is interpreted as a wildcard
# matching 0 or more characters.  So if for example
# "http://example.com/dir/doc.html" is requested, it would match any of
# the following:
#   Pass  http:*
#   Pass  http://example.com/*.html
#   Pass  http://example.com/*
#   Pass  http://example*
#   Pass  http://*/doc.html
# but not:
#   Pass  http://example/*
#   Pass  http://Example.COM/dir/doc.html
#   Pass  http://Example.COM/*
#
# If a URL2 is given and also contains a '*', that character will be
# replaced by whatever matched in URL1.  Processing stops with the
# first matching "Fail" or "Pass" or when the end of the rules is reached.
# If the end is reached without a "Fail" or "Pass", the URL is allowed
# (equivalent to a final "Pass *").
#
# The requested URL will have been transformed to Lynx's normal
# representation.  This means that local file resources should be
# expected in the form "file://localhost/<path using slash separators>",
# not in the machine's native representation for filenames.
#
# Anyone with experience configuring the venerable CERN httpd server will
# recognize some of the syntax - in fact, the code implementing rules goes
# back to a common ancestor.  But note the differences: all URLs and URL-
# patterns here have to be given as absolute URLs, even for local files.
# (Absolute URLs don't imply proxying.)
#
# CONDITIONS
# ----------
# All rules mentioned can be followed by an optional CONDITION, which can
# be used to further restrict when the rule should be applied (in addition
# to the match on URL1).  A CONDITION takes one of the forms
#   "if"     CONDITIONFLAG
#   "unless" CONDITIONFLAG
# and currently two condition flags are recognized:
#   "userspecified"   (or abbreviated "userspec")
#   "redirected"
# To explain these, first some terms need to be defined.  A "request"
# is...
# 
# A user action (like following a link, or entering a 'g'oto URL) can either be
# rejected immediately (for example, because of restrictions in effect, or
# because of invalid input), or can generate a "request".  For the purpose of
# this discussion, a "request" is the sequence of processing done by lynx,
# which might ultimately lead to an actual network request and loading and
# display of data; a request can also result in rejection (for example, some
# restrictions are checked at this stage), or in a redirection.  A redirection
# in turn can be rejected (which makes the request fail), or can automatically
# generate a new request.  A "request chain" is the sequence of one or more
# requests triggered by the same user event that are chained together by
# redirections.
# For each request, some URL schemes are handled (or rejected) specially, see
# Limitation 1 below, the others are passed to the generic access code.  Rules
# processing occurs at the beginning of the generic access code, before a
# request is dispatched to the scheme-specific protocol module (but after
# checking whether the request can be satisfied by re-displaying an already
# cached document).
# With these definitions, the meaning of the possible CONDITIONFLAGS:
# 
#   if redirected
# The rule applies if the current request results from a redirection;
# whether that was a real HTTP redirection or one generated by a rule
# in the previous request makes no difference.  In other words, the
# condition is true if the current request is not the first one in the
# request chain.
#
#   if userspecified
# The rule applies if the initial URL of the request chain was specified
# by the user.  Lynx marks a request as "user specified" for URLs that
# come from 'g'oto prompts, as well as for following links in a bookmark
# or Jump file and some other special (lynx-generated) pages that may
# contain URLs that were typed in by the user.
# Note that this is not a property of the request, but of the whole request
# chain (based on where the first request's URL came from).  The current
# URL may differ from what the user typed
# - because of initial fixups, including conversion of Guess-URLs and file
#   paths to full URLs,
# - because of Map rules applied, and/or
# - because of a previous redirection.
# So to make reasonably sure a suspicious or potentially dangerous URL has
# been entered by the user, i.e. is not a link or external redirection
# location that cannot be trusted, a combination of "userspecified" and
# "redirected" flags should be used, for example
#   Fail URL1 unless userspecified
#   Fail URL1 if redirected
#   ...
#
# CAVEAT
# ======
# First, to squash any false expectations, an example for what NOT TO DO.
# It might be expected that a rule like
#   Fail  file://localhost/etc/passwd		# <- DON'T RELY ON THIS
# could be used to prevent access to the file "/etc/passwd".  This might
# fool a naive user, but the more sophisticated user could still gain
# access, by experimenting with other forms like (@@@ untested)
# "file://<machine's domain name>/etc/passwd" or "/etc//passwd"
# or "/etc/p%61asswd" or "/etc/passwd?" or "/etc/passwd#X" and so on.
# There are many URL forms for accessing the same resource, and Lynx
# just doesn't guarantee that URLs for the same resource will look the
# same way.
#
# The same reservation applies to any attempts to block access to unwanted
# sites and so on.  This isn't the right place for implementing it.
# (Lynx has a number of mechanisms documented elsewhere to restrict access,
# see the INSTALLATION file, lynx.cfg, lynx -help, lynx -restrictions.)
#
# Some more useful applications:
#
# 1. Disabling URLs by access scheme
# ----------------------------------
#   Fail  gopher:*
#   Fail  finger:*
#   Fail  lynxcgi:*
#   Fail  LYNXIMGMAP:*
# This should work (but no guarantees) because Lynx canonicalizes
# the case of recognized access schemes and does not interpret
# %-escaping in the scheme part (@@@ always?)
#
# Note that for many access schemes Lynx already has mechanisms to
# restrict access (see lynx.cfg, -help, -restrictions, etc.), others
# have to be specifically enabled.  Those mechanisms should be used
# in preference.
# Note especially Limitation 1 below.
# This can be used for the remaining cases, or in addition by the
# more paranoid.  Note that disabling "file:*" will also make many
# of the special pages generated by lynx as temporary files (INFO,
# history, ...) inaccessible, on the other hand it doesn't prevent
# _writing_ of various temp files - probably not what you want.
#
# You could also direct access for a scheme to a brief text explaining
# why it's not available:
#   Redirect news:*   http://localhost/texts/newsserver-is-broken.html
#
# 2. Preventing accidental access
# -------------------------------
# If there is a page or site you don't want to access for whatever
# reason (say there's a link to it that crashes Lynx [don't forget to
# report a bug], or if that starts sending you a 5 Mb file you don't
# want, or you just don't like the people...), you can prevent yourself
# from accidentally accessing it:
#    Fail  http://bad.site.com/*
#
# 3. Compressed files
# -------------------
# You have downloaded a bunch of HTML documents, and compressed them
# to save space.  Then you discover that links between the files don't
# work, because they all use the names of the uncompressed files.  The
# following kind of rule will alow you to navigate, invisibly accessing
# the compressed files:
#   Map file://localhost/somedir/*.html file://localhost/somedir/*.html.gz
# or, perhaps better:
#   Redirect file://localhost/somedir/*.html file://localhost/somedir/*.html.gz
#
# 4. Use local copies
# -------------------
# You have downloaded a tree of HTML documents, but there are many links
# between them that still point to the remote location.  You want to access
# the local copies instead, after all that's why you downloaded them.  You
# could start editing the HTML, but the following might be simpler:
#  Map http://remote.com/docs/*.html file://localhost/home/me/docs/*.html
# Or even combine this with compressing the files:
#  Map http://remote.com/docs/*.html file://localhost/home/me/docs/*.html.gz
#
# Again, replacing the "Map" with "Redirect" is probably better - it will
# allow you to see the _real_ location on the lynx INFO screen or in the
# HISTORY list, will avoid duplicates in the cache if the same document is
# loaded with two different URLs, and may allow you to 'e'dit the local
# from within lynx if you feel like it.
#
# 5. Broken links etc.
# --------------------
# A user has moved from http://www.siteA.com/~jdoe to http://siteB.org/john,
# or http://www.provider.com/company/ has moved to their own server
# http://www.company.com, but there are still links to the old location
# all over the place; they now are broken or lead to a stupid "this page
# has moved, please update your bookmarks. Refresh in 5 seconds" page
# which you're tired of seeing.  This will not fix your bookmarks, and
# it will let you see the outdated URLs for longer (Limitation 3 below),
# but for a quick fix:
#   Redirect   http://www.siteA.com/~jdoe/*      http://siteB.org/john/*
#   Redirect   http://www.provider.com/company/* http://www.company.com/*
#
# You could use "Map" instead of "Redirect", but this would let you see the
# outdated URLs for longer and even bookmark them, and you are likely to
# create invalid links if not all documents from a site are mapped
# (Limitation 3).
#
# 6. DNS troubles
# ---------------
# A special case of broken links.  If a site is inaccessible because the
# name cannot be resolved (your or their name server is broken, or the
# name registry once again made a mistake, or they really didn't pay in
# time...) but you still somehow know the address; or if name lookups are
# just too slow:
#   Map   http://www.somesite.com/*  http://10.1.2.3/*
# (You could do the equivalent more cleanly by adding an entry to the hosts
# file, if you have access to it.)
#
# Or, if a name resolves to several addresses of which one is down, and the
# DNS hasn't caught up:
#   Map   http://www.w3.org/*    http://www12.w3.org/*
#
# Note that this can break access to some name-based virtually hosted sites.
#
# In this case use of "Map" is probably preferred over "Redirect", as long
# as the URL on the left side contains the real and preferred hostname or
# the problem is only temporary.
#
# 7. Avoid redirections
# ---------------------
# Some sites have a habit to provide links that don't go to the destination
# directly but always force redirection via some intermediate URL.  The
# delay imposed by this, especially for users with slower connections and
# for overloaded servers, can be avoided if the intermediate URLs always
# follow some simple pattern: we can then anticipate the redirect that will
# inevitably follow and generate it internally.  For example,
#   Redirect http://lwn.net/cgi-bin/vr/*    http://*
#
# Warning: The page authors may not like this circumvention.  Often the
# redirection is wanted by them to track access, sometimes in connection
# with cookies.  Some sites may employ mechanisms that defeat the shortcut.
# It is your responsibility to decide whether use of this feature is
# acceptable.  (But note that the same effect can be achieved anyway for
# any link by editing the URL, e.g. with the ELGOTO ('E') key in Lynx, so
# a shortcut like this does not create some new kind of intrusion.)
#
# 8. Detailed proxy selection
# ---------------------------
# Basic use for this one should be obvious, if you have a need for it.
# It simply allows selecting use (or non-use) of proxies on a more detailed
# level than the traditional <scheme>_proxy and no_proxy variables, as well
# as using different proxies for different sites.
# For example, to request access through an anonymizing proxy for all pages
# on a "suspicious" site:
#   UseProxy  http://suspicious.site/*  http://anonymyzing.proxy.dom/
# (as long as all URLs really have a matching form, not some alternative
# like <http://suspicious.site:80/> or <http://SuSpIcIoUs.site/>!)
#
# To access some site through a local squid proxy, running on the same host
# as lynx, except for some image types (say because you rarely access images
# with lynx anyway, and if you do, you don't want them cached by the proxy):
#   UseProxy  http://some.site/*.gif  none
#   UseProxy  http://some.site/*.jpg  none
#   UseProxy  http://some.site/*      http://localhost:3128/
# Note that order is important here.
#
# To exempt a local address from all proxying:
#   UseProxy  http://local.site/*  none
#
# Note however that for some purposes the "no_proxy" setting may be better
# suited than "UseProxy ... none", because of its different matching logic
# (see comments in lynx.cfg).
#
# 9. Invent your own scheme
# -------------------------
# Suppose you want to teach lynx to handle a completely new URL scheme.
# If what's required for the new scheme is already available in lynx in
# _some_ way, this may be possible with some inventive use of rules.
# As an example, let's assume you want to introduce a simple "man:" scheme
# for showing manual pages, so (for a Unix-like system, at least) "man:lynx"
# would display the same help information as the "man lynx" command and so
# on (we ignore section numbers etc. for simplicity here).
# First, since lynx doesn't know anything about a "man:" scheme, it will
# normally reject any such URLs at an early stage.  However, a trick exists
# to bypass that hurdle: define a man_proxy environment variable *outside of
# lynx, before starting lynx* (it won't work in lynx.cfg), the actual value
# is unimportant and won't actually be used.  For example, in your shell:
#   export man_proxy=X
#
# If you already have some kind of HTTP-accessible man gateway available,
# the task then probably just amounts to transforming the URL into the right
# form.  For one such gateway (in this case, a CGI script running on the
# local machine), the rule
#   Redirect man:* http://localhost/cgi-bin/dwww?type=runman&location=*/
# or, alternatively,
#   UseProxy man:* none
#   Map      man:* http://localhost/cgi-bin/dwww?type=runman&location=*/
# does it, for other setups the right-hand side just has to be modified
# appropriately.  The "UseProxy" is to make sure the bogus man_proxy gets
# ignored.
#
# If no CGI-like access is available, you might want to invoke your system's
# man command directly for a man: URL.  Here is some discussion of how this
# could be done, and why ultimately you may not want to do it; this is also
# an opportunity to show examples for how some of the rules and conditions
# can be used that haven't been discussed in detail elsewhere.
# Lynx provides the lynxexec: (and the similar lynxprog:) scheme for running
# (nearly) arbitrary commands locally.  At the heart of employing it for
# man: would be a rule like this:
#   Redirect          man:*  "lynxexec:/usr/bin/man *"
# (It is a peculiarity of this scheme that the literal space and quoting
# are necessary here.  Also note that Map cannot be used here instead of
# Redirect, since lynxexec, as a special kind of URL, needs to be handled
# "early" in a request.)
# Of course, execution of arbitrary commands is a potentially dangerous
# thing.  lynxexec has to be specifically enabled at compile time and in
# lynx.cfg (or with command line options), and there are various levels
# of control, too much to go into here.  It is assumed in the following that
# lynxexec has been enabled to the degree necessary (allow /usr/bin/man
# execution) but hopefully not too much.
# What needs to be prevented is that allowing local execution of the man
# command might unintentionally open up unwanted execution of other commands,
# possibly by some trick that could be exploited.  For example, redirecting
# man:* as above, the URL "man:lynx;rm -r *" could result in the command
# "man lynx;rm -r *" executed by the system, with obvious disastrous results.
# (This particular example won't actually work, for several reasons; but
# for the purpose of discussion let's assume it did, there may be similar
# ones that do.)
# Because of such dangers, redirection to a lynxexec: is normally never
# accepted by lynx.  We need at least a PermitRedirection rule to override
# this protective limitation:
#   PermitRedirection man:*
#   Redirect          man:*  "lynxexec:/usr/bin/man *"
# But now we have potentially opened up local execution more than is
# acceptable via the man: scheme, so this needs to be examined.
# There are two aspects to security here: (1) restricting the user, and (2)
# protecting the user.  The first could also be phrased as protecting the
# system from the user; the second as preventing lynx (and the system) from
# doing things the user doesn't really want.  Aspect (1) is very important
# for setups providing anonymous guest accounts and similarly restricted
# environments.  (Otherwise shell access is normally allowed, and trying to
# protect the system in lynx would be rather pointless.)  As far as access
# to some URLs is concerned, the difference can be characterized in terms of
# which sources  of URLs are trusted enough to allow access: for (1), only
# links occurring in a limited number of documents are trusted enough for
# some (or all) URLs, user input at 'g'oto prompts and the like is not (if
# not completely disabled).  For (2) and assuming a user with normal shell
# privileges, the user may be trusted enough to accept any URL explicitly
# entered, but URLs from arbitrary external sources are not - someone might
# try to use them to trick the user (by following an innocent-looking link)
# or lynx (by following a redirection) into doing something undesirable.
#
# In the following we are concerned with (2); it is assumed that providers
# of anonymous accounts would not want to follow this path, and would have
# no need for additional schemes that imply local execution anyway.  (For
# one thing, with the man example they would have to carefully check that
# users cannot break out of the man command to a local shell prompt.)
#
# Getting back to the example, it was already mentioned that lynx does not
# allow redirections to lynxexec.  In fact this continues to be disallowed
# for real redirection received from HTTP servers.  But we have introduced
# a new man: scheme, and the lynx code that does the redirection checking
# doesn't know anything about special considerations for man: URLs, so
# an external HTTP server might send a redirection message with "Location:
# man:<something>", which lynx would allow, and which would in turn be
# redirected by our rule to "lynxexec:/usr/bin/man <something>".  Unless
# we are 100% sure that either this can never happen or that the lynxexec
# URL resulting from this can have no harmful effect, this needs to be
# prevented.  It can be done by checking for the "redirected" condition,
# either by putting something like (the first line is of course optional)
#   Alert  man:*  "Redirection to man: not allowed" if redirected
#   Fail   man:*                                    if redirected
# somewhere before the Redirect rule, or, reversing the logic, by adding
# a condition to the redirection rules, i.e. they become
#   PermitRedirection man:*                             unless redirected
#   Redirect          man:*  "lynxexec:/usr/bin/man *"  unless redirected
# (actually, putting the condition on either one of the rules would be
# sufficient).  The second variant assumes that the attempted access to
# man: via redirection will ultimately fail because there is no other way
# to handle such URLs.
#
# The above should take care of rejecting man: URLs from redirections, but
# what about regular links in HTML (like <A HREF="man:...">)?  As long as
# it can be assumed that the user will always inspect each and every link
# before following it, and never follow a link that can have harmful effect,
# no further restrictions are necessary.  But this is a very big assumption,
# unrealistic except perhaps in some single-user setups where the user is
# is identical with the rule writer.  So normally most links have to be
# regarded as suspect, and only URLs entered by the user can be accepted:
#   Alert  man:*  "Redirection to man: not allowed" if redirected
#   Fail   man:*                                    if redirected
#   Alert  man:*  "Link to man: not allowed"        unless userspecified
#   Fail   man:*                                    unless userspecified
#
# With these restrictions we have limited the ways our new man: scheme can
# be used rather severely, to the point where its usefulness is questionable.
# In addition to 'g'oto prompts, it may work in Jump files; also, should
# links to man:<something> appear in HTML text, the user could retype them
# manually or use the ELGOTO ('E') command with some trivial editing (like
# adding a space) to "confirm" the URL.  Even if the precautions outlined
# above are followed: THIS TEXT DOES NOT IMPLY ANY PROMISE THAT, BY FOLLOWING
# THE EXAMPLES, LYNX WILL BE SAFE.  On the other hand, some of the precautions
# *may* not be necessary: it is possible that careful use of TRUSTED_EXEC
# options in lynx.cfg could offer enough protection while making the new
# scheme more useful.
#
# If all this seems a bit too scary, that's intentional; it should be noted
# that these considerations are not in general necessary for "harmless" URL
# schemes, but appropriate for this "extreme" example.  One last remark
# regarding the hypothetical man scheme: instead of implementing it through
# "lynxexec:" or "lynxprog:", it would be somewhat safer to use "lynxcgi:"
# instead if it is supported.  A simple lynxcgi script would have to write
# the man page to stdout (either converted to text/html or as plain text,
# preceded by an appropriate Content-Type header line), and all necessary
# checking for special shell characters would be done within the script -
# lynx does not use the system() function to run the script.
#
# Other Limitations
# =================
# First, see CAVEAT above.  There are other limitations:
#
# 1. Applicable URL schemes
# -------------------------
# Rules processing does not apply to all URL schemes.  Some are
# handled differently from the generic access code, therefore rules
# for such URLs will never be "seen".  This limitation applies at
# least to lynxexec:, lynxprog:, mailto:, LYNXHIST:, LYNXMESSAGES:,
# LYNXCFG:, and LYNXCOMPILEOPTS: URLs.  You shouldn't be tempted
# to try to redirect most of these schemes anyway, but this also
# makes it impossible to disable them with "Fail" rules.
#
# Also, a scheme has to be known to Lynx in order to get as far as
# applying rules - you cannot just define your own new foobar: scheme
# and then map it to something here, but see Application 9, above,
# for a workaround.
#
# 2. No re-checking
# -----------------
# When a URL is mapped to a different one, the new URL is not checked
# again for compliance with most restrictions established by -anonymous,
# -restrictions, lynx.cfg and so on.  This can be regarded as a feature:
# it allows specific exceptions.  Of course it means that users for
# whom any restrictions must be enforced cannot have write access to a
# personal rules file, but that should be obvious anyway!
# This limitation does not applies if "Redirect" is used, in that case
# the new URL will always be re-examined.
#
# 3. Mappings are invisible
# -------------------------
# Changing the URL with "Map" or "Pass" rules will in general not be
# visible to the user, because it happens at a late stage of processing
# a request (similar to directing a request through a proxy).  One
# can think of two kinds of URL for every resource: a "Document URL" as
# the user sees it (on INFO page, history list, status line, etc.), and
# a "physical URL" used for the actual access.  Rules change only the
# physical URL.  This is different from the effect of HTTP redirection.
# Often this is bad, sometimes it may be desirable.
#
# Changing the URL can create broken links if a document has relative URLs,
# since they are taken to be relative to the "Document URL" (if no BASE tag
# is present) when the HTML is parsed.
#
# This limitation does not apply if "Redirect" is used - the new location
# will be visible to the user, and will be used by lynx for resolving
# relative URLs within the document.
#
# 4. Interaction with proxying
# ----------------------------
# Rules processing is done after most other access checks, but before
# proxy (and gateway) settings are examined.  A "Fail" rule works
# as expected, but when the URL has been mapped to a different one,
# the subsequent proxy checking can get confused.  If it decides that
# access is through a proxy or gateway, it will generally use the
# original URL to construct the "physical" URL, effectively overriding
# the mapping rules.  If the mapping is to a different access scheme
# or hostname, proxy checking could also be fooled to use a proxy when
# it shouldn't, to not use one when it should, or (if different proxies
# are used for different schemes) to use the wrong proxy.  So "just
# don't do that"; in some cases setting the no_proxy variable will help.
# Example 3 happens to work nicely if there is a http_proxy but no
# ftp_proxy.
#
# This limitation does not come into play if a "UseProxy" rule is applied,
# in either of its two forms: with a PROXYURL, proxying is fully under
# the control of the rules author, and with "none", subsequent proxy
# and gateway checking is completely disabled.  It is therefore a good
# idea to combine any "Map" and "Pass" rules that might result in passing
# the changed URL with explicit "UseProxy" rules, if the rules file is
# expected to be used together with proxying; or else always use "Redirect"
# instead of simple passing.
#
# 5. Case-sensitive matching
# --------------------------
# The matching logic is generic string-based.  It doesn't know anything
# about URL syntax, and so it cannot know in which parts of a URL case
# matters and where it doesn't.  As a result, all comparisons are case-
# sensitive.  If (a limited number of) case variations of a URL need
# to be dealt with, several rules can be used instead of one.
# In particular, this makes "UseProxy ... none" in some ways more limited
# than a no_proxy setting.
#
# 6. Redirection differences
# --------------------------
# For some URLs lynx does never check after a request whether a redirection
# occurs; that makes the "Redirect" rule useless for such URLs (in addition
# to those mentioned under limitation 1.).  Some of them are some gopher
# types, telnet: and similar in most situations, newspost: and similar,
# lynxcgi:, and some other private types.  Trying to redirect these will
# make access fail.  You probable don't want to change such URLs anyway,
# but if you feel you must, try using "Map" and "Pass" instead.
#
# The -noredir command line option only applies for real HTTP redirection
# responses, Redirect rules are still applied.  Also for certain other
# command line options (-mime_header, -head) and command keys (HEAD) lynx
# shows the redirection message (or part of it) in case of a real HTTP
# redirection, instead of following the redirection.  Here, too, a Redirect
# rule remains effective (there is no redirection message to show, after all).
#
# 7. URLs required
# ----------------
# Full absolute URLs (modulo possible "*" matching wildcards) are required
# in rules.  Strings like "www.somewhere.com" or "/some/dir/some.file" or
# "www.somewhere.com/some/dir/some.file" are not URLs.  Lynx may accept
# them as user input, as abbreviated forms for URLs; but by the time the
# rules get checked, those have been converted to full URLs, if they can
# be recognized.  This also means that rules cannot influence which strings
# typed at a 'g'oto prompt are recognized for URLs - rules processing kicks
# in later.