1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
|
[ Deprecated, unsupported, nonfunctional, but not yet completely excised. ]
Description of Dynamic Update and T_UNSPEC Code
Added by Mike Schwartz
University of Washington Computer Science Department
11/86
schwartz@cs.washington.edu
I have incorporated 2 new features into BIND:
1. Code to allow (unauthenticated) dynamic updates: surrounded by
#ifdef ALLOW_UPDATES
2. Code to allow data of unspecified type: surrounded by
#ifdef ALLOW_T_UNSPEC
Note that you can have one or the other or both (or neither) of these
modifications running, by appropriately modifying the makefiles. Also,
the external interface isn't changed (other than being extended), i.e.,
a BIND server that allows dynamic updates and/or T_UNSPEC data can
still talk to a 'vanilla' server using the 'vanilla' operations.
The description that follows is broken into 3 parts: a functional
description of the dynamic update facility, a functional description of
the T_UNSPEC facility, and a discussion of the implementation of
dynamic updates. The implementation description is mostly intended for
those who want to make future enhancements (especially the addition of
a good authentication mechanism). If you make enhancements, I would be
interested in hearing about them.
1. Dynamic Update Facility
I added this code in conjunction with my research into naming in large
heterogeneous systems. For the purposes of this research, I ignored
security issues. In other words, no authentication/authorization
mechanism exists to control updates. Authentication will hopefully be
addressed at some future point (although probably not by me). In the
mean time, BIND Internet name servers (as opposed to "private" name
server networks operating with their own port numbers, as I use in my
research) should be compiled *without* -DALLOW_UPDATES, so that the
integrity of the Internet name database won't be compromised by this
code.
There are 5 different dynamic update interfaces:
UPDATEA - add a resource record
UPDATED - delete a specific resource record
UPDATEDA - delete all named resource records
UPDATEM - modify a specific resource record
UPDATEMA - modify all named resource records
These all work through the normal resolver interface, i.e., these
interfaces are opcodes, and the data in the buffers passed to
res_mkquery must conform to what is expected for the particular
operation (see the #ifdef ALLOW_UPDATES extensions to nstest.c for
example usage).
UPDATEM is logically equivalent to an UPDATED followed by an UPDATEA,
except that the updates occur atomically at the primary server (as
usual with Domain servers, secondaries may become temporarily
inconsistent). The difference between UPDATED and UPDATEDA is that the
latter allows you to delete all RRs associated with a name; similarly
for UPDATEM and UPDATEMA. The reason for the UPDATE{D,M}A interfaces
is two-fold:
1. Sometimes you want to delete/modify some data, but you know you'll
only have a single RR for that data; in such a case, it's more
convenient to delete/modify the RR by just giving the name;
otherwise, you would have to first look it up, and then
delete/modify it.
2. It is sometimes useful to be able to delete/modify multiple RRs
this way, since one can then perform the operation atomically.
Otherwise, one would have to delete/modify the RRs one-by-one.
One additional point to note about UPDATEMA is that it will return a
success status if there were *zero* or more RRs associated with the given
name (and the RR add succeeds), whereas UPDATEM, UPDATED, and UPDATEDA
will return a success status if there were *one* or more RRs associated
with the given name. The reason for the difference is to handle the
(probably common) case where what you want to do is set a particular
name to contain a single RR, irrespective of whether or not it was
already set.
2. T_UNSPEC Facility
Type T_UNSPEC allows you to store data whose layout BIND doesn't
understand. Data of this type is not marshalled (i.e., converted
between host and network representation, as is done, for example, with
Internet addresses) by BIND, so it is up to the client to make sure
things work out ok w.r.t. heterogeneous data representations. The way
I use this type is to have the client marshal data, store it, retrieve
it, and demarshal it. This way I can store arbitrary data in BIND
without having to add new code for each specific type.
T_UNSPEC data is dumped in an ASCII-encoded, checksummed format so
that, although it's not human-readable, it at least doesn't fill the
dump file with unprintable characters.
Type T_UNSPEC is important for my research environment, where
potentially lots of people want to store data in the name service, and
each person's data looks different. Instead of having BIND understand
the format of each of their data types, the clients define marshaling
routines and pass buffers of marshalled data to BIND; BIND never tries
to demarshal the data...it just holds on to it, and gives it back to
the client when the client requests it, and the client must then
demarshal it.
The Xerox Network System's name service (the Clearinghouse) works this
way. The reason 'vanilla' BIND understands the format of all the data
it holds is probably that BIND is tailored for a very specific
application, and wants to make sure the data it holds makes sense (and,
for some types, BIND needs to take additional action depending on the
data's semantics). For more general purpose name services (like the
Clearinghouse and my usage of BIND), this approach is less tractable.
See the #ifdef ALLOW_T_UNSPEC extensions to nstest.c for example usage of
this type.
3. Dynamic Update Implementation Description
This section is divided into 3 subsections: General Discussion,
Miscellaneous Points, and Known Defects.
3.1 General Discussion
The basic scheme is this: When an update message arrives, a call is
made to InitDynUpdate, which first looks up the SOA record for the zone
the update affects. If this is the primary server for that zone, we do
the update and then update the zone serial number (so that secondaries
will refresh later). If this is a secondary server, we forward the
update to the primary, and if that's successful, we update our copy
afterwards. If it's neither, we refuse the update. (One might think
to try to propagate the update to an authoritative server; I figured
that updates will probably be most likely within an administrative
domain anyway; this could be changed if someone has strong feelings
about it).
Note that this mechanism disallows updates when the primary is
down, preserving the Domain scheme's consistency requirements,
but making the primary a critical point for updates. This seemed
reasonable to me because
1. Alternative schemes must deal with potentially complex
situations involving merging of inconsistent secondary
updates
2. Updates are presumed to be rare relative to read accesses,
so this increased restrictiveness for updates over reads is
probably not critical
I have placed comments through out the code, so it shouldn't be
too hard to see what I did. The majority of the processing is in
doupdate() and InitDynUpdate(). Also, I added a field to the zone
struct, to keep track of when zones get updated, so that only changed
zones get checkpointed.
3.2 Miscellaneous Points
I use ns_maint to call zonedump() if the database changes, to
provide a checkpointing mechanism. I use the zone refresh times to
set up ns_maint interrupts if there are either secondaries or
primaries. Hence, if there is a secondary, this interrupt can cause
zoneref (as before), and if there is a primary, this interrupt can
cause doadump. I also checkpoint if needed before shutting down.
You can force a server to checkpoint any changed zones by sending the
maint signal (SIGALRM) to the process. Otherwise it just checkpoints
during maint. interrupts, or when being shutdown (with SIGTERM).
Sending it the dump signal causes the database to be dumped into the
(single) dump file, but doesn't checkpoint (i.e., update the boot
files). Note that the boot files will be overwritten with checkpoint
files, so if you want to preserve the comments, you should keep copies
of the original boot files separate from the versions that are actually
used.
I disallow T_SOA updates, for several reasons:
- T_SOA deletes at the primary wont be discovered by the secondaries
until they try to request them at maint time, which will cause
a failure
- the corresponding NS record would have to be deleted at the same
time (atomically) to avoid various problems
- T_SOA updates would have to be done in the right order, or else
the primary and secondaries will be out-of-sync for that zone.
My feeling is that changing the zone topology is a weighty enough thing
to do that it should involve changing the load file and reloading all
affected servers.
There are alot of places where bind exits due to catastrophic failures
(mainly malloc failures). I don't try to dump the database in these
places because it's probably inconsistent anyway. It's probably better
to depend on the most recent dump.
3.2 Known Defects
1. I put the following comment in nlookup (db_lookup.c):
Note: at this point, if np->n_data is NULL, we could be in one
of two situations: Either we have come across a name for which
all the RRs have been (dynamically) deleted, or else we have
come across a name which has no RRs associated with it because
it is just a place holder (e.g., EDU). In the former case, we
would like to delete the namebuf, since it is no longer of use,
but in the latter case we need to hold on to it, so future
lookups that depend on it don't fail. The only way I can see
of doing this is to always leave the namebufs around (although
then the memory usage continues to grow whenever names are
added, and can never shrink back down completely when all their
associated RRs are deleted).
Thus, there is a problem that the memory usage will keep growing for
the situation described. You might just choose to ignore this
problem (since I don't see any good way out), since things probably
wont grow fast anyway (how many names are created and then deleted
during a single server incarnation, after all?)
The problem is that one can't delete old namebufs because one would
want to do it from db_update, but db_update calls nlookup to do the
actual work, and can't do it there, since we need to maintain place
holders. One could make db_update not call nlookup, so we know it's
ok to delete the namebuf (since we know the call is part of a delete
call); but then there is code with alot of overlapping functionality
in the 2 routines.
This also causes another problem: If you create a name and then do
UPDATEDA, all it's RRs get deleted, but the name remains; then, if you
do a lookup on that name later, the name is found in the hash table,
but no RRs are found for it. It then forwards the query to itself (for
some reason), and then somehow decides there is no such domain, and then
returns (with the correct answer, but after going through extra work).
But the name remains, and each time it is looked up, we go through
these same steps. This should be fixed, but I don't have time right
now (and the right answer seems to come back anyway, so it's good
enough for now).
2. There are 2 problems that crop up when you store data (other than
T_SOA and T_NS records) in the root:
a. Can't get primary to doaxfr RRs other than SOA and NS to
secondary.
b. Upon checkpoint (zonedump), this data sometimes comes out after other
data in the root, so that (since the SOA and NS records have null
names), they will get interpreted as being records under the
other names upon the next boot up. For example, if you have a
T_A record called ABC, the checkpoint may look like:
$ORIGIN .
ABC IN A 128.95.1.3
99999999 IN NS UW-BORNEO.
IN SOA UW-BORNEO. SCHWARTZ.CS.WASHINGTON.EDU.
( 50 3600 300 3600000 3600 )
Then when booting up the next time, the SOA and NS records get
interpreted as being called "ABC" rather than the null root
name.
3. The secondary server caches the T_A RR for the primary, and hence when
it tries to ns_forw an update, it won't find the address of the primary
using nslookup unless that T_A RR is *also* stored in the main hashtable
(by putting it in a named.db file as well as the named.ca file).
|