Date: Tue, 27 Apr 93 04:30:08 PDT From: Advanced Amateur Radio Networking Group <tcp-group@ucsd.edu> Errors-To: TCP-Group-Errors@UCSD.Edu Reply-To: TCP-Group@UCSD.Edu Precedence: Bulk Subject: TCP-Group Digest V93 #110 To: tcp-group-digest TCP-Group Digest Tue, 27 Apr 93 Volume 93 : Issue 110 Today's Topics: 386 version of GRINOS or JNOS? Ethernet board going to sleep.... Still there ? Thoughts about alloc, bugs and RSPF. Send Replies or notes for publication to: <TCP-Group@UCSD.Edu>. Subscription requests to <TCP-Group-REQUEST@UCSD.Edu>. Problems you can't solve otherwise to brian@ucsd.edu. Archives of past issues of the TCP-Group Digest are available (by FTP only) from UCSD.Edu in directory "mailarchives". We trust that readers are intelligent enough to realize that all text herein consists of personal comments and does not represent the official policies or positions of any party. Your mileage may vary. So there. ---------------------------------------------------------------------- Date: Mon, 26 Apr 1993 11:19:08 -0400 (EDT) From: MIKEBW@ids.net (Mike Bilow, <MIKEBW@ids.net>) Subject: 386 version of GRINOS or JNOS? To: pascoe@rocky.tntn.gtegsc.com, tcp-group@ucsd.edu Phil figured out the changes necessary to the ASM code to allow NOS to be built safely with the "-3" (generate 386-specific code) switch under BC++ 3.1, and I have been working on porting it into GRINOS. Since the low-level code is essentially the same in GRINOS and in JNOS, the port should convert directly. I just have been very busy lately, and I am currently trying to get a new hard drive installed here. When that is done, I will finally have the space to work on this easily, and it will be a priority. -- Mike -- Mike Bilow, <mikebw@ids.net> (Internet) N1BEE @ WA1PHY.#EMA.MA.USA.NA (AX.25) ------------------------------ Date: Mon, 26 Apr 1993 11:41:27 -0400 (EDT) From: MIKEBW@ids.net (Mike Bilow, <MIKEBW@ids.net>) Subject: Ethernet board going to sleep.... To: kf5mg@vnet.IBM.COM, tcp-group@ucsd.edu I'm not sure I can see why "tcp irtt" and friends would have an effect on the Ethernet LAN unless things are awfully messed with noisy cable and the like. What does NOS show for the computed rtt on the dead circuits? Slow machines running is XT mode ("isat 0") will tend to miscompute the rtt for a very fast circuit, showing zero or negative/huge positive numbers, but I though this was fixed a couple of years ago. If the machine is not an XT, try manually issuing the "isat 1" command. In XT mode, NOS can only time things to 55ms resolution, which may lead to anomalous results with Ethernet. I have seen the problem you describe, but it has always been memory related here. -- Mike Bilow, <mikebw@ids.net> (Internet) N1BEE @ WA1PHY.#EMA.MA.USA.NA (AX.25) ------------------------------ Date: 26 Apr 1993 13:54:50 -0500 (EST) From: wilson%inf.UFRGS.BR@UICVM.UIC.EDU (Wilson Roberto Afonso) Subject: Still there ? To: tcpgroup@ucsd.edu Does this list still exist ? If so, how can I subscribe to it ? And how much traffic is there ? Thank you very much. -Wilson (wilson@inf.ufrgs.br) ------------------------------ Date: Mon, 26 Apr 1993 11:04:45 +0200 (MET DST) From: Lars Petterson <d8lars@dtek.chalmers.se> Subject: Thoughts about alloc, bugs and RSPF. To: tcp-group@ucsd.edu This all started after reading Doug Cromptons letter about KA9Q's alloc code. I took a look at it and put it into my NOS and it seemed to work fine. I had to add the ibufs to mbuf.c again as my old PC couldn't keep up with the speed, but that was all (4.77 MHz, 8088 and a DRSI card, I'm a poor student :-) In append(), mbuf.c, the interrupt was turned off (psignal was also called) in the PA0GRI NOS but not in KA9Q's. Is this not needed in append()? I also noted that mem efficient in JNOS and PA0GRI NOS had no meaning at all. The efficient code was set up to reset the Allocp pointer to point at the Base element, but Allocp was never changed so it was always pointing at the Base element. KA9Q NOS, when circular is not defined, is also working in the efficient mode (some sort of "best fit" strategy). I took a look at g1emm's last NOS and there I saw the reason for the mem efficient command. His code was actually using a circular heap and therefore had to reset the Allocp pointer. Later the circular part was removed, but nobody seemes to have noticed that Allocp then never got changed. I.e. one can remove the mem efficient command. There's a way to make the morecore() calls from malloc() a bit faster. The problem here is that when we call morecore() in malloc() we will have to go through the heap once again when the morecore() call returns. This could be changed by setting Allocp to point at q in free() (this is done by removing the #ifdef circular in free()). As morecore() returns Allocp we will now have our new block as the next block on the heap and we don't have to go through the whole heap again. This is not true if we concatenate the last block on the heap with the new block, we then have to go through the whole heap just as before. If this is common or not I don't know but if this don't happen morecore() will be a bit faster. In malloc() we then have to reset the Allocp pointer to point at the Base element. This is done by inserting q = Allocp = &Base; just before the comment /* Search heap list */ (the same place as the mem efficient code had, it's the same code). I have also changed realloc() so that area will not be freed when we can't get memory for the new area. In K&R "The C Programming Language" they write "realloc returns a pointer to the new space, or NULL if the request cannot be satisfied, in wich case *p is unchanged". /* Move existing block to new area */ void * realloc(area,size) void *area; unsigned size; { HEADER HUGE *hp = ((HEADER *)area) - 1; unsigned osize = (hp->s.size - 1) * ABLKSIZE; void *new = malloc(size); Reallocs++; /* We must copy the block since freeing it may cause the heap * debugging code to scribble over it. */ if(new) { memcpy(new, area, (size > osize) ? osize : size); free(area); } return new; } On the other hand, nothing seems to call realloc(). I have put a counter on it and after 3 days it's still 0. As coreleft() don't work when we have allocated extra blocks we need something else instead (I like to know how much memory I have left). I found a pice of code in PE1CHL's NET code that seemes to work OK. I have put it into dostat(): /* Print heap stats */ static int dostat(argc,argv,envp) int argc; char *argv[]; void *envp; { #ifdef __TURBOC__ /* Totalfree code from PE1CHL NET -- sm6rpz */ unsigned long totalfree; _AX = 0x4800; _BX = 0xffff; geninterrupt(0x21); totalfree = 16L * (long) _BX; tputs("KA9Q 1993 alloc code.\n"); tprintf("heap size %lu, used %lu, avail %lu (%lu%%), morecores %lu, coreleft %lu\n", Heapsize, Heapsize - Availmem * ABLKSIZE, Availmem * ABLKSIZE, 100L * Availmem * ABLKSIZE / Heapsize, Morecores, totalfree); #else /* the first line of the old code */ [...] } (I must admit that I don't understand PE1CHL's code but it works fine on my machine) I've also added "used" that makes it easier to find memory leaks. As I started the mem debug facility my NOS began to crash now and then. Especially when ax25 connects where disconnected. These crashes was due to use of newly freed memory in lapb_incoming() (all versions of NOS). After the switch-statement axp was used for piggybacking. The problem was that we had freed axp by calling lapbstate(axp,LAPB_DISCONNECTED) in several places. lapbstate() calls s_ascall() that calls del_ax25() that frees axp. I corrected this bug by changing break, after the call to lapbstate(), to: free_p(bp); return 0; in all places where we call lapbstate(axp,LAPB_DISCONNECTED). This is also needed after the call to del_ax25() (not in KA9Q NOS). The next bug is in tcpuser.c, reset_all() that is called by the exit command (corrected in KA9Q NOS). Reset_all() will call reset_tcp() that calls close_self() that calls set_state() that calls s_tscall() that calls del_tcp() that frees tcb. To correct this bug we have to save the next pointer before freeing the memory: /* Clear all TCP connections */ void reset_all() { register struct tcb *tcb, *tcb1; for(tcb = Tcbs; tcb != NULLTCB; tcb = tcb1) { tcb1 = tcb->next; reset_tcp(tcb); } pwait(NULL); /* Let the RSTs go forth */ } The next two bug fixes are from my own code and has to do with arp and ax25 routes. I have added interface to these and this is also incorporated in JNOS. The bug is the same as above. I used a next pointer that was freed. This bug is in if_detach() in iface.c (this part was not added to JNOS but should be). Here is the corrected code: /* Detach a specified interface */ int if_detach(ifp) register struct iface *ifp; { struct iface *iftmp; struct route *rp,*rptmp; int i,j; struct ax_route *axr, *axr1; struct arp_tab *ap, *ap1; [...] #ifdef AX25 /* Drop all ax25 routes that points to this interface */ for(axr = Ax_routes; axr != NULLAXR; axr = axr1) { axr1 = axr->next;/* Save the next pointer */ if(axr->iface == ifp) ax_drop(axr->target, ifp); /* axr will be undefined after ax_drop() */ } #endif /* Drop all ARP's that point to this interface */ for(i = 0; i < HASHMOD; ++i) for(ap = Arp_tab[i]; ap != NULLARP; ap = ap1) { ap1 = ap->next; /* Save the next pointer */ if(ap->iface == ifp) arp_drop(ap); /* ap will be undefined after arp_drop() */ } /* Drop all routes that point to this interface */ if(R_default.iface == ifp) [...] } The last bug i found was probably THE bug in the RSPF code. The problem was the same as above, a next pointer was used in already freed memory. This was in makeroutes() in rspf.c. Change it to: /* The shortest path first algorithm */ static void makeroutes() { register struct route *rp, *rp2, *saved[HASHMOD]; register struct rspfadj *adj, *lowadj, *gateway; register struct rspflinkh linkh; register struct rspfrouter *rr, *rrprev; register int i, adjcnt; register char *lowp, *r; struct mbuf *bp; int32 lowaddr; int bits, low; if(Keeprouter) /* If false, purge unreachable router entries. */ --Keeprouter; /* Remove all non-manual routes. */ for(bits = 1; bits <= 32; bits++) for(i = 0; i < HASHMOD; i++) for(rp = Routes[bits-1][i]; rp != NULLROUTE; rp = rp2) { rp2 = rp->next; /* BIG BUG FIX -- sm6rpz */ if(dur_timer(&rp->timer)) rt_drop(rp->target,bits); /* rp will be undefined here if(dur_timer(&rp->timer) */ } if((rp = rt_blookup(0L,0)) != NULLROUTE && dur_timer(&rp->timer)) rt_drop(0L,0); /* Delete non-manual default route. */ [...] } rt_drop() will free rp above. This correction has made the RSPF code a lot better and we have not yet seen any crashes. I will therefore upload my latest RSPF code later this week. It works for multiport routers. The biggest problem is one way routes. I don't know how to handle these and can't find any answers in the RSPF 2.2 protocol. So if you have any ideas on this, please write. I will start implementing 2.2 as soon as possible. While you're about correcting lapb_input(), also change axp->flags.remotebusy = (control == RNR) ? YES : NO; to axp->flags.remotebusy = (type == RNR) ? YES : NO; /* Scot McIntosh */ This will make NOS understand RNRs (Scot McIntosh in tcp-group, digest number 80). It's nice to see NOS handle these correct :-) 73 de Lars, sm6rpz -- Lars Pettersson Chalmers University of Technology ------------------------------ End of TCP-Group Digest V93 #110 ****************************** ******************************