Date: Tue, 27 Apr 93 04:30:08 PDT
From: Advanced Amateur Radio Networking Group <tcp-group@ucsd.edu>
Errors-To: TCP-Group-Errors@UCSD.Edu
Reply-To: TCP-Group@UCSD.Edu
Precedence: Bulk
Subject: TCP-Group Digest V93 #110
To: tcp-group-digest


TCP-Group Digest            Tue, 27 Apr 93       Volume 93 : Issue  110

Today's Topics:
                    386 version of GRINOS or JNOS?
                  Ethernet board going to sleep....
                            Still there ?
                 Thoughts about alloc, bugs and RSPF.

Send Replies or notes for publication to: <TCP-Group@UCSD.Edu>.
Subscription requests to <TCP-Group-REQUEST@UCSD.Edu>.
Problems you can't solve otherwise to brian@ucsd.edu.

Archives of past issues of the TCP-Group Digest are available
(by FTP only) from UCSD.Edu in directory "mailarchives".

We trust that readers are intelligent enough to realize that all text
herein consists of personal comments and does not represent the official
policies or positions of any party.  Your mileage may vary.  So there.
----------------------------------------------------------------------

Date: Mon, 26 Apr 1993 11:19:08 -0400 (EDT)
From: MIKEBW@ids.net (Mike Bilow, <MIKEBW@ids.net>)
Subject: 386 version of GRINOS or JNOS?
To: pascoe@rocky.tntn.gtegsc.com, tcp-group@ucsd.edu

Phil figured out the changes necessary to the ASM code to allow NOS to
be built safely with the "-3" (generate 386-specific code) switch under
BC++ 3.1, and I have been working on porting it into GRINOS.  Since the
low-level code is essentially the same in GRINOS and in JNOS, the port
should convert directly.  I just have been very busy lately, and I am
currently trying to get a new hard drive installed here.  When that is
done, I will finally have the space to work on this easily, and it will
be a priority.

-- Mike
-- Mike Bilow, <mikebw@ids.net>  (Internet)
   N1BEE @ WA1PHY.#EMA.MA.USA.NA (AX.25)

------------------------------

Date: Mon, 26 Apr 1993 11:41:27 -0400 (EDT)
From: MIKEBW@ids.net (Mike Bilow, <MIKEBW@ids.net>)
Subject: Ethernet board going to sleep....
To: kf5mg@vnet.IBM.COM, tcp-group@ucsd.edu

I'm not sure I can see why "tcp irtt" and friends would have an effect on
the Ethernet LAN unless things are awfully messed with noisy cable and
the like.  What does NOS show for the computed rtt on the dead circuits?
Slow machines running is XT mode ("isat 0") will tend to miscompute the
rtt for a very fast circuit, showing zero or negative/huge positive numbers,
but I though this was fixed a couple of years ago.  If the machine is not an
XT, try manually issuing the "isat 1" command.  In XT mode, NOS can only
time things to 55ms resolution, which may lead to anomalous results with
Ethernet.  I have seen the problem you describe, but it has always been
memory related here.

-- Mike Bilow, <mikebw@ids.net>  (Internet)
   N1BEE @ WA1PHY.#EMA.MA.USA.NA (AX.25)

------------------------------

Date: 26 Apr 1993 13:54:50 -0500 (EST)
From: wilson%inf.UFRGS.BR@UICVM.UIC.EDU (Wilson Roberto Afonso)
Subject: Still there ?
To: tcpgroup@ucsd.edu

Does this list still exist ?  If so, how can I subscribe to it ?  And
how much traffic is there ?

Thank you very much.

-Wilson  (wilson@inf.ufrgs.br)

------------------------------

Date: Mon, 26 Apr 1993 11:04:45 +0200 (MET DST)
From: Lars Petterson <d8lars@dtek.chalmers.se>
Subject: Thoughts about alloc, bugs and RSPF.
To: tcp-group@ucsd.edu

This all started after reading Doug Cromptons letter about KA9Q's
alloc code. I took a look at it and put it into my NOS and it
seemed to work fine. I had to add the ibufs to mbuf.c again as my
old PC couldn't keep up with the speed, but that was all (4.77
MHz, 8088 and a DRSI card, I'm a poor student :-)

In append(), mbuf.c, the interrupt was turned off (psignal was
also called) in the PA0GRI NOS but not in KA9Q's. Is this not
needed in append()?

I also noted that mem efficient in JNOS and PA0GRI NOS had no
meaning at all. The efficient code was set up to reset the Allocp
pointer to point at the Base element, but Allocp was never
changed so it was always pointing at the Base element.

KA9Q NOS, when circular is not defined, is also working in the
efficient mode (some sort of "best fit" strategy).

I took a look at g1emm's last NOS and there I saw the reason
for the mem efficient command. His code was actually using a
circular heap and therefore had to reset the Allocp pointer.
Later the circular part was removed, but nobody seemes to have
noticed that Allocp then never got changed.

I.e. one can remove the mem efficient command.

There's a way to make the morecore() calls from malloc() a bit
faster. The problem here is that when we call morecore() in
malloc() we will have to go through the heap once again when the
morecore() call returns. This could be changed by setting Allocp
to point at q in free() (this is done by removing the #ifdef
circular in free()). As morecore() returns Allocp we will now
have our new block as the next block on the heap and we don't
have to go through the whole heap again.

This is not true if we concatenate the last block on the heap
with the new block, we then have to go through the whole heap
just as before. If this is common or not I don't know but if
this don't happen morecore() will be a bit faster.

In malloc() we then have to reset the Allocp pointer to point at
the Base element. This is done by inserting q = Allocp = &Base;
just before the comment /* Search heap list */ (the same place as
the mem efficient code had, it's the same code).

I have also changed realloc() so that area will not be freed when
we can't get memory for the new area. In K&R "The C Programming
Language" they write "realloc returns a pointer to the new space,
or NULL if the request cannot be satisfied, in wich case *p is
unchanged".

/* Move existing block to new area */
void *
realloc(area,size)
void *area;
unsigned size;
{
 HEADER HUGE *hp = ((HEADER *)area) - 1;
 unsigned osize = (hp->s.size - 1) * ABLKSIZE;
 void *new = malloc(size);

 Reallocs++;

 /* We must copy the block since freeing it may cause the heap
  * debugging code to scribble over it.
  */
 if(new) {
  memcpy(new, area, (size > osize) ? osize : size);
  free(area);
 }
 return new;
}

On the other hand, nothing seems to call realloc(). I have put a
counter on it and after 3 days it's still 0.

As coreleft() don't work when we have allocated extra blocks we
need something else instead (I like to know how much memory I
have left). I found a pice of code in PE1CHL's NET code that
seemes to work OK. I have put it into dostat():

/* Print heap stats */
static int
dostat(argc,argv,envp)
int argc;
char *argv[];
void *envp;
{
#ifdef __TURBOC__
 /* Totalfree code from PE1CHL NET -- sm6rpz */
 unsigned long totalfree;

 _AX = 0x4800;
 _BX = 0xffff;
 geninterrupt(0x21);
 totalfree = 16L * (long) _BX;

 tputs("KA9Q 1993 alloc code.\n");
 tprintf("heap size %lu, used %lu, avail %lu (%lu%%), morecores %lu, coreleft %lu\n",
  Heapsize,
  Heapsize - Availmem * ABLKSIZE,
  Availmem * ABLKSIZE,
  100L * Availmem * ABLKSIZE / Heapsize,
  Morecores,
  totalfree);
#else
/* the first line of the old code */
[...]
}

(I must admit that I don't understand PE1CHL's code but it works
fine on my machine)

I've also added "used" that makes it easier to find memory leaks.

As I started the mem debug facility my NOS began to crash now and
then. Especially when ax25 connects where disconnected.

These crashes was due to use of newly freed memory in
lapb_incoming() (all versions of NOS). After the switch-statement
axp was used for piggybacking. The problem was that we had freed
axp by calling lapbstate(axp,LAPB_DISCONNECTED) in several
places. lapbstate() calls s_ascall() that calls del_ax25() that
frees axp. I corrected this bug by changing break, after the call
to lapbstate(), to:

 free_p(bp);
 return 0;

in all places where we call lapbstate(axp,LAPB_DISCONNECTED).
This is also needed after the call to del_ax25() (not in KA9Q
NOS).

The next bug is in tcpuser.c, reset_all() that is called by the
exit command (corrected in KA9Q NOS).

Reset_all() will call reset_tcp() that calls close_self() that
calls set_state() that calls s_tscall() that calls del_tcp() that
frees tcb.

To correct this bug we have to save the next pointer before
freeing the memory:

/* Clear all TCP connections */
void
reset_all()
{
 register struct tcb *tcb, *tcb1;

 for(tcb = Tcbs; tcb != NULLTCB; tcb = tcb1) {
  tcb1 = tcb->next;
  reset_tcp(tcb);
 }

 pwait(NULL); /* Let the RSTs go forth */
}

The next two bug fixes are from my own code and has to do with
arp and ax25 routes. I have added interface to these and this is
also incorporated in JNOS. The bug is the same as above. I used a
next pointer that was freed. This bug is in if_detach() in
iface.c (this part was not added to JNOS but should be). Here is
the corrected code:

/* Detach a specified interface */
int
if_detach(ifp)
register struct iface *ifp;
{
 struct iface *iftmp;
 struct route *rp,*rptmp;
 int i,j;
 struct ax_route *axr, *axr1;
 struct arp_tab *ap, *ap1;
[...]
#ifdef AX25
 /* Drop all ax25 routes that points to this interface */
 for(axr = Ax_routes; axr != NULLAXR; axr = axr1) {
  axr1 = axr->next;/* Save the next pointer */
  if(axr->iface == ifp)
   ax_drop(axr->target, ifp);
  /* axr will be undefined after ax_drop() */
 }
#endif

 /* Drop all ARP's that point to this interface */
 for(i = 0; i < HASHMOD; ++i)
     for(ap = Arp_tab[i]; ap != NULLARP; ap = ap1) {
  ap1 = ap->next; /* Save the next pointer */
  if(ap->iface == ifp)
   arp_drop(ap);
  /* ap will be undefined after arp_drop() */
     }

 /* Drop all routes that point to this interface */
 if(R_default.iface == ifp)
[...]
}

The last bug i found was probably THE bug in the RSPF code. The
problem was the same as above, a next pointer was used in already
freed memory. This was in makeroutes() in rspf.c. Change it to:

/* The shortest path first algorithm */
static void
makeroutes()
{
    register struct route *rp, *rp2, *saved[HASHMOD];
    register struct rspfadj *adj, *lowadj, *gateway;
    register struct rspflinkh linkh;
    register struct rspfrouter *rr, *rrprev;
    register int i, adjcnt;
    register char *lowp, *r;
    struct mbuf *bp;
    int32 lowaddr;
    int bits, low;

    if(Keeprouter) /* If false, purge unreachable router entries. */
 --Keeprouter;

    /* Remove all non-manual routes. */
    for(bits = 1; bits <= 32; bits++)
 for(i = 0; i < HASHMOD; i++)
     for(rp = Routes[bits-1][i]; rp != NULLROUTE; rp = rp2) {
  rp2 = rp->next;  /* BIG BUG FIX -- sm6rpz */
  if(dur_timer(&rp->timer))
      rt_drop(rp->target,bits);
  /* rp will be undefined here if(dur_timer(&rp->timer) */
     }

    if((rp = rt_blookup(0L,0)) != NULLROUTE && dur_timer(&rp->timer))
 rt_drop(0L,0);   /* Delete non-manual default route. */
[...]
}

rt_drop() will free rp above.

This correction has made the RSPF code a lot better and we have
not yet seen any crashes. I will therefore upload my latest RSPF
code later this week. It works for multiport routers. The biggest
problem is one way routes. I don't know how to handle these and
can't find any answers in the RSPF 2.2 protocol. So if you have
any ideas on this, please write. I will start implementing 2.2 as
soon as possible.

While you're about correcting lapb_input(), also change

axp->flags.remotebusy = (control == RNR) ? YES : NO;

to

axp->flags.remotebusy = (type == RNR) ? YES : NO; /* Scot McIntosh */

This will make NOS understand RNRs (Scot McIntosh in tcp-group,
digest number 80).

It's nice to see NOS handle these correct :-)

73 de Lars, sm6rpz

-- 
Lars Pettersson             Chalmers University of Technology

------------------------------

End of TCP-Group Digest V93 #110
******************************
******************************