Network Working Group Scott W. Brim Internet Draft Cornell University August 1991 Multicast Communications Using BGP Status of this Memo This Internet Draft is for information purposes, for use in the future. Based on initial explorations, the BGP Working Group of the IETF has decided not to make a permanent decision yet on how to route multicast packets, but to use Reverse Path Forwarding (described below) for the time being. This document describes the analysis done up to this point. It explores the most likely possibilities in detail and lays out the issues for examination, so that when and if the time comes that a deci- sion is made to move away from Reverse Path Forwarding, the information needed to support the decision will be available. It does not attempt to compare the possible approaches exhaustively or to reach a conclusion about which is better. 1. Introduction Most communication in the Internet today is unicasting, where there is a single specific destination for every packet. On local area networks broadcasting is common, in which the the destination of a packet is every node on the network. Multicasting is like broadcasting in that it supports multiple recipients for a single packet, but packets are intended for a specific group. As examples, in a local area network environment multicasting is currently often used for communication between processors of loosely coupled systems, or for communication between routers or bridges. In these cases the sender wants to reach the members of a special group, but not every node on the network. Broadcasting is in fact a special case of multicasting in which the spe- cial group is all nodes. Multicasting over wide areas, as opposed to just on local area networks, is an important capability the development of which has lagged far behind its need. Until recently there has only been one IP multicasting implementation which could be used on more than just a local area net- work [6,13], but multicast packets had to be encapsulated in order to send them across autonomous system boundaries. Work is now in progress to develop support for wide-area multicasting using standard protocols and to make that support an integrated part of the Internet protocol suite. Part of that work is being done under the auspices of the IETF BGP Working Group, including this document which explores how multicast- ing can be supported between autonomous systems using the Border Gateway Protocol (BGP) [1,3]. Another part is the work of the IETF Multicast OSPF Working Group [10], which explores how multicasting can be sup- ported within autonomous systems by OSPF. The best introductory reference paper on multicast forwarding is by Deering and Cheriton [7]. It is highly recommended that that paper Brim [Page 1] Internet Draft August 1991 (plus some of its references) and the BGP RFCs be read before this one, since this one assumes a high degree of understanding of BGP and only summarizes some parts of Deering and Cheriton's presentation. We speak in terms of multicast groups. In IP multicasting, multicast groups are known by their addresses, which are in the range from 224.0.0.0 to 239.255.255.255. Each group has a unique address. Each member of a group is known by one or more multicast addresses in addi- tion to one or more unicast addresses. RFC 1112 [6] defines methods for mapping between IP multicast group addresses and the level 2 address spaces of ethernet, 802.3, all point-to-point protocols, and protocols with broadcast but no multicast capability (e.g. LocalTalk). Mappings are also defined for FDDI [2] and SMDS [12]. The general issues in wide-area multicasting are: o Discovering where packets should be sent. The destination address for a multicast packet refers to a set of entities whose locations change over time. On a local area network all destinations hear the same packet, and there is no need for forwarding. When a mul- ticast packet must be forwarded to the members of a group not on the same local area network, we need mechanisms by which the members can make themselves known to the routers and the routers can ensure that every member of a group receives at least one copy of a packet addressed to that group, and preferably only one. o Establishing efficient routing paths for multicast packets. A packet with multiple recipients must be replicated and sent on mul- tiple links, but copies of the packet should travel over as few links as possible. We need routing protocols defined for use both within and between autonomous systems. This document assumes that hosts will inform routers of their membership in multicast groups, probably via the Internet Group Management Protocol [6], and that autonomous systems use an intra-autonomous system routing protocol which supports multicast forwarding. It attempts to explore three remaining problems -- communication of between routers of where multicast packets should be propagated, efficient propagation of packets between autonomous systems, and interactions between intra-autonomous system and inter-autonomous system routing protocols in the support of multicasting. In this document we first explore unconstrained reverse path forwarding, essentially building on the model used in [13] for the BGP environment. We then extend this model and offer new models for supporting multicast- ing in BGP, in order to support policy-based controls on the propagation of multicast packets. Throughout this document the term "autonomous system" ("AS") is used in the same sense as in the BGP documents, for example in RFC 1164. An "interior gateway protocol" (IGP) refers to a routing protocol used within an autonomous system. Brim [Page 2] Internet Draft August 1991 2. Reverse Path Forwarding Along Unicast Sink Trees The only approach to wide-area routing of multicast packets that has been implemented so far uses "reverse path forwarding" and is described in RFC 1075 and in [6]. This approach would fit well in the BGP environment, offering low overhead and excellent interaction with IGPs. Also the implemented method is almost directly applicable to BGP already and very little design work would have to be done to adapt it to an inter-AS routing environment. However, it does not allow the level of administrative control of routing paths to which many network adminis- trators have become accustomed. 2.1. Mechanism In every approach to forwarding multicast packets the problem faced by a particular router is to determine its position in the paths by which multicast packets from a particular source should be forwarded. A router needs to (1) determine whether to accept a particular multicast packet or to discard it, based on its originating source and immediate previous hop, and (2) once a packet has been accepted, decide which of its peers to forward it to, if any. If a "source tree" defines the paths by which a network node sends uni- cast packets to all other nodes, then a "sink tree" defines how a node is reached by unicast packets from all others. The goal in unicast routing is to make a destination reachable by packets from all sources; the goal in multicast routing is to ensure that a packet from a single source reaches multiple destinations. Obviously a set of paths that solves the first problem can be used to solve the second if we use it in the opposite direction. The basic reverse path forwarding approach uses the fact that the propagation of unicast routing information for a par- ticular IP network already causes the formation of a sink tree, or at least a directed acyclic graph -- the graph of how unicast packets should flow to that IP network from all other networks. Thus when mul- ticast packets need to be routed from that network to multiple destina- tions, a directed acyclic graph with that network as the root has already been formed, and this approach simply arranges for the multicast packets to flow along that tree, but in the opposite direction of the unicast packets. Functionally, if a border router which receives a multicast packet receives it on the link by which it would send a unicast packet to the originator of that multicast packet, then it will propagate that multi- cast packet to its other BGP peers. The flow of multicast packets is determined by how the other border routers' unicast packets reach the multicast packet originator. This procedure establishes paths for efficient broadcast, but network bandwidth is still wasted by sending multicast packets along branches of the sink tree even when there are no nodes on those branches interested in receiving them. Further mechanisms can be defined to dynamically ensure that multicast packets are only sent to those peers which are on paths leading to members of the destination multicast group, for example via the prune and graft messages of RFC 1075. A prune message is sent Brim [Page 3] Internet Draft August 1991 to tell a BGP peer not to send packets from a particular source network addressed to a particular multicast group. A graft message is sent to cancel that directive. See RFC 1075 for details. In all current implementations of the BGP protocol, a border router has an implicit confirmation of whether its peers are using routes that it has offered to them through the "echo" inherent in the BGP update mes- sages (as recommended in Section 10 of RFC 1163). A border router can use it to detect if there are peers which are using paths through it, and forward multicast packets onto the shared network when there are such peers. In order for reverse path forwarding to be effective the recommendation on "echoes," including filling in the next hop field on echoes, will have to be made a requirement. Taking advantage of the echoes, a prune message can apply not only to messages from a particular source network, but to all messages for a particular group, regardless of source. Prune messages can be cached and timed out by the receiver, and repeated as necessary by the sender. A border router can maintain a table of which interfaces packets from a particular source to a particular target multicast group should be forwarded on, depending on memory constraints and multicast activity in the Internet. 2.2. Advantages This approach uses very little additional bandwidth to support wide-area multicasting. The only new BGP messages which need to be added are for prune and graft, and in fact these messages might be added to IGMP instead, which is independent of any particular routing protocol. A border router will only send a packet to peers which use it to reach the source of that packet; and will be able to avoid a good deal of needless traffic. Many of those peers which do receive multicast packets will need to send prune messages and an occasional graft message, at a fre- quency depending on the multicast forwarding cache time-out of the receiving border router, but the overhead in this approach is still sig- nificantly lower than in any of the others considered here. Often people wonder about the overhead of RPF's not propagating multi- cast group member locations in the first place, and essentially discov- ering them by sending data packets everywhere and using prune responses to clean up the forwarding tree after the fact. Under RPF, every multi- cast group with global scope reaches every part of the world with at least one packet periodically, regardless of whether there are group members there or not. Since it is not necessary for a node to be a member of a group in order to send messages to that group, the alterna- tive of propagating membership information a priori would require pro- pagating membership information for a group everywhere, to any node that might want to send to that group. Propagating knowledge of group membership would require at least one packet for each network containing a member to be sent to every constituent of the Internet, each time that network transitioned between having zero and at least one member. On the other hand using data packets and prune messages would require one packet to be sent to every constituent of the Internet for the entire Brim [Page 4] Internet Draft August 1991 group, as opposed to each network containing a member of the group. Another data packet would be sent periodically, but the frequency would depend on the times specified in the prune messages. It can be expected that these times would be long and that routers would use graft messages as necessary. Since a graft message would only be sent if data packets for a particular group were desired, they are only incrementally more traffic than the data itself will be and are not significant as over- head. Thus, independent of the topology of the Internet, it is always cheaper to use the prune/graft approach than it is to propagate member- ship information. Since RPF works with whatever the units of routing are at any node (currently a network, but perhaps in the future an autonomous system or an address and a mask), and keeps only cached information about active multicast sources and groups, it is most likely that it will continue to scale well as the Internet grows. This will depend on the nature of multicast traffic at a particular router, since that will determine how effective caching can be. This approach potentially works better than any other with the expected behavior of multicast-forwarding IGPs. This is discussed in the section on multicasting and IGPs. 2.3. Disadvantages The only significant potential problem with using the basic reverse path forwarding approach emerges when asymmetric routes. As long as the path by which a one node reaches another is the exact reverse of how the other node reaches the first (symmetric routes), unicast and multicast packets will flow along the same paths. However, the Internet supports, and frequently has, asymmetric routes between ASs. Network administra- tors currently set policies for how they want their networks to reach others, but, since in reverse path forwarding multicast packets flow according to how a node is reached, not according to how it reaches oth- ers, if routes are not symmetrical the behavior of the multicast packets will be controlled in the opposite way of what the network managers intended. A related problem is that if a node uses multicast probe packets to locate a replicated resource, and then unicast packets to interact with it, asymmetric paths may lead to that node choosing the wrong server. Discussions in IETF meetings suggest that while most network managers would not mind if multicast packets flowed from their ASs along the paths which others use to send unicast packets to them, there are some who would like to retain more control of how multicast packets flow through the Internet. There are ways to add control which do not involve protocol enhancements, but rather changes in how the Internet is administered. For example, we could enforce a strict hierarchy among ASs. This would restrict the topology in such a way that there was only one path from any AS to any other. We could also consider changing BGP to allow only symmetrical paths, or at least to implement multicast routing in such a way that routers would refuse to carry multicast traffic along asymmetric paths. None of these approaches appears to be very supportable -- besides the fact that it is unreasonable to require Brim [Page 5] Internet Draft August 1991 all network administrators to reach agreements with all other network administrators on hierarchical relationships, most network administra- tors would rather use a routing protocol even if it required them to change their control capability or increased their overhead, rather than hamper their connectivity choices in these ways. Since in Reverse Path Forwarding multicast routing depends directly on unicast routing, incremental implementation in the Internet might be awkward. There is no way to detect which routers support multicast routing, and thus no way to know if multicast packets can get between any two points directly from the network itself. Tunnels may be set up, as described in RFC 1075, to reach between islands of multicast- supporting routers, but again with RPF there is no way of knowing when these (relatively inefficient) tunnels should be in place without fre- quent dialog between network operators. 3. Multicasting on Independent Trees The next approach is more complex than basic reverse path forwarding and has higher overhead, but it offers more administrative control. This approach relaxes the requirement that multicast packets must follow the unicast sink tree. However, before a BGP peer connection is used to carry multicast packets between two BGP peers, potential producers or carriers of multicast traffic "offer" the traffic to potential reci- pients, and the recipient routers decide if they wish to accept it or not before the packets are allowed to flow. A border router may "accept" an offer of multicast packet availability from a peer even if that peer is not its next hop for sending unicast packets to the multicast source. This allows the autonomous system receiving the multicasts to enjoy much more control over how they flow; the receiving autonomous system can even cause them to flow along the same paths as unicast packets despite asymmetric routes. However, this approach may add greatly to routing protocol overhead. 3.1. Mechanism For this approach to work each autonomous system must set up multicast forwarding trees for packets it will originate before sending them. If a border router's AS wishes to send multicast packets to a particular group, the border router asks its peers if they are willing to accept those packets using a new BGP multicast availability message, which will include at least the AS-level path by which the packets will travel and the target multicast group address. An availability message remains outstanding, without repetition, for the life of the BGP peer connec- tion. All availability announcements will be assumed lost when a BGP peer connection is reset, and will be sent again when the connection is re-established. An availability message may be responded to with a mul- ticast acceptance message, and an acceptance may later be revoked with a multicast rejection message. A border router receiving an availability message may respond with an acceptance message for the traffic immediately or delay responding, perhaps forever (in which case no packets from that source AS will be Brim [Page 6] Internet Draft August 1991 forwarded by the peer that sent the availability message). A border router receiving an availability message could respond with an accep- tance immediately if its AS contained members in the target group and it wished for them to receive outside messages. If a border router is not part of a transit AS it can defer responding at all until its AS does have members (if ever). If a border router is part of a transit AS it can pass the availability message on to its other BGP peers, and they in turn can defer response and pass the availability message on to their other peers. As soon as a peer accepts the availability offer, the border router which forwarded the availability message to that peer will return an acceptance to the peer which forwarded the availability mes- sage to it, and so on back along the chain of deferred acceptances. By this process a tree can be built for the flow of multicast packets from a particular AS for a particular group. A border router may not propagate a deferred availability message to another peer without accepting it, unless it would be willing to accept the message itself if its offer to that peer were accepted. A border router may propagate availability messages not only to those peers which were using it to reach the source AS for the availability message (as in Reverse Path Forwarding), but to every peer it has which it is willing to send multicast packets to. Acceptance and rejection are not coupled to unicast reachability information in any way, but are based purely on policy. Since this approach decouples the flow of multicast packets from the unicast sink trees, a border router receiving an availability message cannot necessarily be sure of knowing which networks are associated with the multicast source AS listed in the availability message. Since mul- ticast forwarding depends on both source and target group addresses, without this information routers will not know how to forward multicast packets at all. There are three ways to make sure the border routers have the informa- tion they need: o Include the networks associated with an AS in that AS's availabil- ity messages, or at least in the first availability message sent over a BGP peer connection, with incremental addition and deletion. This would produce very high overhead, requiring every border router to send a list of all of its AS's networks to every other AS on its path to every member of every multicast group it wants to reach.. o Maintain static tables mapping between ASs and networks in all border routers, kept up to date by all network administrators. o Impose a restriction that a border router may not accept or pro- pagate an availability message for any AS unless it has received a BGP unicast reachability update specifying that AS as the last in an ASpath, and assume that almost all border routers will receive such an update for every AS they would carry multicast traffic for, by some path or other. This is a much better approach, especially Brim [Page 7] Internet Draft August 1991 since it would be extremely rare for an AS not to hear unicast routing information from some other AS which is sending multicast packets to or through it. On the other hand that rare possibility might happen, for example with parallel non-interacting backbones or separated commercial and research networks, so the protocol needs to take the possibility into account. The consequence would be that some multicast paths might be cut off. This appears to be worth the risk, and this is the method we will adopt. The multicast availability message includes the entire ASpath the multi- cast packets would follow, not just the originating AS, in order to sup- port administrative choices by the receiving border routers, for the same reason that reachability update messages contain ASpaths. A border router which is propagating an availability message will add its AS to the ASpath in the message using the same techniques as it would in pro- pagating reachability updates. When an AS no longer has members in a particular group, or in case a border router receives a better availability offer via a different peer after it has already accepted a availability offering, it can send a rejection message to the former offerer saying it no longer wishes to receive packets from that source AS to that target multicast group address. Similarly a border router in a transit AS which receives rejection messages from all peers which had previously sent acceptance messages for a particular source AS and group should send a rejection message to its predecessor as well. In BGP unicast reachability update exchanges, when a border router has no way of reaching a destination network it will send its peers a mes- sage containing that network with an unreachable attribute; however, if it is simply changing paths to that destination it will send a reacha- bility message containing the new path, implying that it is no longer using the old path. Similarly, when a border router discovers it is no longer receiving or capable of receiving multicast messages from a par- ticular multicast source AS, it will send a multicast cancellation mes- sage to the peers which had accepted its previous availability messages. If, on the other hand, the path by which it receives multicast messages from a source is changing, it will simply send a new offer to its peers specifying the new path. (A border router would discover it was no longer receiving multicast messages from a particular source AS either by a peer connection being reset or by receiving a cancellation message itself.) The change from the old path is implicit. The flow of multicast packets to peers which had accepted the previous availability message will not be halted; it will continue unless and until the peers send a multicast rejection message. An acceptance of the new offer is assumed, in order to keep multicast packets flowing when they have been flowing already. 3.2. Advantages The main advantage to using this approach as opposed to the previous one is that multicast packets can be caused to flow on administratively acceptable paths if the reverse of the unicast path is administratively Brim [Page 8] Internet Draft August 1991 unacceptable. Another possible advantage is in how much overhead there is in packet forwarding. In the basic Reverse Path Forwarding approach, the only way a router can (for example) protect a group within it from accidental intrusion by outsiders is by sending prune messages after such a packet is received, so in situations where such protection is required, every packet must be examined and the per-packet processing overhead is higher. In this approach the decision whether to accept traffic for a particular multicast group from a particular set of source networks can be done as part of the routing protocol, not on a per-packet basis. Border routers can decide which multicast packets to accept as a back- ground task, not during packet forwarding, and create multicast-specific entries in their packet forwarding tables. This approach also lends itself somewhat better to implementing multi- cast scope, discussed further in a later section on scope issues in gen- eral. 3.3. Disadvantages 3.3.1. Administrative Overhead While this approach can be used to avoid problems with asymmetric routes, and to force multicast packets to flow along administratively acceptable paths, careful coordination must be done between network administrators to avoid, for example, situations where an AS is refusing multicast packets at one border router because it expects to receive them through another while, in fact, that is the only border router to which they will ever be sent. When unicast routing is poorly engineered packets usually manage to get to their destinations anyway. If this approach is used, mistakes in hand-crafting multicast routing could easily be more severe. There is a possibility that a border router might be willing to carry unicast packets to a particular destination but not to carry multicast packets from that destination, in which case other border routers which use it to reach that destination will never receive multicast packets from it. This is a problem with basic reverse path forwarding as well, but here it is more possible to make a serious mistake since there is more administrative power. 3.3.2. Increased Routing Traffic Every sender of multicast packets must set up its paths for any group before it can send to that group. The additional overhead with this approach in the worst case is messages on every inter-AS link for every ASpath actively being used on that link -- the availability offering and the corresponding acceptance. There would be some additional messages due to connectivity changes, but most acceptance messages would not be returned, which more than balances them out in the usual case. The main problem comes when an AS has to set up its paths in anticipa- tion that one of its members might decide to use the group. If no Brim [Page 9] Internet Draft August 1991 member ever does, the bandwidth used for the group setup messages will have been wasted. On the other hand, if the AS does not do this, then it must send its setup messages when it first encounters one of its members sending a packet to that group, and all such first packets will probably have to be discarded. A few things help to keep the overhead down. This sort of setup is only necessary for multicast groups which need to be accessed by non-members across AS boundaries or for groups with members in more than one AS. The great majority of the groups in the first category will be "well- known" groups, for example domain name servers, and any AS that uses these at all will be using them frequently enough that the setup over- head will be small compared to the actual traffic. In the second category, since the AS will be exchanging traffic with that group the traffic in setup packets will once again be small compared to the actual data traffic. 3.3.3. Router Capacity Requirements Finally there would be some increase in router memory requirements, depending on implementation, since the status of each {source, target group, next hop} combination must be maintained, and entries cannot be thrown discarded to conserve memory. 4. Single Spanning Tree One way to manage inter-AS multicasting is to create an AS-level span- ning tree covering all ASs, and have all multicasts flow along that tree, using the same algorithm as is used for building spanning trees in bridged LAN environments. For more information on this algorithm see Perlman [11]. We could add prune and graft messages for specific groups to cut down on unnecessary traffic. The apparent advantages are that this would be simple to do using a pro- ven algorithm as the base, and that it would scale well with the growth of the Internet. After the initial period of tree building, there would be little routing traffic. Intra-AS ("internal") BGP connections would preserve the information passed to them in the same way they handle uni- cast routing information now. This approach is good if we expect Internet multicast traffic to: o be of mostly broad interest, so groups will be generally distri- buted; o require few network resources; o have no need of administrative control; and o not be sensitive to delay or other optimal routing considerations. There are some proposed uses of multicasting which meet these criteria, for example resource location or distribution of large public mailing lists, but the majority of the proposed applications are along the lines Brim [Page 10] Internet Draft August 1991 of real-time conferences, distributed simulations, and others which can similarly be expected to have clusters of participants and be sensitive to delay. Also, however the common spanning tree was formed, multicast packets would regularly have to flow unnecessarily through non-participating ASs to reach the ASs that wish to receive them, and the backbone of the spanning tree would be a bottleneck unless bandwidth requirements were low. There are apparently many users who are eager to use wide-area multicasting, so we can expect multicast traffic to blossom, and multi- cast bandwidth requirements will not be low for very long, and reducing traffic on backbone routers is a desirable goal. Implementing multicast scope rules would be difficult, as discussed later. There is essentially no chance for separate administrative control of the flow of multicast packets, either the ones an AS originates or the ones that flow to and through it. If the spanning tree is allowed to form dynamically, it will not be pos- sible to predict the resulting topology beforehand. There are two prob- lems with this. First, either administrative control should not be an issue, or a method needs to be contrived for ASs to cooperate in stati- cally configuring the spanning tree. Second, as discussed in the sec- tion in interactions with IGPs, some effort will have to be put into ensuring that multicast packets are injected into ASs at the border routers where the IGPs expect them. The only thing that really makes the single spanning tree approach attractive is that it scales well. While we have not worked out the details, it apparently has the least routing overhead of any of the approaches discussed here. However, that low overhead is purchased through using network bandwidth inefficiently for the data itself. In all other ways it is weak. 5. Multicasting Along Unicast Source Trees This section describes a method for giving network administrators the capability of controlling the paths taken by multicast packets using the same sorts of routing policy controls that they are accustomed to for controlling the paths of unicast packets, i.e. by controlling how multi- cast packets that their ASs originate will flow through the Internet. This is done by adding an additional message type to the BGP protocol, to be used in response to regular reachability update messages, which will declare which of the ASpaths made available to it in update mes- sages it wishes to use for multicast traffic. This alternative will be explored in more detail for several reasons: it is fundamentally different from the ones explored so far, in that member locations are propagated and the originator of the multicast packets chooses the path they will take; it offers network administrators a fam- iliar mode of control; and it is the approach which is most similar to that conceived of by the Inter-Domain Policy-Based Routing group, which Brim [Page 11] Internet Draft August 1991 is trying to build protocols for the next stage of the Internet's evolu- tion. The major drawback of this approach is that it places much more routing data on the network, and requires more state information to be kept in intermediate routers, than any of the other approaches. 5.1. Propagating Group Membership Information In this approach, since the originator of the multicast packets decides what paths its packets will take, knowledge of where group members are must be propagated between border routers in order for originators to make their choices. When an AS discovers a multicast group member in one of the networks for which it has routing responsibility (via IGMP), it should simply list the multicast group address in a BGP reachability update message, just as it would for any IP network which it discovers to be newly reachable. All of the rules for handling IP network reachability information can apply to multicast group address reachability information as well -- for example multicast member reachability information can be administra- tively filtered and passed on to other ASs or not, based on the adminis- trative configuration of the border router. A multicast group address needs to be recognized as such and treated differently from unicast addresses. While with unicast addresses a transit border router would propagate only one path to a particular address and suppress any others, with multicast addresses a border router needs to keep and propagate a path for every unique {multicast group address, ultimate AS} pair which it is willing to generate traffic for or carry transit traffic to. A border router can, in fact, send update messages with a particular multicast group address listed several times, sometimes reachable and sometimes unreachable, by different ASpaths, as long as the final AS is unique. When an AS is no longer responsible for any members in a particular group, it should send out a reachability update message listing that multicast group address as unreachable, just as it would for a unicast network which is no longer internally reachable. 5.2. The Multicast Path Usage Message Since we are keeping the routing of multicast packets under administra- tive control by the source of the packets, propagation of information about where multicast members does not imply anything about how they will or should be reached. To establish how multicast packets should be forwarded through the Internet, BGP needs a new multicast path usage message which will inform intermediate ASs of a multicast packet issuer's use of the paths that have been offered to it in the reachabil- ity updates it has received. How this is done is described in the next two sections. As with the approach using independent paths, we cannot be sure that an intermediate border router can map between the originating AS listed in Brim [Page 12] Internet Draft August 1991 multicast path usage messages and the potential source networks it refers to. In that approach we were able to solve most of the problem by not allowing a border router to accept or propagate an availability message unless it had received such information. In this approach that restriction won't work, as will be discussed later. For the moment we will assume that a multicast path usage message must contain a list of networks in it, and look for ways to reduce this overhead later. The multicast path usage message may contain one or more of two types of entries: ADD and DELETE. In both of these a multicast path usage mes- sage should include: o The AS which issued the message. This is the issuing AS. o The AS which will be sending the multicast packets (this may be different from the issuing AS). This is the originating AS. o The list of networks, within the originating AS, which might be found in the multicast packets' source address field. o The target multicast group address. The list of ASpaths which should be used to carry the multicast packets, derived from reachability updates which the issuing AS has received, together with whether each particular ASpath should be ADDed or DELETEd from the receiving border router's multicast forwarding database. The ASpaths in the multicast path usage message are taken from reacha- bility updates it has received, and are the originating AS's intended paths by which its multicast messages will reach members of the target multicast groups. The multicast path usage message conveys this infor- mation to the ASs listed in these ASpaths, so that when they receive multicast packets from that AS destined for the target multicast group address they will know how to forward them. Conceptually, a multicast path usage message tells the receiving border router where it fits in the originator AS's multicast forwarding tree. More than one of the ASs in an ASpath may contain members in the target multicast group, not just the ASs at the ends of the lists. Multicast path usage messages, like reachability update messages, add incremental information. Once a piece of information is transmitted, it is not transmitted again unless an intervening BGP peer connection is reset. As with reachability messages, there are no explicit multicast path usage acknowledgments, positive or negative. If the contents of a mul- ticast path usage message imply that multicast forwarding databases are not in accord, a new reachability update can be issued to attempt to correct the situation (as described below), but a multicast path usage message is never in error unless one of the fields in it is incomprehen- sible or unreasonable, in which case a notification message is returned and the BGP peer connection is reset. The structure of the multicast path usage message could be made more Brim [Page 13] Internet Draft August 1991 complex in order to save on protocol overhead. 5.3. Propagating New Multicast Path Usage Information As routing information is propagated in reachability update messages, a border router receiving an update should go through its usual process of examining each entry for validity of the destination addresses it con- tains before deciding to propagate the information to its other BGP peers. When one of the addresses mentioned is a multicast group address, if the border router is ever going to send or forward packets to that multicast group in the final destination AS of the ASpath, the border router should compare the advertised path with others to that same AS and multicast group address and possibility add it to its multi- cast forwarding database. If the border router decides to use the new path to that AS's members in this multicast group, then in addition to any reachability update messages it might send out, it should also return a multicast path usage message to the BGP peer which sent it the path. The "originator" and "issuing" AS fields in the multicast path usage message will be the same AS. The ASpaths that are sent do not contain the AS they are being sent to, but rather specify the paths beyond the AS they are being sent to. A border router which receives multicast path usage ADD entries should make sure they specify valid ASpaths (ASpaths currently in the receiving border router's routing database) and source networks; store the infor- mation in its own multicast forwarding database (possibly cleaning up any old, invalid multicast path usage information it might find as described below); and immediately propagate the multicast path usage information down the ASpaths listed in the ADD entry, toward their ulti- mate ASs, first modifying the ASpaths in the following manner: o For each BGP peer to whom a multicast path usage message is being sent, the receiving AS should be removed from the path before the message is sent so that the ASpath specifies the path beyond the AS it is being sent to. o If the ASpaths do not all have the same BGP peer for the next hop, the message should be replicated and multiple multicast path usage messages should be sent as necessary. The list sent to each BGP peer should contain only those ASpaths reached through it. When a multicast path usage message reaches the next-to-last AS on a path, the message need not be propagated to the final AS[s]. The ASpaths would be empty if the above rules are followed; the final ASs do not need to know they should route multicast packets to themselves. It may be possible for a multicast path usage message ADD entry to be received which contains information already in a transit AS's multicast routing database. The transit AS should simply ignore this information. A border router may receive a multicast path usage ADD listing an ASpath which the receiving border router is not currently using, for example during routing changes. The receiving border router should respond with a reachability update message listing that multicast group as Brim [Page 14] Internet Draft August 1991 "unreachable" by the requested ASpath, to be sure its peer's routing database, and those of all ASs beyond that peer, are up to date, but otherwise it should ignore that particular ADD entry. 5.4. Deleting Multicast Path Usage When a border router loses a BGP ASpath for any reason, including the loss of a peer connection, or receives a reachability update message listing a multicast group address as "unreachable," in addition to pro- pagating knowledge of the loss in update messages to its other peers, it must also delete all multicast path usage information relating to multi- cast group members reached along that path, just as it deletes all uni- cast routing information depending on that path. Although a border router depends on the originator AS to send it multicast forwarding information (in ADD entries), it must delete that information itself when the information no longer matches the routes it actually uses. The fact that this multicast forwarding information is being deleted is implicit in the reachability update messages that are sent between border routers. In addition, a border router may need to send out a multicast path usage message containing DELETEs in order to have multicast path usage infor- mation removed from the multicast forwarding databases of border routers closer to the target ASs, for three reasons: 1. When a border router decides to use a different path to an AS participating in a multicast group, it must send DELETEs along the old path and ADDs along the new, preferred path. The mes- sages may be sent in any order. 2. As described previously, when a border router loses a peer connection, it must delete any multicast path usage informa- tion which referenced that peer as the next hop; in addition it must issue multicast path usage DELETEs for all current multicast path usage information it received from that peer. These DELETEs must be issued on behalf of the originator ASs which sent the original ADDs; they should be propagated down the ASpaths which the originator AS originally requested, just as ADDs are propagated. In each DELETE entry, it will list itself as the "issuing" AS and the AS whose information should be deleted as the "originator" AS. In the meantime, on the other side of the lost connection BGP update messages will be notifying the originator AS about the loss of the path, so the originator AS will know that its multicast path usage informa- tion has been deleted and can take action accordingly. 3. If a border router which changes its preferred path to a tar- get AS is also a transit for some of its BGP peers, it not only has to send them a regular reachability update message containing the routing change, it also has to pass multicast path usage DELETEs down the old path on behalf of ASs who were using it as a transit to reach the multicast members on the old path. In each DELETE entry, it will list itself as the "issuing" AS and the AS whose information should be deleted as Brim [Page 15] Internet Draft August 1991 the "originator" AS. It must not wait for the originator ASs to issue their own DELETEs -- by the time those DELETEs would arrive, routing to ASs along the old path may have changed and there may be no way for the messages to be delivered to those ASs. The transit border router should not issue ADDs for the new path on behalf of the originator ASs. The originator ASs will need to issue their own ADDs and DELETEs, depending on the new reachability information offered to them, as described in case 1. As an example of the first case, in Figure 1, suppose B, E, and G are participating in a multicast group, but AS C is not available, so B is using path D-F-G-E to reach G and E. When C becomes available to B, B stops using D-F-G-E and starts using C-E-G instead. At this point it must send a multicast path usage message to D DELETing the first path and a message to C ADDing the second. It will list itself as both the "originator" and "issuing" ASs in the multicast path usage messages. As an example of the second case, assume A is also participating in this multicast group, and A is using path B-C-E-G to reach B, E, and G. Then suppose the connection between A and B is lost. B will propagate a mul- ticast path usage DELETE entry for the path C-E-G, specifying A as the originator AS and itself as the issuing AS. C and E will remove all their multicast forwarding information concerning A's participation in this group, and no new information will be added until they hear (somehow) from A again. As an example of the third case, assume A, B, E and G have members as before, but that the path through C is unavailable, so all routing for the group's multicast traffic is along A-B-D-F-G-E. B will list in its multicast forwarding database that traffic for this group from A should be forwarded on path D-F-G-E. When C becomes available B chooses to use the C-E-G path for its own traffic as above, unicast and multicast alike. B must announce this path change to A using reachability updates, and it must issue a multicast path usage DELETE for D-F-G-E for its own traffic as described above. It must also propagate a multicast path usage DELETE along the path D-F-G-E, specifying A as the originator and itself as the issuer. Since A did not initiate the path change, A will not send B a DELETE for its use of the D-F-G-E path, but A does have the responsibility of sending an ADD to B for the C-E-G path (which B will propagate as described in the previous section. A | B / \ / \ C D | | | | E F \ / \ / G Figure 1: ASs and Multicast Path Usage DELETEs Brim [Page 16] Internet Draft August 1991 When a transit border router receives a multicast path usage DELETE being propagated from an originator AS toward target ASs, it should treat it the same way it treats multicast path usage ADDs -- it should modify the information in its multicast forwarding database as necessary and propagate the message toward the target ASs in the message, pruning the AS lists as described above. It is possible for a multicast path usage DELETE to be received for tar- get ASs that are not in a transit AS's multicast routing table. In this case a regular reachability update message with an "unreachable" entry for that multicast group should be sent in order to ensure synchroniza- tion of all border routers' routing tables. It is an error for an AS to receive a multicast path usage message which both ADDs and DELETEs the same ASpath. 5.5. Discussion 5.5.1. ASpaths versus AS Lists If the multicast path usage messages were to specify just the target ASs which an originator AS wished to reach via the receiving border router, and not entire AS paths, the messages could be somewhat smaller. For example, using ASpaths, if ASs Y and Z were both attached to AS X, and AS A wished to reach them through a long ASpath B-C-D-..., then A would have to send multicast path usage ADDs for both B-C-D-...-Y and B-C-D- ...-Z. If AS lists were used then A would only have to send a message to B specifying Y and Z as targets, and would not have to bother includ- ing the paths to reach them. Generally AS lists would work, since at the time multicast path usage messages are issued paths are already established -- the multicast path usage messages are responses to ASpath reachability update messages, and a receiving AS could know how the originator AS wanted its multicast packets forwarded simply by examining its own routing database. One could think of the list of target ASs in a multicast path usage message as a shorthand method of listing the entire ASpaths that were in the update messages the multicast path usage messages were a response to. However, the disadvantage to just using a target list is that it assumes that, at the time the multicast path usage messages are received by an AS, that AS's routing is exactly as it was when the original reachabil- ity update messages were sent by that AS. Consider Figure 1 again, but assume that A is not willing to use the path B-D-.... Here a problem can occur because the exchange of reachability update messages and mul- ticast path usage messages is not lock-step. Suppose that at first B is routing through D-F-G-E. o A sends B no multicast path usage information, since it does not want to use that path to E or G. o B changes to routing through C-E-G and announces the reachability change to A. It has no multicast path usage information from A to delete. Brim [Page 17] Internet Draft August 1991 o A receives the reachability update message from B and sends B a multicast path usage ADD for targets E and G, only specifying the targets, not path(s). o However, while this is happening B changes its mind for some reason and goes back to routing through D-F-G-E. It still has no multi- cast path usage information from A to delete. o B receives A's ADD for E and G. Since we are only using target AS lists, B believes A wants to reach E and G via the path B is currently using, the D-F-G-E path. o A receives B's last reachability update saying B is now using D-F- G-E. Since A does not want to use this path, it says nothing, assuming that B deleted A's multicast path usage information when it changed from the C-E-G path. This problem could be remedied to some extent, to make using just target AS lists somewhat more reasonable, by having the originator AS send "confirming" multicast path usage DELETEs even when a remote AS should have deleted the originator AS's multicast path usage information on its own. However, it is impossible to guarantee that this information will arrive at all the ASs it should. The above scenario involves an intermediate AS switching from an unac- ceptable path to an acceptable one and then back again, leading to mul- ticast path usage information lingering when it shouldn't. A parallel situation can be set up to cause forwarding information not to be there when it should. 5.5.2. Routing Protocol Overhead As with the independent tree approach, since multicast routing is decou- pled from the flow of BGP network reachability messages, we cannot be sure that an intermediate border router can map between a multicast source network and the ASs listed in multicast path usage messages. In that approach we applied a restriction that a border router could not accept or propagate an availability message unless it had received reachability information about the AS originating the multicasts by some means. This restriction worked because the multicast forwarding trees were being set up hop by hop, by the border routers which were going to be receiving the multicast packets. In the current multicast path usage approach, the paths for multicast forwarding are being set up by their originator -- and the originating border router simply cannot know if the intermediate routers it wishes to use have received a list of the networks its AS is responsible for. Therefore the multicast path usage messages must apparently include the networks which can be expected to send multicast packets along the included ASpaths because otherwise the approach would only work with symmetric routes. Unfortunately the protocol as described above requires a multicast path usage message from every originator, across every path it would use, and for every group it wants to reach. Brim [Page 18] Internet Draft August 1991 There are a number of ways to reduce this overhead. o Add a new IP option to multicast packets which would include the number of the AS the packet originated in. o Make the multicast path usage message hierarchical, so that it could include more than one set of {target multicast group address, list of ASpaths} tuples in each message. In this way the list of networks could be applied to more than one set. o Force the interval between multicast path usage messages to be long, approximately five minutes or more. In this way the likeli- hood would be increased that multiple groups could be included in one message. o Allow "proxy" specification for multicast group target addresses, on a per hop basis. o Send the list of networks which will be using the multicast paths only once, and change it only incrementally after that (assuming a link stays up). o Build one tree for forwarding all multicast packets from a particu- lar AS. Do not discriminate between groups. o Place static configuration tables in each border gateway mapping between ASs and networks. An IP option would be controversial, since it not only adds an IP option but involves having a border router (which knows what AS it is in) modi- fying the IP options on a packet generated by a host. This particular IP option would essentially be extending the IP address to include the autonomous system. The success of the second and third points would depend on how multicast groups are used in the Internet. If most inter-AS groups are long- lasting with stable AS-level membership distributions, then the main event which would trigger multicast path usage messages would be losses of BGP peer-to-peer connections. In this case a border router would receive all of the membership information in a short time when the con- nection came up, and the savings from using these techniques would be significant -- in the best case the overhead from multicast path usage messages would only be about as much as a full reachability update on each of the paths used by multicast packets to reach the newly available connection. If, on the other hand, the Internet's AS-level participa- tion distribution is more volatile than the inter-AS connectivity, the main event triggering a multicast path usage message will be a change in membership, and we would only see a small overhead savings, if any. A proxy AS capability would help for stub ASs. It would allow for a stub AS to tell a more central AS to be its proxy, to do whatever the more central AS thinks is right, so the stub AS wouldn't have to send or receive any other multicast path usage messages at all as long as its link to the more central AS stayed up. Backbone ASs have a different Brim [Page 19] Internet Draft August 1991 set of problems, caused by the requirement for full connectivity between border routers in a common AS, which this extension would not address. Incremental advertisement of mappings between ASs and networks might reduce the amount of information sent, but still each incremental change would have to be sent per source AS and per group address, and all addresses would have to sent again over any BGP peer connection which was reset. Building only one tree reduces the overhead so that a multicast path usage packet only needs to be sent whenever a remote (and reachable) AS changes from having no multicast members at all to having at least one. The list of networks would still have to be sent, but only once for all groups (with incremental changes after that). The protocol overhead for BGP would be approximately doubled, since a list of networks would radi- ate out from each AS twice, once for unicast reachability and once to construct the (generalized) multicast tree. There would probably be conditions where the list of networks would have to be sent again, when a completely new AS was placed in an intermediate position on one of the ASpaths used, but this would be infrequent. Since intermediate routers would know not only their position in the "generalized" multicast for- warding tree for the source, but also the locations of the members of each specific multicast group that the source may be using them to reach, they could mask one set of information with the other to deter- mine which peers further downstream should receive a particular multi- cast packet. There are two significant problems with this approach, though. First, if a potential multicast source hears of group members in a remote AS by two different paths, it can be guaranteed that it will only be able to reach a member of one group or the other, but not both. Second, there is no possibility for an AS to communicate with one set of ASs for one group and another set for another group -- any multicast scope characteristics have to be carried in the packets themselves, not in the paths set up by the routing protocol. Statically configuring the border gateways with tables mapping between ASs and networks would remove the need to have any such information in the routing protocol, and would immediately reduce the overhead to a reasonable level. However, it is questionable whether this system can be kept up accurately. We are assuming there will be no problem with overlapping ASs. 5.5.3. Summary This approach has a few advantages to it -- it allows the most adminis- trative control in a mode familiar to network administrators, and more importantly it is closest to the framework being considered by the IDPR group. However, the overhead is higher than with any of the other approaches. 6. Considerations for the Real Internet 6.1. Multicasting, BGP, and IGPs Except in trivial cases, an AS with members in a multicast group needs Brim [Page 20] Internet Draft August 1991 to have an IGP which supports multicasting to make final delivery of packets addressed to that group. A transit AS needs to have either an IGP which supports multicasting or the capability of encapsulating mul- ticast packets for sending multiple copies of the packet to the other border routers in the AS. It is important that the multicast extensions to BGP not make any assumptions about what sort of IGP is running in an AS -- it may be that the multicasting support provided by an AS is extremely rudimentary but satisfactory for its needs. Also, ASs should not be required to have their border routers inject all external routing information into their IGPs in order for the AS to support inter-AS multicasting. Many ASs use low-bandwidth connections to a number of internal sites. Especially because the advent of BGP makes it much easier to divide the Internet up into ASs, in designing the extensions to BGP for multicasting we must not force ASs to carry inter-AS routing information internally. Therefore, since we cannot require intra-AS routers to know the details of inter-AS multicast routing, we have to assume that they know none of it. Intra-AS routers can be expected to assume symmetrical routes in all cases, and it will be up to BGP to make sure multicast packets arrive or at least appear to arrive at the external routers they would arrive at if inter-AS routes were symmetrical. Also, where an AS uses multiple border routers to reach a remote network (e.g. where internal networks all use the border routers closest to them to reach a backbone the AS is multiply attached to), the AS's internal routers may expect a multicast packet from that remote network to arrive at all of those border routers. The border routers must be in agreement with the intra-AS routers on this. The other approaches described here, which allow multicast traffic not to flow on the multicast source's unicast sink tree, may lead to packets arriving at the "wrong" border routers. In any of these methods, if the multicast packets would not appear where expected by the intra-AS routers, the packets will probably have to be encapsulated and sent to where they are expected; otherwise they simply won't be forwarded where they need to go, because the intra-AS routers will assume they have been forwarded there by some other path. Other means of ensuring that pack- ets are delivered correctly may be possible, even if they do not arrive on the expected interfaces, for example through extensive configuring of the intra-AS routers, but if such an approach is possible at all it will be highly specific to a particular AS's needs. Assuming, then, that the only general way to deal with having packets arrive at an unexpected set of border routers is encapsulation, multicast packets could have to cross an entire intermediate AS twice. In the worst case, for example, a multicast packet might cross the continent once on one backbone, be passed to a second backbone, then be encapsulated on that second back- bone and sent back to a "more correct" entry router on the other coast, and finally cross the continent a third time after being de- encapsulated. 6.2. Scope The extent of participation of a particular member of a multicast group Brim [Page 21] Internet Draft August 1991 should have a scope, or limits on the topological extent over which its packets should range and from which external packets addressed to that group should reach it. It should be possible to limit the extent to which packets which are either of no interest to the outside world or should not be seen by the outside world are propagated, and it should also be possible for a host to avoid receiving packets addressed to a particular group from the outside when it only wants to communicate with hosts within some sort of affinity group. Scoping is desirable both to save on wasted network bandwidth and for security. Various levels of scope have been proposed, for example local wire, site, organization, autonomous system, confederation, and/or world. This section concerns itself with how the different multicast routing approaches would support scoping at the AS boundary. How easy does each approach make it to implement scope for incoming and outgoing multicast packets? In Reverse Path Forwarding there is no explicit means for identifying the intended scope of packets. It is straightforward for a border gate- way of an AS to refuse packets coming into it by means of prune mes- sages, even when the AS has internal participants in that group, but it has to watch the addresses of incoming packets in order to know when to send the prune messages, and if it truly wants to protect internal nodes it has to check incoming packets even after it has sent them. Since information about group member locations is not carried in BGP (in this approach), and neither is information about the originators of packets, there are only a few ways for Reverse Path Forwarding to implement scope limitations: for outgoing packets, by using Time To Live and by depend- ing on the IGP to carry scope information which border routers can use to control packet propagation; and for incoming packets perhaps by con- figuring tables in the border routers. There has been a proposal to split up the multicast group address space to make scope implicit in the address given to a group. This should be avoided, if possible, in order to preserve flexibility and keep the mul- ticast addressing scheme capable of matching the requirements of the Internet as they change. We really don't know what the Internet's scope requirements are now, let alone what they will be in five years, but allocation of address space is very difficult to undo. If we use the approach of multicasting on independent trees, we can place a scope attribute in the multicast availability announcements, which would take care of communicating the scope of a multicast group member's participation when it sends messages, and limit how far its packets would travel. The border router could fill this field in with information from the AS's IGP if the IGP was capable of carrying scope information. Since in this approach multicast packets are not sent to a border router unless it has agreed to accept them, a border router could take the scope field in an multicast availability message into account when deciding whether to accept packets. As in all approaches, the border routers can also use preconfigured tables. In the common spanning tree approach scope cannot be implemented in the routing protocol itself. It would have to be implemented in the pack- ets, just as in Reverse Path Forwarding. Brim [Page 22] Internet Draft August 1991 In the final approach, where multicast packets flow along unicast source trees, the source AS has complete control over outgoing packets, even if scope attributes are not implemented, although if it is decided to reduce the overhead of this approach by having one general multicast propagation tree this capability is much more limited. However, under this approach an AS has very little control over incoming packets, besides dropping them, unless the distribution list attributes proposed for IDRP are adopted for BGP. 6.3. Parallel Paths on Shared Networks It is conceivable that two apparently independent paths from a source to members of a multicast group might actually share a network. Indeed, because of the way they were configured two routers on a shared broad- cast network might not even be aware of each other's existence, and both might receive a packet from one source and each place a copy of it on the network using the link-level multicast address. All group members down the forwarding tree would receive two copies. Three possible solutions to this problem come to mind -- practicing good planning and engineering for inter-AS routing; having packet recipients note the link-level address of the previous hop of a packet; and placing IP multicast packets in unicast link-level frames, sending multiple copies where necessary. None of these is particularly satisfying, but the best approach appears to be to have a configuration option for each router interface specifying whether to use link-level multicast or uni- cast addresses, and coordinating the setting of this option for all routers sharing the network. It may be that some routers will check link-level addresses anyway to ensure the packets are coming from a router they are exchanging routing information with, for extra security. 6.4. Overlapping ASs On occasion a network may be advertised by more than one AS. This does not present a problem for reverse path forwarding, but may present a problem for the approach of multicast and unicast paths. This is a problem area which hasn't been explored thoroughly, but situations can be envisioned where a multicast group member may not get any copies of packets from a particular source, or may get multiple copies. In the approach where the source decides the path its multicast packets will take, if an intermediate border router receives multicast path usage messages from multiple ASs which include the same network, it should forward packets from that network on the union of the paths requested by all multicast path usage messages. 6.5. ASs with Multiple Border Routers ASs with multiple connections to the outside world may receive more than one copy of a multicast packet at their various entry points when more than one border router is a useful path between the source of the pack- ets and the AS. Some of the approaches outlined above tolerate this -- in fact, Reverse Path Forwarding requires any packet arriving at any border router to be forwarded into the AS if the usual rules are met Brim [Page 23] Internet Draft August 1991 (that is, administrative boundaries have no meaning in Reverse Path For- warding). While it is probably outside the scope of the BGP protocol, implementations of the other approaches will need to be sure the border routers coordinate among themselves and with the IGP routers. 6.6. Selective Multicasting While, in some of these designs, autonomous systems can have a large amount of control over the flow of multicast packets to or from them, if they are going to be selective in which ASs they communicate with they should be very careful to maintain consistency in how they set up their communications. Any non-trivial multi-party multicast interaction can be disrupted by having members transmitting to other members selectively but not consistently, in that members could receive different sets of messages. Suppose, for example, that AS A is willing send to members of a group in B but not in C, and B is willing to send to both. C's members will not see messages from A, but will see B's replies to A, and so forth. It is strongly recommended that if an AS is going to refuse to exchange multicast packets with some members outside of its own domain, that it either refuse to communicate with any such external members at all, or else that it make very clear agreements with other ASs about how such interactions will be managed. Multicast groups like this should prob- ably have a priori restricted membership. 6.7. Incremental Deployment Whichever approach is used should be easily capable of incremental deployment, including being able to connect temporary "islands" of mul- ticast routing capability in a sea of only unicast routing. In Reverse Path Forwarding the only way of positively knowing whether a peer supports multicasting will be indirectly from BGP version negotia- tion, or perhaps from its response when it receives multicast packets, but even then there is no way of knowing how far your packets might go and whether they will reach a particular group member. Deering's imple- mentation of Reverse Path Forwarding includes a "tunneling" capability via "virtual interfaces." These tunnels need to be configured by hand, and the network administrator must keep track of when and where they are necessary. During the transition time when not all ASs can forward mul- ticast packets, it will be important to be able to get rid of tunnels promptly when they are not needed, but to notice when they are necessary because of an intermediate router losing its multicast forwarding capa- bility. Also, since in some implementations packet forwarding is decou- pled from routing, and Reverse Path Forwarding does not depend directly on the routing function, it is possible that a router will receive mul- ticast traffic even if it doesn't support it. Ideally the receiving router should just discard these packets with an ICMP "net unreachable," but this should be tested. In the Independent Tree and Spanning Tree approaches, the availability messages you receive will give you some idea of the extent of the con- nected multicast forwarding topology. However you will still not have Brim [Page 24] Internet Draft August 1991 complete knowledge of where your packets will go, and once again tunnel connectivity will need to be maintained by hand. In the source tree approach, where multicast group memberships are pro- pagated before packets are forwarded, a border gateway always knows where the packets it originates might travel and the multicast connec- tivity is always apparent. With this approach "tunnels" could be created and destroyed automatically depending on the current state of real multicast connectivity. 7. Acknowledgments The development of the ideas presented in this document has been sup- ported by the Defense Advanced Research Project Agency through grant NAG 2-593 from the NASA Ames Research Center. This work would not have been possible without the help of the IETF BGP Working Group, Dennis Fergu- son, Jeffrey C. Honig, Yakov Rekhter, John Moy, and Steve Deering. 8. References [1] Honig, J., Katz, D., Mathis, M., Rekhter, Y., and Yu, J. Applica- tion of the Border Gateway Protocol in the Internet. RFC 1164, June 1990. [2] Katz, D. A Proposed Standard for the Transmission of IP Datagrams over FDDI Networks. RFC 1188, October 1990. [3] Lougheed, K., and Rekhter, Y. A Border Gateway Protocol (BGP). RFC 1163, June 1990. [4] Lougheed, K., and Rekhter, Y. A Border Gateway Protocol 3 (BGP-3). Internet Draft, January 1991. [5] Deering, S. Multicast Routing in Internetworks and Extended LANs. Proc. ACM SIGCOMM 1988, August 1988. [6] Deering, S. Host Extensions for IP Multicasting. RFC 1112, August 1989. [7] Deering, S., and Cheriton, D. Multicast Routing in Datagram Inter- networks and Extended LANs. ACM Trans. on Comp. Sys. 8(2), May 1990, pp. 85-110. [8] Moy, J. The OSPF Specification. RFC 1131, October 1989. [9] Moy, J. The OSPF Specification, Version 2. Internet Draft, Janu- ary 1991. [10] Moy, J. Multicast Extensions to OSPF. Internet Draft, March 1991. [11] Perlman, R. An algorithm for distributed computation of a spanning tree in an extended LAN. Proc. 9th Data Communications Symposium, pp. 44-53. ACM/IEEE, September 1985. Brim [Page 25] Internet Draft August 1991 [12] Piscitello, D., Lawrence, J. A Specification of the Transmission of IP Datagrams over SMDS. RFC 1209, March 1991. [13] Waitzman, D., Partridge, C., and Deering, S. Distance Vector Mul- ticast Routing Protocol. RFC 1075, November 1988. 9. Author's Address Scott W.Brim Cornell Information Technologies 143 Caldwell Hall Cornell University Ithaca, NY 14853 USA Electronic mail: swb@nr-tech.cit.cornell.edu Phone: +1-607-255-5510 Brim [Page 26]