You should first read the ppp(8) man page and the PPP section of the handbook. Enable logging with the command
set log Phase Chat Connect Carrier lcp ipcp ccp command
This command may be typed at the ppp(8) command prompt or it may be entered in the /etc/ppp/ppp.conf configuration file (the start of the default section is the best place to put it). Make sure that /etc/syslog.conf (see syslog.conf(5)) contains the lines
!ppp *.* /var/log/ppp.log
and that the file /var/log/ppp.log exists. You can now find out a lot about what is going on from the log file. Do not worry if it does not all make sense. If you need to get help from someone, it may make sense to them.
If your version of ppp(8) does not understand the set log command, you should download the latest version. It will build on FreeBSD version 2.1.5 and higher.
This is usually because your hostname will not resolve. The best way to fix this is to make sure that /etc/hosts is consulted by your resolver first by editing /etc/host.conf and putting the hosts line first. Then, simply put an entry in /etc/hosts for your local machine. If you have no local network, change your localhost line:
127.0.0.1 foo.bar.com foo localhost
Otherwise, simply add another entry for your host. Consult the relevant man pages for more details.
You should be able to successfully ping -c1 `hostname` when you are done.
First, check that you have got a default route. By running netstat -rn (see netstat(1)), you should see two entries like this:
Destination Gateway Flags Refs Use Netif Expire default 10.0.0.2 UGSc 0 0 tun0 10.0.0.2 10.0.0.1 UH 0 0 tun0
This is assuming that you have used the addresses from the handbook, the man page or from the ppp.conf.sample file. If you do not have a default route, it may be because you are running an old version of ppp(8) that does not understand the word HISADDR in the ppp.conf file. If your version of ppp(8) is from before FreeBSD 2.2.5, change the
add 0 0 HISADDR
line to one saying
add 0 0 10.0.0.2
Another reason for the default route line being missing is that you have mistakenly set up a default router in your /etc/rc.conf (see rc.conf(5)) file (this file was called /etc/sysconfig prior to release 2.2.2), and you have omitted the line saying
delete ALL
from ppp.conf. If this is the case, go back to the Final system configuration section of the handbook.
This error is usually due to a missing
MYADDR: delete ALL add 0 0 HISADDR
section in your /etc/ppp/ppp.linkup file. This is only necessary if you have a dynamic IP address or do not know the address of your gateway. If you are using interactive mode, you can type the following after entering packet mode (packet mode is indicated by the capitalized PPP in the prompt):
delete ALL add 0 0 HISADDR
Refer to the PPP and Dynamic IP addresses section of the handbook for further details.
The default PPP timeout is 3 minutes. This can be adjusted with the line
set timeout NNN
where NNN is the number of seconds of inactivity before the connection is closed. If NNN is zero, the connection is never closed due to a timeout. It is possible to put this command in the ppp.conf file, or to type it at the prompt in interactive mode. It is also possible to adjust it on the fly while the line is active by connecting to ppps server socket using telnet(1) or pppctl(8). Refer to the ppp(8) man page for further details.
If you have Link Quality Reporting (LQR) configured, it is possible that too many LQR packets are lost between your machine and the peer. Ppp deduces that the line must therefore be bad, and disconnects. Prior to FreeBSD version 2.2.5, LQR was enabled by default. It is now disabled by default. LQR can be disabled with the line
disable lqr
Sometimes, on a noisy phone line or even on a line with call waiting enabled, your modem may hang up because it thinks (incorrectly) that it lost carrier.
There is a setting on most modems for determining how tolerant it should be to temporary losses of carrier. On a USR Sportster for example, this is measured by the S10 register in tenths of a second. To make your modem more forgiving, you could add the following send-expect sequence to your dial string:
set dial "...... ATS10=10 OK ......"
Refer to your modem manual for details.
Many people experience hung connections with no apparent explanation. The first thing to establish is which side of the link is hung.
If you are using an external modem, you can simply try using ping(8) to see if the TD light is flashing when you transmit data. If it flashes (and the RD light does not), the problem is with the remote end. If TD does not flash, the problem is local. With an internal modem, you will need to use the set server command in your ppp.conf file. When the hang occurs, connect to ppp(8) using pppctl(8). If your network connection suddenly revives (PPP was revived due to the activity on the diagnostic socket) or if you cannot connect (assuming the set socket command succeeded at startup time), the problem is local. If you can connect and things are still hung, enable local async logging with set log local async and use ping(8) from another window or terminal to make use of the link. The async logging will show you the data being transmitted and received on the link. If data is going out and not coming back, the problem is remote.
Having established whether the problem is local or remote, you now have two possibilities:
There is very little you can do about this. Most ISPs will refuse to help if you are not running a Microsoft OS. You can enable lqr in your ppp.conf file, allowing ppp(8) to detect the remote failure and hang up, but this detection is relatively slow and therefore not that useful. You may want to avoid telling your ISP that you are running user-PPP...
First, try disabling all local compression by adding the following to your configuration:
disable pred1 deflate deflate24 protocomp acfcomp shortseq vj deny pred1 deflate deflate24 protocomp acfcomp shortseq vj
Then reconnect to ensure that this makes no difference. If things improve or if the problem is solved completely, determine which setting makes the difference through trial and error. This will provide good ammunition when you contact your ISP (although it may make it apparent that you are not running a Microsoft product).
Before contacting your ISP, enable async logging locally and wait until the connection hangs again. This may use up quite a bit of disk space. The last data read from the port may be of interest. It is usually ascii data, and may even describe the problem ("Memory fault, core dumped"?).
If your ISP is helpful, they should be able to enable logging on their end, then when the next link drop occurs, they may be able to tell you why their side is having a problem. Feel free to send the details to Brian Somers <brian@FreeBSD.org>, or even to ask your ISP to contact me directly.
Your best bet here is to rebuild ppp(8) by adding CFLAGS+=-g and STRIP= to the end of the Makefile, then doing a make clean && make && make install. When ppp(8) hangs, find the ppp(8) process id with ps ajxww | fgrep ppp and run gdb ppp PID. From the gdb prompt, you can then use bt to get a stack trace.
Send the results to <brian@Awfulhak.org>.
Prior to FreeBSD version 2.2.5, once the link was established, ppp(8) would wait for the peer to initiate the Line Control Protocol (LCP). Many ISPs will not initiate negotiations and expect the client to do so. To force ppp(8) to initiate the LCP, use the following line:
set openmode active
Note: It usually does no harm if both sides initiate negotiation, so openmode is now active by default. However, the next section explains when it does do some harm.
Occasionally, just after connecting, you may see messages in the log that say "magic is the same". Sometimes, these messages are harmless, and sometimes one side or the other exits. Most PPP implementations cannot survive this problem, and even if the link seems to come up, you will see repeated configure requests and configure acknowledgments in the log file until ppp(8) eventually gives up and closes the connection.
This normally happens on server machines with slow disks that are spawning a getty on the port, and executing ppp(8) from a login script or program after login. I have also heard reports of it happening consistently when using slirp. The reason is that in the time taken between getty(8) exiting and ppp(8) starting, the client-side ppp(8) starts sending Line Control Protocol (LCP) packets. Because ECHO is still switched on for the port on the server, the client ppp(8) sees these packets "reflect" back.
One part of the LCP negotiation is to establish a magic number for each side of the link so that "reflections" can be detected. The protocol says that when the peer tries to negotiate the same magic number, a NAK should be sent and a new magic number should be chosen. During the period that the server port has ECHO turned on, the client ppp(8) sends LCP packets, sees the same magic in the reflected packet and NAKs it. It also sees the NAK reflect (which also means ppp(8) must change its magic). This produces a potentially enormous number of magic number changes, all of which are happily piling into the server's tty buffer. As soon as ppp(8) starts on the server, it is flooded with magic number changes and almost immediately decides it has tried enough to negotiate LCP and gives up. Meanwhile, the client, who no longer sees the reflections, becomes happy just in time to see a hangup from the server.
This can be avoided by allowing the peer to start negotiating with the following line in your ppp.conf file:
set openmode passive
This tells ppp(8) to wait for the server to initiate LCP negotiations. Some servers however may never initiate negotiations. If this is the case, you can do something like:
set openmode active 3
This tells ppp(8) to be passive for 3 seconds, and then to start sending LCP requests. If the peer starts sending requests during this period, ppp(8) will immediately respond rather than waiting for the full 3 second period.
There is currently an implementation mis-feature in ppp(8) where it does not associate LCP, CCP & IPCP responses with their original requests. As a result, if one PPP implementation is more than 6 seconds slower than the other side, the other side will send two additional LCP configuration requests. This is fatal.
Consider two implementations, A and B. A starts sending LCP requests immediately after connecting and B takes 7 seconds to start. When B starts, A has sent 3 LCP REQs. We are assuming the line has ECHO switched off, otherwise we would see magic number problems as described in the previous section. B sends a REQ, then an ACK to the first of A's REQs. This results in A entering the OPENED state and sending and ACK (the first) back to B. In the meantime, B sends back two more ACKs in response to the two additional REQs sent by A before B started up. B then receives the first ACK from A and enters the OPENED state. A receives the second ACK from B and goes back to the REQ-SENT state, sending another (forth) REQ as per the RFC. It then receives the third ACK and enters the OPENED state. In the meantime, B receives the forth REQ from A, resulting in it reverting to the ACK-SENT state and sending another (second) REQ and (forth) ACK as per the RFC. A gets the REQ, goes into REQ-SENT and sends another REQ. It immediately receives the following ACK and enters OPENED.
This goes on until one side figures out that they are getting nowhere and gives up.
The best way to avoid this is to configure one side to be passive - that is, make one side wait for the other to start negotiating. This can be done with the
set openmode passive
command. Care should be taken with this option. You should also use the
set stopped N
command to limit the amount of time that ppp(8) waits for the peer to begin negotiations. Alternatively, the
set openmode active N
command (where N is the number of seconds to wait before starting negotiations) can be used. Check the manual page for details.
Prior to version 2.2.5 of FreeBSD, it was possible that your link was disabled shortly after connection due to ppp(8) mis-handling Predictor1 compression negotiation. This would only happen if both sides tried to negotiate different Compression Control Protocols (CCP). This problem is now corrected, but if you are still running an old version of ppp(8) the problem can be circumvented with the line
disable pred1
When you execute the shell or ! command, ppp(8) executes a shell (or if you have passed any arguments, ppp(8) will execute those arguments). Ppp will wait for the command to complete before continuing. If you attempt to use the PPP link while running the command, the link will appear to have frozen. This is because ppp(8) is waiting for the command to complete.
If you wish to execute commands like this, use the !bg command instead. This will execute the given command in the background, and ppp(8) can continue to service the link.
There is no way for ppp(8) to automatically determine that a direct connection has been dropped. This is due to the lines that are used in a null-modem serial cable. When using this sort of connection, LQR should always be enabled with the line
enable lqr
LQR is accepted by default if negotiated by the peer.
If ppp(8) is dialing unexpectedly, you must determine the cause, and set up Dial filters (dfilters) to prevent such dialing.
To determine the cause, use the following line:
set log +tcp/ip
This will log all traffic through the connection. The next time the line comes up unexpectedly, you will see the reason logged with a convenient timestamp next to it.
You can now disable dialing under these circumstances. Usually, this sort of problem arises due to DNS lookups. To prevent DNS lookups from establishing a connection (this will not prevent ppp(8) from passing the packets through an established connection), use the following:
set dfilter 1 deny udp src eq 53 set dfilter 2 deny udp dst eq 53 set dfilter 3 permit 0/0 0/0
This is not always suitable, as it will effectively break your demand-dial capabilities - most programs will need a DNS lookup before doing any other network related things.
In the DNS case, you should try to determine what is actually trying to resolve a host name. A lot of the time, sendmail(8) is the culprit. You should make sure that you tell sendmail not to do any DNS lookups in its configuration file. See the section on Mail Configuration for details on how to create your own configuration file and what should go into it. You may also want to add the following line to your .mc file:
define(`confDELIVERY_MODE', `d')dnl
This will make sendmail queue everything until the queue is run (usually, sendmail is invoked with -bd -q30m, telling it to run the queue every 30 minutes) or until a sendmail -q is done (perhaps from your ppp.linkup file).
I keep seeing the following errors in my log file:
CCP: CcpSendConfigReq CCP: Received Terminate Ack (1) state = Req-Sent (6)
This is because ppp(8) is trying to negotiate Predictor1 compression, and the peer does not want to negotiate any compression at all. The messages are harmless, but if you wish to remove them, you can disable Predictor1 compression locally too:
disable pred1
Under FreeBSD 2.2.2 and before, there was a bug in the tun driver that prevents incoming packets of a size larger than the tun interface's MTU size. Receipt of a packet greater than the MTU size results in an IO error being logged via syslogd.
The PPP specification says that an MRU of 1500 should always be accepted as a minimum, despite any LCP negotiations, therefore it is possible that should you decrease the MTU to less than 1500, your ISP will transmit packets of 1500 regardless, and you will tickle this non-feature - locking up your link.
The problem can be circumvented by never setting an MTU of less than 1500 under FreeBSD 2.2.2 or before.
In order to log all lines of your modem "conversation", you must enable the following:
set log +connect
This will make ppp(8) log everything up until the last requested "expect" string.
If you wish to see your connect speed and are using PAP or CHAP (and therefore do not have anything to "chat" after the CONNECT in the dial script - no set login script), you must make sure that you instruct ppp(8) to "expect" the whole CONNECT line, something like this:
set dial "ABORT BUSY ABORT NO\\sCARRIER TIMEOUT 4 \ \"\" ATZ OK-ATZ-OK ATDT\\T TIMEOUT 60 CONNECT \\c \\n"
Here, we get our CONNECT, send nothing, then expect a line-feed, forcing ppp(8) to read the whole CONNECT response.
Ppp parses each line in your config files so that it can interpret strings such as set phone "123 456 789" correctly (and realize that the number is actually only one argument. In order to specify a " character, you must escape it using a backslash (\).
When the chat interpreter parses each argument, it re-interprets the argument in order to find any special escape sequences such as \P or \T (see the man page). As a result of this double-parsing, you must remember to use the correct number of escapes.
If you wish to actually send a \ character to (say) your modem, you would need something like:
set dial "\"\" ATZ OK-ATZ-OK AT\\\\X OK"
resulting in the following sequence:
ATZ OK AT\X OK
or
set phone 1234567 set dial "\"\" ATZ OK ATDT\\T"
resulting in the following sequence:
ATZ OK ATDT1234567
Ppp (or any other program for that matter) should never dump core. Because ppp(8) runs with an effective user id of 0, the operating system will not write ppp(8)'s core image to disk before terminating it. If, however ppp(8) is actually terminating due to a segmentation violation or some other signal that normally causes core to be dumped, and you are sure you are using the latest version (see the start of this section), then you should do the following:
% tar xfz ppp-*.src.tar.gz % cd ppp*/ppp % echo STRIP= >>Makefile % echo CFLAGS+=-g >>Makefile % make clean all % su # make install # chmod 555 /usr/sbin/ppp
You will now have a debuggable version of ppp(8) installed. You will have to be root to run ppp(8) as all of its privileges have been revoked. When you start ppp(8), take a careful note of what your current directory was at the time.
Now, if and when ppp(8) receives the segmentation violation, it will dump a core file called ppp.core. You should then do the following:
% su # gdb /usr/sbin/ppp ppp.core (gdb) bt ..... (gdb) f 0 .... (gdb) i args .... (gdb) l .....
All of this information should be given alongside your question, making it possible to diagnose the problem.
If you are familiar with gdb, you may wish to find out some other bits and pieces such as what actually caused the dump and the addresses & values of the relevant variables.
This was a known problem with ppp(8) set up to negotiate a dynamic local IP number with the peer in auto mode. It is fixed in the latest version - search the man page for iface.
The problem was that when that initial program calls connect(2), the IP number of the tun interface is assigned to the socket endpoint. The kernel creates the first outgoing packet and writes it to the tun device. ppp(8) then reads the packet and establishes a connection. If, as a result of ppp(8)'s dynamic IP assignment, the interface address is changed, the original socket endpoint will be invalid. Any subsequent packets sent to the peer will usually be dropped. Even if they are not, any responses will not route back to the originating machine as the IP number is no longer owned by that machine.
There are several theoretical ways to approach this problem. It would be nicest if the peer would re-assign the same IP number if possible :-) The current version of ppp(8) does this, but most other implementations do not.
The easiest method from our side would be to never change the tun interface IP number, but instead to change all outgoing packets so that the source IP number is changed from the interface IP to the negotiated IP on the fly. This is essentially what the iface-alias option in the latest version of ppp(8) is doing (with the help of libalias(3) and ppp(8)'s -nat switch) - it is maintaining all previous interface addresses and NATing them to the last negotiated address.
Another alternative (and probably the most reliable) would be to implement a system call that changes all bound sockets from one IP to another. ppp(8) would use this call to modify the sockets of all existing programs when a new IP number is negotiated. The same system call could be used by dhcp clients when they are forced to re-bind() their sockets.
Yet another possibility is to allow an interface to be brought up without an IP number. Outgoing packets would be given an IP number of 255.255.255.255 up until the first SIOCAIFADDR ioctl is done. This would result in fully binding the socket. It would be up to ppp(8) to change the source IP number, but only if it is set to 255.255.255.255, and only the IP number and IP checksum would need to change. This, however is a bit of a hack as the kernel would be sending bad packets to an improperly configured interface, on the assumption that some other mechanism is capable of fixing things retrospectively.
The reason games and the like do not work when libalias is in use is that the machine on the outside will try to open a connection or send (unsolicited) UDP packets to the machine on the inside. The NAT software does not know that it should send these packets to the interior machine.
To make things work, make sure that the only thing running is the software that you are having problems with, then either run tcpdump on the tun interface of the gateway or enable ppp(8) tcp/ip logging (set log +tcp/ip) on the gateway.
When you start the offending software, you should see packets passing through the gateway machine. When something comes back from the outside, it will be dropped (that is the problem). Note the port number of these packets then shut down the offending software. Do this a few times to see if the port numbers are consistent. If they are, then the following line in the relevant section of /etc/ppp/ppp.conf will make the software functional:
nat port proto internalmachine:port port
where proto is either tcp or udp, internalmachine is the machine that you want the packets to be sent to and port is the destination port number of the packets.
You will not be able to use the software on other machines without changing the above command, and running the software on two internal machines at the same time is out of the question - after all, the outside world is seeing your entire internal network as being just a single machine.
If the port numbers are not consistent, there are three more options:
Submit support in libalias. Examples of "special cases" can be found in /usr/src/lib/libalias/alias_*.c (alias_ftp.c is a good prototype). This usually involves reading certain recognised outgoing packets, identifying the instruction that tells the outside machine to initiate a connection back to the internal machine on a specific (random) port and setting up a "route" in the alias table so that the subsequent packets know where to go.
This is the most difficult solution, but it is the best and will make the software work with multiple machines.
Use a proxy. The application may support socks5 for example, or (as in the "cvsup" case) may have a "passive" option that avoids ever requesting that the peer open connections back to the local machine.
Redirect everything to the internal machine using nat addr. This is the sledge-hammer approach.
Not yet, but this is intended to grow into such a list (if any interest is shown). In each example, internal should be replaced with the IP number of the machine playing the game.
Asheron's Call
nat port udp internal :65000 65000
Manually change the port number within the game to 65000. If you have got a number of machines that you wish to play on assign a unique port number for each (i.e. 65001, 65002, etc) and add a nat port line for each one.
Half Life
nat port udp internal:27005 27015
PCAnywhere 8.0
nat port udp internal:5632 5632
nat port tcp internal:5631 5631
Quake
nat port udp internal:6112 6112
Alternatively, you may want to take a look at www.battle.net for Quake proxy support.
Quake 2
nat port udp internal:27901 27910
nat port udp internal:60021 60021
nat port udp internal:60040 60040
Red Alert
nat port udp internal:8675 8675
nat port udp internal:5009 5009
FCS stands for Frame Check Sequence. Each PPP packet has a checksum attached to ensure that the data being received is the data being sent. If the FCS of an incoming packet is incorrect, the packet is dropped and the HDLC FCS count is increased. The HDLC error values can be displayed using the show hdlc command.
If your link is bad (or if your serial driver is dropping packets), you will see the occasional FCS error. This is not usually worth worrying about although it does slow down the compression protocols substantially. If you have an external modem, make sure your cable is properly shielded from interference - this may eradicate the problem.
If your link freezes as soon as you have connected and you see a large number of FCS errors, this may be because your link is not 8 bit clean. Make sure your modem is not using software flow control (XON/XOFF). If your datalink must use software flow control, use the command set accmap 0x000a0000 to tell ppp(8) to escape the ^Q and ^S characters.
Another reason for seeing too many FCS errors may be that the remote end has stopped talking PPP. You may want to enable async logging at this point to determine if the incoming data is actually a login or shell prompt. If you have a shell prompt at the remote end, it is possible to terminate ppp(8) without dropping the line by using the close lcp command (a following term command will reconnect you to the shell on the remote machine.
If nothing in your log file indicates why the link might have been terminated, you should ask the remote administrator (your ISP?) why the session was terminated.
Thanks to Michael Wozniak <mwozniak@netcom.ca> for figuring this out and Dan Flemming <danflemming@mac.com> for the Mac solution:
This is due to what is called a "Black Hole" router. MacOS and Windows 98 (and maybe other Microsoft OSs) send TCP packets with a requested segment size too big to fit into a PPPoE frame (MTU is 1500 by default for Ethernet) and have the "do not fragment" bit set (default of TCP) and the Telco router is not sending ICMP "must fragment" back to the www site you are trying to load. (Alternatively, the router is sending the ICMP packet correctly, but the firewall at the www site is dropping it.) When the www server is sending you frames that do not fit into the PPPoE pipe the Telco router drops them on the floor and your page does not load (some pages/graphics do as they are smaller than a MSS.) This seems to be the default of most Telco PPPoE configurations (if only they knew how to program a router... sigh...)
One fix is to use regedit on your 95/98 boxes to add the following registry entry...
HKEY_LOCAL_MACHINE\System\CurrentControlSet\Services\Class\NetTrans\0000\MaxMTU
It should be a string with a value "1436", as some ADSL routers are reported to be unable to deal with packets larger than this. This registry key has been changed to Tcpip\Parameters\Interfaces\ID for adapter\MTU in Windows 2000 and becomes a DWORD.
Refer to the Microsoft Knowledge Base documents Q158474 - Windows TCPIP Registry Entries and Q120642 - TCPIP & NBT Configuration Parameters for Windows NT for more information on changing Windows MTU to work with a NAT router.
Another regedit possibility under Windows 2000 is to set the Tcpip\Parameters\Interfaces\ID for adapter\EnablePMTUBHDetect DWORD to 1 as mentioned in the Microsoft document 120642 mentioned above.
Unfortunately, MacOS does not provide an interface for changing TCP/IP settings. However, there is commercial software available, such as OTAdvancedTuner (OT for OpenTransport, the MacOS TCP/IP stack) by Sustainable Softworks, that will allow users to customize TCP/IP settings. MacOS NAT users should select ip_interface_MTU from the drop-down menu, enter 1450 instead of 1500 in the box, click the box next to Save as Auto Configure, and click Make Active.
The latest version of ppp(8) (2.3 or greater) has an enable tcpmssfixup command that will automatically adjust the MSS to an appropriate value. This facility is enabled by default. If you are stuck with an older version of ppp(8), you may want to look at the tcpmssd port.
If all else fails, send as much information as you can, including your config files, how you are starting ppp(8), the relevant parts of your log file and the output of the netstat -rn command (before and after connecting) to the or the comp.unix.bsd.freebsd.misc news group, and someone should point you in the right direction.