8 Replies Latest reply on Jun 29, 2016 11:03 AM by apenichet

    Troubleshooting PPP Interface errors

    tetu04 New Member

      Hello all,

       

      One of our customer sites reported some bad quality sites where clipping was occurring. The call issues still existed after the site rebooted the router in during the morning, but a few hours later, the issues went away. I engaged the circuit provider to test the circuits and see what they find. The site operates on two bonded T1's and an Adtran 908e. The ISP got back to us saying that they see no issues. Here are the current outputs of the T 0/1 and PPP interfaces:

       

      redland.windsor-kf.ta908e-1#$aces t 0/1 performance-statistics Total-24-hour

        24 Hour Performance Statistics:

          0 Errored Seconds, 0 Bursty Errored Seconds

          0 Severely Errored Seconds, 0 Severely Errored Frame Seconds

          0 Unavailable Seconds, 0 Path Code Violations

          0 Line Code Violations, 0 Controlled Slip Seconds

          0 Line Errored Seconds, 0 Degraded Minutes

       

      T 0/1

      TDM group 1, line protocol is UP

        Encapsulation PPP (ppp 1)

          423910 packets input, 116661246 bytes, 0 no buffer

          0 runts, 0 giants, 0 throttles

          714 input errors, 318 CRC, 376 frame

          20 abort, 0 discards, 0 overruns

          553967 packets output, 108136194 bytes, 0 underruns

       

      T 0/2

      TDM group 1, line protocol is UP

        Encapsulation PPP (ppp 1)

          423086 packets input, 116680142 bytes, 0 no buffer

          0 runts, 0 giants, 0 throttles

          597 input errors, 282 CRC, 281 frame

          34 abort, 0 discards, 0 overruns

          553882 packets output, 108156042 bytes, 0 underruns

       

      Errors increment constantly. What troubleshooting steps should I take to find the root cause or the link between the bad voice quality calls?

        • Re: Troubleshooting PPP Interface errors
          jayh Hall_of_Fame

          How are the T1 interfaces clocked?

          What is the cable length and type between the smartjack and the router?

           

          Please run the following command and look at the output:

           

          show system

            We're looking for the primary and secondary clock source.

            In 99.9% of cases this should be t1 0/1 and t1 0/2 respectively.

            If not, configure the following:

            timing-source t1 0/1

            timing-source t1 0/2 secondary


          If the clocking is OK, then post the results of:

           

          show int t1 0/1 performance-statistics

          show int t1 0/2 performance-statistics


          Could other traffic be congesting the link?  Is the poor audio inbound or outbound or both, and is QoS properly configured?  Is your carrier applying QoS inbound?

           

          Call a milliwatt test tone line if you know of a local one and perform a speed test while the call is up.  Does the tone break up during the download test? 

            • Re: Troubleshooting PPP Interface errors
              tetu04 New Member

              The clocking settings are OK. When issuing the performance-statistics command I see some sporadic occurrences of "1 errored seconds" , "1 severely errored frame seconds", "1 controlled slip seconds" or "1 path code violations". However the number never exceeds 1. The poor audio is only on the inbound, whether internal voip calls or external callers on from their cellphones. The external caller hears the conversation fine, the clipping occurred only on the internal voip phone. The site has a 3Mb pipe, with 12 VoIP phones and 4 analog. Overutilization was never an issue, and the these quality problems only appeared  a few days ago. I hoping to correlate the layer 2 incrementing errors from the PPP interfaces to the clipped calls as the customer described.

                • Re: Troubleshooting PPP Interface errors
                  jayh Hall_of_Fame

                  The slip seconds points to a clocking issue.  A handful of errors per day isn't particularly unusual on T1 circuits that traverse outside telephone plant.

                   

                  Check the LAN side for duplex issues on switch ports as well.  If only on the inbound, it could be overutilization, which can be bursty.  Turn on VQM and look at MOS scores and reason for degradation. 

                    • Re: Troubleshooting PPP Interface errors
                      tetu04 New Member

                      I engaged the site's LAN team to verify their equipment. Their PPP interfaces are still incrementing input errors. Right now they are sitting at "5273 input errors, 1751 CRC, 3499 frame". Is the cause for this still internal if the carrier says their circuit is clean and "sh int t 0/1 performance-statistics total" shows no errors?

                        • Re: Troubleshooting PPP Interface errors
                          jayh Hall_of_Fame

                          A slight incrementing of input errors on long-haul WAN circuits is somewhat to be expected.  T1 circuits pass a little over 1.5 million bits every second.  If one per billion bits has an error, that's a bit error rate of 10^-9, pretty good but it's one error about every eleven minutes so they will rack up over time.  That's why the 15-minute performance display is useful, you can see if it was a burst of errors.

                           

                          Some other things to check:

                          • Triple-check your clocking.  Normally you'll set your system-timing to T1 0/1 with a secondary of T1 0/2.
                          • When the carrier says it's clean, are they just looking at the errors on the CO side, or did they take the loop out of service and run a full loopback test to the smartjack?  In order to be sure they'll need to loop to the smartjack and run stress pattern tests which can mean several minutes of downtime.
                          • Triple-check the wiring between the smartjack and the TA900.  Good quality cable, no splices, twisted pair.
                          • Is the TA900 grounded?  Ideally it should be grounded to the same point as telco's smartjack.
                          • Re: Troubleshooting PPP Interface errors
                            david Employee

                            Tetu04,

                             

                            The key to tdm-group errors are to monitor if they are currently incrementing while T1 errors stay error free.  This appears to be the case in your situation, so this usually means that a frame is coming in that is not a valid HDLC frame.  It could be that part of the HDLC frame that was sent from the far end got distorted in the middle.  One way this can happen is if RBS is enabled on the T1 circuit somewhere over the span.  This could explain why there are zero T-1 errors, but the layer two frames (PPP or HDLC) could get mangled traversing the path. You can attempt to run a QRSS pattern head to head (from both sites) and see if errors are reported in one direction, though this is typically only possible if you control/manage both ends of the layer two link.

                             

                            The following is the example text from the output of the show interface command for a T-1 interface cross-connected to a frame relay interface. A cross connected PPP interface would show similar output. The output serves as a guide for the following explanations.

                            TDM group 1, line protocol is DOWN

                              Encapsulation FRAME-RELAY IETF (fr 1)

                                0 packets input, 0 bytes, 0 no buffer

                                0 runts, 0 giants, 0 throttles

                                0 input errors, 0 CRC, 0 frame

                                0 abort, 0 discards, 0 overruns

                                0 packets output, 0 bytes, 0 underruns

                             

                            Explanatory note: There is a frame-based communications link from one router to the far-end router that is not the same as the encapsulation protocol. All of the following errors, with the exception of discards and runts, pertain to that protocol.

                             

                            • no buffer - a packet arrived from the framer but there was insufficient buffer space to store it so it was discarded. This is different than "discards" in that it pertains to the communications channel from the NIM to the CPU while "discards" occur after a frame has been received successfully. Input errors are also incremented in this case.
                            • runts - Applicable to Ethernet only. A received Ethernet frame is too small. This error category will not be incremented on non-Ethernet interfaces.
                            • giants - A received frame was too large. Frame length violation.
                            • throttles - The throttle counter is incremented for each packet received and dropped while the unit is congested. A congested unit is one which consistently cannot transmit packets from internal storage as rapidly as packets are received.
                            • CRC (cyclic redundancy check) - A received frame's error detection 32-bit polynomial does not match the payload contents.
                            • frame - A non-octet aligned frame was received. The internal communications link is an octet based communications protocol. Violations of the octet alignment is a profound problem.
                            • abort - When seven consecutive ones are received during a frame reception. The link between the framer and the CPU is a special format that prevents the sending of seven consecutive frames. This error indicates the either the framer incorrectly encoded the frame or the bit is in error from its original sense.
                            • discards - A frame was successfully received however the CPU was unable to allocate internal storage to hold the packet and so discards it and increments the counter.
                            • overruns - The receiver overruns, insufficient storage to handle all data received.
                            • underruns - The transmitter underruns, insufficient data present to transmit a packet.
                            • input errors - There are several causes for an input error:
                              • no buffer error on the receive side.
                              • received a fragmented packet from framer. Fragments are rejected.
                              • The PLL detected a missing transition. Without a locked PLL a bits cannot be clocked correctly and packet is declared in error.
                              • Non-octet aligned frame was received but the protocol requires packets to be on octet boundaries.
                              • A special in-band abort sequence was received on the NIM-to-CPU communications channel. This was either intentionally the far end aborting for a catastrophic reason or a corruption on the link.
                              • CRC error.CRC32 error detection on all data sent over protocol. This error indicates that the CRC check failed.
                              • Overrun. There was more packet data than buffer to hold that data.


                            Hope this helps,

                            David


                    • Re: Troubleshooting PPP Interface errors
                      david Employee

                      Tetu04,

                       

                      I went ahead and flagged the "Correct Answer" on this post to make it more visible and help other members of the community find solutions more easily. If you don't feel like the answer I marked was correct, feel free to come back to this post and unmark it and select another in its place with the applicable buttons.  If you still need assistance, we would be more than happy to continue working with you on this - just let us know in a reply.

                       

                      Thanks,

                      David


                      • Re: Troubleshooting PPP Interface errors
                        apenichet New Member

                        Hello,

                         

                        I too am having this issue. I have a TA5000 T1 32 port card on one side and I have a 924e at the customers side.

                         

                        1. The T1 itself is not taking errors and when running patterns using QRSS from one end to the other its all clean.

                         

                        2. Here is a sample of the stats:

                          Line Status: -- No Alarms --

                         

                         

                          5 minute input rate 120792 bits/sec, 68 packets/sec

                          5 minute output rate 36080 bits/sec, 20 packets/sec

                          Current Performance Statistics:

                            0 Errored Seconds, 0 Bursty Errored Seconds

                            0 Severely Errored Seconds, 0 Severely Errored Frame Seconds

                            0 Unavailable Seconds, 0 Path Code Violations

                            0 Line Code Violations, 0 Controlled Slip Seconds

                            0 Line Errored Seconds, 0 Degraded Minutes

                         

                         

                          TDM group 1, line protocol is UP

                          Encapsulation PPP (ppp 1)

                            73900221 packets input, 3841744415 bytes, 0 no buffer

                            0 runts, 0 giants, 0 throttles

                            13240 input errors, 3985 CRC, 5835 frame

                            3420 abort, 0 discards, 0 overruns

                            18452346 packets output, 2010944582 bytes, 0 underruns

                         

                        What can we do to make these errors go away.