cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 
tetu04
New Contributor III

Troubleshooting PPP Interface errors

Jump to solution

Hello all,

One of our customer sites reported some bad quality sites where clipping was occurring. The call issues still existed after the site rebooted the router in during the morning, but a few hours later, the issues went away. I engaged the circuit provider to test the circuits and see what they find. The site operates on two bonded T1's and an Adtran 908e. The ISP got back to us saying that they see no issues. Here are the current outputs of the T 0/1 and PPP interfaces:

redland.windsor-kf.ta908e-1#$aces t 0/1 performance-statistics Total-24-hour

  24 Hour Performance Statistics:

    0 Errored Seconds, 0 Bursty Errored Seconds

    0 Severely Errored Seconds, 0 Severely Errored Frame Seconds

    0 Unavailable Seconds, 0 Path Code Violations

    0 Line Code Violations, 0 Controlled Slip Seconds

    0 Line Errored Seconds, 0 Degraded Minutes

T 0/1

TDM group 1, line protocol is UP

  Encapsulation PPP (ppp 1)

    423910 packets input, 116661246 bytes, 0 no buffer

    0 runts, 0 giants, 0 throttles

    714 input errors, 318 CRC, 376 frame

    20 abort, 0 discards, 0 overruns

    553967 packets output, 108136194 bytes, 0 underruns

T 0/2

TDM group 1, line protocol is UP

  Encapsulation PPP (ppp 1)

    423086 packets input, 116680142 bytes, 0 no buffer

    0 runts, 0 giants, 0 throttles

    597 input errors, 282 CRC, 281 frame

    34 abort, 0 discards, 0 overruns

    553882 packets output, 108156042 bytes, 0 underruns

Errors increment constantly. What troubleshooting steps should I take to find the root cause or the link between the bad voice quality calls?

0 Kudos
1 Solution

Accepted Solutions
Anonymous
Not applicable

Re: Troubleshooting PPP Interface errors

Jump to solution

Tetu04,

The key to tdm-group errors are to monitor if they are currently incrementing while T1 errors stay error free.  This appears to be the case in your situation, so this usually means that a frame is coming in that is not a valid HDLC frame.  It could be that part of the HDLC frame that was sent from the far end got distorted in the middle.  One way this can happen is if RBS is enabled on the T1 circuit somewhere over the span.  This could explain why there are zero T-1 errors, but the layer two frames (PPP or HDLC) could get mangled traversing the path. You can attempt to run a QRSS pattern head to head (from both sites) and see if errors are reported in one direction, though this is typically only possible if you control/manage both ends of the layer two link.

The following is the example text from the output of the show interface command for a T-1 interface cross-connected to a frame relay interface. A cross connected PPP interface would show similar output. The output serves as a guide for the following explanations.

TDM group 1, line protocol is DOWN

  Encapsulation FRAME-RELAY IETF (fr 1)

    0 packets input, 0 bytes, 0 no buffer

    0 runts, 0 giants, 0 throttles

    0 input errors, 0 CRC, 0 frame

    0 abort, 0 discards, 0 overruns

    0 packets output, 0 bytes, 0 underruns

Explanatory note: There is a frame-based communications link from one router to the far-end router that is not the same as the encapsulation protocol. All of the following errors, with the exception of discards and runts, pertain to that protocol.

  • no buffer - a packet arrived from the framer but there was insufficient buffer space to store it so it was discarded. This is different than "discards" in that it pertains to the communications channel from the NIM to the CPU while "discards" occur after a frame has been received successfully. Input errors are also incremented in this case.
  • runts - Applicable to Ethernet only. A received Ethernet frame is too small. This error category will not be incremented on non-Ethernet interfaces.
  • giants - A received frame was too large. Frame length violation.
  • throttles - The throttle counter is incremented for each packet received and dropped while the unit is congested. A congested unit is one which consistently cannot transmit packets from internal storage as rapidly as packets are received.
  • CRC (cyclic redundancy check) - A received frame's error detection 32-bit polynomial does not match the payload contents.
  • frame - A non-octet aligned frame was received. The internal communications link is an octet based communications protocol. Violations of the octet alignment is a profound problem.
  • abort - When seven consecutive ones are received during a frame reception. The link between the framer and the CPU is a special format that prevents the sending of seven consecutive frames. This error indicates the either the framer incorrectly encoded the frame or the bit is in error from its original sense.
  • discards - A frame was successfully received however the CPU was unable to allocate internal storage to hold the packet and so discards it and increments the counter.
  • overruns - The receiver overruns, insufficient storage to handle all data received.
  • underruns - The transmitter underruns, insufficient data present to transmit a packet.
  • input errors - There are several causes for an input error:
    • no buffer error on the receive side.
    • received a fragmented packet from framer. Fragments are rejected.
    • The PLL detected a missing transition. Without a locked PLL a bits cannot be clocked correctly and packet is declared in error.
    • Non-octet aligned frame was received but the protocol requires packets to be on octet boundaries.
    • A special in-band abort sequence was received on the NIM-to-CPU communications channel. This was either intentionally the far end aborting for a catastrophic reason or a corruption on the link.
    • CRC error.CRC32 error detection on all data sent over protocol. This error indicates that the CRC check failed.
    • Overrun. There was more packet data than buffer to hold that data.


Hope this helps,

David


View solution in original post

0 Kudos
8 Replies
jayh
Honored Contributor
Honored Contributor

Re: Troubleshooting PPP Interface errors

Jump to solution

How are the T1 interfaces clocked?

What is the cable length and type between the smartjack and the router?

Please run the following command and look at the output:

show system

  We're looking for the primary and secondary clock source.

  In 99.9% of cases this should be t1 0/1 and t1 0/2 respectively.

  If not, configure the following:

  timing-source t1 0/1

  timing-source t1 0/2 secondary


If the clocking is OK, then post the results of:

show int t1 0/1 performance-statistics

show int t1 0/2 performance-statistics


Could other traffic be congesting the link?  Is the poor audio inbound or outbound or both, and is QoS properly configured?  Is your carrier applying QoS inbound?

Call a milliwatt test tone line if you know of a local one and perform a speed test while the call is up.  Does the tone break up during the download test? 

tetu04
New Contributor III

Re: Troubleshooting PPP Interface errors

Jump to solution

The clocking settings are OK. When issuing the performance-statistics command I see some sporadic occurrences of "1 errored seconds" , "1 severely errored frame seconds", "1 controlled slip seconds" or "1 path code violations". However the number never exceeds 1. The poor audio is only on the inbound, whether internal voip calls or external callers on from their cellphones. The external caller hears the conversation fine, the clipping occurred only on the internal voip phone. The site has a 3Mb pipe, with 12 VoIP phones and 4 analog. Overutilization was never an issue, and the these quality problems only appeared  a few days ago. I hoping to correlate the layer 2 incrementing errors from the PPP interfaces to the clipped calls as the customer described.

jayh
Honored Contributor
Honored Contributor

Re: Troubleshooting PPP Interface errors

Jump to solution

The slip seconds points to a clocking issue.  A handful of errors per day isn't particularly unusual on T1 circuits that traverse outside telephone plant.

Check the LAN side for duplex issues on switch ports as well.  If only on the inbound, it could be overutilization, which can be bursty.  Turn on VQM and look at MOS scores and reason for degradation. 

tetu04
New Contributor III

Re: Troubleshooting PPP Interface errors

Jump to solution

I engaged the site's LAN team to verify their equipment. Their PPP interfaces are still incrementing input errors. Right now they are sitting at "5273 input errors, 1751 CRC, 3499 frame". Is the cause for this still internal if the carrier says their circuit is clean and "sh int t 0/1 performance-statistics total" shows no errors?

jayh
Honored Contributor
Honored Contributor

Re: Troubleshooting PPP Interface errors

Jump to solution

A slight incrementing of input errors on long-haul WAN circuits is somewhat to be expected.  T1 circuits pass a little over 1.5 million bits every second.  If one per billion bits has an error, that's a bit error rate of 10^-9, pretty good but it's one error about every eleven minutes so they will rack up over time.  That's why the 15-minute performance display is useful, you can see if it was a burst of errors.

Some other things to check:

  • Triple-check your clocking.  Normally you'll set your system-timing to T1 0/1 with a secondary of T1 0/2.
  • When the carrier says it's clean, are they just looking at the errors on the CO side, or did they take the loop out of service and run a full loopback test to the smartjack?  In order to be sure they'll need to loop to the smartjack and run stress pattern tests which can mean several minutes of downtime.
  • Triple-check the wiring between the smartjack and the TA900.  Good quality cable, no splices, twisted pair.
  • Is the TA900 grounded?  Ideally it should be grounded to the same point as telco's smartjack.
Anonymous
Not applicable

Re: Troubleshooting PPP Interface errors

Jump to solution

Tetu04,

The key to tdm-group errors are to monitor if they are currently incrementing while T1 errors stay error free.  This appears to be the case in your situation, so this usually means that a frame is coming in that is not a valid HDLC frame.  It could be that part of the HDLC frame that was sent from the far end got distorted in the middle.  One way this can happen is if RBS is enabled on the T1 circuit somewhere over the span.  This could explain why there are zero T-1 errors, but the layer two frames (PPP or HDLC) could get mangled traversing the path. You can attempt to run a QRSS pattern head to head (from both sites) and see if errors are reported in one direction, though this is typically only possible if you control/manage both ends of the layer two link.

The following is the example text from the output of the show interface command for a T-1 interface cross-connected to a frame relay interface. A cross connected PPP interface would show similar output. The output serves as a guide for the following explanations.

TDM group 1, line protocol is DOWN

  Encapsulation FRAME-RELAY IETF (fr 1)

    0 packets input, 0 bytes, 0 no buffer

    0 runts, 0 giants, 0 throttles

    0 input errors, 0 CRC, 0 frame

    0 abort, 0 discards, 0 overruns

    0 packets output, 0 bytes, 0 underruns

Explanatory note: There is a frame-based communications link from one router to the far-end router that is not the same as the encapsulation protocol. All of the following errors, with the exception of discards and runts, pertain to that protocol.

  • no buffer - a packet arrived from the framer but there was insufficient buffer space to store it so it was discarded. This is different than "discards" in that it pertains to the communications channel from the NIM to the CPU while "discards" occur after a frame has been received successfully. Input errors are also incremented in this case.
  • runts - Applicable to Ethernet only. A received Ethernet frame is too small. This error category will not be incremented on non-Ethernet interfaces.
  • giants - A received frame was too large. Frame length violation.
  • throttles - The throttle counter is incremented for each packet received and dropped while the unit is congested. A congested unit is one which consistently cannot transmit packets from internal storage as rapidly as packets are received.
  • CRC (cyclic redundancy check) - A received frame's error detection 32-bit polynomial does not match the payload contents.
  • frame - A non-octet aligned frame was received. The internal communications link is an octet based communications protocol. Violations of the octet alignment is a profound problem.
  • abort - When seven consecutive ones are received during a frame reception. The link between the framer and the CPU is a special format that prevents the sending of seven consecutive frames. This error indicates the either the framer incorrectly encoded the frame or the bit is in error from its original sense.
  • discards - A frame was successfully received however the CPU was unable to allocate internal storage to hold the packet and so discards it and increments the counter.
  • overruns - The receiver overruns, insufficient storage to handle all data received.
  • underruns - The transmitter underruns, insufficient data present to transmit a packet.
  • input errors - There are several causes for an input error:
    • no buffer error on the receive side.
    • received a fragmented packet from framer. Fragments are rejected.
    • The PLL detected a missing transition. Without a locked PLL a bits cannot be clocked correctly and packet is declared in error.
    • Non-octet aligned frame was received but the protocol requires packets to be on octet boundaries.
    • A special in-band abort sequence was received on the NIM-to-CPU communications channel. This was either intentionally the far end aborting for a catastrophic reason or a corruption on the link.
    • CRC error.CRC32 error detection on all data sent over protocol. This error indicates that the CRC check failed.
    • Overrun. There was more packet data than buffer to hold that data.


Hope this helps,

David


0 Kudos
Anonymous
Not applicable

Re: Troubleshooting PPP Interface errors

Jump to solution

Tetu04,

I went ahead and flagged the "Correct Answer" on this post to make it more visible and help other members of the community find solutions more easily. If you don't feel like the answer I marked was correct, feel free to come back to this post and unmark it and select another in its place with the applicable buttons.  If you still need assistance, we would be more than happy to continue working with you on this - just let us know in a reply.

Thanks,

David


apenichet
New Contributor III

Re: Troubleshooting PPP Interface errors

Jump to solution

Hello,

I too am having this issue. I have a TA5000 T1 32 port card on one side and I have a 924e at the customers side.

1. The T1 itself is not taking errors and when running patterns using QRSS from one end to the other its all clean.

2. Here is a sample of the stats:

  Line Status: -- No Alarms --

  5 minute input rate 120792 bits/sec, 68 packets/sec

  5 minute output rate 36080 bits/sec, 20 packets/sec

  Current Performance Statistics:

    0 Errored Seconds, 0 Bursty Errored Seconds

    0 Severely Errored Seconds, 0 Severely Errored Frame Seconds

    0 Unavailable Seconds, 0 Path Code Violations

    0 Line Code Violations, 0 Controlled Slip Seconds

    0 Line Errored Seconds, 0 Degraded Minutes

  TDM group 1, line protocol is UP

  Encapsulation PPP (ppp 1)

    73900221 packets input, 3841744415 bytes, 0 no buffer

    0 runts, 0 giants, 0 throttles

    13240 input errors, 3985 CRC, 5835 frame

    3420 abort, 0 discards, 0 overruns

    18452346 packets output, 2010944582 bytes, 0 underruns

What can we do to make these errors go away.