Silicon Exposed: TDR updates

A few months ago, I wrote about a project I had been thinking of for a while but not had time to work on: a time-domain reflectometer (TDR) for testing twisted pair Ethernet cables.

TDR background

The basic theory of operation is simple: Send a pulse down a transmission line and measure the reflected voltage over time to get a plot of impedance discontinuities over time. Unfortunately, doing this with sufficient temporal resolution (sub-nanosecond) requires extremely high analog sampling rates, and GHz A/D converters are (to say the least) not cheap: the least expensive 1 GSA/s ADC on Digi-Key is the HMCAD1520, which sells for $120 each at the time of this writing. Higher sampling rates cost even more, the 1.5 GSa/s ADC081500CIYB is listed at $347.

One possible architecture would consist of a pre-amplifier for each channel, a 4:1 RF mux, and a single high-speed ADC sampled by an FPGA. This would work, but seemed quite expensive and I wanted to explore lower-cost options.

ADC architecture

After thinking about the problem for a while, I realized that the single most expensive component in a classical TDR was probably the ADC - but there was no easy way to make it cheaper. What if I could eliminate the ADC entirely?

I drew inspiration from the successive-approximation-register (SAR) ADC architecture, which essentially converts a DAC into an ADC by binary searching. The basic operating algorithm is as follows:

For each point T in time

vstart = 0
vend = Vref
Set DAC to (Vstart + Vend) / 2
Compare Vin against Vdac
If Vin > Vdac

Set current bit of sample to 1
Set Vstart to Vdac, update ADC, repeat

else

Set current bit of sample to 0
Set Vend to Vdac, update ADC, repeat

The problem with SAR for high speeds is that N-bit analog resolution at M samples per second requires a DAC that can run at O(M log N) samples per second - hardly an improvement!

In order to work around this problem, I began to think about ways to represent the data generated by a SAR ADC. I ended up modeling a simplified SAR ADC which performed a linear, rather than binary, search. We can represent the intermediate data as a matrix of 2^N rows by M columns, one for each of M data points.

The sampling algorithm for this simplified ADC works as follows:

For each point T in time

Set DAC to 0
Compare Vin against Vdac

If Vin > Vdac

Set column[Vdac] to 1
Increment Vdac
Repeat comparison

Otherwise stop and capture the next sample

Once we have this matrix, we can simply sum the number of 1s in each column to calculate the corresponding sample value.

While this approach will clearly work, it is exponentially slower than the conventional SAR ADC since it requires 2^N samples instead, instead of N, for N-bit precision. So why is it useful?

Now consider what happens if we acquire the data from a transposed version of the same matrix:

For each Vdac from 0 to Vref

For each point T in time

Compare Vin against Vdac

If Vin > Vdac

Set row Vin of column T to 1

Increment Vref
Go back in time and loop over the signal again

This version clearly captures the same data, since matrix[T][V] is set to true iff sample T is less than V. We simply switch the inner and outer loops.

It also has a very interesting property for cost optimization: Since it only updates the DAC after sampling the entire signal, we can now use a much slower (and cheaper) DAC than with a conventional SAR. In addition, the comparator can now update at the sampling frequency instead of 2^N times the sampling frequency.

There's just one problem: It requires time travel! Why are we wasting our time analyzing a circuit that can't actually be built?

Well, as it turns out we can solve this problem too - with "parallel universes". Since the impedance of the cable is (hopefully) fairly constant over time, if we send multiple pulses they should return identical reflection waveforms. We can thus send out a pulse, test one candidate Vdac value against this waveform, then increment Vdac and repeat.

The end result is that with a cheap SPI DAC, a high-speed comparator, and an FPGA with a high-speed digital input we can digitize a repetitive signal to arbitrary analog precision, with sampling rate limited only by comparator bandwidth and FPGA input buffer performance!

Pulse generation

The first step in any TDR, of course, is to generate the pulse.

I spent a while looking over FPGAs and ended up deciding on the Xilinx Kintex-7 series, specifically the XC7K70T. The -1 speed grade can do 1.25 Gbps LVDS in the high-performance I/O banks (matching the -2 and -3 speed Artix-7 devices) and the higher speed grades can go up to 1.6 Gbps.

The pulse is generated by using the OSERDES of the FPGA to produce a single-cycle LVDS 1 followed by a long series of 0s. The resulting LVDS pulse is fed into a Micrel SY58603U LVDS-to-CML buffer. This slightly increases the amplitude of the output pulse and sharpens the rise time up to 80ps.

The resulting pulse is then sent through the RJ45 connector onto the cable being tested.

Output buffer

Input preamplifier

The reflected signal coming off the differential pair is AC coupled with a pair of capacitors to prevent bus fights between the (unequal) common-mode voltages of the output buffer and the preamplifier. It is then fed into a LMH6881 programmable-gain preamplifier.

This is by far the most pricey analog component I've used in a design: nearly $10 for a single amplifier. But it's a very nice amplifier (made on a SiGe BiCMOS process)- very high linearity, 2.4 GHz bandwidth, and gain from 6 to 26 dB programmable over SPI in 0.25 dB steps.

Input preamplifier

The optional external terminator (R25) is intended to damp out any reflections coming off of the preamplifier if they present a problem; during the initial assembly I plan to leave it unpopulated. Since this is my first high-speed mixed signal design I'm trying to make it easy to tweak if I screwed up somehow :)

The output of the preamplifier is a differential signal with 2.5V common mode offset.

Differential to single-ended conversion

The next step is to convert the differential voltage into a single-ended voltage that we can feed into the comparator. I use an AD8045 unity gain voltage feedback amplifier for this, configured to compute CH1_VDIFF = (CH1_BUF_P - CH1_BUF_N) + 2.5V.

Digitization

The single-ended voltage is compared against the DAC output (AFE_VREF) using one half of an LMH7322 dual comparator.

The output supply of the comparator is driven by a 2.5V supply to produce LVDS-compatible differential output voltage levels.

PCB layout

The board was laid out in KiCAD using the new push-and-shove router. All of the differential pairs were manually length-matched to 0.1mm or better.

The upper left corner of the board contains four copies of the AFE. The AD8045s are on the underside of the board because the pinout made routing easier this way. Hopefully the impedance discontinuities from the vias won't matter at these signal speeds...

AFE layout, front side

AFE layout, back side

The rest of the board isn't nearly as complex: the lower left has a second RJ45 connector and a RGMII PHY for interfacing to the host PC, the power supply is at the upper right, and the FPGA is bottom center.

The power supply is divided into two regions, digital and analog. The digital supply is on the far right side of the board, safely isolated from the AFE. It uses an LTC3374 to generate an 1.0V 4A rail for the FPGA core, a 1.2V 2A rail for the FPGA transceivers and Ethernet PHY, a 1.8V 1A rail for digital I/O, and a 2.5V rail for the CML buffers and Ethernet analog logic.

The analog supply was fairly close to the AFE so I put a guard ring around it just to be safe. It consists of a LTC3122 boost converter to push the 5V nominal input voltage up to 6V, followed by a 5V LDO to give a nice clean analog rail. I ran the output of the LDO through a pi filter just to be extra safe.

The TDR subsystem didn't use any of the four 6.6 Gbps GTX serial transceivers on the FPGA because they are designed to recover their clock from the incoming signal and don't seem to support use of an external reference clock. It seemed a shame to waste them, though, so I broke them (as well as 20 0.95 Gbps LVDS channels) out to a Samtec Q-strip header for use as high-speed GPIO.

Without further ado, here's the full layout. I could have made the board quite a bit smaller in the vertical axis but I needed to keep a constant 100mm high so it would fit in the card guides on my Eurocard rack.

The board is at fabs for quotes now and I'll make another post once the boards come back.