Tuning the Gateware from Python
This section is about how to configure the gateware once the FPGA has been programmed. This means writing to a sequence of registers and BRAMs to:
- Synchronize the logic
- Select requantization depth (4-bit / 1-bit)
- Set FFT parameters
- Set the cross correlation parameters
- Select frequency channels
- Autotune the digital gain coefficients (4-bit mode only)
The Design Overview section documents how the gateware is designed in Simulink. If you are unfamiliar with gateware design or are finding this section confusing you may want to read that section before this one.
Reading and Writing to Programmable Registers
In Python the CasperFpga class allows you to interface with named FPGA registers. The gateware needs to be tuned and configured according to the user's needs, such as which channels to pick and which what bit mode to select. This configuration happens in five steps.
- Setup. TODO
- Set the channel order. Re-order the frequency channels so that the UDP payload packetizer selects the correct channels. For example, to select only channels 120:136 you must re-order the channels so that 120:136 occur at the beginning of each frame. For deeper explanation of why we need to do this see the packetizer section.
- Optionally set the 4 bit coefficients.
- Tune. TODO
- Optionally update the 4 bit coefficients based on OBC data. TODO
Setup
TODO: write this section...
sparrow_albatros.AlbatrosDigitizer.setup()
Source code in software/sparrow_albatros.py
192
193
194
195
196
197
198 | def setup(self):
self.logger.info("Programming FPGA")
self.program_fpga()
fpga_clock_mhz = self.cfpga.estimate_fpga_clock()
self.logger.info(f"Estimated FPGA clock is {fpga_clock_mhz:.2f}")
self.logger.info("Initializing ADCs")
self.initialize_adc()
|
Set channel order
TODO: write this section...
sparrow_albatros.AlbatrosDigitizer.set_channel_order(channels, bits)
Sets the firmware channels
Source code in software/sparrow_albatros.py
154
155
156
157
158
159
160 | def set_channel_order(self, channels, bits):
"""Sets the firmware channels"""
# hard coded names of brams
if bits==1: channel_map="one_bit_reorder_map1"
elif bits==4: channel_map="four_bit_reorder_map1"
else: raise ValueError(f"Bits must be 1 or 4, not {bits}")
self.cfpga.write(channel_map, channels.astype(">H").tobytes(), offset=0)
|
Tuning registers
TODO: write this section...
sparrow_albatros.AlbatrosDigitizer.tune(ref_clock, fftshift, acc_len, dest_ip, dest_prt, spectra_per_packet, bytes_per_spectrum, bits, dest_mac=0)
This method "tunes" the FPGA Gateware's input registers to suit the
user's needs.
- Assumes fpga has been programmed and cfpga is running.
- Sets values in FPGA's programmable Registers and BRAMs.
- Basic sanity checks of FPGA output values, e.g. FFT overflows.
Parameters: |
-
ref_clock
(float )
–
-
fftshift
(int )
–
FFT shift schedule. This int is re-interpreted as a
sequence of bits, the 12 LSBs are used to define the shift
schedule. Do 1/0 for on/off.*
-
acc_len
(int )
–
Number of spectra accumulated to integrate correlations.*
-
dest_ip
(str )
–
IP address to send packets to. The input is an IPV4
string following the convention "x.x.x.x", e.g. "192.168.0.1".
This IP address is reinterpreted as an int and that value is
written to a register on the FPGA.
-
dest_prt
(int )
–
Destination port number.*
-
spectra_per_packet
(int )
–
Number of spectra to include in each UDP
packet.*
-
bytes_per_spectrum
(int )
–
Number of bytes in one quantized spectrum.*
-
bits
(int )
–
Number of bits per real/imaginary componant after
requantization. Takes values 1 or 4.
-
dest_mac
(int , default:
0
)
–
Not yet implemented. Configure the destination MAC
address.
|
* This parameter's value gets written to a register on the FPGA.
Source code in software/sparrow_albatros.py
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281 | def tune(self, ref_clock, fftshift, acc_len, dest_ip,
dest_prt, spectra_per_packet, bytes_per_spectrum,
bits, dest_mac:int=0):
"""
This method "tunes" the FPGA Gateware's input registers to suit the
user's needs.
- Assumes fpga has been programmed and cfpga is running.
- Sets values in FPGA's programmable Registers and BRAMs.
- Basic sanity checks of FPGA output values, e.g. FFT overflows.
Args:
ref_clock (float): Reference clock in MHz
fftshift (int): FFT shift schedule. This int is re-interpreted as a
sequence of bits, the 12 LSBs are used to define the shift
schedule. Do 1/0 for on/off.\*
acc_len (int): Number of spectra accumulated to integrate correlations.\*
dest_ip (str): IP address to send packets to. The input is an IPV4
string following the convention "x.x.x.x", e.g. "192.168.0.1".
This IP address is reinterpreted as an int and that value is
written to a register on the FPGA.
dest_prt (int): Destination port number.\*
spectra_per_packet (int): Number of spectra to include in each UDP
packet.\*
bytes_per_spectrum (int): Number of bytes in one quantized spectrum.\*
bits (int): Number of bits per real/imaginary componant after
requantization. Takes values 1 or 4.
dest_mac (int): *Not yet implemented.* Configure the destination MAC
address.
\* This parameter's value gets written to a register on the FPGA.
"""
MTU=1500 # max number of bytes in a packet
assert spectra_per_packet < (1<<5), "spec-per-pack too large for slice, aborting"
assert spectra_per_packet * bytes_per_spectrum <= MTU-8, "Packets too large, will cause fragmentation"
assert bits in (1,4), f"Baseband requantization mode must be 1 or 4, not {bits}"
# Assume bitstream already uploaded, data in self.cfpga
# Assume ADCs already initialized including that get_system_information...
# Inherit adc's logger level
self.adc.ref = ref_clock # Set reference clock for ADC
# ADC calibration assumed already aligned (?)
# Need to set the ADC gain?
# Get info from and set registers
self.logger.info(f"FPGA clock: {self.cfpga.estimate_fpga_clock():.2f}")
self.logger.info(f"Set FFT shift schedule to {fftshift:b}")
self.cfpga.registers.pfb_fft_shift.write_int(fftshift)
self.logger.info(f"Set correlator accumulation length to {acc_len}")
self.cfpga.registers.acc_len.write_int(acc_len)
self.logger.info("Reset GBE (UDP packetizer)")
self.cfpga.registers.gbe_rst.write_int(1)
self.cfpga.registers.gbe_en.write_int(0)
self.logger.info(f"Set spectra-per-packet to {spectra_per_packet}")
self.cfpga.registers.packetiser_spectra_per_packet.write_int(spectra_per_packet)
self.logger.info(f"Set bytes-per-spectrum to {bytes_per_spectrum}")
self.cfpga.registers.packetiser_bytes_per_spectrum.write_int(bytes_per_spectrum)
self.logger.info(f"Set quantization bit mode to {bits}-bits")
if bits==1: self.cfpga.registers.sel.write_int(0)
elif bits==4: self.cfpga.registers.sel.write_int(1)
self.logger.info(f"NOT YET IMPLEMENTED: Setting destination MAC address to {dest_mac}")
# TODO: set destination MAC address
self.logger.info(f"Set destination IP address and port to {dest_ip}:{dest_prt}")
self.cfpga.registers.dest_ip.write_int(str2ip(dest_ip))
self.cfpga.registers.dest_prt.write_int(dest_prt)
# Do we need to set mac address?
#time.sleep(1.1) # dogmatically wait for regs to set before sending sync pulse
self.sync_pulse()
fft_of_count = self.cfpga.registers.fft_of_count.read_uint()
if fft_of_count != 0:
self.logger.warning(f"FFT overflowing: count={fft_of_count}")
else:
self.logger.info(f"No FFT overflows detected")
self.logger.info("Enabling 1 GbE output")
self.cfpga.registers.gbe_en.write_int(1)
#self.logger.info("Leaving GBE reset high; pull it down manually once you think the negotiation has happened well!")
self.cfpga.registers.gbe_rst.write_int(0)
gbe_overflow = self.cfpga.registers.tx_of_cnt.read_uint()
if gbe_overflow:
self.logger.warning(f"GbE transmit overflowing: count={gbe_overflow}")
else:
self.logger.info("No GbE overflows detected")
self.logger.info("Setup and tuning complete")
return
|
Tuning the 4-bit digital gain coefficients (special case)
TODO: write this section... here's an outline:
- To avoid non-converging, iterative hell of capturing 4 bit data and increasing or decreasing the digital gain coefficients on each channel (that may take forever to converge in the intermittent RFI case), we one-shot the 4-bit gain coefficients by getting a power reading from the on-board-correlator.
- Tune gateware so that it's collecting data as you'd like it to
- Wait for the on board correlator's accumulator to fill up
- Read the auto-correlations power in each channel to estimate the optimal digital-gain coefficient needed
- Write the set the 4-bit digital gain coefficients.
Now to actually implement this requires some book-keeping.
Bookkeeping with bit-growth
See also release notes of bit growth FFT implementation.
Fix24_23s come out of the FFT, real and complex are bussed together and interpreted as UFix_48s. The power block outputs UFix_42_40, which is a shifted squared value. We drop the six LSBs to make room for MSBs in the 64-bit accumulator.

The number of spectra accumulated in the auto-correlator is denoted len_acc
or \(L\). The best estimate for the STD in each real and imaginary components is
\[
\sigma = \sqrt{\frac{P}{2 \cdot 2^40 \cdot L}}
\]
where \(P\) is the power read from the autocorr BRAM interpreted as a UInt64. The four-bit digital gains must be set so that the quantization interval \(\Delta\) is equal to 0.293 of the STD (see the Optimal Digital Gains section). In this implementation, the quantization interval remains fixed at 1/8 and the digital gains modify the signals.

We must therefore multiply the signals by a gain factor, \(g\), such that
\[
g \cdot \sigma = \frac{1/8}{0.293} \Rightarrow g = \frac{1/8}{0.293 \cdot \sigma}.
\]
This factor is written to BRAM as a UInt_32
but it's reinterpreted as a Fix_32_17
, so we need to multiply \(g\) by \(2^{17}\) before writing it. The following is a code snippet for computing the optimal gain.
pol00 = read_pols(['pol00','pol11'])['pol00'] / (1<<40) # Each freq channel has different power
stds0 = np.sqrt(pol00 / (2 * acc_len)) # ...and therefore a different STD
gs = (1/8) / (0.293 * stds0) # Calculate the digital gains
gs *= (1<<17) # Multiply by 2^17 for packaging
gs[np.where(gs > (1<<31)-1)] = (1<<31)-1 # Clip the gains so that they fit
# Now the `gs` array is ready for writing to BRAM
...as implemented in
sparrow_albatros.AlbatrosDigitizer.get_optimal_coeffs_from_acc(chans)
Reads accumulator to set 4-bit digital gain coefficients
Assumes fpga is setup and well tuned.
self : AlbatrosDigitizer object for reading the acc
chans : numpy integer array can be used to index accumulator pols
Source code in software/sparrow_albatros.py
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334 | def get_optimal_coeffs_from_acc(self, chans):
"""Reads accumulator to set 4-bit digital gain coefficients
Assumes fpga is setup and well tuned.
self : AlbatrosDigitizer object for reading the acc
chans : numpy integer array can be used to index accumulator pols"""
# for the same channel, we want to apply same digital gain to each pol
quant4_delta = 1/8 # 0.125 is the quantization delta for 4-bit signed 4_3 as on fpga
# clips at plus/minus 0.875
quant4_optimal = 0.293 # optimal 15-level quantization delta for gaussian with std=1
_pols = self.read_pols(['pol00','pol11'])
acc_len = self.cfpga.registers.acc_len.read_uint() # number of spectra to accumulate
OPT_COEFF_MODE="bitgrowth"
if OPT_COEFF_MODE=="legacy":
# these are read as int64 but they are infact 64_35 for autocorr and 64_34 for xcorr
pol00,pol11 = _pols['pol00'] / (1<<35), _pols['pol11'] / (1<<35)
pol00_stds = np.sqrt(pol00 / (2*acc_len)) # stds of re or imaginary parts
pol11_stds = np.sqrt(pol11 / (2*acc_len))
coeffs_pol0 = np.zeros(2048) # hard coded num of chans as 2048
coeffs_pol1 = np.zeros(2048) # hard coded num of chans as 2048
coeffs_pol0[chans] = quant4_delta / (pol00_stds[chans] * quant4_optimal)
coeffs_pol1[chans] = quant4_delta / (pol11_stds[chans] * quant4_optimal)
coeffs_pol0[chans] *= (1<<17) # bram is re-interpreted as ufix 32_17
coeffs_pol1[chans] *= (1<<17) # bram is re-interpreted as ufix 32_17
elif OPT_COEFF_MODE=="bitgrowth":
pol00,pol11 = _pols['pol00'] / (1<<40), _pols['pol11'] / (1<<40)
pol00_stds = np.sqrt(pol00 / (2*acc_len))
pol11_stds = np.sqrt(pol11 / (2*acc_len))
g0 = (1/8) / (0.293 * pol00_stds) # optimal gain coefficients
g1 = (1/8) / (0.293 * pol11_stds) # optimal gain coefficients
coeffs_pol0 = np.zeros(2048) # hard coded num of chans as 2048
coeffs_pol1 = np.zeros(2048) # hard coded num of chans as 2048
coeffs_pol0[chans] = g0[chans] * (1<<15) * 1.414 # fudge factor sqrt 2
coeffs_pol1[chans] = g1[chans] * (1<<15) * 1.414 # fudge factor sqrt 2
# not sure where missing factor of two comes from
# sets stds to roughly 2.83 [plus-minus systematic 0.05])
coeffs_pol0[coeffs_pol0 > (1<<31)-1] = (1<<31)-1 # clip coeffs at max signed-int value
coeffs_pol1[coeffs_pol1 > (1<<31)-1] = (1<<31)-1 # clip coeffs at max signed-int value
coeffs_pol0 = np.array(coeffs_pol0 + 0.5, dtype='>I')
coeffs_pol1 = np.array(coeffs_pol1 + 0.5, dtype='>I')
return coeffs_pol0,coeffs_pol1
|
The communication stack
You may be wondering how Python is able to read from and write to FPGA registers.