Baseband
- What is baseband?
- Specify the header format?
- Refer to how it's packetized (in the gateware design overview) and talk about the limitations this imposes on 1-bit the data.
- Also data is read in multiples of four bytes over spi interface which limits flexibility in one-bit channels selectable.
- How to read and plot the data, pointers to Albatros analysis repository.
- Point out/document specific c-code that writes out the business.
Introduction to Baseband Data
Baseband data refers to the raw, digitized signal output from the ADCs after it has been processed through the PFB (Polyphase Filter Bank) and FFT (Fast Fourier Transform) stages, but before any correlation is performed. This data represents the complex frequency-domain representation of the original input signals from both polarizations.
In the Sparrow Albatros system, baseband data can be requantized to either 1-bit or 4-bit resolution before being packetized and transmitted over the network. This allows for flexible bandwidth usage depending on the scientific requirements of the observation.
Data Collection Process
The baseband data flows through the following processing chain:
- Analog signals are digitized by the ADCs at 250 MSPS
- The digital samples are processed through the PFB/FFT pipeline
- Data is requantized to either 1-bit or 4-bit resolution
- Selected frequency channels are reordered and packetized
- Packets are transmitted over UDP to a receiving computer
- The
dump_baseband
utility captures these packets and writes them to disk
For details on the baseband data collection process see the Embedded Software/Dumping Baseband page.
Baseband Data Format
Packetization and Limitations
As described in the gateware design overview, the baseband data is packetized in the FPGA before transmission. The packetization process imposes some limitations on the data, particularly for 1-bit mode:
- Data is bussified onto an 8-bit bus before packetization
- For 4-bit data (4 bits real + 4 bits imaginary for each polarization), this fundamentally limits us to 1024 of 2048 channels
- For 1-bit data, channels must be selected in multiples of four due to the SPI interface reading only in multiples of four bytes. Channels come in pairs starting with an even indexed channels because the gateware's 1-bit-data bussifier employs a barrel-switcher that pairs consecutive channels, further limiting the flexibility in channel selection.
UDP Packet Format
Each UDP packet contains: - A 4-byte spectrum number (specno) header - Multiple spectra of baseband data (the number of spectra per packet is configurable)
The number of spectra per packet is calculated to optimize data transfer, with a maximum of 30 spectra per packet. This is determined by the get_nspec
function in the dump_baseband
utility:
uint64_t get_nspec(uint64_t bytes_per_spec, uint64_t max_nbyte) {
uint64_t nspec = max_nbyte / bytes_per_spec;
if (nspec > 30) {
nspec = 30;
} else if (nspec < 1) {
log_message("WARNING: nspec<1, packets may be fragmented.\n");
nspec = 1;
}
return nspec;
}
Raw File Header Format
When baseband data is captured to disk, a comprehensive header is written to the start of each file. The header includes:
- Header size information
- Version information
- Packet and spectral metadata
- GPS timing and location data (if available)
- Channel indices
- Digital gain coefficients for both polarizations
The header is written in big-endian format to ensure cross-platform compatibility:
uint64_t file_header0[FH0SIZE] = {
to_big_endian(header_bytes), // The number of bytes in the header
to_big_endian(escape_seq_zero), // Escape sequence to distinguish from old format
to_big_endian(version_major), // Major version number
to_big_endian(version_minor), // Minor version number
to_big_endian(bytes_per_packet), // Number of bytes in each UDP packet payload
to_big_endian(lenchans), // Number of frequency channels
to_big_endian(spec_per_packet), // Number of spectra per packet
to_big_endian(bits), // Quantization bits (1 or 4)
to_big_endian(adc_clk), // ADC sampling rate in MHz
to_big_endian(fft_framelen), // FFT frame size
to_big_endian(station_id), // Station ID
to_big_endian(have_gps), // Flag indicating if GPS data is available
to_big_endian(time_s), // Unix timestamp (seconds)
to_big_endian(time_ns), // Nanosecond precision
to_big_endian_double(lattitude), // Latitude in degrees
to_big_endian_double(longitude), // Longitude in degrees
to_big_endian_double(elevation), // Altitude in meters
};
Following this initial header section, the file contains: - Channel indices for each selected channel - Digital gain coefficients for polarization 0 - Digital gain coefficients for polarization 1 - The actual baseband data packets
Capturing Baseband Data
The Sparrow Albatros system provides a dedicated C utility, dump_baseband
, to capture the UDP packets containing baseband data and write them to disk. This utility is optimized for high-throughput data capture with minimal packet loss.
Configuration
The dump_baseband
utility reads its configuration from a standard INI file (typically config.ini
). Important configuration parameters include:
[baseband]
channels=190:194 220:222 # Format: range1 range2 ...
file_size=1 # Size in GB
bits=4 # 1 for 1-bit mode, 4 for 4-bit mode
version_major=1 # Header version
version_minor=0 # Header version
[paths]
dump_baseband_output_directory=/path/to/output
coeffs_pol0_binary_path=/path/to/coeffs0
coeffs_pol1_binary_path=/path/to/coeffs1
[networking]
max_bytes_per_packet=8192 # Maximum UDP payload size
Implementation Details
The utility uses the libpcap library to capture network packets matching specific filter criteria:
char filter_exp[] = "udp and dst port 7417 and dst host 10.10.11.99 and src host 192.168.41.10";
Files are organized into directories named by timestamp slice to facilitate data management. Each raw file contains: 1. The header as described above 2. A sequence of UDP packet payloads containing the baseband data
The utility monitors for packet loss both within individual files and between consecutive files, logging statistics to help identify data quality issues.
Reading and Analyzing Baseband Data
To read and analyze the baseband data, refer to the Albatros Analysis repository, which provides Python utilities for:
- Reading the raw file format
- Extracting and interpreting the header information
- Processing the baseband data
- Plotting and visualizing the data
The header information can be loaded and parsed to understand the data parameters, and the actual data can be processed for scientific analysis.
C Code Implementation
The implementation of the dump_baseband
utility is found in software/dump_baseband.c
. Key functions include:
write_header
: Writes the comprehensive file headerget_config_from_ini
: Parses the INI configuration fileget_packets_per_file
: Calculates how many packets to write per fileparse_chans
: Parses channel specifications from the configurationset_coeffs_from_serialized_binary_files
: Loads digital gain coefficients
The main capture loop in the utility captures packets, extracts the payload, and writes them sequentially to disk, organizing files by timestamp.