D
David
Sun, Dec 15, 2024 9:48 PM
Hello all,
I apologize in advance for data dumping. I have made a 2 input/1 output
RFNoC block that requires repeatable synchronized DDC starts. My current
method of starting the DDC is not working as desired.
*Question - **How can I correctly start both DDC's so samples are available
on the same clock cycle, similar to the rx_samples_to_file, while still
using my 2 in/1 out RFNoC block? *
I would like to focus the conversation on my C++ implementation for now.
All my simulations have convinced me my block is consuming AXI-Stream data
correctly.
Problem
When starting two DDCs with timed commands sent to DDC in my C++
application, I am not getting the same result as the
rx_samples_to_file.cpp... rx_samples_to_file has repeatable alignment, and
mine has random. This has led me to believe the problem is in my
application and not my block. My Vivado simulations show my block is able
to consume the AXI-Stream transactions in parallel as I expect.
Considering sampling noise from a sig gen that is split to both inputs, I
see the following behavior:
rx_samples_to_file (base image) davids_samples_to_file (custom image)
DDC A samples ... X_1 Y_1 Z_1 ... DDC A samples ... X_1 Y_1 Z_1 ...
DDC B samples ... X_2 Y_2 Z_2 ... DDC B samples X_2 Y_2 Z_2 ... ...
*sample_1 is not equal to sample_2, but over a large number of samples they
will correlate well.
In the above example, the noise correlates as expected, but it is delayed
by 1 sample. When using my application, I have seen no delay (desired), and
also delay in the range of 5 samples.
C++ Implementation
[image: image.png]
I am using* uhd::rfnoc::ddc_block_control* types to issue the stream
command because I was having issues with my block propagating. Issuing to
the DDCs lets the data flow from 2 inputs to the 1 output, where the output
is either a file or loopback to transmit.
The base image with rx_samples_to_file uses a multi_usrp type, which
propagates the stream command from the rx_streamer.
RFNoC laydown
[image: image.png]
Data flows in both Tx loopback configuration and Rx to file configuration.
Methods and Symptoms
I have two methods of measuring the synchronization, with data collected by
ILA cores at either the output of DDC or input of custom block:
- *Math: *When receiving correlated noise, I can measure the cross
correlation and show that the correlation peaks as expected, and show the
delay between channels in samples.
-
Vivado Waveform Viewer: When the ILA cores are collecting DDC
channel data, I can see that the base image samples are available on the
same clock. My image does not have that behavior.
Thanks,
David
Hello all,
I apologize in advance for data dumping. I have made a 2 input/1 output
RFNoC block that requires repeatable synchronized DDC starts. My current
method of starting the DDC is not working as desired.
*Question - **How can I correctly start both DDC's so samples are available
on the same clock cycle, similar to the rx_samples_to_file, while still
using my 2 in/1 out RFNoC block? *
I would like to focus the conversation on my C++ implementation for now.
All my simulations have convinced me my block is consuming AXI-Stream data
correctly.
*Problem*
When starting two DDCs with timed commands sent to DDC in my C++
application, I am not getting the same result as the
rx_samples_to_file.cpp... rx_samples_to_file has repeatable alignment, and
mine has random. This has led me to believe the problem is in my
application and not my block. My Vivado simulations show my block is able
to consume the AXI-Stream transactions in parallel as I expect.
Considering sampling noise from a sig gen that is split to both inputs, I
see the following behavior:
rx_samples_to_file (base image) davids_samples_to_file (custom image)
DDC A samples ... X_1 Y_1 Z_1 ... DDC A samples ... X_1 Y_1 Z_1 ...
DDC B samples ... X_2 Y_2 Z_2 ... DDC B samples X_2 Y_2 Z_2 ... ...
*sample_1 is not equal to sample_2, but over a large number of samples they
will correlate well.
In the above example, the noise correlates as expected, but it is delayed
by 1 sample. When using my application, I have seen no delay (desired), and
also delay in the range of 5 samples.
*C++ Implementation*
[image: image.png]
I am using* uhd::rfnoc::ddc_block_control* types to issue the stream
command because I was having issues with my block propagating. Issuing to
the DDCs lets the data flow from 2 inputs to the 1 output, where the output
is either a file or loopback to transmit.
The base image with rx_samples_to_file uses a multi_usrp type, which
propagates the stream command from the rx_streamer.
*RFNoC laydown*
[image: image.png]
Data flows in both Tx loopback configuration and Rx to file configuration.
*Methods and Symptoms*
I have two methods of measuring the synchronization, with data collected by
ILA cores at either the output of DDC or input of custom block:
1. *Math: *When receiving correlated noise, I can measure the cross
correlation and show that the correlation peaks as expected, and show the
delay between channels in samples.
2. *Vivado Waveform Viewer*: When the ILA cores are collecting DDC
channel data, I can see that the base image samples are available on the
same clock. My image does not have that behavior.
Thanks,
David
MB
Martin Braun
Thu, Dec 19, 2024 11:23 AM
Hey David,
this looks like you've gotten pretty far on a sophisticated project! I
have a few observations:
- At first glance, your C++ looks correct.
- I would expect samples to arrive at your block synchronously based on
that. However, maybe I'm forgetting something that would cause the outputs
of the DDCs to misalign data by a few clock cycles. Which makes me wonder:
Are you sure your input packets are misaligned? In RFNoC, we make no
guarantee that the output of the DDC (or any other) block is aligned to the
clock cycle, because we encode the timestamp with the data. Meaning that
the first, actual sample that arrives at your block on each channel is in
fact time-aligned, they just arrive a few clock cycles apart. This is the
same logic that applies when packets arrive at the host computer, where we
make no assumptions that they arrive at the exact same time.
- If this is the issue, I think we have some modules you can use to
actually align samples within your block, or you just do some AXI alignment
yourself by combining the tready and tvalid signals of two streams.
- Side note, although it's not important: I would consider it a best
practice to have your block call
set_action_forwarding_policy(forwarding_policy_t::ONE_TO_FAN, "stream_cmd")
so it would properly forward stream commands, and then you can plop the
stream command into the streamer.
--M
On Sun, Dec 15, 2024 at 10:49 PM David vitishlsfan21@gmail.com wrote:
Hello all,
I apologize in advance for data dumping. I have made a 2 input/1 output
RFNoC block that requires repeatable synchronized DDC starts. My current
method of starting the DDC is not working as desired.
*Question - **How can I correctly start both DDC's so samples are
available on the same clock cycle, similar to the rx_samples_to_file, while
still using my 2 in/1 out RFNoC block? *
I would like to focus the conversation on my C++ implementation for now.
All my simulations have convinced me my block is consuming AXI-Stream data
correctly.
Problem
When starting two DDCs with timed commands sent to DDC in my C++
application, I am not getting the same result as the
rx_samples_to_file.cpp... rx_samples_to_file has repeatable alignment, and
mine has random. This has led me to believe the problem is in my
application and not my block. My Vivado simulations show my block is able
to consume the AXI-Stream transactions in parallel as I expect.
Considering sampling noise from a sig gen that is split to both inputs, I
see the following behavior:
rx_samples_to_file (base image) davids_samples_to_file (custom image)
DDC A samples ... X_1 Y_1 Z_1 ... DDC A samples ... X_1 Y_1 Z_1 ...
DDC B samples ... X_2 Y_2 Z_2 ... DDC B samples X_2 Y_2 Z_2 ... ...
*sample_1 is not equal to sample_2, but over a large number of samples
they will correlate well.
In the above example, the noise correlates as expected, but it is delayed
by 1 sample. When using my application, I have seen no delay (desired), and
also delay in the range of 5 samples.
C++ Implementation
[image: image.png]
I am using* uhd::rfnoc::ddc_block_control* types to issue the stream
command because I was having issues with my block propagating. Issuing to
the DDCs lets the data flow from 2 inputs to the 1 output, where the output
is either a file or loopback to transmit.
The base image with rx_samples_to_file uses a multi_usrp type, which
propagates the stream command from the rx_streamer.
RFNoC laydown
[image: image.png]
Data flows in both Tx loopback configuration and Rx to file configuration.
Methods and Symptoms
I have two methods of measuring the synchronization, with data collected
by ILA cores at either the output of DDC or input of custom block:
1. *Math: *When receiving correlated noise, I can measure the cross
correlation and show that the correlation peaks as expected, and show the
delay between channels in samples.
2. *Vivado Waveform Viewer*: When the ILA cores are collecting DDC
channel data, I can see that the base image samples are available on the
same clock. My image does not have that behavior.
Thanks,
David
USRP-users mailing list -- usrp-users@lists.ettus.com
To unsubscribe send an email to usrp-users-leave@lists.ettus.com
Hey David,
this looks like you've gotten pretty far on a sophisticated project! I
have a few observations:
- At first glance, your C++ looks correct.
- I would expect samples to arrive at your block synchronously based on
that. However, maybe I'm forgetting something that would cause the outputs
of the DDCs to misalign data by a few clock cycles. Which makes me wonder:
Are you sure your input packets are misaligned? In RFNoC, we make no
guarantee that the output of the DDC (or any other) block is aligned to the
clock cycle, because we encode the timestamp with the data. Meaning that
the first, actual sample that arrives at your block on each channel is in
fact time-aligned, they just arrive a few clock cycles apart. This is the
same logic that applies when packets arrive at the host computer, where we
make no assumptions that they arrive at the exact same time.
- If this is the issue, I think we have some modules you can use to
actually align samples within your block, or you just do some AXI alignment
yourself by combining the tready and tvalid signals of two streams.
- Side note, although it's not important: I would consider it a best
practice to have your block call
set_action_forwarding_policy(forwarding_policy_t::ONE_TO_FAN, "stream_cmd")
so it would properly forward stream commands, and then you can plop the
stream command into the streamer.
--M
On Sun, Dec 15, 2024 at 10:49 PM David <vitishlsfan21@gmail.com> wrote:
> Hello all,
>
> I apologize in advance for data dumping. I have made a 2 input/1 output
> RFNoC block that requires repeatable synchronized DDC starts. My current
> method of starting the DDC is not working as desired.
>
> *Question - **How can I correctly start both DDC's so samples are
> available on the same clock cycle, similar to the rx_samples_to_file, while
> still using my 2 in/1 out RFNoC block? *
> I would like to focus the conversation on my C++ implementation for now.
> All my simulations have convinced me my block is consuming AXI-Stream data
> correctly.
>
> *Problem*
> When starting two DDCs with timed commands sent to DDC in my C++
> application, I am not getting the same result as the
> rx_samples_to_file.cpp... rx_samples_to_file has repeatable alignment, and
> mine has random. This has led me to believe the problem is in my
> application and not my block. My Vivado simulations show my block is able
> to consume the AXI-Stream transactions in parallel as I expect.
>
> Considering sampling noise from a sig gen that is split to both inputs, I
> see the following behavior:
> rx_samples_to_file (base image) davids_samples_to_file (custom image)
> DDC A samples ... X_1 Y_1 Z_1 ... DDC A samples ... X_1 Y_1 Z_1 ...
> DDC B samples ... X_2 Y_2 Z_2 ... DDC B samples X_2 Y_2 Z_2 ... ...
>
> *sample_1 is not equal to sample_2, but over a large number of samples
> they will correlate well.
>
> In the above example, the noise correlates as expected, but it is delayed
> by 1 sample. When using my application, I have seen no delay (desired), and
> also delay in the range of 5 samples.
>
> *C++ Implementation*
> [image: image.png]
>
> I am using* uhd::rfnoc::ddc_block_control* types to issue the stream
> command because I was having issues with my block propagating. Issuing to
> the DDCs lets the data flow from 2 inputs to the 1 output, where the output
> is either a file or loopback to transmit.
>
> The base image with rx_samples_to_file uses a multi_usrp type, which
> propagates the stream command from the rx_streamer.
>
> *RFNoC laydown*
>
> [image: image.png]
>
> Data flows in both Tx loopback configuration and Rx to file configuration.
>
> *Methods and Symptoms*
> I have two methods of measuring the synchronization, with data collected
> by ILA cores at either the output of DDC or input of custom block:
>
> 1. *Math: *When receiving correlated noise, I can measure the cross
> correlation and show that the correlation peaks as expected, and show the
> delay between channels in samples.
> 2. *Vivado Waveform Viewer*: When the ILA cores are collecting DDC
> channel data, I can see that the base image samples are available on the
> same clock. My image does not have that behavior.
>
>
> Thanks,
>
> David
>
>
>
> _______________________________________________
> USRP-users mailing list -- usrp-users@lists.ettus.com
> To unsubscribe send an email to usrp-users-leave@lists.ettus.com
>
D
David
Fri, Dec 20, 2024 2:27 AM
Martin,
Thanks for the reply. I will take any modules you suggest for AXI
alignment, even if they do not "fix" my issue, it is good for me to look at.
-
thanks for the comment, this block is a long time coming.
-
We captured some screen shots of the ILA core recording both the base
image and my image. I also was able to add a dummy port on my image and run
the *rx_samples_to_file *on that (because it was statically connected),
which confirmed that the multi_usrp method producing the expected results,
with/without my block in line:
below I present some screenshots of the behavior, where the ILA is
capturing the output of both DDCs before packetization.* What is not
shown is the multi_usrp method running with my block, but it has the same
behavior as the base image**:*
Base Image, with rx_samples_to_file (multi_usrp)
Example 1: zoomed in run
[image: base_image_zoomed.PNG]
Example 2: different run, zoomed out. both DDCs perform as expected:
[image: base_image_zoomed_out.PNG]
Custom Image, with davids_rx_to_file (ddc_block_controller)
Example 1: random distance between samples on both DDCs, clear on DDC1. The
last 4 valids have a big change in cycle distance.
[image: random_dist.PNG]
Example 2: a different run, same behavior as above and time tags.
[image: time_tags.PNG]
Example 3: A run where it "almost" worked, and my block also "almost
worked". You can see the alignment slips at the end:
[image: Timing_mostly_aligned.PNG]
- right now in the yaml I am using the named inputs with one port each:
data:
fpga_iface: axis_data
clk_domain: rfnoc_chdr
inputs:
in_1:
num_ports: 1
...
in_2:
num_ports: 1
...
I have done some experiments with one named input with 2 port, and I see
that the AXI handshake is one packet with two parallel streams. I will try
to "AXI align" as you suggested with this first:
data:
fpga_iface: axis_data
clk_domain: rfnoc_chdr
inputs:
in:
num_ports: 2
...
- right now, since I want to issue the streaming command while doing record
to file and transmit loopback, I will start with the forwarding policy
as you suggested and also try to add my own issue stream command to my
block. It is not trivial for me since I am not a C++ person, so I won't be
able to provide much feedback on that effort.
Thanks,
David
On Thu, Dec 19, 2024 at 3:24 AM Martin Braun martin.braun@ettus.com wrote:
Hey David,
this looks like you've gotten pretty far on a sophisticated project! I
have a few observations:
- At first glance, your C++ looks correct.
- I would expect samples to arrive at your block synchronously based on
that. However, maybe I'm forgetting something that would cause the outputs
of the DDCs to misalign data by a few clock cycles. Which makes me wonder:
Are you sure your input packets are misaligned? In RFNoC, we make no
guarantee that the output of the DDC (or any other) block is aligned to the
clock cycle, because we encode the timestamp with the data. Meaning that
the first, actual sample that arrives at your block on each channel is in
fact time-aligned, they just arrive a few clock cycles apart. This is the
same logic that applies when packets arrive at the host computer, where we
make no assumptions that they arrive at the exact same time.
- If this is the issue, I think we have some modules you can use to
actually align samples within your block, or you just do some AXI alignment
yourself by combining the tready and tvalid signals of two streams.
- Side note, although it's not important: I would consider it a best
practice to have your block call
set_action_forwarding_policy(forwarding_policy_t::ONE_TO_FAN, "stream_cmd")
so it would properly forward stream commands, and then you can plop the
stream command into the streamer.
--M
On Sun, Dec 15, 2024 at 10:49 PM David vitishlsfan21@gmail.com wrote:
Hello all,
I apologize in advance for data dumping. I have made a 2 input/1 output
RFNoC block that requires repeatable synchronized DDC starts. My current
method of starting the DDC is not working as desired.
*Question - **How can I correctly start both DDC's so samples are
available on the same clock cycle, similar to the rx_samples_to_file, while
still using my 2 in/1 out RFNoC block? *
I would like to focus the conversation on my C++ implementation for now.
All my simulations have convinced me my block is consuming AXI-Stream data
correctly.
Problem
When starting two DDCs with timed commands sent to DDC in my C++
application, I am not getting the same result as the
rx_samples_to_file.cpp... rx_samples_to_file has repeatable alignment, and
mine has random. This has led me to believe the problem is in my
application and not my block. My Vivado simulations show my block is able
to consume the AXI-Stream transactions in parallel as I expect.
Considering sampling noise from a sig gen that is split to both inputs, I
see the following behavior:
rx_samples_to_file (base image) davids_samples_to_file (custom image)
DDC A samples ... X_1 Y_1 Z_1 ... DDC A samples ... X_1 Y_1 Z_1 ...
DDC B samples ... X_2 Y_2 Z_2 ... DDC B samples X_2 Y_2 Z_2 ... ...
*sample_1 is not equal to sample_2, but over a large number of samples
they will correlate well.
In the above example, the noise correlates as expected, but it is delayed
by 1 sample. When using my application, I have seen no delay (desired), and
also delay in the range of 5 samples.
C++ Implementation
[image: image.png]
I am using* uhd::rfnoc::ddc_block_control* types to issue the stream
command because I was having issues with my block propagating. Issuing to
the DDCs lets the data flow from 2 inputs to the 1 output, where the output
is either a file or loopback to transmit.
The base image with rx_samples_to_file uses a multi_usrp type, which
propagates the stream command from the rx_streamer.
RFNoC laydown
[image: image.png]
Data flows in both Tx loopback configuration and Rx to file configuration.
Methods and Symptoms
I have two methods of measuring the synchronization, with data collected
by ILA cores at either the output of DDC or input of custom block:
1. *Math: *When receiving correlated noise, I can measure the cross
correlation and show that the correlation peaks as expected, and show the
delay between channels in samples.
2. *Vivado Waveform Viewer*: When the ILA cores are collecting DDC
channel data, I can see that the base image samples are available on the
same clock. My image does not have that behavior.
Thanks,
David
USRP-users mailing list -- usrp-users@lists.ettus.com
To unsubscribe send an email to usrp-users-leave@lists.ettus.com
Martin,
Thanks for the reply. I will take any modules you suggest for AXI
alignment, even if they do not "fix" my issue, it is good for me to look at.
1. thanks for the comment, this block is a long time coming.
2. We captured some screen shots of the ILA core recording both the base
image and my image. I also was able to add a dummy port on my image and run
the *rx_samples_to_file *on that (because it was statically connected),
which confirmed that the multi_usrp method producing the expected results,
with/without my block in line:
below I present some screenshots of the behavior, where the ILA is
capturing the output of both DDCs *before* packetization.* What is not
shown is the multi_usrp method running with my block, but it has the same
behavior as the base image**:*
*Base Image, with rx_samples_to_file (multi_usrp)*
Example 1: zoomed in run
[image: base_image_zoomed.PNG]
Example 2: different run, zoomed out. both DDCs perform as expected:
[image: base_image_zoomed_out.PNG]
*Custom Image, with davids_rx_to_file (ddc_block_controller)*
Example 1: random distance between samples on both DDCs, clear on DDC1. The
last 4 valids have a big change in cycle distance.
[image: random_dist.PNG]
Example 2: a different run, same behavior as above and time tags.
[image: time_tags.PNG]
Example 3: A run where it "almost" worked, and my block also "almost
worked". You can see the alignment slips at the end:
[image: Timing_mostly_aligned.PNG]
3. right now in the yaml I am using the named inputs with one port each:
data:
fpga_iface: axis_data
clk_domain: rfnoc_chdr
inputs:
in_1:
num_ports: 1
...
in_2:
num_ports: 1
...
I have done some experiments with one named input with 2 port, and I see
that the AXI handshake is one packet with two parallel streams. I will try
to "AXI align" as you suggested with this first:
data:
fpga_iface: axis_data
clk_domain: rfnoc_chdr
inputs:
in:
num_ports: 2
...
4. right now, since I want to issue the streaming command while doing *record
to file* and *transmit loopback*, I will start with the forwarding policy
as you suggested and also try to add my own issue stream command to my
block. It is not trivial for me since I am not a C++ person, so I won't be
able to provide much feedback on that effort.
Thanks,
David
On Thu, Dec 19, 2024 at 3:24 AM Martin Braun <martin.braun@ettus.com> wrote:
> Hey David,
>
> this looks like you've gotten pretty far on a sophisticated project! I
> have a few observations:
>
> - At first glance, your C++ looks correct.
> - I would expect samples to arrive at your block synchronously based on
> that. However, maybe I'm forgetting something that would cause the outputs
> of the DDCs to misalign data by a few clock cycles. Which makes me wonder:
> Are you sure your input packets are misaligned? In RFNoC, we make no
> guarantee that the output of the DDC (or any other) block is aligned to the
> clock cycle, because we encode the timestamp with the data. Meaning that
> the first, actual sample that arrives at your block on each channel is in
> fact time-aligned, they just arrive a few clock cycles apart. This is the
> same logic that applies when packets arrive at the host computer, where we
> make no assumptions that they arrive at the exact same time.
> - If this is the issue, I think we have some modules you can use to
> actually align samples within your block, or you just do some AXI alignment
> yourself by combining the tready and tvalid signals of two streams.
> - Side note, although it's not important: I would consider it a best
> practice to have your block call
> set_action_forwarding_policy(forwarding_policy_t::ONE_TO_FAN, "stream_cmd")
> so it would properly forward stream commands, and then you can plop the
> stream command into the streamer.
>
> --M
>
> On Sun, Dec 15, 2024 at 10:49 PM David <vitishlsfan21@gmail.com> wrote:
>
>> Hello all,
>>
>> I apologize in advance for data dumping. I have made a 2 input/1 output
>> RFNoC block that requires repeatable synchronized DDC starts. My current
>> method of starting the DDC is not working as desired.
>>
>> *Question - **How can I correctly start both DDC's so samples are
>> available on the same clock cycle, similar to the rx_samples_to_file, while
>> still using my 2 in/1 out RFNoC block? *
>> I would like to focus the conversation on my C++ implementation for now.
>> All my simulations have convinced me my block is consuming AXI-Stream data
>> correctly.
>>
>> *Problem*
>> When starting two DDCs with timed commands sent to DDC in my C++
>> application, I am not getting the same result as the
>> rx_samples_to_file.cpp... rx_samples_to_file has repeatable alignment, and
>> mine has random. This has led me to believe the problem is in my
>> application and not my block. My Vivado simulations show my block is able
>> to consume the AXI-Stream transactions in parallel as I expect.
>>
>> Considering sampling noise from a sig gen that is split to both inputs, I
>> see the following behavior:
>> rx_samples_to_file (base image) davids_samples_to_file (custom image)
>> DDC A samples ... X_1 Y_1 Z_1 ... DDC A samples ... X_1 Y_1 Z_1 ...
>> DDC B samples ... X_2 Y_2 Z_2 ... DDC B samples X_2 Y_2 Z_2 ... ...
>>
>> *sample_1 is not equal to sample_2, but over a large number of samples
>> they will correlate well.
>>
>> In the above example, the noise correlates as expected, but it is delayed
>> by 1 sample. When using my application, I have seen no delay (desired), and
>> also delay in the range of 5 samples.
>>
>> *C++ Implementation*
>> [image: image.png]
>>
>> I am using* uhd::rfnoc::ddc_block_control* types to issue the stream
>> command because I was having issues with my block propagating. Issuing to
>> the DDCs lets the data flow from 2 inputs to the 1 output, where the output
>> is either a file or loopback to transmit.
>>
>> The base image with rx_samples_to_file uses a multi_usrp type, which
>> propagates the stream command from the rx_streamer.
>>
>> *RFNoC laydown*
>>
>> [image: image.png]
>>
>> Data flows in both Tx loopback configuration and Rx to file configuration.
>>
>> *Methods and Symptoms*
>> I have two methods of measuring the synchronization, with data collected
>> by ILA cores at either the output of DDC or input of custom block:
>>
>> 1. *Math: *When receiving correlated noise, I can measure the cross
>> correlation and show that the correlation peaks as expected, and show the
>> delay between channels in samples.
>> 2. *Vivado Waveform Viewer*: When the ILA cores are collecting DDC
>> channel data, I can see that the base image samples are available on the
>> same clock. My image does not have that behavior.
>>
>>
>> Thanks,
>>
>> David
>>
>>
>>
>> _______________________________________________
>> USRP-users mailing list -- usrp-users@lists.ettus.com
>> To unsubscribe send an email to usrp-users-leave@lists.ettus.com
>>
> _______________________________________________
> USRP-users mailing list -- usrp-users@lists.ettus.com
> To unsubscribe send an email to usrp-users-leave@lists.ettus.com
>
MB
Martin Braun
Fri, Dec 20, 2024 12:11 PM
David,
is it possible that your block is deasserting tready on one of its inputs,
thus delaying the DDC?
--M
On Fri, Dec 20, 2024 at 3:27 AM David vitishlsfan21@gmail.com wrote:
Martin,
Thanks for the reply. I will take any modules you suggest for AXI
alignment, even if they do not "fix" my issue, it is good for me to look at.
-
thanks for the comment, this block is a long time coming.
-
We captured some screen shots of the ILA core recording both the base
image and my image. I also was able to add a dummy port on my image and run
the *rx_samples_to_file *on that (because it was statically connected),
which confirmed that the multi_usrp method producing the expected results,
with/without my block in line:
below I present some screenshots of the behavior, where the ILA is
capturing the output of both DDCs before packetization.* What is not
shown is the multi_usrp method running with my block, but it has the same
behavior as the base image**:*
Base Image, with rx_samples_to_file (multi_usrp)
Example 1: zoomed in run
[image: base_image_zoomed.PNG]
Example 2: different run, zoomed out. both DDCs perform as expected:
[image: base_image_zoomed_out.PNG]
Custom Image, with davids_rx_to_file (ddc_block_controller)
Example 1: random distance between samples on both DDCs, clear on DDC1.
The last 4 valids have a big change in cycle distance.
[image: random_dist.PNG]
Example 2: a different run, same behavior as above and time tags.
[image: time_tags.PNG]
Example 3: A run where it "almost" worked, and my block also "almost
worked". You can see the alignment slips at the end:
[image: Timing_mostly_aligned.PNG]
- right now in the yaml I am using the named inputs with one port each:
data:
fpga_iface: axis_data
clk_domain: rfnoc_chdr
inputs:
in_1:
num_ports: 1
...
in_2:
num_ports: 1
...
I have done some experiments with one named input with 2 port, and I see
that the AXI handshake is one packet with two parallel streams. I will try
to "AXI align" as you suggested with this first:
data:
fpga_iface: axis_data
clk_domain: rfnoc_chdr
inputs:
in:
num_ports: 2
...
- right now, since I want to issue the streaming command while doing record
to file and transmit loopback, I will start with the forwarding policy
as you suggested and also try to add my own issue stream command to my
block. It is not trivial for me since I am not a C++ person, so I won't be
able to provide much feedback on that effort.
Thanks,
David
On Thu, Dec 19, 2024 at 3:24 AM Martin Braun martin.braun@ettus.com
wrote:
Hey David,
this looks like you've gotten pretty far on a sophisticated project! I
have a few observations:
- At first glance, your C++ looks correct.
- I would expect samples to arrive at your block synchronously based on
that. However, maybe I'm forgetting something that would cause the outputs
of the DDCs to misalign data by a few clock cycles. Which makes me wonder:
Are you sure your input packets are misaligned? In RFNoC, we make no
guarantee that the output of the DDC (or any other) block is aligned to the
clock cycle, because we encode the timestamp with the data. Meaning that
the first, actual sample that arrives at your block on each channel is in
fact time-aligned, they just arrive a few clock cycles apart. This is the
same logic that applies when packets arrive at the host computer, where we
make no assumptions that they arrive at the exact same time.
- If this is the issue, I think we have some modules you can use to
actually align samples within your block, or you just do some AXI alignment
yourself by combining the tready and tvalid signals of two streams.
- Side note, although it's not important: I would consider it a best
practice to have your block call
set_action_forwarding_policy(forwarding_policy_t::ONE_TO_FAN, "stream_cmd")
so it would properly forward stream commands, and then you can plop the
stream command into the streamer.
--M
On Sun, Dec 15, 2024 at 10:49 PM David vitishlsfan21@gmail.com wrote:
Hello all,
I apologize in advance for data dumping. I have made a 2 input/1 output
RFNoC block that requires repeatable synchronized DDC starts. My current
method of starting the DDC is not working as desired.
*Question - **How can I correctly start both DDC's so samples are
available on the same clock cycle, similar to the rx_samples_to_file, while
still using my 2 in/1 out RFNoC block? *
I would like to focus the conversation on my C++ implementation for now.
All my simulations have convinced me my block is consuming AXI-Stream data
correctly.
Problem
When starting two DDCs with timed commands sent to DDC in my C++
application, I am not getting the same result as the
rx_samples_to_file.cpp... rx_samples_to_file has repeatable alignment, and
mine has random. This has led me to believe the problem is in my
application and not my block. My Vivado simulations show my block is able
to consume the AXI-Stream transactions in parallel as I expect.
Considering sampling noise from a sig gen that is split to both inputs,
I see the following behavior:
rx_samples_to_file (base image) davids_samples_to_file (custom image)
DDC A samples ... X_1 Y_1 Z_1 ... DDC A samples ... X_1 Y_1 Z_1 ...
DDC B samples ... X_2 Y_2 Z_2 ... DDC B samples X_2 Y_2 Z_2 ... ...
*sample_1 is not equal to sample_2, but over a large number of samples
they will correlate well.
In the above example, the noise correlates as expected, but it is
delayed by 1 sample. When using my application, I have seen no delay
(desired), and also delay in the range of 5 samples.
C++ Implementation
[image: image.png]
I am using* uhd::rfnoc::ddc_block_control* types to issue the stream
command because I was having issues with my block propagating. Issuing to
the DDCs lets the data flow from 2 inputs to the 1 output, where the output
is either a file or loopback to transmit.
The base image with rx_samples_to_file uses a multi_usrp type, which
propagates the stream command from the rx_streamer.
RFNoC laydown
[image: image.png]
Data flows in both Tx loopback configuration and Rx to file
configuration.
Methods and Symptoms
I have two methods of measuring the synchronization, with data collected
by ILA cores at either the output of DDC or input of custom block:
1. *Math: *When receiving correlated noise, I can measure the cross
correlation and show that the correlation peaks as expected, and show the
delay between channels in samples.
2. *Vivado Waveform Viewer*: When the ILA cores are collecting DDC
channel data, I can see that the base image samples are available on the
same clock. My image does not have that behavior.
Thanks,
David
USRP-users mailing list -- usrp-users@lists.ettus.com
To unsubscribe send an email to usrp-users-leave@lists.ettus.com
David,
is it possible that your block is deasserting tready on one of its inputs,
thus delaying the DDC?
--M
On Fri, Dec 20, 2024 at 3:27 AM David <vitishlsfan21@gmail.com> wrote:
> Martin,
>
> Thanks for the reply. I will take any modules you suggest for AXI
> alignment, even if they do not "fix" my issue, it is good for me to look at.
>
> 1. thanks for the comment, this block is a long time coming.
>
> 2. We captured some screen shots of the ILA core recording both the base
> image and my image. I also was able to add a dummy port on my image and run
> the *rx_samples_to_file *on that (because it was statically connected),
> which confirmed that the multi_usrp method producing the expected results,
> with/without my block in line:
>
> below I present some screenshots of the behavior, where the ILA is
> capturing the output of both DDCs *before* packetization.* What is not
> shown is the multi_usrp method running with my block, but it has the same
> behavior as the base image**:*
>
> *Base Image, with rx_samples_to_file (multi_usrp)*
> Example 1: zoomed in run
> [image: base_image_zoomed.PNG]
> Example 2: different run, zoomed out. both DDCs perform as expected:
> [image: base_image_zoomed_out.PNG]
>
>
> *Custom Image, with davids_rx_to_file (ddc_block_controller)*
> Example 1: random distance between samples on both DDCs, clear on DDC1.
> The last 4 valids have a big change in cycle distance.
> [image: random_dist.PNG]
> Example 2: a different run, same behavior as above and time tags.
> [image: time_tags.PNG]
> Example 3: A run where it "almost" worked, and my block also "almost
> worked". You can see the alignment slips at the end:
> [image: Timing_mostly_aligned.PNG]
>
>
> 3. right now in the yaml I am using the named inputs with one port each:
>
> data:
> fpga_iface: axis_data
> clk_domain: rfnoc_chdr
> inputs:
> in_1:
> num_ports: 1
> ...
> in_2:
> num_ports: 1
> ...
>
> I have done some experiments with one named input with 2 port, and I see
> that the AXI handshake is one packet with two parallel streams. I will try
> to "AXI align" as you suggested with this first:
> data:
> fpga_iface: axis_data
> clk_domain: rfnoc_chdr
> inputs:
> in:
> num_ports: 2
> ...
>
> 4. right now, since I want to issue the streaming command while doing *record
> to file* and *transmit loopback*, I will start with the forwarding policy
> as you suggested and also try to add my own issue stream command to my
> block. It is not trivial for me since I am not a C++ person, so I won't be
> able to provide much feedback on that effort.
>
> Thanks,
>
> David
>
> On Thu, Dec 19, 2024 at 3:24 AM Martin Braun <martin.braun@ettus.com>
> wrote:
>
>> Hey David,
>>
>> this looks like you've gotten pretty far on a sophisticated project! I
>> have a few observations:
>>
>> - At first glance, your C++ looks correct.
>> - I would expect samples to arrive at your block synchronously based on
>> that. However, maybe I'm forgetting something that would cause the outputs
>> of the DDCs to misalign data by a few clock cycles. Which makes me wonder:
>> Are you sure your input packets are misaligned? In RFNoC, we make no
>> guarantee that the output of the DDC (or any other) block is aligned to the
>> clock cycle, because we encode the timestamp with the data. Meaning that
>> the first, actual sample that arrives at your block on each channel is in
>> fact time-aligned, they just arrive a few clock cycles apart. This is the
>> same logic that applies when packets arrive at the host computer, where we
>> make no assumptions that they arrive at the exact same time.
>> - If this is the issue, I think we have some modules you can use to
>> actually align samples within your block, or you just do some AXI alignment
>> yourself by combining the tready and tvalid signals of two streams.
>> - Side note, although it's not important: I would consider it a best
>> practice to have your block call
>> set_action_forwarding_policy(forwarding_policy_t::ONE_TO_FAN, "stream_cmd")
>> so it would properly forward stream commands, and then you can plop the
>> stream command into the streamer.
>>
>> --M
>>
>> On Sun, Dec 15, 2024 at 10:49 PM David <vitishlsfan21@gmail.com> wrote:
>>
>>> Hello all,
>>>
>>> I apologize in advance for data dumping. I have made a 2 input/1 output
>>> RFNoC block that requires repeatable synchronized DDC starts. My current
>>> method of starting the DDC is not working as desired.
>>>
>>> *Question - **How can I correctly start both DDC's so samples are
>>> available on the same clock cycle, similar to the rx_samples_to_file, while
>>> still using my 2 in/1 out RFNoC block? *
>>> I would like to focus the conversation on my C++ implementation for now.
>>> All my simulations have convinced me my block is consuming AXI-Stream data
>>> correctly.
>>>
>>> *Problem*
>>> When starting two DDCs with timed commands sent to DDC in my C++
>>> application, I am not getting the same result as the
>>> rx_samples_to_file.cpp... rx_samples_to_file has repeatable alignment, and
>>> mine has random. This has led me to believe the problem is in my
>>> application and not my block. My Vivado simulations show my block is able
>>> to consume the AXI-Stream transactions in parallel as I expect.
>>>
>>> Considering sampling noise from a sig gen that is split to both inputs,
>>> I see the following behavior:
>>> rx_samples_to_file (base image) davids_samples_to_file (custom image)
>>> DDC A samples ... X_1 Y_1 Z_1 ... DDC A samples ... X_1 Y_1 Z_1 ...
>>> DDC B samples ... X_2 Y_2 Z_2 ... DDC B samples X_2 Y_2 Z_2 ... ...
>>>
>>> *sample_1 is not equal to sample_2, but over a large number of samples
>>> they will correlate well.
>>>
>>> In the above example, the noise correlates as expected, but it is
>>> delayed by 1 sample. When using my application, I have seen no delay
>>> (desired), and also delay in the range of 5 samples.
>>>
>>> *C++ Implementation*
>>> [image: image.png]
>>>
>>> I am using* uhd::rfnoc::ddc_block_control* types to issue the stream
>>> command because I was having issues with my block propagating. Issuing to
>>> the DDCs lets the data flow from 2 inputs to the 1 output, where the output
>>> is either a file or loopback to transmit.
>>>
>>> The base image with rx_samples_to_file uses a multi_usrp type, which
>>> propagates the stream command from the rx_streamer.
>>>
>>> *RFNoC laydown*
>>>
>>> [image: image.png]
>>>
>>> Data flows in both Tx loopback configuration and Rx to file
>>> configuration.
>>>
>>> *Methods and Symptoms*
>>> I have two methods of measuring the synchronization, with data collected
>>> by ILA cores at either the output of DDC or input of custom block:
>>>
>>> 1. *Math: *When receiving correlated noise, I can measure the cross
>>> correlation and show that the correlation peaks as expected, and show the
>>> delay between channels in samples.
>>> 2. *Vivado Waveform Viewer*: When the ILA cores are collecting DDC
>>> channel data, I can see that the base image samples are available on the
>>> same clock. My image does not have that behavior.
>>>
>>>
>>> Thanks,
>>>
>>> David
>>>
>>>
>>>
>>> _______________________________________________
>>> USRP-users mailing list -- usrp-users@lists.ettus.com
>>> To unsubscribe send an email to usrp-users-leave@lists.ettus.com
>>>
>> _______________________________________________
>> USRP-users mailing list -- usrp-users@lists.ettus.com
>> To unsubscribe send an email to usrp-users-leave@lists.ettus.com
>>
>
D
David
Fri, Dec 20, 2024 10:08 PM
Martin,
I don't have waveform viewer screenshots yet of the inputs (working on it),
but I have run the simulation with a packet delayed 500 clock cycles on one
of my block's channels. I can see that my block "waits" for the second
channel, which aligns the axi transaction. This is because my block is an
HLS block that is data driven, and won't be ready unless it has both
inputs. I verified the output data from the delayed packet simulation.
Because of these factors, I think it is unlikely my block is deasserting
tready in my FPGA images.
Simulation output with a delayed packet on channel 1:
[image: delayed_port1_packet.png]
I also know the maximum sample rate we can run with on my block, and have
done many tests to ensure that my block is consuming data fast enough so
there are no overflows upstream.
My understanding of how the RFNoC packets work is that the output of the
DDC is filling a packet formed in the NoC shell, which is then released
once the 64 samples are filled. You can see that the DDC0 and DDC1 tready
in all my debug screenshots is always asserted, even in the non-working
cases. Likewise, on my blocks input, tvalid from the noc shell is always
asserted, while my blocks tready drives the transaction.
Where we are now, is that using the usrp and multi_usrp APIs, my block
works as expected. When using RFNoC API, which sets the rate on the DDC and
starts streaming, we get the problem behavior. Is it possible that DDC0 and
DDC1 are not sampling correctly when I am using RFNoC API to set the rate
and start streaming? I have seen a difference before between the APIs,
where the multi_usrp was able to set the center frequency on the base
image, and the RFNoC API kept the center frequency at 0 MHz.
I don't understand why the clock distance between the tvalids on DDC0 and
DDC1 would change in my previous images, which only happens on the RFNoC
API application. I would expect a ddc output to be equidistant based on the
output sample rate. This is where the debugging is in the DDC blocks
(uhd/fpga/usrp3/lib/rfnoc/ddc.v):
//! RFNoC specific digital down-conversion chain
module ddc #(
parameter SR_FREQ_ADDR = 0,
parameter SR_SCALE_IQ_ADDR = 1,
parameter SR_DECIM_ADDR = 2,
parameter SR_MUX_ADDR = 3,
parameter SR_COEFFS_ADDR = 4,
parameter PRELOAD_HBS = 1, // Preload half band filter state with 0s
parameter NUM_HB = 3,
parameter CIC_MAX_DECIM = 255,
parameter SAMPLE_WIDTH = 16,
parameter WIDTH = 24
)(
input clk, input reset,
input clear, // Resets everything except the timed phase inc FIFO and phase
inc
input set_stb, input [7:0] set_addr, input [31:0] set_data,
input timed_set_stb, input [7:0] timed_set_addr, input [31:0]
timed_set_data,
input [31:0] sample_in_tdata,
input sample_in_tvalid,
input sample_in_tlast,
(* dont_touch="true",mark_debug="true") output sample_in_tready,
input sample_in_tuser,
input sample_in_eob,
( dont_touch="true",mark_debug="true") output [31:0] sample_out_tdata,
( dont_touch="true",mark_debug="true"*) output sample_out_tvalid,
input sample_out_tready,
output sample_out_tlast
);
Thanks,
David
On Fri, Dec 20, 2024 at 4:11 AM Martin Braun martin.braun@ettus.com wrote:
David,
is it possible that your block is deasserting tready on one of its inputs,
thus delaying the DDC?
--M
On Fri, Dec 20, 2024 at 3:27 AM David vitishlsfan21@gmail.com wrote:
Martin,
Thanks for the reply. I will take any modules you suggest for AXI
alignment, even if they do not "fix" my issue, it is good for me to look at.
-
thanks for the comment, this block is a long time coming.
-
We captured some screen shots of the ILA core recording both the base
image and my image. I also was able to add a dummy port on my image and run
the *rx_samples_to_file *on that (because it was statically connected),
which confirmed that the multi_usrp method producing the expected results,
with/without my block in line:
below I present some screenshots of the behavior, where the ILA is
capturing the output of both DDCs before packetization.* What is not
shown is the multi_usrp method running with my block, but it has the same
behavior as the base image**:*
Base Image, with rx_samples_to_file (multi_usrp)
Example 1: zoomed in run
[image: base_image_zoomed.PNG]
Example 2: different run, zoomed out. both DDCs perform as expected:
[image: base_image_zoomed_out.PNG]
Custom Image, with davids_rx_to_file (ddc_block_controller)
Example 1: random distance between samples on both DDCs, clear on DDC1.
The last 4 valids have a big change in cycle distance.
[image: random_dist.PNG]
Example 2: a different run, same behavior as above and time tags.
[image: time_tags.PNG]
Example 3: A run where it "almost" worked, and my block also "almost
worked". You can see the alignment slips at the end:
[image: Timing_mostly_aligned.PNG]
- right now in the yaml I am using the named inputs with one port each:
data:
fpga_iface: axis_data
clk_domain: rfnoc_chdr
inputs:
in_1:
num_ports: 1
...
in_2:
num_ports: 1
...
I have done some experiments with one named input with 2 port, and I see
that the AXI handshake is one packet with two parallel streams. I will try
to "AXI align" as you suggested with this first:
data:
fpga_iface: axis_data
clk_domain: rfnoc_chdr
inputs:
in:
num_ports: 2
...
- right now, since I want to issue the streaming command while doing record
to file and transmit loopback, I will start with the forwarding
policy as you suggested and also try to add my own issue stream command to
my block. It is not trivial for me since I am not a C++ person, so I won't
be able to provide much feedback on that effort.
Thanks,
David
On Thu, Dec 19, 2024 at 3:24 AM Martin Braun martin.braun@ettus.com
wrote:
Hey David,
this looks like you've gotten pretty far on a sophisticated project! I
have a few observations:
- At first glance, your C++ looks correct.
- I would expect samples to arrive at your block synchronously based on
that. However, maybe I'm forgetting something that would cause the outputs
of the DDCs to misalign data by a few clock cycles. Which makes me wonder:
Are you sure your input packets are misaligned? In RFNoC, we make no
guarantee that the output of the DDC (or any other) block is aligned to the
clock cycle, because we encode the timestamp with the data. Meaning that
the first, actual sample that arrives at your block on each channel is in
fact time-aligned, they just arrive a few clock cycles apart. This is the
same logic that applies when packets arrive at the host computer, where we
make no assumptions that they arrive at the exact same time.
- If this is the issue, I think we have some modules you can use to
actually align samples within your block, or you just do some AXI alignment
yourself by combining the tready and tvalid signals of two streams.
- Side note, although it's not important: I would consider it a best
practice to have your block call
set_action_forwarding_policy(forwarding_policy_t::ONE_TO_FAN, "stream_cmd")
so it would properly forward stream commands, and then you can plop the
stream command into the streamer.
--M
On Sun, Dec 15, 2024 at 10:49 PM David vitishlsfan21@gmail.com wrote:
Hello all,
I apologize in advance for data dumping. I have made a 2 input/1 output
RFNoC block that requires repeatable synchronized DDC starts. My current
method of starting the DDC is not working as desired.
*Question - **How can I correctly start both DDC's so samples are
available on the same clock cycle, similar to the rx_samples_to_file, while
still using my 2 in/1 out RFNoC block? *
I would like to focus the conversation on my C++ implementation for
now. All my simulations have convinced me my block is consuming AXI-Stream
data correctly.
Problem
When starting two DDCs with timed commands sent to DDC in my C++
application, I am not getting the same result as the
rx_samples_to_file.cpp... rx_samples_to_file has repeatable alignment, and
mine has random. This has led me to believe the problem is in my
application and not my block. My Vivado simulations show my block is able
to consume the AXI-Stream transactions in parallel as I expect.
Considering sampling noise from a sig gen that is split to both inputs,
I see the following behavior:
rx_samples_to_file (base image) davids_samples_to_file (custom image)
DDC A samples ... X_1 Y_1 Z_1 ... DDC A samples ... X_1 Y_1 Z_1 ...
DDC B samples ... X_2 Y_2 Z_2 ... DDC B samples X_2 Y_2 Z_2 ... ...
*sample_1 is not equal to sample_2, but over a large number of samples
they will correlate well.
In the above example, the noise correlates as expected, but it is
delayed by 1 sample. When using my application, I have seen no delay
(desired), and also delay in the range of 5 samples.
C++ Implementation
[image: image.png]
I am using* uhd::rfnoc::ddc_block_control* types to issue the stream
command because I was having issues with my block propagating. Issuing to
the DDCs lets the data flow from 2 inputs to the 1 output, where the output
is either a file or loopback to transmit.
The base image with rx_samples_to_file uses a multi_usrp type, which
propagates the stream command from the rx_streamer.
RFNoC laydown
[image: image.png]
Data flows in both Tx loopback configuration and Rx to file
configuration.
Methods and Symptoms
I have two methods of measuring the synchronization, with data
collected by ILA cores at either the output of DDC or input of custom
block:
1. *Math: *When receiving correlated noise, I can measure the cross
correlation and show that the correlation peaks as expected, and show the
delay between channels in samples.
2. *Vivado Waveform Viewer*: When the ILA cores are collecting DDC
channel data, I can see that the base image samples are available on the
same clock. My image does not have that behavior.
Thanks,
David
USRP-users mailing list -- usrp-users@lists.ettus.com
To unsubscribe send an email to usrp-users-leave@lists.ettus.com
Martin,
I don't have waveform viewer screenshots yet of the inputs (working on it),
but I have run the simulation with a packet delayed 500 clock cycles on one
of my block's channels. I can see that my block "waits" for the second
channel, which aligns the axi transaction. This is because my block is an
HLS block that is data driven, and won't be ready unless it has both
inputs. I verified the output data from the delayed packet simulation.
Because of these factors, I think it is unlikely my block is deasserting
tready in my FPGA images.
Simulation output with a delayed packet on channel 1:
[image: delayed_port1_packet.png]
I also know the maximum sample rate we can run with on my block, and have
done many tests to ensure that my block is consuming data fast enough so
there are no overflows upstream.
My understanding of how the RFNoC packets work is that the output of the
DDC is filling a packet formed in the NoC shell, which is then released
once the 64 samples are filled. You can see that the DDC0 and DDC1 *tready*
in all my debug screenshots is always asserted, even in the non-working
cases. Likewise, on my blocks input, tvalid from the noc shell is always
asserted, while my blocks tready drives the transaction.
Where we are now, is that using the usrp and multi_usrp APIs, my block
works as expected. When using RFNoC API, which sets the rate on the DDC and
starts streaming, we get the problem behavior. Is it possible that DDC0 and
DDC1 are not sampling correctly when I am using RFNoC API to set the rate
and start streaming? I have seen a difference before between the APIs,
where the multi_usrp was able to set the center frequency on the base
image, and the RFNoC API kept the center frequency at 0 MHz.
I don't understand why the clock distance between the tvalids on DDC0 and
DDC1 would change in my previous images, which only happens on the RFNoC
API application. I would expect a ddc output to be equidistant based on the
output sample rate. This is where the debugging is in the DDC blocks
(uhd/fpga/usrp3/lib/rfnoc/ddc.v):
//! RFNoC specific digital down-conversion chain
module ddc #(
parameter SR_FREQ_ADDR = 0,
parameter SR_SCALE_IQ_ADDR = 1,
parameter SR_DECIM_ADDR = 2,
parameter SR_MUX_ADDR = 3,
parameter SR_COEFFS_ADDR = 4,
parameter PRELOAD_HBS = 1, // Preload half band filter state with 0s
parameter NUM_HB = 3,
parameter CIC_MAX_DECIM = 255,
parameter SAMPLE_WIDTH = 16,
parameter WIDTH = 24
)(
input clk, input reset,
input clear, // Resets everything except the timed phase inc FIFO and phase
inc
input set_stb, input [7:0] set_addr, input [31:0] set_data,
input timed_set_stb, input [7:0] timed_set_addr, input [31:0]
timed_set_data,
input [31:0] sample_in_tdata,
input sample_in_tvalid,
input sample_in_tlast,
(* dont_touch="true",mark_debug="true"*) output sample_in_tready,
input sample_in_tuser,
input sample_in_eob,
(* dont_touch="true",mark_debug="true"*) output [31:0] sample_out_tdata,
(* dont_touch="true",mark_debug="true"*) output sample_out_tvalid,
input sample_out_tready,
output sample_out_tlast
);
Thanks,
David
On Fri, Dec 20, 2024 at 4:11 AM Martin Braun <martin.braun@ettus.com> wrote:
> David,
>
> is it possible that your block is deasserting tready on one of its inputs,
> thus delaying the DDC?
>
> --M
>
> On Fri, Dec 20, 2024 at 3:27 AM David <vitishlsfan21@gmail.com> wrote:
>
>> Martin,
>>
>> Thanks for the reply. I will take any modules you suggest for AXI
>> alignment, even if they do not "fix" my issue, it is good for me to look at.
>>
>> 1. thanks for the comment, this block is a long time coming.
>>
>> 2. We captured some screen shots of the ILA core recording both the base
>> image and my image. I also was able to add a dummy port on my image and run
>> the *rx_samples_to_file *on that (because it was statically connected),
>> which confirmed that the multi_usrp method producing the expected results,
>> with/without my block in line:
>>
>> below I present some screenshots of the behavior, where the ILA is
>> capturing the output of both DDCs *before* packetization.* What is not
>> shown is the multi_usrp method running with my block, but it has the same
>> behavior as the base image**:*
>>
>> *Base Image, with rx_samples_to_file (multi_usrp)*
>> Example 1: zoomed in run
>> [image: base_image_zoomed.PNG]
>> Example 2: different run, zoomed out. both DDCs perform as expected:
>> [image: base_image_zoomed_out.PNG]
>>
>>
>> *Custom Image, with davids_rx_to_file (ddc_block_controller)*
>> Example 1: random distance between samples on both DDCs, clear on DDC1.
>> The last 4 valids have a big change in cycle distance.
>> [image: random_dist.PNG]
>> Example 2: a different run, same behavior as above and time tags.
>> [image: time_tags.PNG]
>> Example 3: A run where it "almost" worked, and my block also "almost
>> worked". You can see the alignment slips at the end:
>> [image: Timing_mostly_aligned.PNG]
>>
>>
>> 3. right now in the yaml I am using the named inputs with one port each:
>>
>> data:
>> fpga_iface: axis_data
>> clk_domain: rfnoc_chdr
>> inputs:
>> in_1:
>> num_ports: 1
>> ...
>> in_2:
>> num_ports: 1
>> ...
>>
>> I have done some experiments with one named input with 2 port, and I see
>> that the AXI handshake is one packet with two parallel streams. I will try
>> to "AXI align" as you suggested with this first:
>> data:
>> fpga_iface: axis_data
>> clk_domain: rfnoc_chdr
>> inputs:
>> in:
>> num_ports: 2
>> ...
>>
>> 4. right now, since I want to issue the streaming command while doing *record
>> to file* and *transmit loopback*, I will start with the forwarding
>> policy as you suggested and also try to add my own issue stream command to
>> my block. It is not trivial for me since I am not a C++ person, so I won't
>> be able to provide much feedback on that effort.
>>
>> Thanks,
>>
>> David
>>
>> On Thu, Dec 19, 2024 at 3:24 AM Martin Braun <martin.braun@ettus.com>
>> wrote:
>>
>>> Hey David,
>>>
>>> this looks like you've gotten pretty far on a sophisticated project! I
>>> have a few observations:
>>>
>>> - At first glance, your C++ looks correct.
>>> - I would expect samples to arrive at your block synchronously based on
>>> that. However, maybe I'm forgetting something that would cause the outputs
>>> of the DDCs to misalign data by a few clock cycles. Which makes me wonder:
>>> Are you sure your input packets are misaligned? In RFNoC, we make no
>>> guarantee that the output of the DDC (or any other) block is aligned to the
>>> clock cycle, because we encode the timestamp with the data. Meaning that
>>> the first, actual sample that arrives at your block on each channel is in
>>> fact time-aligned, they just arrive a few clock cycles apart. This is the
>>> same logic that applies when packets arrive at the host computer, where we
>>> make no assumptions that they arrive at the exact same time.
>>> - If this is the issue, I think we have some modules you can use to
>>> actually align samples within your block, or you just do some AXI alignment
>>> yourself by combining the tready and tvalid signals of two streams.
>>> - Side note, although it's not important: I would consider it a best
>>> practice to have your block call
>>> set_action_forwarding_policy(forwarding_policy_t::ONE_TO_FAN, "stream_cmd")
>>> so it would properly forward stream commands, and then you can plop the
>>> stream command into the streamer.
>>>
>>> --M
>>>
>>> On Sun, Dec 15, 2024 at 10:49 PM David <vitishlsfan21@gmail.com> wrote:
>>>
>>>> Hello all,
>>>>
>>>> I apologize in advance for data dumping. I have made a 2 input/1 output
>>>> RFNoC block that requires repeatable synchronized DDC starts. My current
>>>> method of starting the DDC is not working as desired.
>>>>
>>>> *Question - **How can I correctly start both DDC's so samples are
>>>> available on the same clock cycle, similar to the rx_samples_to_file, while
>>>> still using my 2 in/1 out RFNoC block? *
>>>> I would like to focus the conversation on my C++ implementation for
>>>> now. All my simulations have convinced me my block is consuming AXI-Stream
>>>> data correctly.
>>>>
>>>> *Problem*
>>>> When starting two DDCs with timed commands sent to DDC in my C++
>>>> application, I am not getting the same result as the
>>>> rx_samples_to_file.cpp... rx_samples_to_file has repeatable alignment, and
>>>> mine has random. This has led me to believe the problem is in my
>>>> application and not my block. My Vivado simulations show my block is able
>>>> to consume the AXI-Stream transactions in parallel as I expect.
>>>>
>>>> Considering sampling noise from a sig gen that is split to both inputs,
>>>> I see the following behavior:
>>>> rx_samples_to_file (base image) davids_samples_to_file (custom image)
>>>> DDC A samples ... X_1 Y_1 Z_1 ... DDC A samples ... X_1 Y_1 Z_1 ...
>>>> DDC B samples ... X_2 Y_2 Z_2 ... DDC B samples X_2 Y_2 Z_2 ... ...
>>>>
>>>> *sample_1 is not equal to sample_2, but over a large number of samples
>>>> they will correlate well.
>>>>
>>>> In the above example, the noise correlates as expected, but it is
>>>> delayed by 1 sample. When using my application, I have seen no delay
>>>> (desired), and also delay in the range of 5 samples.
>>>>
>>>> *C++ Implementation*
>>>> [image: image.png]
>>>>
>>>> I am using* uhd::rfnoc::ddc_block_control* types to issue the stream
>>>> command because I was having issues with my block propagating. Issuing to
>>>> the DDCs lets the data flow from 2 inputs to the 1 output, where the output
>>>> is either a file or loopback to transmit.
>>>>
>>>> The base image with rx_samples_to_file uses a multi_usrp type, which
>>>> propagates the stream command from the rx_streamer.
>>>>
>>>> *RFNoC laydown*
>>>>
>>>> [image: image.png]
>>>>
>>>> Data flows in both Tx loopback configuration and Rx to file
>>>> configuration.
>>>>
>>>> *Methods and Symptoms*
>>>> I have two methods of measuring the synchronization, with data
>>>> collected by ILA cores at either the output of DDC or input of custom
>>>> block:
>>>>
>>>> 1. *Math: *When receiving correlated noise, I can measure the cross
>>>> correlation and show that the correlation peaks as expected, and show the
>>>> delay between channels in samples.
>>>> 2. *Vivado Waveform Viewer*: When the ILA cores are collecting DDC
>>>> channel data, I can see that the base image samples are available on the
>>>> same clock. My image does not have that behavior.
>>>>
>>>>
>>>> Thanks,
>>>>
>>>> David
>>>>
>>>>
>>>>
>>>> _______________________________________________
>>>> USRP-users mailing list -- usrp-users@lists.ettus.com
>>>> To unsubscribe send an email to usrp-users-leave@lists.ettus.com
>>>>
>>> _______________________________________________
>>> USRP-users mailing list -- usrp-users@lists.ettus.com
>>> To unsubscribe send an email to usrp-users-leave@lists.ettus.com
>>>
>>
RK
Rob Kossler
Mon, Dec 23, 2024 4:28 PM
Hi David,
Your email distinguishes between the multi_usrp API and the rfnoc API. But,
under the hood, the multi_usrp API
https://github.com/EttusResearch/uhd/blob/master/host/lib/usrp/multi_usrp_rfnoc.cpp
implements all of its functionality with the rfnoc API. So, it seems that
the multi_usrp implementation (using rfnoc API commands) is doing something
different than your own implementation (using rfnoc API commands). I
realize that this is not a very helpful comment but perhaps if you take a
closer look at the multi_usrp_rfnoc class, you might find something
different in the underlying commands.
Rob
On Fri, Dec 20, 2024 at 5:10 PM David vitishlsfan21@gmail.com wrote:
Martin,
I don't have waveform viewer screenshots yet of the inputs (working on
it), but I have run the simulation with a packet delayed 500 clock cycles
on one of my block's channels. I can see that my block "waits" for the
second channel, which aligns the axi transaction. This is because my
block is an HLS block that is data driven, and won't be ready unless it has
both inputs. I verified the output data from the delayed packet simulation.
Because of these factors, I think it is unlikely my block is deasserting
tready in my FPGA images.
Simulation output with a delayed packet on channel 1:
[image: delayed_port1_packet.png]
I also know the maximum sample rate we can run with on my block, and have
done many tests to ensure that my block is consuming data fast enough so
there are no overflows upstream.
My understanding of how the RFNoC packets work is that the output of the
DDC is filling a packet formed in the NoC shell, which is then released
once the 64 samples are filled. You can see that the DDC0 and DDC1
tready in all my debug screenshots is always asserted, even in the
non-working cases. Likewise, on my blocks input, tvalid from the noc shell
is always asserted, while my blocks tready drives the transaction.
Where we are now, is that using the usrp and multi_usrp APIs, my block
works as expected. When using RFNoC API, which sets the rate on the DDC and
starts streaming, we get the problem behavior. Is it possible that DDC0 and
DDC1 are not sampling correctly when I am using RFNoC API to set the rate
and start streaming? I have seen a difference before between the APIs,
where the multi_usrp was able to set the center frequency on the base
image, and the RFNoC API kept the center frequency at 0 MHz.
I don't understand why the clock distance between the tvalids on DDC0 and
DDC1 would change in my previous images, which only happens on the RFNoC
API application. I would expect a ddc output to be equidistant based on the
output sample rate. This is where the debugging is in the DDC blocks
(uhd/fpga/usrp3/lib/rfnoc/ddc.v):
//! RFNoC specific digital down-conversion chain
module ddc #(
parameter SR_FREQ_ADDR = 0,
parameter SR_SCALE_IQ_ADDR = 1,
parameter SR_DECIM_ADDR = 2,
parameter SR_MUX_ADDR = 3,
parameter SR_COEFFS_ADDR = 4,
parameter PRELOAD_HBS = 1, // Preload half band filter state with 0s
parameter NUM_HB = 3,
parameter CIC_MAX_DECIM = 255,
parameter SAMPLE_WIDTH = 16,
parameter WIDTH = 24
)(
input clk, input reset,
input clear, // Resets everything except the timed phase inc FIFO and
phase inc
input set_stb, input [7:0] set_addr, input [31:0] set_data,
input timed_set_stb, input [7:0] timed_set_addr, input [31:0]
timed_set_data,
input [31:0] sample_in_tdata,
input sample_in_tvalid,
input sample_in_tlast,
(* dont_touch="true",mark_debug="true") output sample_in_tready,
input sample_in_tuser,
input sample_in_eob,
( dont_touch="true",mark_debug="true") output [31:0] sample_out_tdata,
( dont_touch="true",mark_debug="true"*) output sample_out_tvalid,
input sample_out_tready,
output sample_out_tlast
);
Thanks,
David
On Fri, Dec 20, 2024 at 4:11 AM Martin Braun martin.braun@ettus.com
wrote:
David,
is it possible that your block is deasserting tready on one of its
inputs, thus delaying the DDC?
--M
On Fri, Dec 20, 2024 at 3:27 AM David vitishlsfan21@gmail.com wrote:
Martin,
Thanks for the reply. I will take any modules you suggest for AXI
alignment, even if they do not "fix" my issue, it is good for me to look at.
-
thanks for the comment, this block is a long time coming.
-
We captured some screen shots of the ILA core recording both the base
image and my image. I also was able to add a dummy port on my image and run
the *rx_samples_to_file *on that (because it was statically connected),
which confirmed that the multi_usrp method producing the expected results,
with/without my block in line:
below I present some screenshots of the behavior, where the ILA is
capturing the output of both DDCs before packetization.* What is not
shown is the multi_usrp method running with my block, but it has the same
behavior as the base image**:*
Base Image, with rx_samples_to_file (multi_usrp)
Example 1: zoomed in run
[image: base_image_zoomed.PNG]
Example 2: different run, zoomed out. both DDCs perform as expected:
[image: base_image_zoomed_out.PNG]
Custom Image, with davids_rx_to_file (ddc_block_controller)
Example 1: random distance between samples on both DDCs, clear on DDC1.
The last 4 valids have a big change in cycle distance.
[image: random_dist.PNG]
Example 2: a different run, same behavior as above and time tags.
[image: time_tags.PNG]
Example 3: A run where it "almost" worked, and my block also "almost
worked". You can see the alignment slips at the end:
[image: Timing_mostly_aligned.PNG]
- right now in the yaml I am using the named inputs with one port each:
data:
fpga_iface: axis_data
clk_domain: rfnoc_chdr
inputs:
in_1:
num_ports: 1
...
in_2:
num_ports: 1
...
I have done some experiments with one named input with 2 port, and I see
that the AXI handshake is one packet with two parallel streams. I will try
to "AXI align" as you suggested with this first:
data:
fpga_iface: axis_data
clk_domain: rfnoc_chdr
inputs:
in:
num_ports: 2
...
- right now, since I want to issue the streaming command while doing record
to file and transmit loopback, I will start with the forwarding
policy as you suggested and also try to add my own issue stream command to
my block. It is not trivial for me since I am not a C++ person, so I won't
be able to provide much feedback on that effort.
Thanks,
David
On Thu, Dec 19, 2024 at 3:24 AM Martin Braun martin.braun@ettus.com
wrote:
Hey David,
this looks like you've gotten pretty far on a sophisticated project! I
have a few observations:
- At first glance, your C++ looks correct.
- I would expect samples to arrive at your block synchronously based on
that. However, maybe I'm forgetting something that would cause the outputs
of the DDCs to misalign data by a few clock cycles. Which makes me wonder:
Are you sure your input packets are misaligned? In RFNoC, we make no
guarantee that the output of the DDC (or any other) block is aligned to the
clock cycle, because we encode the timestamp with the data. Meaning that
the first, actual sample that arrives at your block on each channel is in
fact time-aligned, they just arrive a few clock cycles apart. This is the
same logic that applies when packets arrive at the host computer, where we
make no assumptions that they arrive at the exact same time.
- If this is the issue, I think we have some modules you can use to
actually align samples within your block, or you just do some AXI alignment
yourself by combining the tready and tvalid signals of two streams.
- Side note, although it's not important: I would consider it a best
practice to have your block call
set_action_forwarding_policy(forwarding_policy_t::ONE_TO_FAN, "stream_cmd")
so it would properly forward stream commands, and then you can plop the
stream command into the streamer.
--M
On Sun, Dec 15, 2024 at 10:49 PM David vitishlsfan21@gmail.com wrote:
Hello all,
I apologize in advance for data dumping. I have made a 2 input/1
output RFNoC block that requires repeatable synchronized DDC starts. My
current method of starting the DDC is not working as desired.
*Question - **How can I correctly start both DDC's so samples are
available on the same clock cycle, similar to the rx_samples_to_file, while
still using my 2 in/1 out RFNoC block? *
I would like to focus the conversation on my C++ implementation for
now. All my simulations have convinced me my block is consuming AXI-Stream
data correctly.
Problem
When starting two DDCs with timed commands sent to DDC in my C++
application, I am not getting the same result as the
rx_samples_to_file.cpp... rx_samples_to_file has repeatable alignment, and
mine has random. This has led me to believe the problem is in my
application and not my block. My Vivado simulations show my block is able
to consume the AXI-Stream transactions in parallel as I expect.
Considering sampling noise from a sig gen that is split to both
inputs, I see the following behavior:
rx_samples_to_file (base image) davids_samples_to_file (custom image)
DDC A samples ... X_1 Y_1 Z_1 ... DDC A samples ... X_1 Y_1 Z_1 ...
DDC B samples ... X_2 Y_2 Z_2 ... DDC B samples X_2 Y_2 Z_2 ... ...
*sample_1 is not equal to sample_2, but over a large number of samples
they will correlate well.
In the above example, the noise correlates as expected, but it is
delayed by 1 sample. When using my application, I have seen no delay
(desired), and also delay in the range of 5 samples.
C++ Implementation
[image: image.png]
I am using* uhd::rfnoc::ddc_block_control* types to issue the stream
command because I was having issues with my block propagating. Issuing to
the DDCs lets the data flow from 2 inputs to the 1 output, where the output
is either a file or loopback to transmit.
The base image with rx_samples_to_file uses a multi_usrp type, which
propagates the stream command from the rx_streamer.
RFNoC laydown
[image: image.png]
Data flows in both Tx loopback configuration and Rx to file
configuration.
Methods and Symptoms
I have two methods of measuring the synchronization, with data
collected by ILA cores at either the output of DDC or input of custom
block:
1. *Math: *When receiving correlated noise, I can measure the
cross correlation and show that the correlation peaks as expected, and show
the delay between channels in samples.
2. *Vivado Waveform Viewer*: When the ILA cores are collecting DDC
channel data, I can see that the base image samples are available on the
same clock. My image does not have that behavior.
Thanks,
David
USRP-users mailing list -- usrp-users@lists.ettus.com
To unsubscribe send an email to usrp-users-leave@lists.ettus.com
Hi David,
Your email distinguishes between the multi_usrp API and the rfnoc API. But,
under the hood, the multi_usrp API
<https://github.com/EttusResearch/uhd/blob/master/host/lib/usrp/multi_usrp_rfnoc.cpp>
implements all of its functionality with the rfnoc API. So, it seems that
the multi_usrp implementation (using rfnoc API commands) is doing something
different than your own implementation (using rfnoc API commands). I
realize that this is not a very helpful comment but perhaps if you take a
closer look at the multi_usrp_rfnoc class, you might find something
different in the underlying commands.
Rob
On Fri, Dec 20, 2024 at 5:10 PM David <vitishlsfan21@gmail.com> wrote:
> Martin,
>
> I don't have waveform viewer screenshots yet of the inputs (working on
> it), but I have run the simulation with a packet delayed 500 clock cycles
> on one of my block's channels. I can see that my block "waits" for the
> second channel, which aligns the axi transaction. This is because my
> block is an HLS block that is data driven, and won't be ready unless it has
> both inputs. I verified the output data from the delayed packet simulation.
> Because of these factors, I think it is unlikely my block is deasserting
> tready in my FPGA images.
>
> Simulation output with a delayed packet on channel 1:
> [image: delayed_port1_packet.png]
>
>
> I also know the maximum sample rate we can run with on my block, and have
> done many tests to ensure that my block is consuming data fast enough so
> there are no overflows upstream.
>
> My understanding of how the RFNoC packets work is that the output of the
> DDC is filling a packet formed in the NoC shell, which is then released
> once the 64 samples are filled. You can see that the DDC0 and DDC1
> *tready* in all my debug screenshots is always asserted, even in the
> non-working cases. Likewise, on my blocks input, tvalid from the noc shell
> is always asserted, while my blocks tready drives the transaction.
>
> Where we are now, is that using the usrp and multi_usrp APIs, my block
> works as expected. When using RFNoC API, which sets the rate on the DDC and
> starts streaming, we get the problem behavior. Is it possible that DDC0 and
> DDC1 are not sampling correctly when I am using RFNoC API to set the rate
> and start streaming? I have seen a difference before between the APIs,
> where the multi_usrp was able to set the center frequency on the base
> image, and the RFNoC API kept the center frequency at 0 MHz.
>
> I don't understand why the clock distance between the tvalids on DDC0 and
> DDC1 would change in my previous images, which only happens on the RFNoC
> API application. I would expect a ddc output to be equidistant based on the
> output sample rate. This is where the debugging is in the DDC blocks
> (uhd/fpga/usrp3/lib/rfnoc/ddc.v):
>
> //! RFNoC specific digital down-conversion chain
>
> module ddc #(
> parameter SR_FREQ_ADDR = 0,
> parameter SR_SCALE_IQ_ADDR = 1,
> parameter SR_DECIM_ADDR = 2,
> parameter SR_MUX_ADDR = 3,
> parameter SR_COEFFS_ADDR = 4,
> parameter PRELOAD_HBS = 1, // Preload half band filter state with 0s
> parameter NUM_HB = 3,
> parameter CIC_MAX_DECIM = 255,
> parameter SAMPLE_WIDTH = 16,
> parameter WIDTH = 24
> )(
> input clk, input reset,
> input clear, // Resets everything except the timed phase inc FIFO and
> phase inc
> input set_stb, input [7:0] set_addr, input [31:0] set_data,
> input timed_set_stb, input [7:0] timed_set_addr, input [31:0]
> timed_set_data,
> input [31:0] sample_in_tdata,
> input sample_in_tvalid,
> input sample_in_tlast,
> (* dont_touch="true",mark_debug="true"*) output sample_in_tready,
> input sample_in_tuser,
> input sample_in_eob,
> (* dont_touch="true",mark_debug="true"*) output [31:0] sample_out_tdata,
> (* dont_touch="true",mark_debug="true"*) output sample_out_tvalid,
> input sample_out_tready,
> output sample_out_tlast
> );
>
> Thanks,
>
> David
>
> On Fri, Dec 20, 2024 at 4:11 AM Martin Braun <martin.braun@ettus.com>
> wrote:
>
>> David,
>>
>> is it possible that your block is deasserting tready on one of its
>> inputs, thus delaying the DDC?
>>
>> --M
>>
>> On Fri, Dec 20, 2024 at 3:27 AM David <vitishlsfan21@gmail.com> wrote:
>>
>>> Martin,
>>>
>>> Thanks for the reply. I will take any modules you suggest for AXI
>>> alignment, even if they do not "fix" my issue, it is good for me to look at.
>>>
>>> 1. thanks for the comment, this block is a long time coming.
>>>
>>> 2. We captured some screen shots of the ILA core recording both the base
>>> image and my image. I also was able to add a dummy port on my image and run
>>> the *rx_samples_to_file *on that (because it was statically connected),
>>> which confirmed that the multi_usrp method producing the expected results,
>>> with/without my block in line:
>>>
>>> below I present some screenshots of the behavior, where the ILA is
>>> capturing the output of both DDCs *before* packetization.* What is not
>>> shown is the multi_usrp method running with my block, but it has the same
>>> behavior as the base image**:*
>>>
>>> *Base Image, with rx_samples_to_file (multi_usrp)*
>>> Example 1: zoomed in run
>>> [image: base_image_zoomed.PNG]
>>> Example 2: different run, zoomed out. both DDCs perform as expected:
>>> [image: base_image_zoomed_out.PNG]
>>>
>>>
>>> *Custom Image, with davids_rx_to_file (ddc_block_controller)*
>>> Example 1: random distance between samples on both DDCs, clear on DDC1.
>>> The last 4 valids have a big change in cycle distance.
>>> [image: random_dist.PNG]
>>> Example 2: a different run, same behavior as above and time tags.
>>> [image: time_tags.PNG]
>>> Example 3: A run where it "almost" worked, and my block also "almost
>>> worked". You can see the alignment slips at the end:
>>> [image: Timing_mostly_aligned.PNG]
>>>
>>>
>>> 3. right now in the yaml I am using the named inputs with one port each:
>>>
>>> data:
>>> fpga_iface: axis_data
>>> clk_domain: rfnoc_chdr
>>> inputs:
>>> in_1:
>>> num_ports: 1
>>> ...
>>> in_2:
>>> num_ports: 1
>>> ...
>>>
>>> I have done some experiments with one named input with 2 port, and I see
>>> that the AXI handshake is one packet with two parallel streams. I will try
>>> to "AXI align" as you suggested with this first:
>>> data:
>>> fpga_iface: axis_data
>>> clk_domain: rfnoc_chdr
>>> inputs:
>>> in:
>>> num_ports: 2
>>> ...
>>>
>>> 4. right now, since I want to issue the streaming command while doing *record
>>> to file* and *transmit loopback*, I will start with the forwarding
>>> policy as you suggested and also try to add my own issue stream command to
>>> my block. It is not trivial for me since I am not a C++ person, so I won't
>>> be able to provide much feedback on that effort.
>>>
>>> Thanks,
>>>
>>> David
>>>
>>> On Thu, Dec 19, 2024 at 3:24 AM Martin Braun <martin.braun@ettus.com>
>>> wrote:
>>>
>>>> Hey David,
>>>>
>>>> this looks like you've gotten pretty far on a sophisticated project! I
>>>> have a few observations:
>>>>
>>>> - At first glance, your C++ looks correct.
>>>> - I would expect samples to arrive at your block synchronously based on
>>>> that. However, maybe I'm forgetting something that would cause the outputs
>>>> of the DDCs to misalign data by a few clock cycles. Which makes me wonder:
>>>> Are you sure your input packets are misaligned? In RFNoC, we make no
>>>> guarantee that the output of the DDC (or any other) block is aligned to the
>>>> clock cycle, because we encode the timestamp with the data. Meaning that
>>>> the first, actual sample that arrives at your block on each channel is in
>>>> fact time-aligned, they just arrive a few clock cycles apart. This is the
>>>> same logic that applies when packets arrive at the host computer, where we
>>>> make no assumptions that they arrive at the exact same time.
>>>> - If this is the issue, I think we have some modules you can use to
>>>> actually align samples within your block, or you just do some AXI alignment
>>>> yourself by combining the tready and tvalid signals of two streams.
>>>> - Side note, although it's not important: I would consider it a best
>>>> practice to have your block call
>>>> set_action_forwarding_policy(forwarding_policy_t::ONE_TO_FAN, "stream_cmd")
>>>> so it would properly forward stream commands, and then you can plop the
>>>> stream command into the streamer.
>>>>
>>>> --M
>>>>
>>>> On Sun, Dec 15, 2024 at 10:49 PM David <vitishlsfan21@gmail.com> wrote:
>>>>
>>>>> Hello all,
>>>>>
>>>>> I apologize in advance for data dumping. I have made a 2 input/1
>>>>> output RFNoC block that requires repeatable synchronized DDC starts. My
>>>>> current method of starting the DDC is not working as desired.
>>>>>
>>>>> *Question - **How can I correctly start both DDC's so samples are
>>>>> available on the same clock cycle, similar to the rx_samples_to_file, while
>>>>> still using my 2 in/1 out RFNoC block? *
>>>>> I would like to focus the conversation on my C++ implementation for
>>>>> now. All my simulations have convinced me my block is consuming AXI-Stream
>>>>> data correctly.
>>>>>
>>>>> *Problem*
>>>>> When starting two DDCs with timed commands sent to DDC in my C++
>>>>> application, I am not getting the same result as the
>>>>> rx_samples_to_file.cpp... rx_samples_to_file has repeatable alignment, and
>>>>> mine has random. This has led me to believe the problem is in my
>>>>> application and not my block. My Vivado simulations show my block is able
>>>>> to consume the AXI-Stream transactions in parallel as I expect.
>>>>>
>>>>> Considering sampling noise from a sig gen that is split to both
>>>>> inputs, I see the following behavior:
>>>>> rx_samples_to_file (base image) davids_samples_to_file (custom image)
>>>>> DDC A samples ... X_1 Y_1 Z_1 ... DDC A samples ... X_1 Y_1 Z_1 ...
>>>>> DDC B samples ... X_2 Y_2 Z_2 ... DDC B samples X_2 Y_2 Z_2 ... ...
>>>>>
>>>>> *sample_1 is not equal to sample_2, but over a large number of samples
>>>>> they will correlate well.
>>>>>
>>>>> In the above example, the noise correlates as expected, but it is
>>>>> delayed by 1 sample. When using my application, I have seen no delay
>>>>> (desired), and also delay in the range of 5 samples.
>>>>>
>>>>> *C++ Implementation*
>>>>> [image: image.png]
>>>>>
>>>>> I am using* uhd::rfnoc::ddc_block_control* types to issue the stream
>>>>> command because I was having issues with my block propagating. Issuing to
>>>>> the DDCs lets the data flow from 2 inputs to the 1 output, where the output
>>>>> is either a file or loopback to transmit.
>>>>>
>>>>> The base image with rx_samples_to_file uses a multi_usrp type, which
>>>>> propagates the stream command from the rx_streamer.
>>>>>
>>>>> *RFNoC laydown*
>>>>>
>>>>> [image: image.png]
>>>>>
>>>>> Data flows in both Tx loopback configuration and Rx to file
>>>>> configuration.
>>>>>
>>>>> *Methods and Symptoms*
>>>>> I have two methods of measuring the synchronization, with data
>>>>> collected by ILA cores at either the output of DDC or input of custom
>>>>> block:
>>>>>
>>>>> 1. *Math: *When receiving correlated noise, I can measure the
>>>>> cross correlation and show that the correlation peaks as expected, and show
>>>>> the delay between channels in samples.
>>>>> 2. *Vivado Waveform Viewer*: When the ILA cores are collecting DDC
>>>>> channel data, I can see that the base image samples are available on the
>>>>> same clock. My image does not have that behavior.
>>>>>
>>>>>
>>>>> Thanks,
>>>>>
>>>>> David
>>>>>
>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> USRP-users mailing list -- usrp-users@lists.ettus.com
>>>>> To unsubscribe send an email to usrp-users-leave@lists.ettus.com
>>>>>
>>>> _______________________________________________
>>>> USRP-users mailing list -- usrp-users@lists.ettus.com
>>>> To unsubscribe send an email to usrp-users-leave@lists.ettus.com
>>>>
>>> _______________________________________________
> USRP-users mailing list -- usrp-users@lists.ettus.com
> To unsubscribe send an email to usrp-users-leave@lists.ettus.com
>
D
David
Mon, Dec 23, 2024 10:55 PM
Rob,
Thank you for your response, I was actually unaware of the mutli_usrp_rfnoc
class, and I see how it calls the same command. I now have an extra tool I
can fiddle with after the holidays, plus the new FPGA debug images...
I have been using the multi_usrp.cpp class as my working case, which came
from the examples. It looks like it sets the stream property on the ddc
directly, whereas the RFNoC methods call a method post_action(dst_edge,
new_action). Still looking into it, which will take some time.
// multi_usrp
void issue_stream_cmd(const stream_cmd_t& stream_cmd, size_t chan) override
{
if (chan != ALL_CHANS) {
_tree->access<stream_cmd_t>(rx_dsp_root(chan) / "stream_cmd").set
(stream_cmd);
return;
}
for (size_t c = 0; c < get_rx_num_channels(); c++) {
issue_stream_cmd(stream_cmd, c);
}
}
// multi_usrp_rfnoc
void issue_stream_cmd(const stream_cmd_t& stream_cmd, size_t chan =
ALL_CHANS) override
{
MUX_RX_API_CALL(issue_stream_cmd, stream_cmd);
auto& rx_chain = _get_rx_chan(chan);
if (rx_chain.ddc) {
rx_chain.ddc->issue_stream_cmd(stream_cmd, rx_chain.block_chan);
} else {
rx_chain.radio->issue_stream_cmd(stream_cmd, rx_chain.block_chan);
}
}
// ddc block controller
void issue_stream_cmd(const uhd::stream_cmd_t& stream_cmd, const size_t port)
override
{
RFNOC_LOG_TRACE("issue_stream_cmd(stream_mode=" << char(stream_cmd.
stream_mode)
<< ", port=" << port);
res_source_info dst_edge{res_source_info::OUTPUT_EDGE, port};
auto new_action = stream_cmd_action_info::make(stream_cmd.stream_mode);
new_action->stream_cmd = stream_cmd;
issue_stream_cmd_action_handler(dst_edge, new_action);
}
Thanks,
David
On Mon, Dec 23, 2024 at 8:28 AM Rob Kossler rkossler@nd.edu wrote:
Hi David,
Your email distinguishes between the multi_usrp API and the rfnoc API.
But, under the hood, the multi_usrp API
https://github.com/EttusResearch/uhd/blob/master/host/lib/usrp/multi_usrp_rfnoc.cpp
implements all of its functionality with the rfnoc API. So, it seems that
the multi_usrp implementation (using rfnoc API commands) is doing something
different than your own implementation (using rfnoc API commands). I
realize that this is not a very helpful comment but perhaps if you take a
closer look at the multi_usrp_rfnoc class, you might find something
different in the underlying commands.
Rob
On Fri, Dec 20, 2024 at 5:10 PM David vitishlsfan21@gmail.com wrote:
Martin,
I don't have waveform viewer screenshots yet of the inputs (working on
it), but I have run the simulation with a packet delayed 500 clock cycles
on one of my block's channels. I can see that my block "waits" for the
second channel, which aligns the axi transaction. This is because my
block is an HLS block that is data driven, and won't be ready unless it has
both inputs. I verified the output data from the delayed packet simulation.
Because of these factors, I think it is unlikely my block is deasserting
tready in my FPGA images.
Simulation output with a delayed packet on channel 1:
[image: delayed_port1_packet.png]
I also know the maximum sample rate we can run with on my block, and have
done many tests to ensure that my block is consuming data fast enough so
there are no overflows upstream.
My understanding of how the RFNoC packets work is that the output of the
DDC is filling a packet formed in the NoC shell, which is then released
once the 64 samples are filled. You can see that the DDC0 and DDC1
tready in all my debug screenshots is always asserted, even in the
non-working cases. Likewise, on my blocks input, tvalid from the noc shell
is always asserted, while my blocks tready drives the transaction.
Where we are now, is that using the usrp and multi_usrp APIs, my block
works as expected. When using RFNoC API, which sets the rate on the DDC and
starts streaming, we get the problem behavior. Is it possible that DDC0 and
DDC1 are not sampling correctly when I am using RFNoC API to set the rate
and start streaming? I have seen a difference before between the APIs,
where the multi_usrp was able to set the center frequency on the base
image, and the RFNoC API kept the center frequency at 0 MHz.
I don't understand why the clock distance between the tvalids on DDC0 and
DDC1 would change in my previous images, which only happens on the RFNoC
API application. I would expect a ddc output to be equidistant based on the
output sample rate. This is where the debugging is in the DDC blocks
(uhd/fpga/usrp3/lib/rfnoc/ddc.v):
//! RFNoC specific digital down-conversion chain
module ddc #(
parameter SR_FREQ_ADDR = 0,
parameter SR_SCALE_IQ_ADDR = 1,
parameter SR_DECIM_ADDR = 2,
parameter SR_MUX_ADDR = 3,
parameter SR_COEFFS_ADDR = 4,
parameter PRELOAD_HBS = 1, // Preload half band filter state with 0s
parameter NUM_HB = 3,
parameter CIC_MAX_DECIM = 255,
parameter SAMPLE_WIDTH = 16,
parameter WIDTH = 24
)(
input clk, input reset,
input clear, // Resets everything except the timed phase inc FIFO and
phase inc
input set_stb, input [7:0] set_addr, input [31:0] set_data,
input timed_set_stb, input [7:0] timed_set_addr, input [31:0]
timed_set_data,
input [31:0] sample_in_tdata,
input sample_in_tvalid,
input sample_in_tlast,
(* dont_touch="true",mark_debug="true") output sample_in_tready,
input sample_in_tuser,
input sample_in_eob,
( dont_touch="true",mark_debug="true") output [31:0] sample_out_tdata,
( dont_touch="true",mark_debug="true"*) output sample_out_tvalid,
input sample_out_tready,
output sample_out_tlast
);
Thanks,
David
On Fri, Dec 20, 2024 at 4:11 AM Martin Braun martin.braun@ettus.com
wrote:
David,
is it possible that your block is deasserting tready on one of its
inputs, thus delaying the DDC?
--M
On Fri, Dec 20, 2024 at 3:27 AM David vitishlsfan21@gmail.com wrote:
Martin,
Thanks for the reply. I will take any modules you suggest for AXI
alignment, even if they do not "fix" my issue, it is good for me to look at.
-
thanks for the comment, this block is a long time coming.
-
We captured some screen shots of the ILA core recording both the
base image and my image. I also was able to add a dummy port on my image
and run the *rx_samples_to_file *on that (because it was statically
connected), which confirmed that the multi_usrp method producing the
expected results, with/without my block in line:
below I present some screenshots of the behavior, where the ILA is
capturing the output of both DDCs before packetization.* What is not
shown is the multi_usrp method running with my block, but it has the same
behavior as the base image**:*
Base Image, with rx_samples_to_file (multi_usrp)
Example 1: zoomed in run
[image: base_image_zoomed.PNG]
Example 2: different run, zoomed out. both DDCs perform as expected:
[image: base_image_zoomed_out.PNG]
Custom Image, with davids_rx_to_file (ddc_block_controller)
Example 1: random distance between samples on both DDCs, clear on DDC1.
The last 4 valids have a big change in cycle distance.
[image: random_dist.PNG]
Example 2: a different run, same behavior as above and time tags.
[image: time_tags.PNG]
Example 3: A run where it "almost" worked, and my block also "almost
worked". You can see the alignment slips at the end:
[image: Timing_mostly_aligned.PNG]
- right now in the yaml I am using the named inputs with one port each:
data:
fpga_iface: axis_data
clk_domain: rfnoc_chdr
inputs:
in_1:
num_ports: 1
...
in_2:
num_ports: 1
...
I have done some experiments with one named input with 2 port, and I
see that the AXI handshake is one packet with two parallel streams. I will
try to "AXI align" as you suggested with this first:
data:
fpga_iface: axis_data
clk_domain: rfnoc_chdr
inputs:
in:
num_ports: 2
...
- right now, since I want to issue the streaming command while doing record
to file and transmit loopback, I will start with the forwarding
policy as you suggested and also try to add my own issue stream command to
my block. It is not trivial for me since I am not a C++ person, so I won't
be able to provide much feedback on that effort.
Thanks,
David
On Thu, Dec 19, 2024 at 3:24 AM Martin Braun martin.braun@ettus.com
wrote:
Hey David,
this looks like you've gotten pretty far on a sophisticated project!
I have a few observations:
- At first glance, your C++ looks correct.
- I would expect samples to arrive at your block synchronously based
on that. However, maybe I'm forgetting something that would cause the
outputs of the DDCs to misalign data by a few clock cycles. Which makes me
wonder: Are you sure your input packets are misaligned? In RFNoC, we make
no guarantee that the output of the DDC (or any other) block is aligned to
the clock cycle, because we encode the timestamp with the data. Meaning
that the first, actual sample that arrives at your block on each channel is
in fact time-aligned, they just arrive a few clock cycles apart. This is
the same logic that applies when packets arrive at the host computer, where
we make no assumptions that they arrive at the exact same time.
- If this is the issue, I think we have some modules you can use to
actually align samples within your block, or you just do some AXI alignment
yourself by combining the tready and tvalid signals of two streams.
- Side note, although it's not important: I would consider it a best
practice to have your block call
set_action_forwarding_policy(forwarding_policy_t::ONE_TO_FAN, "stream_cmd")
so it would properly forward stream commands, and then you can plop the
stream command into the streamer.
--M
On Sun, Dec 15, 2024 at 10:49 PM David vitishlsfan21@gmail.com
wrote:
Hello all,
I apologize in advance for data dumping. I have made a 2 input/1
output RFNoC block that requires repeatable synchronized DDC starts. My
current method of starting the DDC is not working as desired.
*Question - **How can I correctly start both DDC's so samples are
available on the same clock cycle, similar to the rx_samples_to_file, while
still using my 2 in/1 out RFNoC block? *
I would like to focus the conversation on my C++ implementation for
now. All my simulations have convinced me my block is consuming AXI-Stream
data correctly.
Problem
When starting two DDCs with timed commands sent to DDC in my C++
application, I am not getting the same result as the
rx_samples_to_file.cpp... rx_samples_to_file has repeatable alignment, and
mine has random. This has led me to believe the problem is in my
application and not my block. My Vivado simulations show my block is able
to consume the AXI-Stream transactions in parallel as I expect.
Considering sampling noise from a sig gen that is split to both
inputs, I see the following behavior:
rx_samples_to_file (base image) davids_samples_to_file (custom image)
DDC A samples ... X_1 Y_1 Z_1 ... DDC A samples ... X_1 Y_1 Z_1 ...
DDC B samples ... X_2 Y_2 Z_2 ... DDC B samples X_2 Y_2 Z_2 ... ...
*sample_1 is not equal to sample_2, but over a large number of
samples they will correlate well.
In the above example, the noise correlates as expected, but it is
delayed by 1 sample. When using my application, I have seen no delay
(desired), and also delay in the range of 5 samples.
C++ Implementation
[image: image.png]
I am using* uhd::rfnoc::ddc_block_control* types to issue the stream
command because I was having issues with my block propagating. Issuing to
the DDCs lets the data flow from 2 inputs to the 1 output, where the output
is either a file or loopback to transmit.
The base image with rx_samples_to_file uses a multi_usrp type, which
propagates the stream command from the rx_streamer.
RFNoC laydown
[image: image.png]
Data flows in both Tx loopback configuration and Rx to file
configuration.
Methods and Symptoms
I have two methods of measuring the synchronization, with data
collected by ILA cores at either the output of DDC or input of custom
block:
1. *Math: *When receiving correlated noise, I can measure the
cross correlation and show that the correlation peaks as expected, and show
the delay between channels in samples.
2. *Vivado Waveform Viewer*: When the ILA cores are collecting
DDC channel data, I can see that the base image samples are available on
the same clock. My image does not have that behavior.
Thanks,
David
USRP-users mailing list -- usrp-users@lists.ettus.com
To unsubscribe send an email to usrp-users-leave@lists.ettus.com
Rob,
Thank you for your response, I was actually unaware of the mutli_usrp_rfnoc
class, and I see how it calls the same command. I now have an extra tool I
can fiddle with after the holidays, plus the new FPGA debug images...
I have been using the multi_usrp.cpp class as my working case, which came
from the examples. It looks like it sets the stream property on the ddc
directly, whereas the RFNoC methods call a method post_action(dst_edge,
new_action). Still looking into it, which will take some time.
// multi_usrp
void issue_stream_cmd(const stream_cmd_t& stream_cmd, size_t chan) override
{
if (chan != ALL_CHANS) {
_tree->access<stream_cmd_t>(rx_dsp_root(chan) / "stream_cmd").set
(stream_cmd);
return;
}
for (size_t c = 0; c < get_rx_num_channels(); c++) {
issue_stream_cmd(stream_cmd, c);
}
}
// multi_usrp_rfnoc
void issue_stream_cmd(const stream_cmd_t& stream_cmd, size_t chan =
ALL_CHANS) override
{
MUX_RX_API_CALL(issue_stream_cmd, stream_cmd);
auto& rx_chain = _get_rx_chan(chan);
if (rx_chain.ddc) {
rx_chain.ddc->issue_stream_cmd(stream_cmd, rx_chain.block_chan);
} else {
rx_chain.radio->issue_stream_cmd(stream_cmd, rx_chain.block_chan);
}
}
// ddc block controller
void issue_stream_cmd(const uhd::stream_cmd_t& stream_cmd, const size_t port)
override
{
RFNOC_LOG_TRACE("issue_stream_cmd(stream_mode=" << char(stream_cmd.
stream_mode)
<< ", port=" << port);
res_source_info dst_edge{res_source_info::OUTPUT_EDGE, port};
auto new_action = stream_cmd_action_info::make(stream_cmd.stream_mode);
new_action->stream_cmd = stream_cmd;
issue_stream_cmd_action_handler(dst_edge, new_action);
}
Thanks,
David
On Mon, Dec 23, 2024 at 8:28 AM Rob Kossler <rkossler@nd.edu> wrote:
> Hi David,
> Your email distinguishes between the multi_usrp API and the rfnoc API.
> But, under the hood, the multi_usrp API
> <https://github.com/EttusResearch/uhd/blob/master/host/lib/usrp/multi_usrp_rfnoc.cpp>
> implements all of its functionality with the rfnoc API. So, it seems that
> the multi_usrp implementation (using rfnoc API commands) is doing something
> different than your own implementation (using rfnoc API commands). I
> realize that this is not a very helpful comment but perhaps if you take a
> closer look at the multi_usrp_rfnoc class, you might find something
> different in the underlying commands.
> Rob
>
> On Fri, Dec 20, 2024 at 5:10 PM David <vitishlsfan21@gmail.com> wrote:
>
>> Martin,
>>
>> I don't have waveform viewer screenshots yet of the inputs (working on
>> it), but I have run the simulation with a packet delayed 500 clock cycles
>> on one of my block's channels. I can see that my block "waits" for the
>> second channel, which aligns the axi transaction. This is because my
>> block is an HLS block that is data driven, and won't be ready unless it has
>> both inputs. I verified the output data from the delayed packet simulation.
>> Because of these factors, I think it is unlikely my block is deasserting
>> tready in my FPGA images.
>>
>> Simulation output with a delayed packet on channel 1:
>> [image: delayed_port1_packet.png]
>>
>>
>> I also know the maximum sample rate we can run with on my block, and have
>> done many tests to ensure that my block is consuming data fast enough so
>> there are no overflows upstream.
>>
>> My understanding of how the RFNoC packets work is that the output of the
>> DDC is filling a packet formed in the NoC shell, which is then released
>> once the 64 samples are filled. You can see that the DDC0 and DDC1
>> *tready* in all my debug screenshots is always asserted, even in the
>> non-working cases. Likewise, on my blocks input, tvalid from the noc shell
>> is always asserted, while my blocks tready drives the transaction.
>>
>> Where we are now, is that using the usrp and multi_usrp APIs, my block
>> works as expected. When using RFNoC API, which sets the rate on the DDC and
>> starts streaming, we get the problem behavior. Is it possible that DDC0 and
>> DDC1 are not sampling correctly when I am using RFNoC API to set the rate
>> and start streaming? I have seen a difference before between the APIs,
>> where the multi_usrp was able to set the center frequency on the base
>> image, and the RFNoC API kept the center frequency at 0 MHz.
>>
>> I don't understand why the clock distance between the tvalids on DDC0 and
>> DDC1 would change in my previous images, which only happens on the RFNoC
>> API application. I would expect a ddc output to be equidistant based on the
>> output sample rate. This is where the debugging is in the DDC blocks
>> (uhd/fpga/usrp3/lib/rfnoc/ddc.v):
>>
>> //! RFNoC specific digital down-conversion chain
>>
>> module ddc #(
>> parameter SR_FREQ_ADDR = 0,
>> parameter SR_SCALE_IQ_ADDR = 1,
>> parameter SR_DECIM_ADDR = 2,
>> parameter SR_MUX_ADDR = 3,
>> parameter SR_COEFFS_ADDR = 4,
>> parameter PRELOAD_HBS = 1, // Preload half band filter state with 0s
>> parameter NUM_HB = 3,
>> parameter CIC_MAX_DECIM = 255,
>> parameter SAMPLE_WIDTH = 16,
>> parameter WIDTH = 24
>> )(
>> input clk, input reset,
>> input clear, // Resets everything except the timed phase inc FIFO and
>> phase inc
>> input set_stb, input [7:0] set_addr, input [31:0] set_data,
>> input timed_set_stb, input [7:0] timed_set_addr, input [31:0]
>> timed_set_data,
>> input [31:0] sample_in_tdata,
>> input sample_in_tvalid,
>> input sample_in_tlast,
>> (* dont_touch="true",mark_debug="true"*) output sample_in_tready,
>> input sample_in_tuser,
>> input sample_in_eob,
>> (* dont_touch="true",mark_debug="true"*) output [31:0] sample_out_tdata,
>> (* dont_touch="true",mark_debug="true"*) output sample_out_tvalid,
>> input sample_out_tready,
>> output sample_out_tlast
>> );
>>
>> Thanks,
>>
>> David
>>
>> On Fri, Dec 20, 2024 at 4:11 AM Martin Braun <martin.braun@ettus.com>
>> wrote:
>>
>>> David,
>>>
>>> is it possible that your block is deasserting tready on one of its
>>> inputs, thus delaying the DDC?
>>>
>>> --M
>>>
>>> On Fri, Dec 20, 2024 at 3:27 AM David <vitishlsfan21@gmail.com> wrote:
>>>
>>>> Martin,
>>>>
>>>> Thanks for the reply. I will take any modules you suggest for AXI
>>>> alignment, even if they do not "fix" my issue, it is good for me to look at.
>>>>
>>>> 1. thanks for the comment, this block is a long time coming.
>>>>
>>>> 2. We captured some screen shots of the ILA core recording both the
>>>> base image and my image. I also was able to add a dummy port on my image
>>>> and run the *rx_samples_to_file *on that (because it was statically
>>>> connected), which confirmed that the multi_usrp method producing the
>>>> expected results, with/without my block in line:
>>>>
>>>> below I present some screenshots of the behavior, where the ILA is
>>>> capturing the output of both DDCs *before* packetization.* What is not
>>>> shown is the multi_usrp method running with my block, but it has the same
>>>> behavior as the base image**:*
>>>>
>>>> *Base Image, with rx_samples_to_file (multi_usrp)*
>>>> Example 1: zoomed in run
>>>> [image: base_image_zoomed.PNG]
>>>> Example 2: different run, zoomed out. both DDCs perform as expected:
>>>> [image: base_image_zoomed_out.PNG]
>>>>
>>>>
>>>> *Custom Image, with davids_rx_to_file (ddc_block_controller)*
>>>> Example 1: random distance between samples on both DDCs, clear on DDC1.
>>>> The last 4 valids have a big change in cycle distance.
>>>> [image: random_dist.PNG]
>>>> Example 2: a different run, same behavior as above and time tags.
>>>> [image: time_tags.PNG]
>>>> Example 3: A run where it "almost" worked, and my block also "almost
>>>> worked". You can see the alignment slips at the end:
>>>> [image: Timing_mostly_aligned.PNG]
>>>>
>>>>
>>>> 3. right now in the yaml I am using the named inputs with one port each:
>>>>
>>>> data:
>>>> fpga_iface: axis_data
>>>> clk_domain: rfnoc_chdr
>>>> inputs:
>>>> in_1:
>>>> num_ports: 1
>>>> ...
>>>> in_2:
>>>> num_ports: 1
>>>> ...
>>>>
>>>> I have done some experiments with one named input with 2 port, and I
>>>> see that the AXI handshake is one packet with two parallel streams. I will
>>>> try to "AXI align" as you suggested with this first:
>>>> data:
>>>> fpga_iface: axis_data
>>>> clk_domain: rfnoc_chdr
>>>> inputs:
>>>> in:
>>>> num_ports: 2
>>>> ...
>>>>
>>>> 4. right now, since I want to issue the streaming command while doing *record
>>>> to file* and *transmit loopback*, I will start with the forwarding
>>>> policy as you suggested and also try to add my own issue stream command to
>>>> my block. It is not trivial for me since I am not a C++ person, so I won't
>>>> be able to provide much feedback on that effort.
>>>>
>>>> Thanks,
>>>>
>>>> David
>>>>
>>>> On Thu, Dec 19, 2024 at 3:24 AM Martin Braun <martin.braun@ettus.com>
>>>> wrote:
>>>>
>>>>> Hey David,
>>>>>
>>>>> this looks like you've gotten pretty far on a sophisticated project!
>>>>> I have a few observations:
>>>>>
>>>>> - At first glance, your C++ looks correct.
>>>>> - I would expect samples to arrive at your block synchronously based
>>>>> on that. However, maybe I'm forgetting something that would cause the
>>>>> outputs of the DDCs to misalign data by a few clock cycles. Which makes me
>>>>> wonder: Are you sure your input packets are misaligned? In RFNoC, we make
>>>>> no guarantee that the output of the DDC (or any other) block is aligned to
>>>>> the clock cycle, because we encode the timestamp with the data. Meaning
>>>>> that the first, actual sample that arrives at your block on each channel is
>>>>> in fact time-aligned, they just arrive a few clock cycles apart. This is
>>>>> the same logic that applies when packets arrive at the host computer, where
>>>>> we make no assumptions that they arrive at the exact same time.
>>>>> - If this is the issue, I think we have some modules you can use to
>>>>> actually align samples within your block, or you just do some AXI alignment
>>>>> yourself by combining the tready and tvalid signals of two streams.
>>>>> - Side note, although it's not important: I would consider it a best
>>>>> practice to have your block call
>>>>> set_action_forwarding_policy(forwarding_policy_t::ONE_TO_FAN, "stream_cmd")
>>>>> so it would properly forward stream commands, and then you can plop the
>>>>> stream command into the streamer.
>>>>>
>>>>> --M
>>>>>
>>>>> On Sun, Dec 15, 2024 at 10:49 PM David <vitishlsfan21@gmail.com>
>>>>> wrote:
>>>>>
>>>>>> Hello all,
>>>>>>
>>>>>> I apologize in advance for data dumping. I have made a 2 input/1
>>>>>> output RFNoC block that requires repeatable synchronized DDC starts. My
>>>>>> current method of starting the DDC is not working as desired.
>>>>>>
>>>>>> *Question - **How can I correctly start both DDC's so samples are
>>>>>> available on the same clock cycle, similar to the rx_samples_to_file, while
>>>>>> still using my 2 in/1 out RFNoC block? *
>>>>>> I would like to focus the conversation on my C++ implementation for
>>>>>> now. All my simulations have convinced me my block is consuming AXI-Stream
>>>>>> data correctly.
>>>>>>
>>>>>> *Problem*
>>>>>> When starting two DDCs with timed commands sent to DDC in my C++
>>>>>> application, I am not getting the same result as the
>>>>>> rx_samples_to_file.cpp... rx_samples_to_file has repeatable alignment, and
>>>>>> mine has random. This has led me to believe the problem is in my
>>>>>> application and not my block. My Vivado simulations show my block is able
>>>>>> to consume the AXI-Stream transactions in parallel as I expect.
>>>>>>
>>>>>> Considering sampling noise from a sig gen that is split to both
>>>>>> inputs, I see the following behavior:
>>>>>> rx_samples_to_file (base image) davids_samples_to_file (custom image)
>>>>>> DDC A samples ... X_1 Y_1 Z_1 ... DDC A samples ... X_1 Y_1 Z_1 ...
>>>>>> DDC B samples ... X_2 Y_2 Z_2 ... DDC B samples X_2 Y_2 Z_2 ... ...
>>>>>>
>>>>>> *sample_1 is not equal to sample_2, but over a large number of
>>>>>> samples they will correlate well.
>>>>>>
>>>>>> In the above example, the noise correlates as expected, but it is
>>>>>> delayed by 1 sample. When using my application, I have seen no delay
>>>>>> (desired), and also delay in the range of 5 samples.
>>>>>>
>>>>>> *C++ Implementation*
>>>>>> [image: image.png]
>>>>>>
>>>>>> I am using* uhd::rfnoc::ddc_block_control* types to issue the stream
>>>>>> command because I was having issues with my block propagating. Issuing to
>>>>>> the DDCs lets the data flow from 2 inputs to the 1 output, where the output
>>>>>> is either a file or loopback to transmit.
>>>>>>
>>>>>> The base image with rx_samples_to_file uses a multi_usrp type, which
>>>>>> propagates the stream command from the rx_streamer.
>>>>>>
>>>>>> *RFNoC laydown*
>>>>>>
>>>>>> [image: image.png]
>>>>>>
>>>>>> Data flows in both Tx loopback configuration and Rx to file
>>>>>> configuration.
>>>>>>
>>>>>> *Methods and Symptoms*
>>>>>> I have two methods of measuring the synchronization, with data
>>>>>> collected by ILA cores at either the output of DDC or input of custom
>>>>>> block:
>>>>>>
>>>>>> 1. *Math: *When receiving correlated noise, I can measure the
>>>>>> cross correlation and show that the correlation peaks as expected, and show
>>>>>> the delay between channels in samples.
>>>>>> 2. *Vivado Waveform Viewer*: When the ILA cores are collecting
>>>>>> DDC channel data, I can see that the base image samples are available on
>>>>>> the same clock. My image does not have that behavior.
>>>>>>
>>>>>>
>>>>>> Thanks,
>>>>>>
>>>>>> David
>>>>>>
>>>>>>
>>>>>>
>>>>>> _______________________________________________
>>>>>> USRP-users mailing list -- usrp-users@lists.ettus.com
>>>>>> To unsubscribe send an email to usrp-users-leave@lists.ettus.com
>>>>>>
>>>>> _______________________________________________
>>>>> USRP-users mailing list -- usrp-users@lists.ettus.com
>>>>> To unsubscribe send an email to usrp-users-leave@lists.ettus.com
>>>>>
>>>> _______________________________________________
>> USRP-users mailing list -- usrp-users@lists.ettus.com
>> To unsubscribe send an email to usrp-users-leave@lists.ettus.com
>>
>
RK
Rob Kossler
Tue, Dec 24, 2024 3:40 PM
Hi David,
Just to clarify, functions in the file multi_usrp.cpp are only used for
devices that don't support rfnoc. For any device that supports rfnoc, the
"make" function at the bottom of multi_usrp.cpp simply makes a
multi_usrp_rfnoc object when the user instantiates a multi_usrp object. So,
when you are using it with your device, it is using the functions from
multi_usrp_rfnoc rather than multi_usrp. If you change the UHD logging
level to trace (which may require a re-build), you will see that the rfnoc
api functions are being called (such as the ddc block controller
"issue_stream_cmd" shown below).
Rob
On Mon, Dec 23, 2024 at 5:55 PM David vitishlsfan21@gmail.com wrote:
Rob,
Thank you for your response, I was actually unaware of the
mutli_usrp_rfnoc class, and I see how it calls the same command. I now have
an extra tool I can fiddle with after the holidays, plus the new FPGA debug
images...
I have been using the multi_usrp.cpp class as my working case, which came
from the examples. It looks like it sets the stream property on the ddc
directly, whereas the RFNoC methods call a method post_action(dst_edge,
new_action). Still looking into it, which will take some time.
// multi_usrp
void issue_stream_cmd(const stream_cmd_t& stream_cmd, size_t chan)
override
{
if (chan != ALL_CHANS) {
_tree->access<stream_cmd_t>(rx_dsp_root(chan) / "stream_cmd").set
(stream_cmd);
return;
}
for (size_t c = 0; c < get_rx_num_channels(); c++) {
issue_stream_cmd(stream_cmd, c);
}
}
// multi_usrp_rfnoc
void issue_stream_cmd(const stream_cmd_t& stream_cmd, size_t chan =
ALL_CHANS) override
{
MUX_RX_API_CALL(issue_stream_cmd, stream_cmd);
auto& rx_chain = _get_rx_chan(chan);
if (rx_chain.ddc) {
rx_chain.ddc->issue_stream_cmd(stream_cmd, rx_chain.block_chan);
} else {
rx_chain.radio->issue_stream_cmd(stream_cmd, rx_chain.block_chan);
}
}
// ddc block controller
void issue_stream_cmd(const uhd::stream_cmd_t& stream_cmd, const size_t
port) override
{
RFNOC_LOG_TRACE("issue_stream_cmd(stream_mode=" << char(stream_cmd.
stream_mode)
<< ", port=" << port);
res_source_info dst_edge{res_source_info::OUTPUT_EDGE, port};
auto new_action = stream_cmd_action_info::make(stream_cmd.stream_mode);
new_action->stream_cmd = stream_cmd;
issue_stream_cmd_action_handler(dst_edge, new_action);
}
Thanks,
David
On Mon, Dec 23, 2024 at 8:28 AM Rob Kossler rkossler@nd.edu wrote:
Hi David,
Your email distinguishes between the multi_usrp API and the rfnoc API.
But, under the hood, the multi_usrp API
https://github.com/EttusResearch/uhd/blob/master/host/lib/usrp/multi_usrp_rfnoc.cpp
implements all of its functionality with the rfnoc API. So, it seems that
the multi_usrp implementation (using rfnoc API commands) is doing something
different than your own implementation (using rfnoc API commands). I
realize that this is not a very helpful comment but perhaps if you take a
closer look at the multi_usrp_rfnoc class, you might find something
different in the underlying commands.
Rob
On Fri, Dec 20, 2024 at 5:10 PM David vitishlsfan21@gmail.com wrote:
Martin,
I don't have waveform viewer screenshots yet of the inputs (working on
it), but I have run the simulation with a packet delayed 500 clock cycles
on one of my block's channels. I can see that my block "waits" for the
second channel, which aligns the axi transaction. This is because my
block is an HLS block that is data driven, and won't be ready unless it has
both inputs. I verified the output data from the delayed packet simulation.
Because of these factors, I think it is unlikely my block is deasserting
tready in my FPGA images.
Simulation output with a delayed packet on channel 1:
[image: delayed_port1_packet.png]
I also know the maximum sample rate we can run with on my block, and
have done many tests to ensure that my block is consuming data fast enough
so there are no overflows upstream.
My understanding of how the RFNoC packets work is that the output of the
DDC is filling a packet formed in the NoC shell, which is then released
once the 64 samples are filled. You can see that the DDC0 and DDC1
tready in all my debug screenshots is always asserted, even in the
non-working cases. Likewise, on my blocks input, tvalid from the noc shell
is always asserted, while my blocks tready drives the transaction.
Where we are now, is that using the usrp and multi_usrp APIs, my block
works as expected. When using RFNoC API, which sets the rate on the DDC and
starts streaming, we get the problem behavior. Is it possible that DDC0 and
DDC1 are not sampling correctly when I am using RFNoC API to set the rate
and start streaming? I have seen a difference before between the APIs,
where the multi_usrp was able to set the center frequency on the base
image, and the RFNoC API kept the center frequency at 0 MHz.
I don't understand why the clock distance between the tvalids on DDC0
and DDC1 would change in my previous images, which only happens on the
RFNoC API application. I would expect a ddc output to be equidistant based
on the output sample rate. This is where the debugging is in the DDC blocks
(uhd/fpga/usrp3/lib/rfnoc/ddc.v):
//! RFNoC specific digital down-conversion chain
module ddc #(
parameter SR_FREQ_ADDR = 0,
parameter SR_SCALE_IQ_ADDR = 1,
parameter SR_DECIM_ADDR = 2,
parameter SR_MUX_ADDR = 3,
parameter SR_COEFFS_ADDR = 4,
parameter PRELOAD_HBS = 1, // Preload half band filter state with 0s
parameter NUM_HB = 3,
parameter CIC_MAX_DECIM = 255,
parameter SAMPLE_WIDTH = 16,
parameter WIDTH = 24
)(
input clk, input reset,
input clear, // Resets everything except the timed phase inc FIFO and
phase inc
input set_stb, input [7:0] set_addr, input [31:0] set_data,
input timed_set_stb, input [7:0] timed_set_addr, input [31:0]
timed_set_data,
input [31:0] sample_in_tdata,
input sample_in_tvalid,
input sample_in_tlast,
(* dont_touch="true",mark_debug="true") output sample_in_tready,
input sample_in_tuser,
input sample_in_eob,
( dont_touch="true",mark_debug="true") output [31:0] sample_out_tdata,
( dont_touch="true",mark_debug="true"*) output sample_out_tvalid,
input sample_out_tready,
output sample_out_tlast
);
Thanks,
David
On Fri, Dec 20, 2024 at 4:11 AM Martin Braun martin.braun@ettus.com
wrote:
David,
is it possible that your block is deasserting tready on one of its
inputs, thus delaying the DDC?
--M
On Fri, Dec 20, 2024 at 3:27 AM David vitishlsfan21@gmail.com wrote:
Martin,
Thanks for the reply. I will take any modules you suggest for AXI
alignment, even if they do not "fix" my issue, it is good for me to look at.
-
thanks for the comment, this block is a long time coming.
-
We captured some screen shots of the ILA core recording both the
base image and my image. I also was able to add a dummy port on my image
and run the *rx_samples_to_file *on that (because it was statically
connected), which confirmed that the multi_usrp method producing the
expected results, with/without my block in line:
below I present some screenshots of the behavior, where the ILA is
capturing the output of both DDCs before packetization.* What is
not shown is the multi_usrp method running with my block, but it has the
same behavior as the base image**:*
Base Image, with rx_samples_to_file (multi_usrp)
Example 1: zoomed in run
[image: base_image_zoomed.PNG]
Example 2: different run, zoomed out. both DDCs perform as expected:
[image: base_image_zoomed_out.PNG]
Custom Image, with davids_rx_to_file (ddc_block_controller)
Example 1: random distance between samples on both DDCs, clear on
DDC1. The last 4 valids have a big change in cycle distance.
[image: random_dist.PNG]
Example 2: a different run, same behavior as above and time tags.
[image: time_tags.PNG]
Example 3: A run where it "almost" worked, and my block also "almost
worked". You can see the alignment slips at the end:
[image: Timing_mostly_aligned.PNG]
- right now in the yaml I am using the named inputs with one port
each:
data:
fpga_iface: axis_data
clk_domain: rfnoc_chdr
inputs:
in_1:
num_ports: 1
...
in_2:
num_ports: 1
...
I have done some experiments with one named input with 2 port, and I
see that the AXI handshake is one packet with two parallel streams. I will
try to "AXI align" as you suggested with this first:
data:
fpga_iface: axis_data
clk_domain: rfnoc_chdr
inputs:
in:
num_ports: 2
...
- right now, since I want to issue the streaming command while doing record
to file and transmit loopback, I will start with the forwarding
policy as you suggested and also try to add my own issue stream command to
my block. It is not trivial for me since I am not a C++ person, so I won't
be able to provide much feedback on that effort.
Thanks,
David
On Thu, Dec 19, 2024 at 3:24 AM Martin Braun martin.braun@ettus.com
wrote:
Hey David,
this looks like you've gotten pretty far on a sophisticated project!
I have a few observations:
- At first glance, your C++ looks correct.
- I would expect samples to arrive at your block synchronously based
on that. However, maybe I'm forgetting something that would cause the
outputs of the DDCs to misalign data by a few clock cycles. Which makes me
wonder: Are you sure your input packets are misaligned? In RFNoC, we make
no guarantee that the output of the DDC (or any other) block is aligned to
the clock cycle, because we encode the timestamp with the data. Meaning
that the first, actual sample that arrives at your block on each channel is
in fact time-aligned, they just arrive a few clock cycles apart. This is
the same logic that applies when packets arrive at the host computer, where
we make no assumptions that they arrive at the exact same time.
- If this is the issue, I think we have some modules you can use to
actually align samples within your block, or you just do some AXI alignment
yourself by combining the tready and tvalid signals of two streams.
- Side note, although it's not important: I would consider it a best
practice to have your block call
set_action_forwarding_policy(forwarding_policy_t::ONE_TO_FAN, "stream_cmd")
so it would properly forward stream commands, and then you can plop the
stream command into the streamer.
--M
On Sun, Dec 15, 2024 at 10:49 PM David vitishlsfan21@gmail.com
wrote:
Hello all,
I apologize in advance for data dumping. I have made a 2 input/1
output RFNoC block that requires repeatable synchronized DDC starts. My
current method of starting the DDC is not working as desired.
*Question - **How can I correctly start both DDC's so samples are
available on the same clock cycle, similar to the rx_samples_to_file, while
still using my 2 in/1 out RFNoC block? *
I would like to focus the conversation on my C++ implementation for
now. All my simulations have convinced me my block is consuming AXI-Stream
data correctly.
Problem
When starting two DDCs with timed commands sent to DDC in my C++
application, I am not getting the same result as the
rx_samples_to_file.cpp... rx_samples_to_file has repeatable alignment, and
mine has random. This has led me to believe the problem is in my
application and not my block. My Vivado simulations show my block is able
to consume the AXI-Stream transactions in parallel as I expect.
Considering sampling noise from a sig gen that is split to both
inputs, I see the following behavior:
rx_samples_to_file (base image) davids_samples_to_file (custom
image)
DDC A samples ... X_1 Y_1 Z_1 ... DDC A samples ... X_1 Y_1 Z_1 ...
DDC B samples ... X_2 Y_2 Z_2 ... DDC B samples X_2 Y_2 Z_2 ... ...
*sample_1 is not equal to sample_2, but over a large number of
samples they will correlate well.
In the above example, the noise correlates as expected, but it is
delayed by 1 sample. When using my application, I have seen no delay
(desired), and also delay in the range of 5 samples.
C++ Implementation
[image: image.png]
I am using* uhd::rfnoc::ddc_block_control* types to issue the
stream command because I was having issues with my block propagating.
Issuing to the DDCs lets the data flow from 2 inputs to the 1 output, where
the output is either a file or loopback to transmit.
The base image with rx_samples_to_file uses a multi_usrp type, which
propagates the stream command from the rx_streamer.
RFNoC laydown
[image: image.png]
Data flows in both Tx loopback configuration and Rx to file
configuration.
Methods and Symptoms
I have two methods of measuring the synchronization, with data
collected by ILA cores at either the output of DDC or input of custom
block:
1. *Math: *When receiving correlated noise, I can measure the
cross correlation and show that the correlation peaks as expected, and show
the delay between channels in samples.
2. *Vivado Waveform Viewer*: When the ILA cores are collecting
DDC channel data, I can see that the base image samples are available on
the same clock. My image does not have that behavior.
Thanks,
David
USRP-users mailing list -- usrp-users@lists.ettus.com
To unsubscribe send an email to usrp-users-leave@lists.ettus.com
Hi David,
Just to clarify, functions in the file multi_usrp.cpp are only used for
devices that don't support rfnoc. For any device that supports rfnoc, the
"make" function at the bottom of multi_usrp.cpp simply makes a
multi_usrp_rfnoc object when the user instantiates a multi_usrp object. So,
when you are using it with your device, it is using the functions from
multi_usrp_rfnoc rather than multi_usrp. If you change the UHD logging
level to trace (which may require a re-build), you will see that the rfnoc
api functions are being called (such as the ddc block controller
"issue_stream_cmd" shown below).
Rob
On Mon, Dec 23, 2024 at 5:55 PM David <vitishlsfan21@gmail.com> wrote:
> Rob,
>
> Thank you for your response, I was actually unaware of the
> mutli_usrp_rfnoc class, and I see how it calls the same command. I now have
> an extra tool I can fiddle with after the holidays, plus the new FPGA debug
> images...
>
> I have been using the multi_usrp.cpp class as my working case, which came
> from the examples. It looks like it sets the stream property on the ddc
> directly, whereas the RFNoC methods call a method post_action(dst_edge,
> new_action). Still looking into it, which will take some time.
>
> // multi_usrp
> void issue_stream_cmd(const stream_cmd_t& stream_cmd, size_t chan)
> override
> {
> if (chan != ALL_CHANS) {
> _tree->access<stream_cmd_t>(rx_dsp_root(chan) / "stream_cmd").set
> (stream_cmd);
> return;
> }
> for (size_t c = 0; c < get_rx_num_channels(); c++) {
> issue_stream_cmd(stream_cmd, c);
> }
> }
>
> // multi_usrp_rfnoc
> void issue_stream_cmd(const stream_cmd_t& stream_cmd, size_t chan =
> ALL_CHANS) override
> {
> MUX_RX_API_CALL(issue_stream_cmd, stream_cmd);
> auto& rx_chain = _get_rx_chan(chan);
> if (rx_chain.ddc) {
> rx_chain.ddc->issue_stream_cmd(stream_cmd, rx_chain.block_chan);
> } else {
> rx_chain.radio->issue_stream_cmd(stream_cmd, rx_chain.block_chan);
> }
> }
>
>
> // ddc block controller
> void issue_stream_cmd(const uhd::stream_cmd_t& stream_cmd, const size_t
> port) override
> {
> RFNOC_LOG_TRACE("issue_stream_cmd(stream_mode=" << char(stream_cmd.
> stream_mode)
> << ", port=" << port);
> res_source_info dst_edge{res_source_info::OUTPUT_EDGE, port};
> auto new_action = stream_cmd_action_info::make(stream_cmd.stream_mode);
> new_action->stream_cmd = stream_cmd;
> issue_stream_cmd_action_handler(dst_edge, new_action);
> }
>
> Thanks,
>
> David
>
> On Mon, Dec 23, 2024 at 8:28 AM Rob Kossler <rkossler@nd.edu> wrote:
>
>> Hi David,
>> Your email distinguishes between the multi_usrp API and the rfnoc API.
>> But, under the hood, the multi_usrp API
>> <https://github.com/EttusResearch/uhd/blob/master/host/lib/usrp/multi_usrp_rfnoc.cpp>
>> implements all of its functionality with the rfnoc API. So, it seems that
>> the multi_usrp implementation (using rfnoc API commands) is doing something
>> different than your own implementation (using rfnoc API commands). I
>> realize that this is not a very helpful comment but perhaps if you take a
>> closer look at the multi_usrp_rfnoc class, you might find something
>> different in the underlying commands.
>> Rob
>>
>> On Fri, Dec 20, 2024 at 5:10 PM David <vitishlsfan21@gmail.com> wrote:
>>
>>> Martin,
>>>
>>> I don't have waveform viewer screenshots yet of the inputs (working on
>>> it), but I have run the simulation with a packet delayed 500 clock cycles
>>> on one of my block's channels. I can see that my block "waits" for the
>>> second channel, which aligns the axi transaction. This is because my
>>> block is an HLS block that is data driven, and won't be ready unless it has
>>> both inputs. I verified the output data from the delayed packet simulation.
>>> Because of these factors, I think it is unlikely my block is deasserting
>>> tready in my FPGA images.
>>>
>>> Simulation output with a delayed packet on channel 1:
>>> [image: delayed_port1_packet.png]
>>>
>>>
>>> I also know the maximum sample rate we can run with on my block, and
>>> have done many tests to ensure that my block is consuming data fast enough
>>> so there are no overflows upstream.
>>>
>>> My understanding of how the RFNoC packets work is that the output of the
>>> DDC is filling a packet formed in the NoC shell, which is then released
>>> once the 64 samples are filled. You can see that the DDC0 and DDC1
>>> *tready* in all my debug screenshots is always asserted, even in the
>>> non-working cases. Likewise, on my blocks input, tvalid from the noc shell
>>> is always asserted, while my blocks tready drives the transaction.
>>>
>>> Where we are now, is that using the usrp and multi_usrp APIs, my block
>>> works as expected. When using RFNoC API, which sets the rate on the DDC and
>>> starts streaming, we get the problem behavior. Is it possible that DDC0 and
>>> DDC1 are not sampling correctly when I am using RFNoC API to set the rate
>>> and start streaming? I have seen a difference before between the APIs,
>>> where the multi_usrp was able to set the center frequency on the base
>>> image, and the RFNoC API kept the center frequency at 0 MHz.
>>>
>>> I don't understand why the clock distance between the tvalids on DDC0
>>> and DDC1 would change in my previous images, which only happens on the
>>> RFNoC API application. I would expect a ddc output to be equidistant based
>>> on the output sample rate. This is where the debugging is in the DDC blocks
>>> (uhd/fpga/usrp3/lib/rfnoc/ddc.v):
>>>
>>> //! RFNoC specific digital down-conversion chain
>>>
>>> module ddc #(
>>> parameter SR_FREQ_ADDR = 0,
>>> parameter SR_SCALE_IQ_ADDR = 1,
>>> parameter SR_DECIM_ADDR = 2,
>>> parameter SR_MUX_ADDR = 3,
>>> parameter SR_COEFFS_ADDR = 4,
>>> parameter PRELOAD_HBS = 1, // Preload half band filter state with 0s
>>> parameter NUM_HB = 3,
>>> parameter CIC_MAX_DECIM = 255,
>>> parameter SAMPLE_WIDTH = 16,
>>> parameter WIDTH = 24
>>> )(
>>> input clk, input reset,
>>> input clear, // Resets everything except the timed phase inc FIFO and
>>> phase inc
>>> input set_stb, input [7:0] set_addr, input [31:0] set_data,
>>> input timed_set_stb, input [7:0] timed_set_addr, input [31:0]
>>> timed_set_data,
>>> input [31:0] sample_in_tdata,
>>> input sample_in_tvalid,
>>> input sample_in_tlast,
>>> (* dont_touch="true",mark_debug="true"*) output sample_in_tready,
>>> input sample_in_tuser,
>>> input sample_in_eob,
>>> (* dont_touch="true",mark_debug="true"*) output [31:0] sample_out_tdata,
>>> (* dont_touch="true",mark_debug="true"*) output sample_out_tvalid,
>>> input sample_out_tready,
>>> output sample_out_tlast
>>> );
>>>
>>> Thanks,
>>>
>>> David
>>>
>>> On Fri, Dec 20, 2024 at 4:11 AM Martin Braun <martin.braun@ettus.com>
>>> wrote:
>>>
>>>> David,
>>>>
>>>> is it possible that your block is deasserting tready on one of its
>>>> inputs, thus delaying the DDC?
>>>>
>>>> --M
>>>>
>>>> On Fri, Dec 20, 2024 at 3:27 AM David <vitishlsfan21@gmail.com> wrote:
>>>>
>>>>> Martin,
>>>>>
>>>>> Thanks for the reply. I will take any modules you suggest for AXI
>>>>> alignment, even if they do not "fix" my issue, it is good for me to look at.
>>>>>
>>>>> 1. thanks for the comment, this block is a long time coming.
>>>>>
>>>>> 2. We captured some screen shots of the ILA core recording both the
>>>>> base image and my image. I also was able to add a dummy port on my image
>>>>> and run the *rx_samples_to_file *on that (because it was statically
>>>>> connected), which confirmed that the multi_usrp method producing the
>>>>> expected results, with/without my block in line:
>>>>>
>>>>> below I present some screenshots of the behavior, where the ILA is
>>>>> capturing the output of both DDCs *before* packetization.* What is
>>>>> not shown is the multi_usrp method running with my block, but it has the
>>>>> same behavior as the base image**:*
>>>>>
>>>>> *Base Image, with rx_samples_to_file (multi_usrp)*
>>>>> Example 1: zoomed in run
>>>>> [image: base_image_zoomed.PNG]
>>>>> Example 2: different run, zoomed out. both DDCs perform as expected:
>>>>> [image: base_image_zoomed_out.PNG]
>>>>>
>>>>>
>>>>> *Custom Image, with davids_rx_to_file (ddc_block_controller)*
>>>>> Example 1: random distance between samples on both DDCs, clear on
>>>>> DDC1. The last 4 valids have a big change in cycle distance.
>>>>> [image: random_dist.PNG]
>>>>> Example 2: a different run, same behavior as above and time tags.
>>>>> [image: time_tags.PNG]
>>>>> Example 3: A run where it "almost" worked, and my block also "almost
>>>>> worked". You can see the alignment slips at the end:
>>>>> [image: Timing_mostly_aligned.PNG]
>>>>>
>>>>>
>>>>> 3. right now in the yaml I am using the named inputs with one port
>>>>> each:
>>>>>
>>>>> data:
>>>>> fpga_iface: axis_data
>>>>> clk_domain: rfnoc_chdr
>>>>> inputs:
>>>>> in_1:
>>>>> num_ports: 1
>>>>> ...
>>>>> in_2:
>>>>> num_ports: 1
>>>>> ...
>>>>>
>>>>> I have done some experiments with one named input with 2 port, and I
>>>>> see that the AXI handshake is one packet with two parallel streams. I will
>>>>> try to "AXI align" as you suggested with this first:
>>>>> data:
>>>>> fpga_iface: axis_data
>>>>> clk_domain: rfnoc_chdr
>>>>> inputs:
>>>>> in:
>>>>> num_ports: 2
>>>>> ...
>>>>>
>>>>> 4. right now, since I want to issue the streaming command while doing *record
>>>>> to file* and *transmit loopback*, I will start with the forwarding
>>>>> policy as you suggested and also try to add my own issue stream command to
>>>>> my block. It is not trivial for me since I am not a C++ person, so I won't
>>>>> be able to provide much feedback on that effort.
>>>>>
>>>>> Thanks,
>>>>>
>>>>> David
>>>>>
>>>>> On Thu, Dec 19, 2024 at 3:24 AM Martin Braun <martin.braun@ettus.com>
>>>>> wrote:
>>>>>
>>>>>> Hey David,
>>>>>>
>>>>>> this looks like you've gotten pretty far on a sophisticated project!
>>>>>> I have a few observations:
>>>>>>
>>>>>> - At first glance, your C++ looks correct.
>>>>>> - I would expect samples to arrive at your block synchronously based
>>>>>> on that. However, maybe I'm forgetting something that would cause the
>>>>>> outputs of the DDCs to misalign data by a few clock cycles. Which makes me
>>>>>> wonder: Are you sure your input packets are misaligned? In RFNoC, we make
>>>>>> no guarantee that the output of the DDC (or any other) block is aligned to
>>>>>> the clock cycle, because we encode the timestamp with the data. Meaning
>>>>>> that the first, actual sample that arrives at your block on each channel is
>>>>>> in fact time-aligned, they just arrive a few clock cycles apart. This is
>>>>>> the same logic that applies when packets arrive at the host computer, where
>>>>>> we make no assumptions that they arrive at the exact same time.
>>>>>> - If this is the issue, I think we have some modules you can use to
>>>>>> actually align samples within your block, or you just do some AXI alignment
>>>>>> yourself by combining the tready and tvalid signals of two streams.
>>>>>> - Side note, although it's not important: I would consider it a best
>>>>>> practice to have your block call
>>>>>> set_action_forwarding_policy(forwarding_policy_t::ONE_TO_FAN, "stream_cmd")
>>>>>> so it would properly forward stream commands, and then you can plop the
>>>>>> stream command into the streamer.
>>>>>>
>>>>>> --M
>>>>>>
>>>>>> On Sun, Dec 15, 2024 at 10:49 PM David <vitishlsfan21@gmail.com>
>>>>>> wrote:
>>>>>>
>>>>>>> Hello all,
>>>>>>>
>>>>>>> I apologize in advance for data dumping. I have made a 2 input/1
>>>>>>> output RFNoC block that requires repeatable synchronized DDC starts. My
>>>>>>> current method of starting the DDC is not working as desired.
>>>>>>>
>>>>>>> *Question - **How can I correctly start both DDC's so samples are
>>>>>>> available on the same clock cycle, similar to the rx_samples_to_file, while
>>>>>>> still using my 2 in/1 out RFNoC block? *
>>>>>>> I would like to focus the conversation on my C++ implementation for
>>>>>>> now. All my simulations have convinced me my block is consuming AXI-Stream
>>>>>>> data correctly.
>>>>>>>
>>>>>>> *Problem*
>>>>>>> When starting two DDCs with timed commands sent to DDC in my C++
>>>>>>> application, I am not getting the same result as the
>>>>>>> rx_samples_to_file.cpp... rx_samples_to_file has repeatable alignment, and
>>>>>>> mine has random. This has led me to believe the problem is in my
>>>>>>> application and not my block. My Vivado simulations show my block is able
>>>>>>> to consume the AXI-Stream transactions in parallel as I expect.
>>>>>>>
>>>>>>> Considering sampling noise from a sig gen that is split to both
>>>>>>> inputs, I see the following behavior:
>>>>>>> rx_samples_to_file (base image) davids_samples_to_file (custom
>>>>>>> image)
>>>>>>> DDC A samples ... X_1 Y_1 Z_1 ... DDC A samples ... X_1 Y_1 Z_1 ...
>>>>>>> DDC B samples ... X_2 Y_2 Z_2 ... DDC B samples X_2 Y_2 Z_2 ... ...
>>>>>>>
>>>>>>> *sample_1 is not equal to sample_2, but over a large number of
>>>>>>> samples they will correlate well.
>>>>>>>
>>>>>>> In the above example, the noise correlates as expected, but it is
>>>>>>> delayed by 1 sample. When using my application, I have seen no delay
>>>>>>> (desired), and also delay in the range of 5 samples.
>>>>>>>
>>>>>>> *C++ Implementation*
>>>>>>> [image: image.png]
>>>>>>>
>>>>>>> I am using* uhd::rfnoc::ddc_block_control* types to issue the
>>>>>>> stream command because I was having issues with my block propagating.
>>>>>>> Issuing to the DDCs lets the data flow from 2 inputs to the 1 output, where
>>>>>>> the output is either a file or loopback to transmit.
>>>>>>>
>>>>>>> The base image with rx_samples_to_file uses a multi_usrp type, which
>>>>>>> propagates the stream command from the rx_streamer.
>>>>>>>
>>>>>>> *RFNoC laydown*
>>>>>>>
>>>>>>> [image: image.png]
>>>>>>>
>>>>>>> Data flows in both Tx loopback configuration and Rx to file
>>>>>>> configuration.
>>>>>>>
>>>>>>> *Methods and Symptoms*
>>>>>>> I have two methods of measuring the synchronization, with data
>>>>>>> collected by ILA cores at either the output of DDC or input of custom
>>>>>>> block:
>>>>>>>
>>>>>>> 1. *Math: *When receiving correlated noise, I can measure the
>>>>>>> cross correlation and show that the correlation peaks as expected, and show
>>>>>>> the delay between channels in samples.
>>>>>>> 2. *Vivado Waveform Viewer*: When the ILA cores are collecting
>>>>>>> DDC channel data, I can see that the base image samples are available on
>>>>>>> the same clock. My image does not have that behavior.
>>>>>>>
>>>>>>>
>>>>>>> Thanks,
>>>>>>>
>>>>>>> David
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> _______________________________________________
>>>>>>> USRP-users mailing list -- usrp-users@lists.ettus.com
>>>>>>> To unsubscribe send an email to usrp-users-leave@lists.ettus.com
>>>>>>>
>>>>>> _______________________________________________
>>>>>> USRP-users mailing list -- usrp-users@lists.ettus.com
>>>>>> To unsubscribe send an email to usrp-users-leave@lists.ettus.com
>>>>>>
>>>>> _______________________________________________
>>> USRP-users mailing list -- usrp-users@lists.ettus.com
>>> To unsubscribe send an email to usrp-users-leave@lists.ettus.com
>>>
>>
D
David
Wed, Dec 25, 2024 3:10 AM
Rob,
Excellent information! I see the distinction now, thank you. I also was
able to see the trace logs of issue_stream_cmd calls after recompiling UHD
with the -DUHD_LOG_MIN_LEVEL=0. Nothing obviously illuminating yet, but I
will be able to use it in the future.
Thanks,
David
On Tue, Dec 24, 2024 at 7:40 AM Rob Kossler rkossler@nd.edu wrote:
Hi David,
Just to clarify, functions in the file multi_usrp.cpp are only used for
devices that don't support rfnoc. For any device that supports rfnoc, the
"make" function at the bottom of multi_usrp.cpp simply makes a
multi_usrp_rfnoc object when the user instantiates a multi_usrp object. So,
when you are using it with your device, it is using the functions from
multi_usrp_rfnoc rather than multi_usrp. If you change the UHD logging
level to trace (which may require a re-build), you will see that the rfnoc
api functions are being called (such as the ddc block controller
"issue_stream_cmd" shown below).
Rob
On Mon, Dec 23, 2024 at 5:55 PM David vitishlsfan21@gmail.com wrote:
Rob,
Thank you for your response, I was actually unaware of the
mutli_usrp_rfnoc class, and I see how it calls the same command. I now have
an extra tool I can fiddle with after the holidays, plus the new FPGA debug
images...
I have been using the multi_usrp.cpp class as my working case, which came
from the examples. It looks like it sets the stream property on the ddc
directly, whereas the RFNoC methods call a method post_action(dst_edge,
new_action). Still looking into it, which will take some time.
// multi_usrp
void issue_stream_cmd(const stream_cmd_t& stream_cmd, size_t chan)
override
{
if (chan != ALL_CHANS) {
_tree->access<stream_cmd_t>(rx_dsp_root(chan) / "stream_cmd").set
(stream_cmd);
return;
}
for (size_t c = 0; c < get_rx_num_channels(); c++) {
issue_stream_cmd(stream_cmd, c);
}
}
// multi_usrp_rfnoc
void issue_stream_cmd(const stream_cmd_t& stream_cmd, size_t chan =
ALL_CHANS) override
{
MUX_RX_API_CALL(issue_stream_cmd, stream_cmd);
auto& rx_chain = _get_rx_chan(chan);
if (rx_chain.ddc) {
rx_chain.ddc->issue_stream_cmd(stream_cmd, rx_chain.block_chan);
} else {
rx_chain.radio->issue_stream_cmd(stream_cmd, rx_chain.block_chan);
}
}
// ddc block controller
void issue_stream_cmd(const uhd::stream_cmd_t& stream_cmd, const size_t
port) override
{
RFNOC_LOG_TRACE("issue_stream_cmd(stream_mode=" << char(stream_cmd.
stream_mode)
<< ", port=" << port);
res_source_info dst_edge{res_source_info::OUTPUT_EDGE, port};
auto new_action = stream_cmd_action_info::make(stream_cmd.stream_mode);
new_action->stream_cmd = stream_cmd;
issue_stream_cmd_action_handler(dst_edge, new_action);
}
Thanks,
David
On Mon, Dec 23, 2024 at 8:28 AM Rob Kossler rkossler@nd.edu wrote:
Hi David,
Your email distinguishes between the multi_usrp API and the rfnoc API.
But, under the hood, the multi_usrp API
https://github.com/EttusResearch/uhd/blob/master/host/lib/usrp/multi_usrp_rfnoc.cpp
implements all of its functionality with the rfnoc API. So, it seems that
the multi_usrp implementation (using rfnoc API commands) is doing something
different than your own implementation (using rfnoc API commands). I
realize that this is not a very helpful comment but perhaps if you take a
closer look at the multi_usrp_rfnoc class, you might find something
different in the underlying commands.
Rob
On Fri, Dec 20, 2024 at 5:10 PM David vitishlsfan21@gmail.com wrote:
Martin,
I don't have waveform viewer screenshots yet of the inputs (working on
it), but I have run the simulation with a packet delayed 500 clock cycles
on one of my block's channels. I can see that my block "waits" for the
second channel, which aligns the axi transaction. This is because my
block is an HLS block that is data driven, and won't be ready unless it has
both inputs. I verified the output data from the delayed packet simulation.
Because of these factors, I think it is unlikely my block is deasserting
tready in my FPGA images.
Simulation output with a delayed packet on channel 1:
[image: delayed_port1_packet.png]
I also know the maximum sample rate we can run with on my block, and
have done many tests to ensure that my block is consuming data fast enough
so there are no overflows upstream.
My understanding of how the RFNoC packets work is that the output of
the DDC is filling a packet formed in the NoC shell, which is then released
once the 64 samples are filled. You can see that the DDC0 and DDC1
tready in all my debug screenshots is always asserted, even in the
non-working cases. Likewise, on my blocks input, tvalid from the noc shell
is always asserted, while my blocks tready drives the transaction.
Where we are now, is that using the usrp and multi_usrp APIs, my block
works as expected. When using RFNoC API, which sets the rate on the DDC and
starts streaming, we get the problem behavior. Is it possible that DDC0 and
DDC1 are not sampling correctly when I am using RFNoC API to set the rate
and start streaming? I have seen a difference before between the APIs,
where the multi_usrp was able to set the center frequency on the base
image, and the RFNoC API kept the center frequency at 0 MHz.
I don't understand why the clock distance between the tvalids on DDC0
and DDC1 would change in my previous images, which only happens on the
RFNoC API application. I would expect a ddc output to be equidistant based
on the output sample rate. This is where the debugging is in the DDC blocks
(uhd/fpga/usrp3/lib/rfnoc/ddc.v):
//! RFNoC specific digital down-conversion chain
module ddc #(
parameter SR_FREQ_ADDR = 0,
parameter SR_SCALE_IQ_ADDR = 1,
parameter SR_DECIM_ADDR = 2,
parameter SR_MUX_ADDR = 3,
parameter SR_COEFFS_ADDR = 4,
parameter PRELOAD_HBS = 1, // Preload half band filter state with 0s
parameter NUM_HB = 3,
parameter CIC_MAX_DECIM = 255,
parameter SAMPLE_WIDTH = 16,
parameter WIDTH = 24
)(
input clk, input reset,
input clear, // Resets everything except the timed phase inc FIFO and
phase inc
input set_stb, input [7:0] set_addr, input [31:0] set_data,
input timed_set_stb, input [7:0] timed_set_addr, input [31:0]
timed_set_data,
input [31:0] sample_in_tdata,
input sample_in_tvalid,
input sample_in_tlast,
(* dont_touch="true",mark_debug="true") output sample_in_tready,
input sample_in_tuser,
input sample_in_eob,
( dont_touch="true",mark_debug="true") output [31:0]
sample_out_tdata,
( dont_touch="true",mark_debug="true"*) output sample_out_tvalid,
input sample_out_tready,
output sample_out_tlast
);
Thanks,
David
On Fri, Dec 20, 2024 at 4:11 AM Martin Braun martin.braun@ettus.com
wrote:
David,
is it possible that your block is deasserting tready on one of its
inputs, thus delaying the DDC?
--M
On Fri, Dec 20, 2024 at 3:27 AM David vitishlsfan21@gmail.com wrote:
Martin,
Thanks for the reply. I will take any modules you suggest for AXI
alignment, even if they do not "fix" my issue, it is good for me to look at.
-
thanks for the comment, this block is a long time coming.
-
We captured some screen shots of the ILA core recording both the
base image and my image. I also was able to add a dummy port on my image
and run the *rx_samples_to_file *on that (because it was statically
connected), which confirmed that the multi_usrp method producing the
expected results, with/without my block in line:
below I present some screenshots of the behavior, where the ILA is
capturing the output of both DDCs before packetization.* What is
not shown is the multi_usrp method running with my block, but it has the
same behavior as the base image**:*
Base Image, with rx_samples_to_file (multi_usrp)
Example 1: zoomed in run
[image: base_image_zoomed.PNG]
Example 2: different run, zoomed out. both DDCs perform as expected:
[image: base_image_zoomed_out.PNG]
Custom Image, with davids_rx_to_file (ddc_block_controller)
Example 1: random distance between samples on both DDCs, clear on
DDC1. The last 4 valids have a big change in cycle distance.
[image: random_dist.PNG]
Example 2: a different run, same behavior as above and time tags.
[image: time_tags.PNG]
Example 3: A run where it "almost" worked, and my block also "almost
worked". You can see the alignment slips at the end:
[image: Timing_mostly_aligned.PNG]
- right now in the yaml I am using the named inputs with one port
each:
data:
fpga_iface: axis_data
clk_domain: rfnoc_chdr
inputs:
in_1:
num_ports: 1
...
in_2:
num_ports: 1
...
I have done some experiments with one named input with 2 port, and I
see that the AXI handshake is one packet with two parallel streams. I will
try to "AXI align" as you suggested with this first:
data:
fpga_iface: axis_data
clk_domain: rfnoc_chdr
inputs:
in:
num_ports: 2
...
- right now, since I want to issue the streaming command while doing record
to file and transmit loopback, I will start with the forwarding
policy as you suggested and also try to add my own issue stream command to
my block. It is not trivial for me since I am not a C++ person, so I won't
be able to provide much feedback on that effort.
Thanks,
David
On Thu, Dec 19, 2024 at 3:24 AM Martin Braun martin.braun@ettus.com
wrote:
Hey David,
this looks like you've gotten pretty far on a sophisticated
project! I have a few observations:
- At first glance, your C++ looks correct.
- I would expect samples to arrive at your block synchronously based
on that. However, maybe I'm forgetting something that would cause the
outputs of the DDCs to misalign data by a few clock cycles. Which makes me
wonder: Are you sure your input packets are misaligned? In RFNoC, we make
no guarantee that the output of the DDC (or any other) block is aligned to
the clock cycle, because we encode the timestamp with the data. Meaning
that the first, actual sample that arrives at your block on each channel is
in fact time-aligned, they just arrive a few clock cycles apart. This is
the same logic that applies when packets arrive at the host computer, where
we make no assumptions that they arrive at the exact same time.
- If this is the issue, I think we have some modules you can use to
actually align samples within your block, or you just do some AXI alignment
yourself by combining the tready and tvalid signals of two streams.
- Side note, although it's not important: I would consider it a best
practice to have your block call
set_action_forwarding_policy(forwarding_policy_t::ONE_TO_FAN, "stream_cmd")
so it would properly forward stream commands, and then you can plop the
stream command into the streamer.
--M
On Sun, Dec 15, 2024 at 10:49 PM David vitishlsfan21@gmail.com
wrote:
Hello all,
I apologize in advance for data dumping. I have made a 2 input/1
output RFNoC block that requires repeatable synchronized DDC starts. My
current method of starting the DDC is not working as desired.
*Question - **How can I correctly start both DDC's so samples are
available on the same clock cycle, similar to the rx_samples_to_file, while
still using my 2 in/1 out RFNoC block? *
I would like to focus the conversation on my C++ implementation for
now. All my simulations have convinced me my block is consuming AXI-Stream
data correctly.
Problem
When starting two DDCs with timed commands sent to DDC in my C++
application, I am not getting the same result as the
rx_samples_to_file.cpp... rx_samples_to_file has repeatable alignment, and
mine has random. This has led me to believe the problem is in my
application and not my block. My Vivado simulations show my block is able
to consume the AXI-Stream transactions in parallel as I expect.
Considering sampling noise from a sig gen that is split to both
inputs, I see the following behavior:
rx_samples_to_file (base image) davids_samples_to_file (custom
image)
DDC A samples ... X_1 Y_1 Z_1 ... DDC A samples ... X_1 Y_1 Z_1 ...
DDC B samples ... X_2 Y_2 Z_2 ... DDC B samples X_2 Y_2 Z_2 ... ...
*sample_1 is not equal to sample_2, but over a large number of
samples they will correlate well.
In the above example, the noise correlates as expected, but it is
delayed by 1 sample. When using my application, I have seen no delay
(desired), and also delay in the range of 5 samples.
C++ Implementation
[image: image.png]
I am using* uhd::rfnoc::ddc_block_control* types to issue the
stream command because I was having issues with my block propagating.
Issuing to the DDCs lets the data flow from 2 inputs to the 1 output, where
the output is either a file or loopback to transmit.
The base image with rx_samples_to_file uses a multi_usrp type,
which propagates the stream command from the rx_streamer.
RFNoC laydown
[image: image.png]
Data flows in both Tx loopback configuration and Rx to file
configuration.
Methods and Symptoms
I have two methods of measuring the synchronization, with data
collected by ILA cores at either the output of DDC or input of custom
block:
1. *Math: *When receiving correlated noise, I can measure the
cross correlation and show that the correlation peaks as expected, and show
the delay between channels in samples.
2. *Vivado Waveform Viewer*: When the ILA cores are collecting
DDC channel data, I can see that the base image samples are available on
the same clock. My image does not have that behavior.
Thanks,
David
USRP-users mailing list -- usrp-users@lists.ettus.com
To unsubscribe send an email to usrp-users-leave@lists.ettus.com
Rob,
Excellent information! I see the distinction now, thank you. I also was
able to see the trace logs of issue_stream_cmd calls after recompiling UHD
with the -DUHD_LOG_MIN_LEVEL=0. Nothing obviously illuminating yet, but I
will be able to use it in the future.
Thanks,
David
On Tue, Dec 24, 2024 at 7:40 AM Rob Kossler <rkossler@nd.edu> wrote:
> Hi David,
> Just to clarify, functions in the file multi_usrp.cpp are only used for
> devices that don't support rfnoc. For any device that supports rfnoc, the
> "make" function at the bottom of multi_usrp.cpp simply makes a
> multi_usrp_rfnoc object when the user instantiates a multi_usrp object. So,
> when you are using it with your device, it is using the functions from
> multi_usrp_rfnoc rather than multi_usrp. If you change the UHD logging
> level to trace (which may require a re-build), you will see that the rfnoc
> api functions are being called (such as the ddc block controller
> "issue_stream_cmd" shown below).
> Rob
>
> On Mon, Dec 23, 2024 at 5:55 PM David <vitishlsfan21@gmail.com> wrote:
>
>> Rob,
>>
>> Thank you for your response, I was actually unaware of the
>> mutli_usrp_rfnoc class, and I see how it calls the same command. I now have
>> an extra tool I can fiddle with after the holidays, plus the new FPGA debug
>> images...
>>
>> I have been using the multi_usrp.cpp class as my working case, which came
>> from the examples. It looks like it sets the stream property on the ddc
>> directly, whereas the RFNoC methods call a method post_action(dst_edge,
>> new_action). Still looking into it, which will take some time.
>>
>> // multi_usrp
>> void issue_stream_cmd(const stream_cmd_t& stream_cmd, size_t chan)
>> override
>> {
>> if (chan != ALL_CHANS) {
>> _tree->access<stream_cmd_t>(rx_dsp_root(chan) / "stream_cmd").set
>> (stream_cmd);
>> return;
>> }
>> for (size_t c = 0; c < get_rx_num_channels(); c++) {
>> issue_stream_cmd(stream_cmd, c);
>> }
>> }
>>
>> // multi_usrp_rfnoc
>> void issue_stream_cmd(const stream_cmd_t& stream_cmd, size_t chan =
>> ALL_CHANS) override
>> {
>> MUX_RX_API_CALL(issue_stream_cmd, stream_cmd);
>> auto& rx_chain = _get_rx_chan(chan);
>> if (rx_chain.ddc) {
>> rx_chain.ddc->issue_stream_cmd(stream_cmd, rx_chain.block_chan);
>> } else {
>> rx_chain.radio->issue_stream_cmd(stream_cmd, rx_chain.block_chan);
>> }
>> }
>>
>>
>> // ddc block controller
>> void issue_stream_cmd(const uhd::stream_cmd_t& stream_cmd, const size_t
>> port) override
>> {
>> RFNOC_LOG_TRACE("issue_stream_cmd(stream_mode=" << char(stream_cmd.
>> stream_mode)
>> << ", port=" << port);
>> res_source_info dst_edge{res_source_info::OUTPUT_EDGE, port};
>> auto new_action = stream_cmd_action_info::make(stream_cmd.stream_mode);
>> new_action->stream_cmd = stream_cmd;
>> issue_stream_cmd_action_handler(dst_edge, new_action);
>> }
>>
>> Thanks,
>>
>> David
>>
>> On Mon, Dec 23, 2024 at 8:28 AM Rob Kossler <rkossler@nd.edu> wrote:
>>
>>> Hi David,
>>> Your email distinguishes between the multi_usrp API and the rfnoc API.
>>> But, under the hood, the multi_usrp API
>>> <https://github.com/EttusResearch/uhd/blob/master/host/lib/usrp/multi_usrp_rfnoc.cpp>
>>> implements all of its functionality with the rfnoc API. So, it seems that
>>> the multi_usrp implementation (using rfnoc API commands) is doing something
>>> different than your own implementation (using rfnoc API commands). I
>>> realize that this is not a very helpful comment but perhaps if you take a
>>> closer look at the multi_usrp_rfnoc class, you might find something
>>> different in the underlying commands.
>>> Rob
>>>
>>> On Fri, Dec 20, 2024 at 5:10 PM David <vitishlsfan21@gmail.com> wrote:
>>>
>>>> Martin,
>>>>
>>>> I don't have waveform viewer screenshots yet of the inputs (working on
>>>> it), but I have run the simulation with a packet delayed 500 clock cycles
>>>> on one of my block's channels. I can see that my block "waits" for the
>>>> second channel, which aligns the axi transaction. This is because my
>>>> block is an HLS block that is data driven, and won't be ready unless it has
>>>> both inputs. I verified the output data from the delayed packet simulation.
>>>> Because of these factors, I think it is unlikely my block is deasserting
>>>> tready in my FPGA images.
>>>>
>>>> Simulation output with a delayed packet on channel 1:
>>>> [image: delayed_port1_packet.png]
>>>>
>>>>
>>>> I also know the maximum sample rate we can run with on my block, and
>>>> have done many tests to ensure that my block is consuming data fast enough
>>>> so there are no overflows upstream.
>>>>
>>>> My understanding of how the RFNoC packets work is that the output of
>>>> the DDC is filling a packet formed in the NoC shell, which is then released
>>>> once the 64 samples are filled. You can see that the DDC0 and DDC1
>>>> *tready* in all my debug screenshots is always asserted, even in the
>>>> non-working cases. Likewise, on my blocks input, tvalid from the noc shell
>>>> is always asserted, while my blocks tready drives the transaction.
>>>>
>>>> Where we are now, is that using the usrp and multi_usrp APIs, my block
>>>> works as expected. When using RFNoC API, which sets the rate on the DDC and
>>>> starts streaming, we get the problem behavior. Is it possible that DDC0 and
>>>> DDC1 are not sampling correctly when I am using RFNoC API to set the rate
>>>> and start streaming? I have seen a difference before between the APIs,
>>>> where the multi_usrp was able to set the center frequency on the base
>>>> image, and the RFNoC API kept the center frequency at 0 MHz.
>>>>
>>>> I don't understand why the clock distance between the tvalids on DDC0
>>>> and DDC1 would change in my previous images, which only happens on the
>>>> RFNoC API application. I would expect a ddc output to be equidistant based
>>>> on the output sample rate. This is where the debugging is in the DDC blocks
>>>> (uhd/fpga/usrp3/lib/rfnoc/ddc.v):
>>>>
>>>> //! RFNoC specific digital down-conversion chain
>>>>
>>>> module ddc #(
>>>> parameter SR_FREQ_ADDR = 0,
>>>> parameter SR_SCALE_IQ_ADDR = 1,
>>>> parameter SR_DECIM_ADDR = 2,
>>>> parameter SR_MUX_ADDR = 3,
>>>> parameter SR_COEFFS_ADDR = 4,
>>>> parameter PRELOAD_HBS = 1, // Preload half band filter state with 0s
>>>> parameter NUM_HB = 3,
>>>> parameter CIC_MAX_DECIM = 255,
>>>> parameter SAMPLE_WIDTH = 16,
>>>> parameter WIDTH = 24
>>>> )(
>>>> input clk, input reset,
>>>> input clear, // Resets everything except the timed phase inc FIFO and
>>>> phase inc
>>>> input set_stb, input [7:0] set_addr, input [31:0] set_data,
>>>> input timed_set_stb, input [7:0] timed_set_addr, input [31:0]
>>>> timed_set_data,
>>>> input [31:0] sample_in_tdata,
>>>> input sample_in_tvalid,
>>>> input sample_in_tlast,
>>>> (* dont_touch="true",mark_debug="true"*) output sample_in_tready,
>>>> input sample_in_tuser,
>>>> input sample_in_eob,
>>>> (* dont_touch="true",mark_debug="true"*) output [31:0]
>>>> sample_out_tdata,
>>>> (* dont_touch="true",mark_debug="true"*) output sample_out_tvalid,
>>>> input sample_out_tready,
>>>> output sample_out_tlast
>>>> );
>>>>
>>>> Thanks,
>>>>
>>>> David
>>>>
>>>> On Fri, Dec 20, 2024 at 4:11 AM Martin Braun <martin.braun@ettus.com>
>>>> wrote:
>>>>
>>>>> David,
>>>>>
>>>>> is it possible that your block is deasserting tready on one of its
>>>>> inputs, thus delaying the DDC?
>>>>>
>>>>> --M
>>>>>
>>>>> On Fri, Dec 20, 2024 at 3:27 AM David <vitishlsfan21@gmail.com> wrote:
>>>>>
>>>>>> Martin,
>>>>>>
>>>>>> Thanks for the reply. I will take any modules you suggest for AXI
>>>>>> alignment, even if they do not "fix" my issue, it is good for me to look at.
>>>>>>
>>>>>> 1. thanks for the comment, this block is a long time coming.
>>>>>>
>>>>>> 2. We captured some screen shots of the ILA core recording both the
>>>>>> base image and my image. I also was able to add a dummy port on my image
>>>>>> and run the *rx_samples_to_file *on that (because it was statically
>>>>>> connected), which confirmed that the multi_usrp method producing the
>>>>>> expected results, with/without my block in line:
>>>>>>
>>>>>> below I present some screenshots of the behavior, where the ILA is
>>>>>> capturing the output of both DDCs *before* packetization.* What is
>>>>>> not shown is the multi_usrp method running with my block, but it has the
>>>>>> same behavior as the base image**:*
>>>>>>
>>>>>> *Base Image, with rx_samples_to_file (multi_usrp)*
>>>>>> Example 1: zoomed in run
>>>>>> [image: base_image_zoomed.PNG]
>>>>>> Example 2: different run, zoomed out. both DDCs perform as expected:
>>>>>> [image: base_image_zoomed_out.PNG]
>>>>>>
>>>>>>
>>>>>> *Custom Image, with davids_rx_to_file (ddc_block_controller)*
>>>>>> Example 1: random distance between samples on both DDCs, clear on
>>>>>> DDC1. The last 4 valids have a big change in cycle distance.
>>>>>> [image: random_dist.PNG]
>>>>>> Example 2: a different run, same behavior as above and time tags.
>>>>>> [image: time_tags.PNG]
>>>>>> Example 3: A run where it "almost" worked, and my block also "almost
>>>>>> worked". You can see the alignment slips at the end:
>>>>>> [image: Timing_mostly_aligned.PNG]
>>>>>>
>>>>>>
>>>>>> 3. right now in the yaml I am using the named inputs with one port
>>>>>> each:
>>>>>>
>>>>>> data:
>>>>>> fpga_iface: axis_data
>>>>>> clk_domain: rfnoc_chdr
>>>>>> inputs:
>>>>>> in_1:
>>>>>> num_ports: 1
>>>>>> ...
>>>>>> in_2:
>>>>>> num_ports: 1
>>>>>> ...
>>>>>>
>>>>>> I have done some experiments with one named input with 2 port, and I
>>>>>> see that the AXI handshake is one packet with two parallel streams. I will
>>>>>> try to "AXI align" as you suggested with this first:
>>>>>> data:
>>>>>> fpga_iface: axis_data
>>>>>> clk_domain: rfnoc_chdr
>>>>>> inputs:
>>>>>> in:
>>>>>> num_ports: 2
>>>>>> ...
>>>>>>
>>>>>> 4. right now, since I want to issue the streaming command while doing *record
>>>>>> to file* and *transmit loopback*, I will start with the forwarding
>>>>>> policy as you suggested and also try to add my own issue stream command to
>>>>>> my block. It is not trivial for me since I am not a C++ person, so I won't
>>>>>> be able to provide much feedback on that effort.
>>>>>>
>>>>>> Thanks,
>>>>>>
>>>>>> David
>>>>>>
>>>>>> On Thu, Dec 19, 2024 at 3:24 AM Martin Braun <martin.braun@ettus.com>
>>>>>> wrote:
>>>>>>
>>>>>>> Hey David,
>>>>>>>
>>>>>>> this looks like you've gotten pretty far on a sophisticated
>>>>>>> project! I have a few observations:
>>>>>>>
>>>>>>> - At first glance, your C++ looks correct.
>>>>>>> - I would expect samples to arrive at your block synchronously based
>>>>>>> on that. However, maybe I'm forgetting something that would cause the
>>>>>>> outputs of the DDCs to misalign data by a few clock cycles. Which makes me
>>>>>>> wonder: Are you sure your input packets are misaligned? In RFNoC, we make
>>>>>>> no guarantee that the output of the DDC (or any other) block is aligned to
>>>>>>> the clock cycle, because we encode the timestamp with the data. Meaning
>>>>>>> that the first, actual sample that arrives at your block on each channel is
>>>>>>> in fact time-aligned, they just arrive a few clock cycles apart. This is
>>>>>>> the same logic that applies when packets arrive at the host computer, where
>>>>>>> we make no assumptions that they arrive at the exact same time.
>>>>>>> - If this is the issue, I think we have some modules you can use to
>>>>>>> actually align samples within your block, or you just do some AXI alignment
>>>>>>> yourself by combining the tready and tvalid signals of two streams.
>>>>>>> - Side note, although it's not important: I would consider it a best
>>>>>>> practice to have your block call
>>>>>>> set_action_forwarding_policy(forwarding_policy_t::ONE_TO_FAN, "stream_cmd")
>>>>>>> so it would properly forward stream commands, and then you can plop the
>>>>>>> stream command into the streamer.
>>>>>>>
>>>>>>> --M
>>>>>>>
>>>>>>> On Sun, Dec 15, 2024 at 10:49 PM David <vitishlsfan21@gmail.com>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> Hello all,
>>>>>>>>
>>>>>>>> I apologize in advance for data dumping. I have made a 2 input/1
>>>>>>>> output RFNoC block that requires repeatable synchronized DDC starts. My
>>>>>>>> current method of starting the DDC is not working as desired.
>>>>>>>>
>>>>>>>> *Question - **How can I correctly start both DDC's so samples are
>>>>>>>> available on the same clock cycle, similar to the rx_samples_to_file, while
>>>>>>>> still using my 2 in/1 out RFNoC block? *
>>>>>>>> I would like to focus the conversation on my C++ implementation for
>>>>>>>> now. All my simulations have convinced me my block is consuming AXI-Stream
>>>>>>>> data correctly.
>>>>>>>>
>>>>>>>> *Problem*
>>>>>>>> When starting two DDCs with timed commands sent to DDC in my C++
>>>>>>>> application, I am not getting the same result as the
>>>>>>>> rx_samples_to_file.cpp... rx_samples_to_file has repeatable alignment, and
>>>>>>>> mine has random. This has led me to believe the problem is in my
>>>>>>>> application and not my block. My Vivado simulations show my block is able
>>>>>>>> to consume the AXI-Stream transactions in parallel as I expect.
>>>>>>>>
>>>>>>>> Considering sampling noise from a sig gen that is split to both
>>>>>>>> inputs, I see the following behavior:
>>>>>>>> rx_samples_to_file (base image) davids_samples_to_file (custom
>>>>>>>> image)
>>>>>>>> DDC A samples ... X_1 Y_1 Z_1 ... DDC A samples ... X_1 Y_1 Z_1 ...
>>>>>>>> DDC B samples ... X_2 Y_2 Z_2 ... DDC B samples X_2 Y_2 Z_2 ... ...
>>>>>>>>
>>>>>>>> *sample_1 is not equal to sample_2, but over a large number of
>>>>>>>> samples they will correlate well.
>>>>>>>>
>>>>>>>> In the above example, the noise correlates as expected, but it is
>>>>>>>> delayed by 1 sample. When using my application, I have seen no delay
>>>>>>>> (desired), and also delay in the range of 5 samples.
>>>>>>>>
>>>>>>>> *C++ Implementation*
>>>>>>>> [image: image.png]
>>>>>>>>
>>>>>>>> I am using* uhd::rfnoc::ddc_block_control* types to issue the
>>>>>>>> stream command because I was having issues with my block propagating.
>>>>>>>> Issuing to the DDCs lets the data flow from 2 inputs to the 1 output, where
>>>>>>>> the output is either a file or loopback to transmit.
>>>>>>>>
>>>>>>>> The base image with rx_samples_to_file uses a multi_usrp type,
>>>>>>>> which propagates the stream command from the rx_streamer.
>>>>>>>>
>>>>>>>> *RFNoC laydown*
>>>>>>>>
>>>>>>>> [image: image.png]
>>>>>>>>
>>>>>>>> Data flows in both Tx loopback configuration and Rx to file
>>>>>>>> configuration.
>>>>>>>>
>>>>>>>> *Methods and Symptoms*
>>>>>>>> I have two methods of measuring the synchronization, with data
>>>>>>>> collected by ILA cores at either the output of DDC or input of custom
>>>>>>>> block:
>>>>>>>>
>>>>>>>> 1. *Math: *When receiving correlated noise, I can measure the
>>>>>>>> cross correlation and show that the correlation peaks as expected, and show
>>>>>>>> the delay between channels in samples.
>>>>>>>> 2. *Vivado Waveform Viewer*: When the ILA cores are collecting
>>>>>>>> DDC channel data, I can see that the base image samples are available on
>>>>>>>> the same clock. My image does not have that behavior.
>>>>>>>>
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>>
>>>>>>>> David
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> _______________________________________________
>>>>>>>> USRP-users mailing list -- usrp-users@lists.ettus.com
>>>>>>>> To unsubscribe send an email to usrp-users-leave@lists.ettus.com
>>>>>>>>
>>>>>>> _______________________________________________
>>>>>>> USRP-users mailing list -- usrp-users@lists.ettus.com
>>>>>>> To unsubscribe send an email to usrp-users-leave@lists.ettus.com
>>>>>>>
>>>>>> _______________________________________________
>>>> USRP-users mailing list -- usrp-users@lists.ettus.com
>>>> To unsubscribe send an email to usrp-users-leave@lists.ettus.com
>>>>
>>>
D
David
Thu, Jan 9, 2025 3:41 AM
Rob and Martin,
I ended up resolving the problem, thank you for the help. It was Occam's
razor... I had left in the line:
"stream_cmd.stream_now = rx_stream->get_num_channels() == 1;"
(rx_samples_to_file.cpp line 203)
When I had originally copy and pasted most of the rx_samples_to_file into
my custom davids_samples_to_file.cpp. This line always evaluated to* true*
in my 2 input/1 output case, and it needed to be set to* false* so the
timed command would work. Reading the spec and looking at the ddc block
HDL, I can see that this caused my issues.
For completeness, here are the final ILA waveform viewers of before and
after, when the DDCs start:
Before (stream_now = true)
[image: image.png]
After (stream_now = false)
[image: image.png]
Thanks,
David
On Tue, Dec 24, 2024 at 7:10 PM David vitishlsfan21@gmail.com wrote:
Rob,
Excellent information! I see the distinction now, thank you. I also was
able to see the trace logs of issue_stream_cmd calls after recompiling UHD
with the -DUHD_LOG_MIN_LEVEL=0. Nothing obviously illuminating yet, but I
will be able to use it in the future.
Thanks,
David
On Tue, Dec 24, 2024 at 7:40 AM Rob Kossler rkossler@nd.edu wrote:
Hi David,
Just to clarify, functions in the file multi_usrp.cpp are only used for
devices that don't support rfnoc. For any device that supports rfnoc, the
"make" function at the bottom of multi_usrp.cpp simply makes a
multi_usrp_rfnoc object when the user instantiates a multi_usrp object. So,
when you are using it with your device, it is using the functions from
multi_usrp_rfnoc rather than multi_usrp. If you change the UHD logging
level to trace (which may require a re-build), you will see that the rfnoc
api functions are being called (such as the ddc block controller
"issue_stream_cmd" shown below).
Rob
On Mon, Dec 23, 2024 at 5:55 PM David vitishlsfan21@gmail.com wrote:
Rob,
Thank you for your response, I was actually unaware of the
mutli_usrp_rfnoc class, and I see how it calls the same command. I now have
an extra tool I can fiddle with after the holidays, plus the new FPGA debug
images...
I have been using the multi_usrp.cpp class as my working case, which
came from the examples. It looks like it sets the stream property on the
ddc directly, whereas the RFNoC methods call a method post_action(dst_edge,
new_action). Still looking into it, which will take some time.
// multi_usrp
void issue_stream_cmd(const stream_cmd_t& stream_cmd, size_t chan)
override
{
if (chan != ALL_CHANS) {
_tree->access<stream_cmd_t>(rx_dsp_root(chan) / "stream_cmd").set
(stream_cmd);
return;
}
for (size_t c = 0; c < get_rx_num_channels(); c++) {
issue_stream_cmd(stream_cmd, c);
}
}
// multi_usrp_rfnoc
void issue_stream_cmd(const stream_cmd_t& stream_cmd, size_t chan =
ALL_CHANS) override
{
MUX_RX_API_CALL(issue_stream_cmd, stream_cmd);
auto& rx_chain = _get_rx_chan(chan);
if (rx_chain.ddc) {
rx_chain.ddc->issue_stream_cmd(stream_cmd, rx_chain.block_chan);
} else {
rx_chain.radio->issue_stream_cmd(stream_cmd, rx_chain.block_chan);
}
}
// ddc block controller
void issue_stream_cmd(const uhd::stream_cmd_t& stream_cmd, const size_t
port) override
{
RFNOC_LOG_TRACE("issue_stream_cmd(stream_mode=" << char(stream_cmd.
stream_mode)
<< ", port=" << port);
res_source_info dst_edge{res_source_info::OUTPUT_EDGE, port};
auto new_action = stream_cmd_action_info::make(stream_cmd.stream_mode);
new_action->stream_cmd = stream_cmd;
issue_stream_cmd_action_handler(dst_edge, new_action);
}
Thanks,
David
On Mon, Dec 23, 2024 at 8:28 AM Rob Kossler rkossler@nd.edu wrote:
Hi David,
Your email distinguishes between the multi_usrp API and the rfnoc API.
But, under the hood, the multi_usrp API
https://github.com/EttusResearch/uhd/blob/master/host/lib/usrp/multi_usrp_rfnoc.cpp
implements all of its functionality with the rfnoc API. So, it seems that
the multi_usrp implementation (using rfnoc API commands) is doing something
different than your own implementation (using rfnoc API commands). I
realize that this is not a very helpful comment but perhaps if you take a
closer look at the multi_usrp_rfnoc class, you might find something
different in the underlying commands.
Rob
On Fri, Dec 20, 2024 at 5:10 PM David vitishlsfan21@gmail.com wrote:
Martin,
I don't have waveform viewer screenshots yet of the inputs (working on
it), but I have run the simulation with a packet delayed 500 clock cycles
on one of my block's channels. I can see that my block "waits" for the
second channel, which aligns the axi transaction. This is because my
block is an HLS block that is data driven, and won't be ready unless it has
both inputs. I verified the output data from the delayed packet simulation.
Because of these factors, I think it is unlikely my block is deasserting
tready in my FPGA images.
Simulation output with a delayed packet on channel 1:
[image: delayed_port1_packet.png]
I also know the maximum sample rate we can run with on my block, and
have done many tests to ensure that my block is consuming data fast enough
so there are no overflows upstream.
My understanding of how the RFNoC packets work is that the output of
the DDC is filling a packet formed in the NoC shell, which is then released
once the 64 samples are filled. You can see that the DDC0 and DDC1
tready in all my debug screenshots is always asserted, even in the
non-working cases. Likewise, on my blocks input, tvalid from the noc shell
is always asserted, while my blocks tready drives the transaction.
Where we are now, is that using the usrp and multi_usrp APIs, my block
works as expected. When using RFNoC API, which sets the rate on the DDC and
starts streaming, we get the problem behavior. Is it possible that DDC0 and
DDC1 are not sampling correctly when I am using RFNoC API to set the rate
and start streaming? I have seen a difference before between the APIs,
where the multi_usrp was able to set the center frequency on the base
image, and the RFNoC API kept the center frequency at 0 MHz.
I don't understand why the clock distance between the tvalids on DDC0
and DDC1 would change in my previous images, which only happens on the
RFNoC API application. I would expect a ddc output to be equidistant based
on the output sample rate. This is where the debugging is in the DDC blocks
(uhd/fpga/usrp3/lib/rfnoc/ddc.v):
//! RFNoC specific digital down-conversion chain
module ddc #(
parameter SR_FREQ_ADDR = 0,
parameter SR_SCALE_IQ_ADDR = 1,
parameter SR_DECIM_ADDR = 2,
parameter SR_MUX_ADDR = 3,
parameter SR_COEFFS_ADDR = 4,
parameter PRELOAD_HBS = 1, // Preload half band filter state with 0s
parameter NUM_HB = 3,
parameter CIC_MAX_DECIM = 255,
parameter SAMPLE_WIDTH = 16,
parameter WIDTH = 24
)(
input clk, input reset,
input clear, // Resets everything except the timed phase inc FIFO and
phase inc
input set_stb, input [7:0] set_addr, input [31:0] set_data,
input timed_set_stb, input [7:0] timed_set_addr, input [31:0]
timed_set_data,
input [31:0] sample_in_tdata,
input sample_in_tvalid,
input sample_in_tlast,
(* dont_touch="true",mark_debug="true") output sample_in_tready,
input sample_in_tuser,
input sample_in_eob,
( dont_touch="true",mark_debug="true") output [31:0]
sample_out_tdata,
( dont_touch="true",mark_debug="true"*) output sample_out_tvalid,
input sample_out_tready,
output sample_out_tlast
);
Thanks,
David
On Fri, Dec 20, 2024 at 4:11 AM Martin Braun martin.braun@ettus.com
wrote:
David,
is it possible that your block is deasserting tready on one of its
inputs, thus delaying the DDC?
--M
On Fri, Dec 20, 2024 at 3:27 AM David vitishlsfan21@gmail.com
wrote:
Martin,
Thanks for the reply. I will take any modules you suggest for AXI
alignment, even if they do not "fix" my issue, it is good for me to look at.
-
thanks for the comment, this block is a long time coming.
-
We captured some screen shots of the ILA core recording both the
base image and my image. I also was able to add a dummy port on my image
and run the *rx_samples_to_file *on that (because it was statically
connected), which confirmed that the multi_usrp method producing the
expected results, with/without my block in line:
below I present some screenshots of the behavior, where the ILA is
capturing the output of both DDCs before packetization.* What is
not shown is the multi_usrp method running with my block, but it has the
same behavior as the base image**:*
Base Image, with rx_samples_to_file (multi_usrp)
Example 1: zoomed in run
[image: base_image_zoomed.PNG]
Example 2: different run, zoomed out. both DDCs perform as expected:
[image: base_image_zoomed_out.PNG]
Custom Image, with davids_rx_to_file (ddc_block_controller)
Example 1: random distance between samples on both DDCs, clear on
DDC1. The last 4 valids have a big change in cycle distance.
[image: random_dist.PNG]
Example 2: a different run, same behavior as above and time tags.
[image: time_tags.PNG]
Example 3: A run where it "almost" worked, and my block also "almost
worked". You can see the alignment slips at the end:
[image: Timing_mostly_aligned.PNG]
- right now in the yaml I am using the named inputs with one port
each:
data:
fpga_iface: axis_data
clk_domain: rfnoc_chdr
inputs:
in_1:
num_ports: 1
...
in_2:
num_ports: 1
...
I have done some experiments with one named input with 2 port, and I
see that the AXI handshake is one packet with two parallel streams. I will
try to "AXI align" as you suggested with this first:
data:
fpga_iface: axis_data
clk_domain: rfnoc_chdr
inputs:
in:
num_ports: 2
...
- right now, since I want to issue the streaming command while
doing record to file and transmit loopback, I will start with
the forwarding policy as you suggested and also try to add my own issue
stream command to my block. It is not trivial for me since I am not a C++
person, so I won't be able to provide much feedback on that effort.
Thanks,
David
On Thu, Dec 19, 2024 at 3:24 AM Martin Braun martin.braun@ettus.com
wrote:
Hey David,
this looks like you've gotten pretty far on a sophisticated
project! I have a few observations:
- At first glance, your C++ looks correct.
- I would expect samples to arrive at your block synchronously
based on that. However, maybe I'm forgetting something that would cause the
outputs of the DDCs to misalign data by a few clock cycles. Which makes me
wonder: Are you sure your input packets are misaligned? In RFNoC, we make
no guarantee that the output of the DDC (or any other) block is aligned to
the clock cycle, because we encode the timestamp with the data. Meaning
that the first, actual sample that arrives at your block on each channel is
in fact time-aligned, they just arrive a few clock cycles apart. This is
the same logic that applies when packets arrive at the host computer, where
we make no assumptions that they arrive at the exact same time.
- If this is the issue, I think we have some modules you can use to
actually align samples within your block, or you just do some AXI alignment
yourself by combining the tready and tvalid signals of two streams.
- Side note, although it's not important: I would consider it a
best practice to have your block call
set_action_forwarding_policy(forwarding_policy_t::ONE_TO_FAN, "stream_cmd")
so it would properly forward stream commands, and then you can plop the
stream command into the streamer.
--M
On Sun, Dec 15, 2024 at 10:49 PM David vitishlsfan21@gmail.com
wrote:
Hello all,
I apologize in advance for data dumping. I have made a 2 input/1
output RFNoC block that requires repeatable synchronized DDC starts. My
current method of starting the DDC is not working as desired.
*Question - **How can I correctly start both DDC's so samples are
available on the same clock cycle, similar to the rx_samples_to_file, while
still using my 2 in/1 out RFNoC block? *
I would like to focus the conversation on my C++ implementation
for now. All my simulations have convinced me my block is consuming
AXI-Stream data correctly.
Problem
When starting two DDCs with timed commands sent to DDC in my C++
application, I am not getting the same result as the
rx_samples_to_file.cpp... rx_samples_to_file has repeatable alignment, and
mine has random. This has led me to believe the problem is in my
application and not my block. My Vivado simulations show my block is able
to consume the AXI-Stream transactions in parallel as I expect.
Considering sampling noise from a sig gen that is split to both
inputs, I see the following behavior:
rx_samples_to_file (base image) davids_samples_to_file (custom
image)
DDC A samples ... X_1 Y_1 Z_1 ... DDC A samples ... X_1 Y_1 Z_1
...
DDC B samples ... X_2 Y_2 Z_2 ... DDC B samples X_2 Y_2 Z_2 ...
...
*sample_1 is not equal to sample_2, but over a large number of
samples they will correlate well.
In the above example, the noise correlates as expected, but it is
delayed by 1 sample. When using my application, I have seen no delay
(desired), and also delay in the range of 5 samples.
C++ Implementation
[image: image.png]
I am using* uhd::rfnoc::ddc_block_control* types to issue the
stream command because I was having issues with my block propagating.
Issuing to the DDCs lets the data flow from 2 inputs to the 1 output, where
the output is either a file or loopback to transmit.
The base image with rx_samples_to_file uses a multi_usrp type,
which propagates the stream command from the rx_streamer.
RFNoC laydown
[image: image.png]
Data flows in both Tx loopback configuration and Rx to file
configuration.
Methods and Symptoms
I have two methods of measuring the synchronization, with data
collected by ILA cores at either the output of DDC or input of custom
block:
1. *Math: *When receiving correlated noise, I can measure the
cross correlation and show that the correlation peaks as expected, and show
the delay between channels in samples.
2. *Vivado Waveform Viewer*: When the ILA cores are collecting
DDC channel data, I can see that the base image samples are available on
the same clock. My image does not have that behavior.
Thanks,
David
USRP-users mailing list -- usrp-users@lists.ettus.com
To unsubscribe send an email to usrp-users-leave@lists.ettus.com
Rob and Martin,
I ended up resolving the problem, thank you for the help. It was Occam's
razor... I had left in the line:
"stream_cmd.stream_now = rx_stream->get_num_channels() == 1;"
(rx_samples_to_file.cpp line 203)
When I had originally copy and pasted most of the rx_samples_to_file into
my custom davids_samples_to_file.cpp. This line always evaluated to* true*
in my 2 input/1 output case, and it needed to be set to* false* so the
timed command would work. Reading the spec and looking at the ddc block
HDL, I can see that this caused my issues.
For completeness, here are the final ILA waveform viewers of before and
after, when the DDCs start:
Before (stream_now = true)
[image: image.png]
After (stream_now = false)
[image: image.png]
Thanks,
David
On Tue, Dec 24, 2024 at 7:10 PM David <vitishlsfan21@gmail.com> wrote:
> Rob,
>
> Excellent information! I see the distinction now, thank you. I also was
> able to see the trace logs of issue_stream_cmd calls after recompiling UHD
> with the -DUHD_LOG_MIN_LEVEL=0. Nothing obviously illuminating yet, but I
> will be able to use it in the future.
>
> Thanks,
>
> David
>
> On Tue, Dec 24, 2024 at 7:40 AM Rob Kossler <rkossler@nd.edu> wrote:
>
>> Hi David,
>> Just to clarify, functions in the file multi_usrp.cpp are only used for
>> devices that don't support rfnoc. For any device that supports rfnoc, the
>> "make" function at the bottom of multi_usrp.cpp simply makes a
>> multi_usrp_rfnoc object when the user instantiates a multi_usrp object. So,
>> when you are using it with your device, it is using the functions from
>> multi_usrp_rfnoc rather than multi_usrp. If you change the UHD logging
>> level to trace (which may require a re-build), you will see that the rfnoc
>> api functions are being called (such as the ddc block controller
>> "issue_stream_cmd" shown below).
>> Rob
>>
>> On Mon, Dec 23, 2024 at 5:55 PM David <vitishlsfan21@gmail.com> wrote:
>>
>>> Rob,
>>>
>>> Thank you for your response, I was actually unaware of the
>>> mutli_usrp_rfnoc class, and I see how it calls the same command. I now have
>>> an extra tool I can fiddle with after the holidays, plus the new FPGA debug
>>> images...
>>>
>>> I have been using the multi_usrp.cpp class as my working case, which
>>> came from the examples. It looks like it sets the stream property on the
>>> ddc directly, whereas the RFNoC methods call a method post_action(dst_edge,
>>> new_action). Still looking into it, which will take some time.
>>>
>>> // multi_usrp
>>> void issue_stream_cmd(const stream_cmd_t& stream_cmd, size_t chan)
>>> override
>>> {
>>> if (chan != ALL_CHANS) {
>>> _tree->access<stream_cmd_t>(rx_dsp_root(chan) / "stream_cmd").set
>>> (stream_cmd);
>>> return;
>>> }
>>> for (size_t c = 0; c < get_rx_num_channels(); c++) {
>>> issue_stream_cmd(stream_cmd, c);
>>> }
>>> }
>>>
>>> // multi_usrp_rfnoc
>>> void issue_stream_cmd(const stream_cmd_t& stream_cmd, size_t chan =
>>> ALL_CHANS) override
>>> {
>>> MUX_RX_API_CALL(issue_stream_cmd, stream_cmd);
>>> auto& rx_chain = _get_rx_chan(chan);
>>> if (rx_chain.ddc) {
>>> rx_chain.ddc->issue_stream_cmd(stream_cmd, rx_chain.block_chan);
>>> } else {
>>> rx_chain.radio->issue_stream_cmd(stream_cmd, rx_chain.block_chan);
>>> }
>>> }
>>>
>>>
>>> // ddc block controller
>>> void issue_stream_cmd(const uhd::stream_cmd_t& stream_cmd, const size_t
>>> port) override
>>> {
>>> RFNOC_LOG_TRACE("issue_stream_cmd(stream_mode=" << char(stream_cmd.
>>> stream_mode)
>>> << ", port=" << port);
>>> res_source_info dst_edge{res_source_info::OUTPUT_EDGE, port};
>>> auto new_action = stream_cmd_action_info::make(stream_cmd.stream_mode);
>>> new_action->stream_cmd = stream_cmd;
>>> issue_stream_cmd_action_handler(dst_edge, new_action);
>>> }
>>>
>>> Thanks,
>>>
>>> David
>>>
>>> On Mon, Dec 23, 2024 at 8:28 AM Rob Kossler <rkossler@nd.edu> wrote:
>>>
>>>> Hi David,
>>>> Your email distinguishes between the multi_usrp API and the rfnoc API.
>>>> But, under the hood, the multi_usrp API
>>>> <https://github.com/EttusResearch/uhd/blob/master/host/lib/usrp/multi_usrp_rfnoc.cpp>
>>>> implements all of its functionality with the rfnoc API. So, it seems that
>>>> the multi_usrp implementation (using rfnoc API commands) is doing something
>>>> different than your own implementation (using rfnoc API commands). I
>>>> realize that this is not a very helpful comment but perhaps if you take a
>>>> closer look at the multi_usrp_rfnoc class, you might find something
>>>> different in the underlying commands.
>>>> Rob
>>>>
>>>> On Fri, Dec 20, 2024 at 5:10 PM David <vitishlsfan21@gmail.com> wrote:
>>>>
>>>>> Martin,
>>>>>
>>>>> I don't have waveform viewer screenshots yet of the inputs (working on
>>>>> it), but I have run the simulation with a packet delayed 500 clock cycles
>>>>> on one of my block's channels. I can see that my block "waits" for the
>>>>> second channel, which aligns the axi transaction. This is because my
>>>>> block is an HLS block that is data driven, and won't be ready unless it has
>>>>> both inputs. I verified the output data from the delayed packet simulation.
>>>>> Because of these factors, I think it is unlikely my block is deasserting
>>>>> tready in my FPGA images.
>>>>>
>>>>> Simulation output with a delayed packet on channel 1:
>>>>> [image: delayed_port1_packet.png]
>>>>>
>>>>>
>>>>> I also know the maximum sample rate we can run with on my block, and
>>>>> have done many tests to ensure that my block is consuming data fast enough
>>>>> so there are no overflows upstream.
>>>>>
>>>>> My understanding of how the RFNoC packets work is that the output of
>>>>> the DDC is filling a packet formed in the NoC shell, which is then released
>>>>> once the 64 samples are filled. You can see that the DDC0 and DDC1
>>>>> *tready* in all my debug screenshots is always asserted, even in the
>>>>> non-working cases. Likewise, on my blocks input, tvalid from the noc shell
>>>>> is always asserted, while my blocks tready drives the transaction.
>>>>>
>>>>> Where we are now, is that using the usrp and multi_usrp APIs, my block
>>>>> works as expected. When using RFNoC API, which sets the rate on the DDC and
>>>>> starts streaming, we get the problem behavior. Is it possible that DDC0 and
>>>>> DDC1 are not sampling correctly when I am using RFNoC API to set the rate
>>>>> and start streaming? I have seen a difference before between the APIs,
>>>>> where the multi_usrp was able to set the center frequency on the base
>>>>> image, and the RFNoC API kept the center frequency at 0 MHz.
>>>>>
>>>>> I don't understand why the clock distance between the tvalids on DDC0
>>>>> and DDC1 would change in my previous images, which only happens on the
>>>>> RFNoC API application. I would expect a ddc output to be equidistant based
>>>>> on the output sample rate. This is where the debugging is in the DDC blocks
>>>>> (uhd/fpga/usrp3/lib/rfnoc/ddc.v):
>>>>>
>>>>> //! RFNoC specific digital down-conversion chain
>>>>>
>>>>> module ddc #(
>>>>> parameter SR_FREQ_ADDR = 0,
>>>>> parameter SR_SCALE_IQ_ADDR = 1,
>>>>> parameter SR_DECIM_ADDR = 2,
>>>>> parameter SR_MUX_ADDR = 3,
>>>>> parameter SR_COEFFS_ADDR = 4,
>>>>> parameter PRELOAD_HBS = 1, // Preload half band filter state with 0s
>>>>> parameter NUM_HB = 3,
>>>>> parameter CIC_MAX_DECIM = 255,
>>>>> parameter SAMPLE_WIDTH = 16,
>>>>> parameter WIDTH = 24
>>>>> )(
>>>>> input clk, input reset,
>>>>> input clear, // Resets everything except the timed phase inc FIFO and
>>>>> phase inc
>>>>> input set_stb, input [7:0] set_addr, input [31:0] set_data,
>>>>> input timed_set_stb, input [7:0] timed_set_addr, input [31:0]
>>>>> timed_set_data,
>>>>> input [31:0] sample_in_tdata,
>>>>> input sample_in_tvalid,
>>>>> input sample_in_tlast,
>>>>> (* dont_touch="true",mark_debug="true"*) output sample_in_tready,
>>>>> input sample_in_tuser,
>>>>> input sample_in_eob,
>>>>> (* dont_touch="true",mark_debug="true"*) output [31:0]
>>>>> sample_out_tdata,
>>>>> (* dont_touch="true",mark_debug="true"*) output sample_out_tvalid,
>>>>> input sample_out_tready,
>>>>> output sample_out_tlast
>>>>> );
>>>>>
>>>>> Thanks,
>>>>>
>>>>> David
>>>>>
>>>>> On Fri, Dec 20, 2024 at 4:11 AM Martin Braun <martin.braun@ettus.com>
>>>>> wrote:
>>>>>
>>>>>> David,
>>>>>>
>>>>>> is it possible that your block is deasserting tready on one of its
>>>>>> inputs, thus delaying the DDC?
>>>>>>
>>>>>> --M
>>>>>>
>>>>>> On Fri, Dec 20, 2024 at 3:27 AM David <vitishlsfan21@gmail.com>
>>>>>> wrote:
>>>>>>
>>>>>>> Martin,
>>>>>>>
>>>>>>> Thanks for the reply. I will take any modules you suggest for AXI
>>>>>>> alignment, even if they do not "fix" my issue, it is good for me to look at.
>>>>>>>
>>>>>>> 1. thanks for the comment, this block is a long time coming.
>>>>>>>
>>>>>>> 2. We captured some screen shots of the ILA core recording both the
>>>>>>> base image and my image. I also was able to add a dummy port on my image
>>>>>>> and run the *rx_samples_to_file *on that (because it was statically
>>>>>>> connected), which confirmed that the multi_usrp method producing the
>>>>>>> expected results, with/without my block in line:
>>>>>>>
>>>>>>> below I present some screenshots of the behavior, where the ILA is
>>>>>>> capturing the output of both DDCs *before* packetization.* What is
>>>>>>> not shown is the multi_usrp method running with my block, but it has the
>>>>>>> same behavior as the base image**:*
>>>>>>>
>>>>>>> *Base Image, with rx_samples_to_file (multi_usrp)*
>>>>>>> Example 1: zoomed in run
>>>>>>> [image: base_image_zoomed.PNG]
>>>>>>> Example 2: different run, zoomed out. both DDCs perform as expected:
>>>>>>> [image: base_image_zoomed_out.PNG]
>>>>>>>
>>>>>>>
>>>>>>> *Custom Image, with davids_rx_to_file (ddc_block_controller)*
>>>>>>> Example 1: random distance between samples on both DDCs, clear on
>>>>>>> DDC1. The last 4 valids have a big change in cycle distance.
>>>>>>> [image: random_dist.PNG]
>>>>>>> Example 2: a different run, same behavior as above and time tags.
>>>>>>> [image: time_tags.PNG]
>>>>>>> Example 3: A run where it "almost" worked, and my block also "almost
>>>>>>> worked". You can see the alignment slips at the end:
>>>>>>> [image: Timing_mostly_aligned.PNG]
>>>>>>>
>>>>>>>
>>>>>>> 3. right now in the yaml I am using the named inputs with one port
>>>>>>> each:
>>>>>>>
>>>>>>> data:
>>>>>>> fpga_iface: axis_data
>>>>>>> clk_domain: rfnoc_chdr
>>>>>>> inputs:
>>>>>>> in_1:
>>>>>>> num_ports: 1
>>>>>>> ...
>>>>>>> in_2:
>>>>>>> num_ports: 1
>>>>>>> ...
>>>>>>>
>>>>>>> I have done some experiments with one named input with 2 port, and I
>>>>>>> see that the AXI handshake is one packet with two parallel streams. I will
>>>>>>> try to "AXI align" as you suggested with this first:
>>>>>>> data:
>>>>>>> fpga_iface: axis_data
>>>>>>> clk_domain: rfnoc_chdr
>>>>>>> inputs:
>>>>>>> in:
>>>>>>> num_ports: 2
>>>>>>> ...
>>>>>>>
>>>>>>> 4. right now, since I want to issue the streaming command while
>>>>>>> doing *record to file* and *transmit loopback*, I will start with
>>>>>>> the forwarding policy as you suggested and also try to add my own issue
>>>>>>> stream command to my block. It is not trivial for me since I am not a C++
>>>>>>> person, so I won't be able to provide much feedback on that effort.
>>>>>>>
>>>>>>> Thanks,
>>>>>>>
>>>>>>> David
>>>>>>>
>>>>>>> On Thu, Dec 19, 2024 at 3:24 AM Martin Braun <martin.braun@ettus.com>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> Hey David,
>>>>>>>>
>>>>>>>> this looks like you've gotten pretty far on a sophisticated
>>>>>>>> project! I have a few observations:
>>>>>>>>
>>>>>>>> - At first glance, your C++ looks correct.
>>>>>>>> - I would expect samples to arrive at your block synchronously
>>>>>>>> based on that. However, maybe I'm forgetting something that would cause the
>>>>>>>> outputs of the DDCs to misalign data by a few clock cycles. Which makes me
>>>>>>>> wonder: Are you sure your input packets are misaligned? In RFNoC, we make
>>>>>>>> no guarantee that the output of the DDC (or any other) block is aligned to
>>>>>>>> the clock cycle, because we encode the timestamp with the data. Meaning
>>>>>>>> that the first, actual sample that arrives at your block on each channel is
>>>>>>>> in fact time-aligned, they just arrive a few clock cycles apart. This is
>>>>>>>> the same logic that applies when packets arrive at the host computer, where
>>>>>>>> we make no assumptions that they arrive at the exact same time.
>>>>>>>> - If this is the issue, I think we have some modules you can use to
>>>>>>>> actually align samples within your block, or you just do some AXI alignment
>>>>>>>> yourself by combining the tready and tvalid signals of two streams.
>>>>>>>> - Side note, although it's not important: I would consider it a
>>>>>>>> best practice to have your block call
>>>>>>>> set_action_forwarding_policy(forwarding_policy_t::ONE_TO_FAN, "stream_cmd")
>>>>>>>> so it would properly forward stream commands, and then you can plop the
>>>>>>>> stream command into the streamer.
>>>>>>>>
>>>>>>>> --M
>>>>>>>>
>>>>>>>> On Sun, Dec 15, 2024 at 10:49 PM David <vitishlsfan21@gmail.com>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>> Hello all,
>>>>>>>>>
>>>>>>>>> I apologize in advance for data dumping. I have made a 2 input/1
>>>>>>>>> output RFNoC block that requires repeatable synchronized DDC starts. My
>>>>>>>>> current method of starting the DDC is not working as desired.
>>>>>>>>>
>>>>>>>>> *Question - **How can I correctly start both DDC's so samples are
>>>>>>>>> available on the same clock cycle, similar to the rx_samples_to_file, while
>>>>>>>>> still using my 2 in/1 out RFNoC block? *
>>>>>>>>> I would like to focus the conversation on my C++ implementation
>>>>>>>>> for now. All my simulations have convinced me my block is consuming
>>>>>>>>> AXI-Stream data correctly.
>>>>>>>>>
>>>>>>>>> *Problem*
>>>>>>>>> When starting two DDCs with timed commands sent to DDC in my C++
>>>>>>>>> application, I am not getting the same result as the
>>>>>>>>> rx_samples_to_file.cpp... rx_samples_to_file has repeatable alignment, and
>>>>>>>>> mine has random. This has led me to believe the problem is in my
>>>>>>>>> application and not my block. My Vivado simulations show my block is able
>>>>>>>>> to consume the AXI-Stream transactions in parallel as I expect.
>>>>>>>>>
>>>>>>>>> Considering sampling noise from a sig gen that is split to both
>>>>>>>>> inputs, I see the following behavior:
>>>>>>>>> rx_samples_to_file (base image) davids_samples_to_file (custom
>>>>>>>>> image)
>>>>>>>>> DDC A samples ... X_1 Y_1 Z_1 ... DDC A samples ... X_1 Y_1 Z_1
>>>>>>>>> ...
>>>>>>>>> DDC B samples ... X_2 Y_2 Z_2 ... DDC B samples X_2 Y_2 Z_2 ...
>>>>>>>>> ...
>>>>>>>>>
>>>>>>>>> *sample_1 is not equal to sample_2, but over a large number of
>>>>>>>>> samples they will correlate well.
>>>>>>>>>
>>>>>>>>> In the above example, the noise correlates as expected, but it is
>>>>>>>>> delayed by 1 sample. When using my application, I have seen no delay
>>>>>>>>> (desired), and also delay in the range of 5 samples.
>>>>>>>>>
>>>>>>>>> *C++ Implementation*
>>>>>>>>> [image: image.png]
>>>>>>>>>
>>>>>>>>> I am using* uhd::rfnoc::ddc_block_control* types to issue the
>>>>>>>>> stream command because I was having issues with my block propagating.
>>>>>>>>> Issuing to the DDCs lets the data flow from 2 inputs to the 1 output, where
>>>>>>>>> the output is either a file or loopback to transmit.
>>>>>>>>>
>>>>>>>>> The base image with rx_samples_to_file uses a multi_usrp type,
>>>>>>>>> which propagates the stream command from the rx_streamer.
>>>>>>>>>
>>>>>>>>> *RFNoC laydown*
>>>>>>>>>
>>>>>>>>> [image: image.png]
>>>>>>>>>
>>>>>>>>> Data flows in both Tx loopback configuration and Rx to file
>>>>>>>>> configuration.
>>>>>>>>>
>>>>>>>>> *Methods and Symptoms*
>>>>>>>>> I have two methods of measuring the synchronization, with data
>>>>>>>>> collected by ILA cores at either the output of DDC or input of custom
>>>>>>>>> block:
>>>>>>>>>
>>>>>>>>> 1. *Math: *When receiving correlated noise, I can measure the
>>>>>>>>> cross correlation and show that the correlation peaks as expected, and show
>>>>>>>>> the delay between channels in samples.
>>>>>>>>> 2. *Vivado Waveform Viewer*: When the ILA cores are collecting
>>>>>>>>> DDC channel data, I can see that the base image samples are available on
>>>>>>>>> the same clock. My image does not have that behavior.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Thanks,
>>>>>>>>>
>>>>>>>>> David
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> _______________________________________________
>>>>>>>>> USRP-users mailing list -- usrp-users@lists.ettus.com
>>>>>>>>> To unsubscribe send an email to usrp-users-leave@lists.ettus.com
>>>>>>>>>
>>>>>>>> _______________________________________________
>>>>>>>> USRP-users mailing list -- usrp-users@lists.ettus.com
>>>>>>>> To unsubscribe send an email to usrp-users-leave@lists.ettus.com
>>>>>>>>
>>>>>>> _______________________________________________
>>>>> USRP-users mailing list -- usrp-users@lists.ettus.com
>>>>> To unsubscribe send an email to usrp-users-leave@lists.ettus.com
>>>>>
>>>>