Empathy List Archives

LB

Leo Bodnar

Tue, Oct 31, 2017 9:03 PM

From: Attila Kinali attila@kinali.ch
True. An NTP server does not need to measure time better than 100ns or so.
10ns is probably more than good enough. But then, this raises the question
what your performance metric is that you optimize for?

The goal was maximum throughput with minimum time offset.
Maximum throughput eventually ended up as "fully saturated full-duplex 100BASE-TX" and minimum time offset as "below 1 microsecond"
There was nothing on the market below £2-3k that could do that. I think Microsemi has recently made a server that can do 100kpps+ but I don't know its price.

If it is hold-over, then this will be limited by the TCXO and how well
you can measure its frequency, which in turn depends on how well you
can measure the PPS pulse. You say that your device is typically within
4-5ms in 24h of hold-over. That translates to frequency uncertainty
of approximately 5e-8. That's not that good.
To put this into perspective have a look at the two attached plots.
These are the PPM values that ntp reports for a standard server (HP DL380G7).
The first plot shows the long term variation of all the data I currently have.
The three jumps coincide with days when we restarted ntpd. As you can see,
the long term variation of the crystal frequency is well below 0.5ppm. The
second plot zooms in into one of the day with large variations. The worst
of these being about 10ppb. Lets assume for simplicity, the 10ppb step happens
instantaneous, then this would result in a hold over performance of ~0.9ms
in 24h. Yes, this is not a fair comparison. The sever is in a room where
temperature is pretty much constant (sorry, I don't have any data on that,
but assume less than 5°C variation within a day). And it's not true hold
over performance, but a guestimation from the ntp provided loop data. But
even if we add a factor of 10, this simple, unstabilized, unsophisticated
PC comes pretty close to the performance your device claims. And that's not
even a PC with a good crystal (I have measurements of others, that showed
variation of less than 2ppb over months in rooms without air conditioning).

Or to put it differently: If i'd get a Minnow Turbot, add a GPS receiver,
put everything in a metal box and just use normal ntpd, i'd expect to
have a hold over performance of better than 100ms/24h (assuming 1ppm
stability of the crystal), probably in the order of 10ms/24h and it would
have no problems handling a humongous number of clients, thanks to the
fast CPU (1.4GHz) and the Gbit/s ethernet interface.

So, why does a simple PC with ntp do such a good job? The secret
lies in the measurement: Very much simplified, ntp measures the
frequency in 1000s intervals. Measurement uncertainty is reported to be
better than 100us per reference server. Ie the uncertainty is in
better than 1e-7 (compare with the estimated 5e-8 from above).
Add to that averaging over multiple reference severs (4 in this case)
and a sophisticated clock parameter estimation and the uncertainty
goes down quite a bit.

To summarize: If you want to improve your ntp devices hold over performance
you have to improve the frequency measurement and use a better clock modeling.
Ie, use a timing GPS receiver and its sawtooth correction, and model the
clocks frequency change over time.

I do want to improve my NTP devices but I do not understand what you are suggesting.
Why would sawtooth correction matter when there is no GPS signal available at all?
I am not measuring any frequencies - the whole device runs synchronously hard-locked to GPS time when it is available and freewheeling when not.
This is reference stratum 1 clock, it does connect to any servers, it only serves clients.
If you chop off its antenna and disconnect local LAN segment from the internet and other NTP servers, local network time will drift off by 4-5ms in 24 hours and then the server will fall back to stratum 16 and will tell clients that it cannot provide accurate time anymore.

But even if we add a factor of 10, this simple, unstabilized, unsophisticated PC comes pretty close to the performance your device claims.

Are you saying that if you deprive any PC of any connectivity it will drift by 4-5ms in 24 hours?

Leo

> From: Attila Kinali <attila@kinali.ch> > True. An NTP server does not need to measure time better than 100ns or so. > 10ns is probably more than good enough. But then, this raises the question > what your performance metric is that you optimize for? The goal was maximum throughput with minimum time offset. Maximum throughput eventually ended up as "fully saturated full-duplex 100BASE-TX" and minimum time offset as "below 1 microsecond" There was nothing on the market below £2-3k that could do that. I think Microsemi has recently made a server that can do 100kpps+ but I don't know its price. > If it is hold-over, then this will be limited by the TCXO and how well > you can measure its frequency, which in turn depends on how well you > can measure the PPS pulse. You say that your device is typically within > 4-5ms in 24h of hold-over. That translates to frequency uncertainty > of approximately 5e-8. That's not that good. > To put this into perspective have a look at the two attached plots. > These are the PPM values that ntp reports for a standard server (HP DL380G7). > The first plot shows the long term variation of all the data I currently have. > The three jumps coincide with days when we restarted ntpd. As you can see, > the long term variation of the crystal frequency is well below 0.5ppm. The > second plot zooms in into one of the day with large variations. The worst > of these being about 10ppb. Lets assume for simplicity, the 10ppb step happens > instantaneous, then this would result in a hold over performance of ~0.9ms > in 24h. Yes, this is not a fair comparison. The sever is in a room where > temperature is pretty much constant (sorry, I don't have any data on that, > but assume less than 5°C variation within a day). And it's not true hold > over performance, but a guestimation from the ntp provided loop data. But > even if we add a factor of 10, this simple, unstabilized, unsophisticated > PC comes pretty close to the performance your device claims. And that's not > even a PC with a good crystal (I have measurements of others, that showed > variation of less than 2ppb over months in rooms without air conditioning). > > Or to put it differently: If i'd get a Minnow Turbot, add a GPS receiver, > put everything in a metal box and just use normal ntpd, i'd expect to > have a hold over performance of better than 100ms/24h (assuming 1ppm > stability of the crystal), probably in the order of 10ms/24h and it would > have no problems handling a humongous number of clients, thanks to the > fast CPU (1.4GHz) and the Gbit/s ethernet interface. > > So, why does a simple PC with ntp do such a good job? The secret > lies in the measurement: Very much simplified, ntp measures the > frequency in 1000s intervals. Measurement uncertainty is reported to be > better than 100us per reference server. Ie the uncertainty is in > better than 1e-7 (compare with the estimated 5e-8 from above). > Add to that averaging over multiple reference severs (4 in this case) > and a sophisticated clock parameter estimation and the uncertainty > goes down quite a bit. > > To summarize: If you want to improve your ntp devices hold over performance > you have to improve the frequency measurement and use a better clock modeling. > Ie, use a timing GPS receiver and its sawtooth correction, and model the > clocks frequency change over time. I do want to improve my NTP devices but I do not understand what you are suggesting. Why would sawtooth correction matter when there is no GPS signal available at all? I am not measuring any frequencies - the whole device runs synchronously hard-locked to GPS time when it is available and freewheeling when not. This is reference stratum 1 clock, it does connect to any servers, it only serves clients. If you chop off its antenna and disconnect local LAN segment from the internet and other NTP servers, local network time will drift off by 4-5ms in 24 hours and then the server will fall back to stratum 16 and will tell clients that it cannot provide accurate time anymore. > But even if we add a factor of 10, this simple, unstabilized, unsophisticated PC comes pretty close to the performance your device claims. Are you saying that if you deprive any PC of any connectivity it will drift by 4-5ms in 24 hours? Leo

AK

Attila Kinali

Wed, Nov 1, 2017 12:11 AM

On Tue, 31 Oct 2017 21:03:05 +0000
Leo Bodnar leo@leobodnar.com wrote:

The goal was maximum throughput with minimum time offset.
Maximum throughput eventually ended up as "fully saturated full-duplex
100BASE-TX" and minimum time offset as "below 1 microsecond"
There was nothing on the market below £2-3k that could do that. I think
Microsemi has recently made a server that can do 100kpps+ but I don't know
its price.

Hmm? There are at least a dozen how-tos out there that explain how to
make an NTP server out of an SBC. And I have seen the one or other
being sold as a complete box with batteries included.

Basically, all you have to do is use an SBC that runs linux and has
a GPIO with an interrupt to act as a PPS input. Attach a GPS receiver
and you are almost done. The cheapest option are probably the i.MX233
based ones (go as low as €20). The probably most mentioned option
is using a Beaglebone Black.

If I had to build something like this today, I would probably go
for an OSD3358, which is an AM3358 packaged with memory and power
management and allows using a simple 4 layer board. Add a few
bits for ethernet and the GPS and you are almost done.

I do want to improve my NTP devices but I do not understand what you are
suggesting.
Why would sawtooth correction matter when there is no GPS signal available at
all?

It matters while you have signal.

I am not measuring any frequencies - the whole device runs synchronously hard-
locked to GPS time when it is available and freewheeling when not.

You should have a control loop somewhere, which explicitly or implicitly
estimates the frequency of the TCXO.

The time-nuts archives are full with discussions how to do such
control loops and improve hold over performance. Though there
weren't many in the last 2-3 years. John Vigs tutorial is also
a good start.

Are you saying that if you deprive any PC of any connectivity it will drift
by 4-5ms in 24 hours?

Almost. It has to have ntp running and ntp must have had time to
discipline the local oscillator. If the PC is then in an environment
that will not cause its oscillator to drift more than 10-100ppb per
day, then it will stay below 10ms. There are a few ifs there, but
it's nothing out of the ordinary. Even ordinary crystal oscillators
can be quite stable if they have been running for a while.
Just for comparison: decent wrist watches drift less than 1min in
half a year. Good ones less than 10s.

		Attila Kinali

--
It is upon moral qualities that a society is ultimately founded. All
the prosperity and technological sophistication in the world is of no
use without that foundation.
-- Miss Matheson, The Diamond Age, Neil Stephenson

On Tue, 31 Oct 2017 21:03:05 +0000 Leo Bodnar <leo@leobodnar.com> wrote: > The goal was maximum throughput with minimum time offset. > Maximum throughput eventually ended up as "fully saturated full-duplex > 100BASE-TX" and minimum time offset as "below 1 microsecond" > There was nothing on the market below £2-3k that could do that. I think > Microsemi has recently made a server that can do 100kpps+ but I don't know > its price. Hmm? There are at least a dozen how-tos out there that explain how to make an NTP server out of an SBC. And I have seen the one or other being sold as a complete box with batteries included. Basically, all you have to do is use an SBC that runs linux and has a GPIO with an interrupt to act as a PPS input. Attach a GPS receiver and you are almost done. The cheapest option are probably the i.MX233 based ones (go as low as €20). The probably most mentioned option is using a Beaglebone Black. If I had to build something like this today, I would probably go for an OSD3358, which is an AM3358 packaged with memory and power management and allows using a simple 4 layer board. Add a few bits for ethernet and the GPS and you are almost done. > I do want to improve my NTP devices but I do not understand what you are > suggesting. > Why would sawtooth correction matter when there is no GPS signal available at > all? It matters while you have signal. > I am not measuring any frequencies - the whole device runs synchronously hard- > locked to GPS time when it is available and freewheeling when not. You should have a control loop somewhere, which explicitly or implicitly estimates the frequency of the TCXO. The time-nuts archives are full with discussions how to do such control loops and improve hold over performance. Though there weren't many in the last 2-3 years. John Vigs tutorial is also a good start. > Are you saying that if you deprive any PC of any connectivity it will drift > by 4-5ms in 24 hours? Almost. It has to have ntp running and ntp must have had time to discipline the local oscillator. If the PC is then in an environment that will not cause its oscillator to drift more than 10-100ppb per day, then it will stay below 10ms. There are a few ifs there, but it's nothing out of the ordinary. Even ordinary crystal oscillators can be quite stable if they have been running for a while. Just for comparison: decent wrist watches drift less than 1min in half a year. Good ones less than 10s. Attila Kinali -- It is upon moral qualities that a society is ultimately founded. All the prosperity and technological sophistication in the world is of no use without that foundation. -- Miss Matheson, The Diamond Age, Neil Stephenson

DP

Denny Page

Wed, Nov 1, 2017 5:13 AM

Depends upon the results you are trying to achieve. Using Linux pretty much guarantees that your server clock will be off by 6-10us, with substantial variance. Even with a good nic that supports hardware timestamping, the variance will increase substantially as you go off box (spread spectrum is a big annoyance!). If you don’t have hardware timestamping, the base error will increase by another 10-100us, and the variance will simply go through the roof. Any load on the system whatsoever will quickly drive further degradation throughout. This is why people generally talk about NTP having a “typical" accuracy of 1ms and a standard deviation over 100us. For casual use, this is fine for most people.

The LeoNTP units operate in a completely different world. Leo advertises accuracy of under 1us, which matches the general performance of PTP. In my testing, the units actually do a bit better than that. Using hardware timestamps on the client, I generally see less than +-100ns, with a standard deviation of around 35ns. And the performance remains constant under load. I am not able to do heavy load testing, but Leo has described the heavy load performance earlier in the thread. Basically the units are capable of operating at 100Mb wire speed. As I said, a completely different world.

Your mileage may vary.

Denny

On Oct 31, 2017, at 17:11, Attila Kinali attila@kinali.ch wrote:

Basically, all you have to do is use an SBC that runs linux and has
a GPIO with an interrupt to act as a PPS input. Attach a GPS receiver
and you are almost done.

Depends upon the results you are trying to achieve. Using Linux pretty much guarantees that your server clock will be off by 6-10us, with substantial variance. Even with a good nic that supports hardware timestamping, the variance will increase substantially as you go off box (spread spectrum is a big annoyance!). If you don’t have hardware timestamping, the base error will increase by another 10-100us, and the variance will simply go through the roof. Any load on the system whatsoever will quickly drive further degradation throughout. This is why people generally talk about NTP having a “typical" accuracy of 1ms and a standard deviation over 100us. For casual use, this is fine for most people. The LeoNTP units operate in a completely different world. Leo advertises accuracy of under 1us, which matches the general performance of PTP. In my testing, the units actually do a bit better than that. Using hardware timestamps on the client, I generally see less than +-100ns, with a standard deviation of around 35ns. And the performance remains constant under load. I am not able to do heavy load testing, but Leo has described the heavy load performance earlier in the thread. Basically the units are capable of operating at 100Mb wire speed. As I said, a completely different world. Your mileage may vary. Denny > On Oct 31, 2017, at 17:11, Attila Kinali <attila@kinali.ch> wrote: > > Basically, all you have to do is use an SBC that runs linux and has > a GPIO with an interrupt to act as a PPS input. Attach a GPS receiver > and you are almost done.

AK

Attila Kinali

Wed, Nov 1, 2017 12:39 PM

On Tue, 31 Oct 2017 22:13:01 -0700
Denny Page denny@cococafe.com wrote:

Depends upon the results you are trying to achieve. Using Linux pretty
much guarantees that your server clock will be off by 6-10us, with
substantial variance. Even with a good nic that supports hardware
timestamping, the variance will increase substantially as you go off box
(spread spectrum is a big annoyance!).

6-10µs is the interrupt latency of linux on ARM SoC. I guess, to get
below that you'd have to tweak the kernel a bit. Which should not
be that difficult. Definitly simpler than writing your own IP and NTP
stack from scratch.

Spread spectrum can usually be switched off, though requires at least a
custom DTB or even patching of the kernel. There are a few boards, though
that do not allow spread spectrum to be switched off.

		Attila Kinali

--
It is upon moral qualities that a society is ultimately founded. All
the prosperity and technological sophistication in the world is of no
use without that foundation.
-- Miss Matheson, The Diamond Age, Neil Stephenson

On Tue, 31 Oct 2017 22:13:01 -0700 Denny Page <denny@cococafe.com> wrote: > Depends upon the results you are trying to achieve. Using Linux pretty > much guarantees that your server clock will be off by 6-10us, with > substantial variance. Even with a good nic that supports hardware > timestamping, the variance will increase substantially as you go off box > (spread spectrum is a big annoyance!). 6-10µs is the interrupt latency of linux on ARM SoC. I guess, to get below that you'd have to tweak the kernel a bit. Which should not be that difficult. Definitly simpler than writing your own IP and NTP stack from scratch. Spread spectrum can usually be switched off, though requires at least a custom DTB or even patching of the kernel. There are a few boards, though that do not allow spread spectrum to be switched off. Attila Kinali -- It is upon moral qualities that a society is ultimately founded. All the prosperity and technological sophistication in the world is of no use without that foundation. -- Miss Matheson, The Diamond Age, Neil Stephenson

DP

Denny Page

Wed, Nov 1, 2017 5:15 PM

On Nov 01, 2017, at 05:39, Attila Kinali attila@kinali.ch wrote:

6-10µs is the interrupt latency of linux on ARM SoC. I guess, to get
below that you'd have to tweak the kernel a bit. Which should not
be that difficult. Definitly simpler than writing your own IP and NTP
stack from scratch.

Just tweak the Linux kernel a bit? No. You would have to a rewrite substantial chunks of it. A tremendous effort. Low latency and accurate timing is not what Linux is designed for. This has been discussed extensively for years.

Writing your own IPv4 datagram stack for ICMP and NTP is rather trivial. It’s all static state, you don’t have to deal with fragments, you don’t have to deal with options, etc. There really is very little you actually have to do. IPv6 is more work because you have to maintain some dynamic state (routing), process options, etc., but it’s still nothing near like trying to turn Linux into a real-time system.

Spread spectrum can usually be switched off, though requires at least a
custom DTB or even patching of the kernel. There are a few boards, though
that do not allow spread spectrum to be switched off.

Unfortunately is usually has to be done in the bios. The control space that would allow spread spectrum to be turned off is usually disabled prior to kernel load. For compliance, most bios implementations have removed spread spectrum as an option so you have to build a custom one. I can’t begin to tell you what kind of a chill comes over the conversation with a vendor (even a maker oriented one) when you tell them you want to create a custom bios to disable spread spectrum.

Denny

> On Nov 01, 2017, at 05:39, Attila Kinali <attila@kinali.ch> wrote: > > 6-10µs is the interrupt latency of linux on ARM SoC. I guess, to get > below that you'd have to tweak the kernel a bit. Which should not > be that difficult. Definitly simpler than writing your own IP and NTP > stack from scratch. Just tweak the Linux kernel a bit? No. You would have to a rewrite substantial chunks of it. A tremendous effort. Low latency and accurate timing is not what Linux is designed for. This has been discussed extensively for years. Writing your own IPv4 datagram stack for ICMP and NTP is rather trivial. It’s all static state, you don’t have to deal with fragments, you don’t have to deal with options, etc. There really is very little you actually have to do. IPv6 is more work because you have to maintain some dynamic state (routing), process options, etc., but it’s still nothing near like trying to turn Linux into a real-time system. > Spread spectrum can usually be switched off, though requires at least a > custom DTB or even patching of the kernel. There are a few boards, though > that do not allow spread spectrum to be switched off. Unfortunately is usually has to be done in the bios. The control space that would allow spread spectrum to be turned off is usually disabled prior to kernel load. For compliance, most bios implementations have removed spread spectrum as an option so you have to build a custom one. I can’t begin to tell you what kind of a chill comes over the conversation with a vendor (even a maker oriented one) when you tell them you want to create a custom bios to disable spread spectrum. Denny

AK

Attila Kinali

Sun, Nov 12, 2017 3:23 PM

On Wed, 1 Nov 2017 10:15:43 -0700
Denny Page denny@cococafe.com wrote:

6-10µs is the interrupt latency of linux on ARM SoC. I guess, to get
below that you'd have to tweak the kernel a bit. Which should not
be that difficult. Definitly simpler than writing your own IP and NTP
stack from scratch.

Just tweak the Linux kernel a bit? No. You would have to a rewrite
substantial chunks of it. A tremendous effort. Low latency and accurate
timing is not what Linux is designed for. This has been discussed
extensively for years.

Not really. The kernel has already quite a few low-latency network paths.
You just need to enable them and then cut out the biggest timing uncertainty:
the user-space to kernel context switch. If you write a small stub driver
that takes from user space the required data to build a NTP packet, you
can cut out quite a bit of the latency. You could even get a decent estimate
on when the packet will be send out in case of conguestion, if you check
the buffer fill marks. My guess, for someone who knows his way around
the kernel network stack, that would be 1-4 weeks of effort.

Spread spectrum can usually be switched off, though requires at least a
custom DTB or even patching of the kernel. There are a few boards, though
that do not allow spread spectrum to be switched off.

Unfortunately is usually has to be done in the bios. The control space that
would allow spread spectrum to be turned off is usually disabled prior to
kernel load. For compliance, most bios implementations have removed spread
spectrum as an option so you have to build a custom one. I can’t begin to
tell you what kind of a chill comes over the conversation with a vendor
(even a maker oriented one) when you tell them you want to create a custom
bios to disable spread spectrum.

Ah.. sorry for the confusion. I was specifically talking about embedded
systems. On a PC all bets are off, while on a embedded system you have
quite a high level of control of what's going on. At most you have to
tweak the DT file a bit, or set some initialization values a bit differently.

			Attila Kinali

--
You know, the very powerful and the very stupid have one thing in common.
They don't alters their views to fit the facts, they alter the facts to
fit the views, which can be uncomfortable if you happen to be one of the
facts that needs altering. -- The Doctor

On Wed, 1 Nov 2017 10:15:43 -0700 Denny Page <denny@cococafe.com> wrote: > > 6-10µs is the interrupt latency of linux on ARM SoC. I guess, to get > > below that you'd have to tweak the kernel a bit. Which should not > > be that difficult. Definitly simpler than writing your own IP and NTP > > stack from scratch. > > Just tweak the Linux kernel a bit? No. You would have to a rewrite > substantial chunks of it. A tremendous effort. Low latency and accurate > timing is not what Linux is designed for. This has been discussed > extensively for years. Not really. The kernel has already quite a few low-latency network paths. You just need to enable them and then cut out the biggest timing uncertainty: the user-space to kernel context switch. If you write a small stub driver that takes from user space the required data to build a NTP packet, you can cut out quite a bit of the latency. You could even get a decent estimate on when the packet will be send out in case of conguestion, if you check the buffer fill marks. My guess, for someone who knows his way around the kernel network stack, that would be 1-4 weeks of effort. > > Spread spectrum can usually be switched off, though requires at least a > > custom DTB or even patching of the kernel. There are a few boards, though > > that do not allow spread spectrum to be switched off. > > Unfortunately is usually has to be done in the bios. The control space that > would allow spread spectrum to be turned off is usually disabled prior to > kernel load. For compliance, most bios implementations have removed spread > spectrum as an option so you have to build a custom one. I can’t begin to > tell you what kind of a chill comes over the conversation with a vendor > (even a maker oriented one) when you tell them you want to create a custom > bios to disable spread spectrum. Ah.. sorry for the confusion. I was specifically talking about embedded systems. On a PC all bets are off, while on a embedded system you have quite a high level of control of what's going on. At most you have to tweak the DT file a bit, or set some initialization values a bit differently. Attila Kinali -- You know, the very powerful and the very stupid have one thing in common. They don't alters their views to fit the facts, they alter the facts to fit the views, which can be uncomfortable if you happen to be one of the facts that needs altering. -- The Doctor

DP

Denny Page

Sun, Nov 12, 2017 6:46 PM

The 6-7us of latency in this discussion does not involve the network path. In this regard network latency is fairly well addressed with hardware timestamping, although trying to get readings across the clock domains looses dozens of nanos of precision. In this discussion, the 6-7us of latency originates from servicing the serial device interrupt in order to timestamp the TOS pulse. This is entirely in the kernel, and there is no user space context switch involved. If you have a way to dramatically reduce the 6-7us interrupt latency with the Linux kernel, say to a 100 nanos, please do share.

On Nov 12, 2017, at 07:23, Attila Kinali attila@kinali.ch wrote:

The kernel has already quite a few low-latency network paths.
You just need to enable them and then cut out the biggest timing uncertainty:
the user-space to kernel context switch.

The 6-7us of latency in this discussion does not involve the network path. In this regard network latency is fairly well addressed with hardware timestamping, although trying to get readings across the clock domains looses dozens of nanos of precision. In this discussion, the 6-7us of latency originates from servicing the serial device interrupt in order to timestamp the TOS pulse. This is entirely in the kernel, and there is no user space context switch involved. If you have a way to dramatically reduce the 6-7us interrupt latency with the Linux kernel, say to a 100 nanos, please do share. > On Nov 12, 2017, at 07:23, Attila Kinali <attila@kinali.ch> wrote: > > The kernel has already quite a few low-latency network paths. > You just need to enable them and then cut out the biggest timing uncertainty: > the user-space to kernel context switch.

time-nuts@lists.febo.com

Re: [time-nuts] Designing an embedded precision GPS time