hi,
I want to setup a time servere so I saw in many configuration that a clock-block
was used, do I need it too ? what are the benefits ?
thanks,
Dan
The Clock-Block is a clock generator that is useful if you want to replace the computer's onboard crystal clock with an external high-stability source. For example, you can configure it to take a 10 MHz input from a GPSDO and create a 14.318182 MHz output to replace the crystal in a PC that uses that frequency. The external clock improves the NTP system's short term stability because the typical PC crystal is quite temperature sensitive and overall just isn't very good.
You do not need to use something like the Clock-Block to build a very good NTP server, but if you want to build the ultimate server it is part of the mix.
John
On Dec 27, 2012, at 8:43 AM, Dan Nica timi@crystal.rdstm.ro wrote:
hi,
I want to setup a time servere so I saw in many configuration that a clock-block
was used, do I need it too ? what are the benefits ?
thanks,
Dan
time-nuts mailing list -- time-nuts@febo.com
To unsubscribe, go to https://www.febo.com/cgi-bin/mailman/listinfo/time-nuts
and follow the instructions there.
You do not need to use something like the Clock-Block to build a very good NTP server, but if you want to build the ultimate server it is part of the mix.
Yes this is true. The server can be "very good", meaning that if it
were better the clients that it servers could not "know" the
difference. A simple is if a wall clock moved the hands with
millisecond precision, it would not serve the clients (human eyeballs)
any better if it moved with nanosecond precision because human
perception is measured in mS not nS. Same with the time server, it
communicates with its clients over a network that has someuncertainy
in th delay and ultra-presision is lost. So nanosecond level
timekeeping in the server is not required. You can do uSec level
time keeping with the standard TTL can on most mother boards.
However this list is for "nuts" and you might think it is fun to try
and do 1000 times better time keeping than is needed, in that case you
will need some kind of specialized clock hardware.
Chris Albertson
Redondo Beach, California
On 27 Dec, 2012, at 08:05 , Chris Albertson albertson.chris@gmail.com wrote:
You do not need to use something like the Clock-Block to build a very good NTP server, but if you want to build the ultimate server it is part of the mix.
Yes this is true. The server can be "very good", meaning that if it
were better the clients that it servers could not "know" the
difference. A simple is if a wall clock moved the hands with
millisecond precision, it would not serve the clients (human eyeballs)
any better if it moved with nanosecond precision because human
perception is measured in mS not nS. Same with the time server, it
communicates with its clients over a network that has someuncertainy
in th delay and ultra-presision is lost. So nanosecond level
timekeeping in the server is not required. You can do uSec level
time keeping with the standard TTL can on most mother boards.
However this list is for "nuts" and you might think it is fun to try
and do 1000 times better time keeping than is needed, in that case you
will need some kind of specialized clock hardware.
I don't think I buy this. It takes 70 milliseconds for a signal
transmitted from a GPS satellite to be received on the ground, but
we don't use this fact to argue that sub-70 ms timing from GPS is
not possible. If you have a network of high-bandwidth routers and
switches doing forwarding in hardware, and carrying no traffic other
than the packets you are timing (I have access to lab setups that
can meet this description) you can observe packet delivery times that
are stable at well under the microsecond level even though the total
time required to deliver a packet is much larger. If you add competing
traffic, like real life networks, the packet-to-packet variability
becomes much worse, but this is sample noise that can be addressed
by taking larger numbers of samples and filtering based on the expected
statistics of that noise. That is, the level of noise effecting
each individual sample entering the filter does not alone predict
the noise level of the result coming out, the latter also depends on the
number of samples and the quality of the model of the noise employed by
the filter. Note that I often see claims of time synchronization with
PTP at the 10 ns level or better. As this level of synchronization is
usually achieved by the brute force method of measuring transit times
across every network device on the path from source to destination I
have no doubt that what NTP can do will necessarily be worse than this,
but I don't know of a basis that would predict whether NTP's "worse"
is necessarily going to be 10,000x worse or can be just 10x worse.
Knowing that would require actually trying it to measure what can be
done.
What is certain, however, that if you want to measure this at the levels
that might be possible you probably want nanosecond-level clock hardware
in both the server, so that it can produce time of this quality, and in
the clients, so that you can measure how well they do directly rather
than attempting to have the NTP implementation grade its own homework. I
don't think this is a waste of time at all. The best case is that the
ability to measure at this level would lead to an understanding of what
it would take to transfer time with NTP at this level, but even the worst
case would be that one would be able to support one's assertions about what
can't usefully be done with data, and that's not bad either. If no one
tries then no one will ever know.
Dennis Ferguson
On Thu, Dec 27, 2012 at 10:55 AM, Dennis Ferguson
dennis.c.ferguson@gmail.com wrote:
I don't think I buy this. It takes 70 milliseconds for a signal
transmitted from a GPS satellite to be received on the ground, but
we don't use this fact to argue that sub-70 ms timing from GPS is
not possible.
We don't care how long it takes, we care about the uncertainty in the
length of time. The speed of light through space is very, very
certain and
.
If you have a network of high-bandwidth routers and
switches doing forwarding in hardware, and carrying no traffic other
than the packets you are timing (I have access to lab setups that
can meet this description) you can observe packet delivery times that
are stable at well under the microsecond level even though the total
time required to deliver a packet is much larger.
If you add competing
traffic, like real life networks, the packet-to-packet variability
becomes much worse, but this is sample noise that can be addressed
by taking larger numbers of samples and filtering based on the expected
statistics of that noise.
This is what NTP does. It looks at the clocks over a long period of time.
That is, the level of noise effecting
each individual sample entering the filter does not alone predict
the noise level of the result coming out, the latter also depends on the
number of samples and the quality of the model of the noise employed by
the filter. Note that I often see claims of time synchronization with
PTP at the 10 ns level or better. As this level of synchronization is
usually achieved by the brute force method of measuring transit times
across every network device on the path from source to destination I
have no doubt that what NTP can do will necessarily be worse than this,
but I don't know of a basis that would predict whether NTP's "worse"
is necessarily going to be 10,000x worse or can be just 10x worse.
Knowing that would require actually trying it to measure what can be
done.
We know the numbers, many peole have done this. I could be off by 10X
because maybe my memory is wrong or terminology does not match yours
but in principle we know these numbers.
We know what the best NTP servers using their built-in TTL oscillators
can do. BSD based PCs connected to a good timing mode GPS get to 2
uSecond offsets from "true". This seems to be the limit unless one
resorts to heroic efforts involving special built hardware. But even
the best lab setups using Ethernet are not this good. So my
conclusion was that if accurate client timing is the goal then it will
NOT help. The Ethernet based NTP clients will still have mS level
timing (1000x worse than the GPS connected server) not matter how good
the server is. Hence my advice to NOT bother with a special purpose
"clock block"
That was the bottom line, that a purpose built clock is not needed.
If good client timming is desired using NTP you are going to have to
distribute the PPS from GPS using some side channel, making the server
better is of no use. Or use PTP and special purpose network equipment
(But I bet PPS distribution would be cheaper if you simply used the
extra unused twisted pairs inside most Cat-5 cable.)
The reason you can't distribute ns level time over a network to normal
NTP clients is because of the random queing that happens inside the
client's ethernet interfaces. The normal installed base of ethernet
cards does not do time stamping. So even uSec level timing is lost in
the typical client.
What is certain, however, that if you want to measure this at the levels
that might be possible you probably want nanosecond-level clock hardware
in both the server, so that it can produce time of this quality, and in
the clients, so that you can measure how well they do directly rather
than attempting to have the NTP implementation grade its own homework. I
don't think this is a waste of time at all. The best case is that the
ability to measure at this level would lead to an understanding of what
it would take to transfer time with NTP at this level, but even the worst
case would be that one would be able to support one's assertions about what
can't usefully be done with data, and that's not bad either. If no one
tries then no one will ever know.
Dennis Ferguson
time-nuts mailing list -- time-nuts@febo.com
To unsubscribe, go to https://www.febo.com/cgi-bin/mailman/listinfo/time-nuts
and follow the instructions there.
--
Chris Albertson
Redondo Beach, California
On Thu, 27 Dec 2012 10:55:12 -0800
Dennis Ferguson dennis.c.ferguson@gmail.com wrote:
I don't think I buy this. It takes 70 milliseconds for a signal
transmitted from a GPS satellite to be received on the ground, but
we don't use this fact to argue that sub-70 ms timing from GPS is
not possible. If you have a network of high-bandwidth routers and
switches doing forwarding in hardware, and carrying no traffic other
than the packets you are timing (I have access to lab setups that
can meet this description) you can observe packet delivery times that
are stable at well under the microsecond level even though the total
time required to deliver a packet is much larger.
I'm not sure about this. Knowing about how switches work internally,
i'd guess they have "jitter" of something in the range of 1-10us, but
i've never done any measurements. Have you any hard numbers?
If you add competing
traffic, like real life networks, the packet-to-packet variability
becomes much worse, but this is sample noise that can be addressed
by taking larger numbers of samples and filtering based on the expected
statistics of that noise.
Here lies the big problem. While with GPS we pretty much know what
the time is that the signal takes to reach earth, we have no clue
with network packets in a loaded network. We don't even have an
idea what the packet transmit distribution is in the moment we are
doing our measurements. Neither the queue length in the router/switch
nor anything else. And the loading of a switch changes momentarily
and this in turn changes the delay of our packets. You can of course
apply math and try to get rid of quite a bit of noise, but you will
never get rid of it down to ns levels.
If i'm not mistaken, IEEE1588v1 had exactly that problem. They tried to
use "normal" switches and hoped the jitter would be predictable enough to
get compensated for. This didnt work out, so v2 now demands that all
switches act as border clocks
As this level of synchronization is
usually achieved by the brute force method of measuring transit times
across every network device on the path from source to destination I
have no doubt that what NTP can do will necessarily be worse than this,
but I don't know of a basis that would predict whether NTP's "worse"
is necessarily going to be 10,000x worse or can be just 10x worse.
Knowing that would require actually trying it to measure what can be
done.
You can guestimate that getting below 200us is not easy in a normal
network, but sub-1ms should be possible unless the network is very loaded.
Attila Kinali
--
There is no secret ingredient
-- Po, Kung Fu Panda
On 12/27/12 11:17 AM, Chris Albertson wrote:
On Thu, Dec 27, 2012 at 10:55 AM, Dennis Ferguson
dennis.c.ferguson@gmail.com wrote:
The reason you can't distribute ns level time over a network to normal
NTP clients is because of the random queing that happens inside the
client's ethernet interfaces. The normal installed base of ethernet
cards does not do time stamping. So even uSec level timing is lost in
the typical client.
I've been playing with Arduinos and Ethernet over the past couple weeks
(all those home automation kind of chores.. not like I need to time the
smoker temperature ramps to nanosecond precision), but this brings up an
interesting issue..
A lot of the uncertainty is because of the "smart" Ethernet interface
that tries to do stuff to offload the processor (buffering, etc.). This
is one of the cases where old, less capable interfaces might do better.
So, what about the USB-Ethernet dongles? (I use them a lot at work to
add a second interface for a laptop in test equipment setups, talking to
a Prologix, for instance)
Or, the Wiznet "Ethernet/IP stack" on a chip devices that talk via SPI,
or basically, a serial port.
https://www.sparkfun.com/products/9473 for a widget
https://www.sparkfun.com/products/9471? for the part
http://www.saelig.com/product/BRD002.htm a Ethernet to serial port board
Some of these might have deterministic enough timing that it would be
useful. They sure are cheap.
Actually those "smart" interface and the low cost interface on dongles
use very dumb hardware and depend on software. So they are not as
deterministic as you'd like. The very best network hardware is
purpose built and expensive and does the timing in hardware.
I still think the lowest cos tmemthodis to distribute PPS to every
computer. Then the only special wardware the PC needs is a serial
port
A lot of the uncertainty is because of the "smart" Ethernet interface that
tries to do stuff to offload the processor (buffering, etc.). This is one of
the cases where old, less capable interfaces might do better.
So, what about the USB-Ethernet dongles? (I use them a lot at work to add a
second interface for a laptop in test equipment setups, talking to a
Prologix, for instance)
Or, the Wiznet "Ethernet/IP stack" on a chip devices that talk via SPI, or
basically, a serial port.
https://www.sparkfun.com/products/9473 for a widget
https://www.sparkfun.com/products/9471? for the part
http://www.saelig.com/product/BRD002.htm a Ethernet to serial port board
Some of these might have deterministic enough timing that it would be
useful. They sure are cheap.
time-nuts mailing list -- time-nuts@febo.com
To unsubscribe, go to
https://www.febo.com/cgi-bin/mailman/listinfo/time-nuts
and follow the instructions there.
--
Chris Albertson
Redondo Beach, California
On Thu, 27 Dec 2012 11:30:37 -0800
Jim Lux jimlux@earthlink.net wrote:
So, what about the USB-Ethernet dongles? (I use them a lot at work to
add a second interface for a laptop in test equipment setups, talking to
a Prologix, for instance)
A lot worse! One thing is that USB has a polled protocol (ie the host
has to ask whether the device has any data) and the slot when an USB
device can signal that a packet came in repeates it self at most a couple
of times per 1ms frame for FS (don't know from the top of my head for HS).
The bigger issue though is that drivers for ethernet dongles are usally
of quite bad quality (ie it takes them a long time to do anything), and
they try to queue as much as possible to increase troughput (which adds
unpredictable delay).
Attila Kinali
--
There is no secret ingredient
-- Po, Kung Fu Panda
On 12/27/2012 08:28 PM, Attila Kinali wrote:
On Thu, 27 Dec 2012 10:55:12 -0800
Dennis Fergusondennis.c.ferguson@gmail.com wrote:
I don't think I buy this. It takes 70 milliseconds for a signal
transmitted from a GPS satellite to be received on the ground, but
we don't use this fact to argue that sub-70 ms timing from GPS is
not possible. If you have a network of high-bandwidth routers and
switches doing forwarding in hardware, and carrying no traffic other
than the packets you are timing (I have access to lab setups that
can meet this description) you can observe packet delivery times that
are stable at well under the microsecond level even though the total
time required to deliver a packet is much larger.
I'm not sure about this. Knowing about how switches work internally,
i'd guess they have "jitter" of something in the range of 1-10us, but
i've never done any measurements. Have you any hard numbers?
Switches and routers can contribute significant amount of time.
On GE, a full-length packet is about 12 us, so a single packets
head-of-line blocking can be anything up to that amount, multiple
packets... well, it keeps adding. Knowing how switches works doesn't
really help as packets arrive in a myriad of rates, they interact and
cross-modulate and create strange patterns and dance in interesting ways
that is ever changing in unpredictable fashion.
The GPS situation with ionosphere and troposphere is benign in
comparison, yet challenging in their own right.
If you add competing
traffic, like real life networks, the packet-to-packet variability
becomes much worse, but this is sample noise that can be addressed
by taking larger numbers of samples and filtering based on the expected
statistics of that noise.
Here lies the big problem. While with GPS we pretty much know what
the time is that the signal takes to reach earth, we have no clue
with network packets in a loaded network. We don't even have an
idea what the packet transmit distribution is in the moment we are
doing our measurements. Neither the queue length in the router/switch
nor anything else. And the loading of a switch changes momentarily
and this in turn changes the delay of our packets. You can of course
apply math and try to get rid of quite a bit of noise, but you will
never get rid of it down to ns levels.
A challenging thing, indeed.
If i'm not mistaken, IEEE1588v1 had exactly that problem. They tried to
use "normal" switches and hoped the jitter would be predictable enough to
get compensated for. This didnt work out, so v2 now demands that all
switches act as border clocks
The irony of it being that 1588v2 aware switches can work worse than
dirt cheap switches, since the extra packet-handling is taking a slow an
painful route through the switch.
As this level of synchronization is
usually achieved by the brute force method of measuring transit times
across every network device on the path from source to destination I
have no doubt that what NTP can do will necessarily be worse than this,
but I don't know of a basis that would predict whether NTP's "worse"
is necessarily going to be 10,000x worse or can be just 10x worse.
Knowing that would require actually trying it to measure what can be
done.
You can guestimate that getting below 200us is not easy in a normal
network, but sub-1ms should be possible unless the network is very loaded.
The trouble is that you milage will vary on a typical network, these
delays keeps shifting and you can get a good idea by measuring it for a
few weeks, but you can't reliably predict it, just the overall shape of it.
Doing ~200 us for a non-trival network with real data on it sound about
right.
What kills many assumptions is that the noise-forms fail most of the
normal assumptions about "noise". It's not zero mean, it does not have a
static mean, it does not have a static variance, it is not symmetric, it
is not independent of other traffic, it is not "just another service".
Cheers,
Magnus
Magnus,
Doing ~200 us for a non-trival network with real data on it sound about
right.
What kills many assumptions is that the noise-forms fail most of the
normal assumptions about "noise". It's not zero mean, it does not have a
static mean, it does not have a static variance, it is not symmetric, it
is not independent of other traffic, it is not "just another service".
Cheers,
Magnus
But NTP is very much aware of the measurements coming from a complicated
statistical source. Look for 'wedge scattergram'.
http://www.eecis.udel.edu/~mills/database/brief/algor/algor.pdf
http://www.eecis.udel.edu/~mills/database/brief/distlec/distlec.ppt
--
Björn
Björn,
On 12/28/2012 12:31 AM, bg@lysator.liu.se wrote:
Magnus,
Doing ~200 us for a non-trival network with real data on it sound about
right.
What kills many assumptions is that the noise-forms fail most of the
normal assumptions about "noise". It's not zero mean, it does not have a
static mean, it does not have a static variance, it is not symmetric, it
is not independent of other traffic, it is not "just another service".
Cheers,
Magnus
But NTP is very much aware of the measurements coming from a complicated
statistical source. Look for 'wedge scattergram'.
http://www.eecis.udel.edu/~mills/database/brief/algor/algor.pdf
http://www.eecis.udel.edu/~mills/database/brief/distlec/distlec.ppt
I am fully aware of it.
NTP has many treatments to the measurements to combat these issues, but
it can't cut down the noise all the way down as you would like to do 1
us or so over large scale networks.
My comment above was to point out that you do need to look hard at the
topic to grasp all the details of it. There are many subtleties and the
wedge scattergram is one of several methods to approach it.
NTP only come that far, given some of the design constraints it has. It
does a good job within those limits.
Cheers,
Magnus
On 12/27/12 11:48 AM, Attila Kinali wrote:
On Thu, 27 Dec 2012 11:30:37 -0800
Jim Lux jimlux@earthlink.net wrote:
So, what about the USB-Ethernet dongles? (I use them a lot at work to
add a second interface for a laptop in test equipment setups, talking to
a Prologix, for instance)
A lot worse! One thing is that USB has a polled protocol (ie the host
has to ask whether the device has any data) and the slot when an USB
device can signal that a packet came in repeates it self at most a couple
of times per 1ms frame for FS (don't know from the top of my head for HS).
The bigger issue though is that drivers for ethernet dongles are usally
of quite bad quality (ie it takes them a long time to do anything), and
they try to queue as much as possible to increase troughput (which adds
unpredictable delay).
OK, so throw out the USB version..
Just thinking out loud here: are these $20 wonders potentially a way to
get "better" time transfer via Ethernet?
I can see that a PC interface version might not be very good (because of
the need to build a low jitter driver for the PC for SPI or RS232).
But what about connected to some purpose built widget.. That is, rather
than invest in a IEEE 1588 interface, could one cobble together
something that does somewhere in between the vanilla mobo Ethernet &
network driver that came with the OS on the "high jitter" end and some
IEEE 1588 rev2 on the "low jitter" end.
After all, there's no real "driver" when you're hooking it up to an
Arduino.
There might be some significant uncertainty in the microcontroller
inside the interface chip: it does quite a lot. But maybe not.. I
haven't looked at it, and I was just curious.
Hi
Everything I've seen on the Adruino are "single chip wonders". They encapsulate the entire TCP/IP stack on the external chip. That will make getting at anything pretty tough.
Bob
On Dec 28, 2012, at 10:21 AM, Jim Lux jimlux@earthlink.net wrote:
On 12/27/12 11:48 AM, Attila Kinali wrote:
On Thu, 27 Dec 2012 11:30:37 -0800
Jim Lux jimlux@earthlink.net wrote:
So, what about the USB-Ethernet dongles? (I use them a lot at work to
add a second interface for a laptop in test equipment setups, talking to
a Prologix, for instance)
A lot worse! One thing is that USB has a polled protocol (ie the host
has to ask whether the device has any data) and the slot when an USB
device can signal that a packet came in repeates it self at most a couple
of times per 1ms frame for FS (don't know from the top of my head for HS).
The bigger issue though is that drivers for ethernet dongles are usally
of quite bad quality (ie it takes them a long time to do anything), and
they try to queue as much as possible to increase troughput (which adds
unpredictable delay).
OK, so throw out the USB version..
Just thinking out loud here: are these $20 wonders potentially a way to get "better" time transfer via Ethernet?
I can see that a PC interface version might not be very good (because of the need to build a low jitter driver for the PC for SPI or RS232).
But what about connected to some purpose built widget.. That is, rather than invest in a IEEE 1588 interface, could one cobble together something that does somewhere in between the vanilla mobo Ethernet & network driver that came with the OS on the "high jitter" end and some IEEE 1588 rev2 on the "low jitter" end.
After all, there's no real "driver" when you're hooking it up to an Arduino.
There might be some significant uncertainty in the microcontroller inside the interface chip: it does quite a lot. But maybe not.. I haven't looked at it, and I was just curious.
time-nuts mailing list -- time-nuts@febo.com
To unsubscribe, go to https://www.febo.com/cgi-bin/mailman/listinfo/time-nuts
and follow the instructions there.
On 12/28/12 7:56 AM, Bob Camp wrote:
Hi
Everything I've seen on the Adruino are "single chip wonders". They encapsulate the entire TCP/IP stack on the external chip. That will make getting at anything pretty tough.
Well, they also do UDP..(at least the Wiznet 5100 does)..
I was wondering if the timing from "UDP datagram received" to "some
signal that Arduino sees" was deterministic at a useful level...
The Arduino code I've looked at seems to be at a "sockets" kind of
interface and there is an interrupt from the hardware when something
arrives.
On Fri, 28 Dec 2012 07:21:26 -0800
Jim Lux jimlux@earthlink.net wrote:
OK, so throw out the USB version..
Just thinking out loud here: are these $20 wonders potentially a way to
get "better" time transfer via Ethernet?
Depends on what your goal is.
I can see that a PC interface version might not be very good (because of
the need to build a low jitter driver for the PC for SPI or RS232).
How about a uC that has build in IEEE1588 support? Quite a few of
the Cortex-M3 class do that already (eg LM3S9xxx, STM32F*, K62*).
They can do that time stamping with "a couple" lines of code
and feed it to the PC some way or other.
You dont have to run full IEEE1588, just use the timestamping the
the ethernet MAC provides.
Attila Kinali
--
There is no secret ingredient
-- Po, Kung Fu Panda
Hi
It's still an encapsulated stack that just coughs up the information after it's done this and that to it. How long the this and that takes is dependent on how busy the network is. It's going to watch packets on the net first and push data to the Arduino second.
I'm beating on a similar part between posts. It really lags out as the traffic goes from near nothing to just a bit more than nothing. I'm not far enough into it to be able to put numbers on either measure. Looks like the units are in the thousands of bits per second range though.
Bob
On Dec 28, 2012, at 11:18 AM, Jim Lux jimlux@earthlink.net wrote:
On 12/28/12 7:56 AM, Bob Camp wrote:
Hi
Everything I've seen on the Adruino are "single chip wonders". They encapsulate the entire TCP/IP stack on the external chip. That will make getting at anything pretty tough.
Well, they also do UDP..(at least the Wiznet 5100 does)..
I was wondering if the timing from "UDP datagram received" to "some signal that Arduino sees" was deterministic at a useful level...
The Arduino code I've looked at seems to be at a "sockets" kind of interface and there is an interrupt from the hardware when something arrives.
time-nuts mailing list -- time-nuts@febo.com
To unsubscribe, go to https://www.febo.com/cgi-bin/mailman/listinfo/time-nuts
and follow the instructions there.
On 27 Dec, 2012, at 11:28 , Attila Kinali attila@kinali.ch wrote:
On Thu, 27 Dec 2012 10:55:12 -0800
Dennis Ferguson dennis.c.ferguson@gmail.com wrote:
I don't think I buy this. It takes 70 milliseconds for a signal
transmitted from a GPS satellite to be received on the ground, but
we don't use this fact to argue that sub-70 ms timing from GPS is
not possible. If you have a network of high-bandwidth routers and
switches doing forwarding in hardware, and carrying no traffic other
than the packets you are timing (I have access to lab setups that
can meet this description) you can observe packet delivery times that
are stable at well under the microsecond level even though the total
time required to deliver a packet is much larger.
I'm not sure about this. Knowing about how switches work internally,
i'd guess they have "jitter" of something in the range of 1-10us, but
i've never done any measurements. Have you any hard numbers?
I've measured it for large routers, but the numbers are not mine. In
a former life I helped design forwarding path ASICs.
I'm interested in what that guess is based on, however, since I can't
imagine where 1-10us of self-generated jitter from an ethernet switch
would come from, if not from competing traffic. A well-spec'd piece of
silicon to handle 20 Gbps of full-duplex bandwidth needs to be capable
of processing about 40 million packet arrivals per second, or about
one packet every 25 ns. That's pretty much what is needed to build
a good ~$200, 24 port gigabit ethernet switch. The cheapest hardware
forwarding path to implement, which is generally what you'll find in
there, is a fixed processing pipeline (or pipelines) that takes packets
in at the required rate and spits out the results at that rate delayed by
N chip clock cycles; N might be large (but not too large; N tells you
how many packets it needs to be able to have in process simultaneously
and it is cheaper in logic if you can minimize that number) but it is a
constant. Your jitter estimate implies that such a switch, even when
not occupied with other traffic, will either sometimes leave a packet
sitting around for between 40 and 400 packet arrival times before getting
around to doing something with it, or else will sometimes do between 40
and 400 packet arrival times worth of extra work to forward the thing.
My experience with this suggests that it is actually easier to build if
it doesn't work like that. The switch I recently bought for my house,
this one
http://www.netgear.com/business/products/switches/prosafe-plus-switches/JGS524E.aspx#
specifies the total latency (that's total time, not jitter) through the
switch at 4.1 us for 64 byte packets, a precision I expect they
arrived at by just adding up the store-and-forward and fixed pipeline
delays. Nearly all of the variation in delay is from competing traffic
Even if 1-10us was observed for individual samples, however, that is
still missing the point. The interesting number is not the variability
of individual samples, it is the stability of the measure of central
tendency derived from many such samples (e.g. the average, if the
variation were gaussian) that is the interesting number.
If you add competing
traffic, like real life networks, the packet-to-packet variability
becomes much worse, but this is sample noise that can be addressed
by taking larger numbers of samples and filtering based on the expected
statistics of that noise.
Here lies the big problem. While with GPS we pretty much know what
the time is that the signal takes to reach earth, we have no clue
with network packets in a loaded network. We don't even have an
idea what the packet transmit distribution is in the moment we are
doing our measurements. Neither the queue length in the router/switch
nor anything else. And the loading of a switch changes momentarily
and this in turn changes the delay of our packets. You can of course
apply math and try to get rid of quite a bit of noise, but you will
never get rid of it down to ns levels.
?? NTP is a two-way time transfer. We directly measure how long the
cumulative queue lengths are for the round trip for each sample, and we
hence directly measure how this changes from sample to sample. There are
also good statistical models for the average behaviour of such queues when
operating at traffic levels where packet losses are rare and where the
bandwidth is not being significantly consumed by a small number of large,
correlated, flows, which is the usual operating state for both local
networks and Internet backbones (it is usually access circuits that are
the problem) and there are heuristics one can use to determine when the
statistics are not likely to be so nice; these are of use when designing
the thing which has the queues. What we haven't had is hosts and servers
capable of making precise measurements either of packet arrivals and
departures (why is a ping round trip reported to be 200 us or 400 us
when the packet spends less than 50 us in the network between the machines?),
nor of external reference time sources like GPS nor, really, any good
way to measure, and hence improve, the quality of the end result we
want, which is the time on the client's clock.
Since we're now starting to see computers with peripherals which address
some of these measurement problems really well (hardware time stamping
for packets, hardware PPS timestamp capture) at the small 10's of
nanoseconds level, what bothers me is the argument that there is no
use trying to make use of this, other than for timenut bragging purposes,
since NTP can't operate at anywhere near that level. To me this argument
is near perfect in its circularity.
If i'm not mistaken, IEEE1588v1 had exactly that problem. They tried to
use "normal" switches and hoped the jitter would be predictable enough to
get compensated for. This didnt work out, so v2 now demands that all
switches act as border clocks
Yes, NTP will never match a properly implemented PTP, but then again the
claims for what a properly implemented PTP can do still leave a lot of
room between there and a microsecond.
While PTP was originally conceived as a consumer networking thing, note that
the major use of PTP, and one driving its design, has turned out to be in
telecommunications networks where the replacement of traditional, finely-clocked,
carrier circuits with ethernet for backhaul has deprived the thing at the far
end of the backhaul circuit (say, a GSM/UMTS base station) of the frequency
reference it formerly relied on. The requirements for this application are
stringent enough that the failure of 1588v1 to meet them cannot be construed
as saying anything of practical importance about the ability of something
that works like 1588v1 to set your computer's clock, other than it won't do
as well as a well done 1588v2.
As this level of synchronization is
usually achieved by the brute force method of measuring transit times
across every network device on the path from source to destination I
have no doubt that what NTP can do will necessarily be worse than this,
but I don't know of a basis that would predict whether NTP's "worse"
is necessarily going to be 10,000x worse or can be just 10x worse.
Knowing that would require actually trying it to measure what can be
done.
You can guestimate that getting below 200us is not easy in a normal
network, but sub-1ms should be possible unless the network is very loaded.
So how did you compute the 200 us guess? I know of no basis for that
prediction.
If you look in Dr. Mills's NTP book, towards the end, you'll find a
plot of the Allan deviation of several apparently perfectly vanilla
computer clocks against an NTP reference (i.e. across a network). This
is a quite old result (the better machine is a DEC Alpha) so the NTP
timestamps are certainly being taken in software using 1990's computer
technology. The minimum Allan deviation is about 10^-8 at about 1000
seconds, not numbers that are going to impress anyone, but numbers that
are still the raw material for an average 10 us clock maintained with an
NTP time reference, with an old system and a nothing-special clock (I think
the machines must have been kept in an air conditioned room to eliminate
systematic oscillator variations well enough to produce such a pretty
plot, though). And, in fact, the 10 us might well be in part reflecting
the stability of the NTP server clock at the state of the art then,
rather than the network, so the number with a more precise server and
the same network might have been better still.
So the logical question might be why these measurements indicate that he
had the raw material for a 10 us, NTP synchronized clock, but one seldom
seems to see anything that good when running ntpd? I guess I'd just
point out that the difference between ntpd and the Allan deviation
measurements he shows is that ntpd wasn't running when the latter were
made; the difference is ntpd. What this suggests to me is that either
the things ntpd shows you about what it is doing do not reflect the
actual quality of its end product (the synchonization of the computer's
clock), or that ntpd does not make good use of the raw material available
to it. In either case, if you are making your prediction by looking
at what ntpd says and trying to extrapolate from that to what is possible
(or even what is currently happening) you may be fooling yourself.
There are still many things to learn here.
Dennis Ferguson
I have used a commercial stack for 8 bit microcontrollers (Micronet CMX) and as is, it would be a bad choice for timing because of the time it spends processing packets before deciding if they are for itself or not. It could be modified but you need a developer license to get the source code, and it starts at $15,000. I believe.
Didier
Sent from my Droid Razr 4G LTE wireless tracker.
-----Original Message-----
From: Bob Camp lists@rtty.us
To: Discussion of precise time and frequency measurement time-nuts@febo.com
Sent: Fri, 28 Dec 2012 11:36 AM
Subject: Re: [time-nuts] clock-block any need ?
Hi
It's still an encapsulated stack that just coughs up the information after it's done this and that to it. How long the this and that takes is dependent on how busy the network is. It's going to watch packets on the net first and push data to the Arduino second.
I'm beating on a similar part between posts. It really lags out as the traffic goes from near nothing to just a bit more than nothing. I'm not far enough into it to be able to put numbers on either measure. Looks like the units are in the thousands of bits per second range though.
Bob
On Dec 28, 2012, at 11:18 AM, Jim Lux jimlux@earthlink.net wrote:
On 12/28/12 7:56 AM, Bob Camp wrote:
Hi
Everything I've seen on the Adruino are "single chip wonders". They encapsulate the entire TCP/IP stack on the external chip. That will make getting at anything pretty tough.
Well, they also do UDP..(at least the Wiznet 5100 does)..
I was wondering if the timing from "UDP datagram received" to "some signal that Arduino sees" was deterministic at a useful level...
The Arduino code I've looked at seems to be at a "sockets" kind of interface and there is an interrupt from the hardware when something arrives.
time-nuts mailing list -- time-nuts@febo.com
To unsubscribe, go to https://www.febo.com/cgi-bin/mailman/listinfo/time-nuts
and follow the instructions there.
time-nuts mailing list -- time-nuts@febo.com
To unsubscribe, go to https://www.febo.com/cgi-bin/mailman/listinfo/time-nuts
and follow the instructions there.
STMicroelectronics has a 37 pages paper
on the ethernet hardware time stamping MAC of the STM32M107. The National
DP83640 hardware time stamping PHY is another example (and was already
pointed out here some time ago).
On Fri, Dec 28, 2012 at 9:54 PM, shalimr9@gmail.com wrote:
I have used a commercial stack for 8 bit microcontrollers (Micronet CMX)
and as is, it would be a bad choice for timing because of the time it
spends processing packets before deciding if they are for itself or not.
It could be modified but you need a developer license to get the source
code, and it starts at $15,000. I believe.
Didier
Sent from my Droid Razr 4G LTE wireless tracker.
-----Original Message-----
From: Bob Camp lists@rtty.us
To: Discussion of precise time and frequency measurement <
time-nuts@febo.com>
Sent: Fri, 28 Dec 2012 11:36 AM
Subject: Re: [time-nuts] clock-block any need ?
Hi
It's still an encapsulated stack that just coughs up the information after
it's done this and that to it. How long the this and that takes is
dependent on how busy the network is. It's going to watch packets on the
net first and push data to the Arduino second.
I'm beating on a similar part between posts. It really lags out as the
traffic goes from near nothing to just a bit more than nothing. I'm not far
enough into it to be able to put numbers on either measure. Looks like the
units are in the thousands of bits per second range though.
Bob
On Dec 28, 2012, at 11:18 AM, Jim Lux jimlux@earthlink.net wrote:
On 12/28/12 7:56 AM, Bob Camp wrote:
Hi
Everything I've seen on the Adruino are "single chip wonders". They
encapsulate the entire TCP/IP stack on the external chip. That will make
getting at anything pretty tough.
Well, they also do UDP..(at least the Wiznet 5100 does)..
I was wondering if the timing from "UDP datagram received" to "some
signal that Arduino sees" was deterministic at a useful level...
The Arduino code I've looked at seems to be at a "sockets" kind of
interface and there is an interrupt from the hardware when something
arrives.
and follow the instructions there.
time-nuts mailing list -- time-nuts@febo.com
To unsubscribe, go to
https://www.febo.com/cgi-bin/mailman/listinfo/time-nuts
and follow the instructions there.
time-nuts mailing list -- time-nuts@febo.com
To unsubscribe, go to
https://www.febo.com/cgi-bin/mailman/listinfo/time-nuts
and follow the instructions there.
On Fri, 28 Dec 2012 11:54:53 -0800
Dennis Ferguson dennis.c.ferguson@gmail.com wrote:
I'm not sure about this. Knowing about how switches work internally,
i'd guess they have "jitter" of something in the range of 1-10us, but
i've never done any measurements. Have you any hard numbers?
I've measured it for large routers, but the numbers are not mine. In
a former life I helped design forwarding path ASICs.
I'm interested in what that guess is based on, however, since I can't
imagine where 1-10us of self-generated jitter from an ethernet switch
would come from, if not from competing traffic.
Routing table lookups. These things are getting more and more optimized
torwards cheap and troughput. These can and often will be (partially)
offloaded to an ARM or MIPS core running right next to the switch matrix
itself. The switch matrix itself is pretty fast. I wouldnt expect more than
a couple 100ns jitter from that.. if it even gets as high as 100ns.
Even if 1-10us was observed for individual samples, however, that is
still missing the point. The interesting number is not the variability
of individual samples, it is the stability of the measure of central
tendency derived from many such samples (e.g. the average, if the
variation were gaussian) that is the interesting number.
Yes, and that's where the statistics of the packet arrival time distribution
enter the game.
?? NTP is a two-way time transfer. We directly measure how long the
cumulative queue lengths are for the round trip for each sample, and we
hence directly measure how this changes from sample to sample. There are
also good statistical models for the average behaviour of such queues when
operating at traffic levels where packet losses are rare and where the
bandwidth is not being significantly consumed by a small number of large,
correlated, flows, which is the usual operating state for both local
networks and Internet backbones (it is usually access circuits that are
the problem) and there are heuristics one can use to determine when the
statistics are not likely to be so nice; these are of use when designing
the thing which has the queues.
Yes, but these statistics are based on stady state conditions, which
you dont have in real networks. Something is changing every second.
And if you say that these changes are part of the statistics, then
you must measure over a long period of time, that is measured in minutes
if not hours and then derive your statistics from that.
What we haven't had is hosts and servers
capable of making precise measurements either of packet arrivals and
departures (why is a ping round trip reported to be 200 us or 400 us
when the packet spends less than 50 us in the network between the machines?),
nor of external reference time sources like GPS nor, really, any good
way to measure, and hence improve, the quality of the end result we
want, which is the time on the client's clock.
That's the other big issue. But IIRC PHK mentioned a GBit or 10GBit card
a couple of months ago, that contains time stamping hardware for 1588.
So we might end up with consumer grade cards that support this soonish.
(As yo have written)
Since we're now starting to see computers with peripherals which address
some of these measurement problems really well (hardware time stamping
for packets, hardware PPS timestamp capture) at the small 10's of
nanoseconds level, what bothers me is the argument that there is no
use trying to make use of this, other than for timenut bragging purposes,
since NTP can't operate at anywhere near that level. To me this argument
is near perfect in its circularity.
I dont really get what you are hitting at, here.
If i'm not mistaken, IEEE1588v1 had exactly that problem. They tried to
use "normal" switches and hoped the jitter would be predictable enough to
get compensated for. This didnt work out, so v2 now demands that all
switches act as border clocks
Yes, NTP will never match a properly implemented PTP, but then again the
claims for what a properly implemented PTP can do still leave a lot of
room between there and a microsecond.
While PTP was originally conceived as a consumer networking thing,
Oh..kay... And i thought it was for measurement instruments being
synced up trough the ubiquitus ethernet links instead of running
and additional coax.
note that
the major use of PTP, and one driving its design, has turned out to be in
telecommunications networks where the replacement of traditional, finely-clocked,
carrier circuits with ethernet for backhaul has deprived the thing at the far
end of the backhaul circuit (say, a GSM/UMTS base station) of the frequency
reference it formerly relied on. The requirements for this application are
stringent enough that the failure of 1588v1 to meet them cannot be construed
as saying anything of practical importance about the ability of something
that works like 1588v1 to set your computer's clock, other than it won't do
as well as a well done 1588v2.
Interesting piece of information. Dou you have links of stuff i could
read up on this?
As this level of synchronization is
usually achieved by the brute force method of measuring transit times
across every network device on the path from source to destination I
have no doubt that what NTP can do will necessarily be worse than this,
but I don't know of a basis that would predict whether NTP's "worse"
is necessarily going to be 10,000x worse or can be just 10x worse.
Knowing that would require actually trying it to measure what can be
done.
You can guestimate that getting below 200us is not easy in a normal
network, but sub-1ms should be possible unless the network is very loaded.
So how did you compute the 200 us guess? I know of no basis for that
prediction.
From the data ntp gives me in the networks i manage.
I hardly get any jitter number below 1ms, even with unloaded network
and unloaded hosts. The 200us comes from the "usual" rtt time measurements
on PCs.
There are still many things to learn here.
Arent we here because for exactly this reason? :-)
Attila Kinali
--
There is no secret ingredient
-- Po, Kung Fu Panda
On 27 Dec, 2012, at 15:13 , Magnus Danielson magnus@rubidium.dyndns.org wrote:
On GE, a full-length packet is about 12 us, so a single packets head-of-line blocking can be anything up to that amount, multiple packets... well, it keeps adding. Knowing how switches works doesn't really help as packets arrive in a myriad of rates, they interact and cross-modulate and create strange patterns and dance in interesting ways that is ever changing in unpredictable fashion.
I wanted to address this bit because it seems like most
people base their expectations for NTP on this complexity,
as does the argument being made above, but the holiday
intervened. While I suspect many people are thoroughly
bored of this topic by now I can't resist completing the
thought.
Yes, the delay of a sample packet through an output queue
will be proportional to the number of untransmitted bits in
the queue ahead of it, yes, the magnitude of that delay can
be very large and largely variable and, even, yes, the
statistics governing that delay may often be unpredictable and
non-gaussian, exhibiting dangerously heavy tails. The thing is,
though, that this doesn't necessarily have to matter so much. A
better approach might avoid relying on the things you can't know.
To see how, consider a different question: what is the
probability that any two samples sent through that queue
will experience precisely the same delay (i.e. find precisely
the same number of bits queued in front of it when it
gets there)? I think it is fairly conservative to predict
that the probability that two samples will arrive at a non-empty
output queue with exactly the same number of bits in front of
them will be fairly small; the number of bits in the queue will
be continuously changing, so the delay through a non-empty queue
should have a near-continuous (and unpredictable) probability
distribution, as you point out, and if the sampling is uncorrelated
with the competing traffic it is unlikely that any pair of
samples will find exactly the same point on that distribution.
The exception to this, of course, is a queue length of
precisely 0 bits (which is precisely why the behaviour
of a switch with no competing traffic is interesting). The
vast majority of queues in the vast majority of network
devices in real networks are no where near continuously
occupied for long periods. The time-averaged fractional load
on the circuit a queue is feeding is also the probability of
finding the queue not-empty. If the average load on the
output circuit is less than 100% then multiple samples are
probably going to find that queue precisely empty; if the
average load on the output circuit is 50% (and that would be
an unusually high number in a LAN, though maybe less
unusual in other contexts) then 50% of the samples that pass
through that queue are going to find it empty. Since samples
that found the queue empty will have experienced pretty much
identical delays, the "results" (for some value of "result")
from those samples will cluster closely together. The
results from samples which experienced a delay will
differ from that cluster but, as discussed above, will also
differ from each other and generally won't form a cluster
somewhere else. The cluster marks the good spot independent
of the precise (and precisely unknowable) nature of the statistics
governing the distribution of samples outside the cluster. If
we can find the cluster we have a result which does not depend
on understanding the precise behaviour of samples outside the
cluster.
Given this it is also worth while to consider "jitter", which
intuition based on a normal distribution assumption might suggest
should be predictive of the quality of the result derived from a
collection of samples. In the situation above, however, the
dominant contributors to "jitter", however measured, are going
to be the samples outside the cluster since they are the ones
that are "jittering" (it is that property we are relying on to
define the cluster). If jitter mostly measures information
about the samples the estimate doesn't rely on then it tells you
little about the samples the estimate does rely on, and hence
can provide no prediction about the quality of an estimate
derived from those samples alone. In fact, in a true perversion
of normal intuition, high jitter and heavy-tailed probability
distributions might even make it easier to get a good result
by making it easier to identify the cluster. Saying "I see
a lot of jitter" doesn't necessarily tell you anything about
what is possible.
While the argument gets a lot more complex in a hurry, and
too much to attempt here (the above is too much already), I
believe this general approach can scale to a whole large network
of devices with queues (though even the single-switch case has real
life relevance too). That is, I think it is possible to find a
sample "result" for which there is a strong tendency for "good"
samples to cluster together while "bad" samples are unlikely to do
so, with the quality of the result depending on the population and
nature of variability of the cluster but hardly at all on the
outliers, and with the lack of a measurable cluster telling you
when you might be better off relying on your local clock rather
than the network. The approach relies on the things we do know
about networks and networking equipment while avoiding reliance on
things we can't know: it mostly avoids making gaussian statistical
assumptions about distributions that may not be gaussian. The field
of robust statistics provides tools addressing this which might
be of use.
I guess it is worth completing this by mentioning what it
says about ntpd. First, ntpd knows all of the above, probably
much, much better than I do, though it might not put it in
quite the same terms. If you make the assumption that the
stochastic delays experienced by samples are evenly distributed
between the outbound and inbound paths (this is not a good match
for the real world, by the way, but there are constraints...) then
round trip delay becomes a stand-in measure of "cluster", and ntpd
does what it can with this. The fundamental constraint that limits
what ntpd can do, in a couple of ways, is the fact that the final
stage of its filter is a PLL. The integrator in a PLL assumes
that the errors in the samples it is being fed are zero-mean and
normally distributed, and will fail to arrive at a correct answer if
this is not the case, so if you want to filter samples for which
this is unlikely to be the case you need to do it before they get
to the PLL. The problem with doing this well, however, is that a
PLL is also destabilised by adding delays to its feedback path,
causing errors of a different nature, so anything done before the
PLL is severely limited in the amount of time it can spend doing
that, and hence the number of samples it can look at to do that.
Doing better probably requires replacing the PLL; the "replace
it with what?" question is truly interesting.
I suspect I've gone well off topic for this list, however, and for
that I apologize. I just wanted to make sure it was understood that
there is an argument for the view that we do not yet know of any
fundamental limits on the precision that NTP, or a network time
protocol like NTP, might achieve, so any effort to build NTP servers
and clients which can make their measurements more precisely is not
a waste of time. It instead is what is required to make progress
in understanding how to do this better.
Dennis Ferguson
Hi
The problem with your approach is that you can depart from "normal" for very long periods of time. Consider my home network, running NTP to external sources. Around 4 in the afternoon all the kids get home and start streaming video. At 7 I get home and start doing the same thing. We each keep this up for 5 hours. Past midnight, the bit torrent fires up and it runs through 5 AM. Mid day, there's a video conference that runs from home for an hour or two.
Each of these things creates a non-zero load ahead of the NTP at some point. Given network congestion and re-transmission the load will really pile up at various times. Given the high level of transmit / receive asymmetry in my cable modem, it will be pretty hard for me to figure out what's going on.
The net result will be that my NTP hops around a bit during the day.
Bob
On Jan 1, 2013, at 8:57 PM, Dennis Ferguson dennis.c.ferguson@gmail.com wrote:
On 27 Dec, 2012, at 15:13 , Magnus Danielson magnus@rubidium.dyndns.org wrote:
On GE, a full-length packet is about 12 us, so a single packets head-of-line blocking can be anything up to that amount, multiple packets... well, it keeps adding. Knowing how switches works doesn't really help as packets arrive in a myriad of rates, they interact and cross-modulate and create strange patterns and dance in interesting ways that is ever changing in unpredictable fashion.
I wanted to address this bit because it seems like most
people base their expectations for NTP on this complexity,
as does the argument being made above, but the holiday
intervened. While I suspect many people are thoroughly
bored of this topic by now I can't resist completing the
thought.
Yes, the delay of a sample packet through an output queue
will be proportional to the number of untransmitted bits in
the queue ahead of it, yes, the magnitude of that delay can
be very large and largely variable and, even, yes, the
statistics governing that delay may often be unpredictable and
non-gaussian, exhibiting dangerously heavy tails. The thing is,
though, that this doesn't necessarily have to matter so much. A
better approach might avoid relying on the things you can't know.
To see how, consider a different question: what is the
probability that any two samples sent through that queue
will experience precisely the same delay (i.e. find precisely
the same number of bits queued in front of it when it
gets there)? I think it is fairly conservative to predict
that the probability that two samples will arrive at a non-empty
output queue with exactly the same number of bits in front of
them will be fairly small; the number of bits in the queue will
be continuously changing, so the delay through a non-empty queue
should have a near-continuous (and unpredictable) probability
distribution, as you point out, and if the sampling is uncorrelated
with the competing traffic it is unlikely that any pair of
samples will find exactly the same point on that distribution.
The exception to this, of course, is a queue length of
precisely 0 bits (which is precisely why the behaviour
of a switch with no competing traffic is interesting). The
vast majority of queues in the vast majority of network
devices in real networks are no where near continuously
occupied for long periods. The time-averaged fractional load
on the circuit a queue is feeding is also the probability of
finding the queue not-empty. If the average load on the
output circuit is less than 100% then multiple samples are
probably going to find that queue precisely empty; if the
average load on the output circuit is 50% (and that would be
an unusually high number in a LAN, though maybe less
unusual in other contexts) then 50% of the samples that pass
through that queue are going to find it empty. Since samples
that found the queue empty will have experienced pretty much
identical delays, the "results" (for some value of "result")
from those samples will cluster closely together. The
results from samples which experienced a delay will
differ from that cluster but, as discussed above, will also
differ from each other and generally won't form a cluster
somewhere else. The cluster marks the good spot independent
of the precise (and precisely unknowable) nature of the statistics
governing the distribution of samples outside the cluster. If
we can find the cluster we have a result which does not depend
on understanding the precise behaviour of samples outside the
cluster.
Given this it is also worth while to consider "jitter", which
intuition based on a normal distribution assumption might suggest
should be predictive of the quality of the result derived from a
collection of samples. In the situation above, however, the
dominant contributors to "jitter", however measured, are going
to be the samples outside the cluster since they are the ones
that are "jittering" (it is that property we are relying on to
define the cluster). If jitter mostly measures information
about the samples the estimate doesn't rely on then it tells you
little about the samples the estimate does rely on, and hence
can provide no prediction about the quality of an estimate
derived from those samples alone. In fact, in a true perversion
of normal intuition, high jitter and heavy-tailed probability
distributions might even make it easier to get a good result
by making it easier to identify the cluster. Saying "I see
a lot of jitter" doesn't necessarily tell you anything about
what is possible.
While the argument gets a lot more complex in a hurry, and
too much to attempt here (the above is too much already), I
believe this general approach can scale to a whole large network
of devices with queues (though even the single-switch case has real
life relevance too). That is, I think it is possible to find a
sample "result" for which there is a strong tendency for "good"
samples to cluster together while "bad" samples are unlikely to do
so, with the quality of the result depending on the population and
nature of variability of the cluster but hardly at all on the
outliers, and with the lack of a measurable cluster telling you
when you might be better off relying on your local clock rather
than the network. The approach relies on the things we do know
about networks and networking equipment while avoiding reliance on
things we can't know: it mostly avoids making gaussian statistical
assumptions about distributions that may not be gaussian. The field
of robust statistics provides tools addressing this which might
be of use.
I guess it is worth completing this by mentioning what it
says about ntpd. First, ntpd knows all of the above, probably
much, much better than I do, though it might not put it in
quite the same terms. If you make the assumption that the
stochastic delays experienced by samples are evenly distributed
between the outbound and inbound paths (this is not a good match
for the real world, by the way, but there are constraints...) then
round trip delay becomes a stand-in measure of "cluster", and ntpd
does what it can with this. The fundamental constraint that limits
what ntpd can do, in a couple of ways, is the fact that the final
stage of its filter is a PLL. The integrator in a PLL assumes
that the errors in the samples it is being fed are zero-mean and
normally distributed, and will fail to arrive at a correct answer if
this is not the case, so if you want to filter samples for which
this is unlikely to be the case you need to do it before they get
to the PLL. The problem with doing this well, however, is that a
PLL is also destabilised by adding delays to its feedback path,
causing errors of a different nature, so anything done before the
PLL is severely limited in the amount of time it can spend doing
that, and hence the number of samples it can look at to do that.
Doing better probably requires replacing the PLL; the "replace
it with what?" question is truly interesting.
I suspect I've gone well off topic for this list, however, and for
that I apologize. I just wanted to make sure it was understood that
there is an argument for the view that we do not yet know of any
fundamental limits on the precision that NTP, or a network time
protocol like NTP, might achieve, so any effort to build NTP servers
and clients which can make their measurements more precisely is not
a waste of time. It instead is what is required to make progress
in understanding how to do this better.
Dennis Ferguson
time-nuts mailing list -- time-nuts@febo.com
To unsubscribe, go to https://www.febo.com/cgi-bin/mailman/listinfo/time-nuts
and follow the instructions there.
On 02/01/13 02:57, Dennis Ferguson wrote:
On 27 Dec, 2012, at 15:13 , Magnus Danielsonmagnus@rubidium.dyndns.org wrote:
On GE, a full-length packet is about 12 us, so a single packets head-of-line blocking can be anything up to that amount, multiple packets... well, it keeps adding. Knowing how switches works doesn't really help as packets arrive in a myriad of rates, they interact and cross-modulate and create strange patterns and dance in interesting ways that is ever changing in unpredictable fashion.
I wanted to address this bit because it seems like most
people base their expectations for NTP on this complexity,
as does the argument being made above, but the holiday
intervened. While I suspect many people are thoroughly
bored of this topic by now I can't resist completing the
thought.
Be advised that it was the short description of a much lengthier discussion.
Yes, the delay of a sample packet through an output queue
will be proportional to the number of untransmitted bits in
the queue ahead of it, yes, the magnitude of that delay can
be very large and largely variable and, even, yes, the
statistics governing that delay may often be unpredictable and
non-gaussian, exhibiting dangerously heavy tails. The thing is,
though, that this doesn't necessarily have to matter so much. A
better approach might avoid relying on the things you can't know.
Hard to avoid fundamental properties of transmission, at least when they
have been made fundamental properties.
Recall that the queue length is quantized in steps, and that various
"padding" (preamble-sequence, header, trailer, postamble-sequence)
occurs. 8-bit quantization is safe to assume as minimum step for GE, due
to its 8B10B encoding format on the optical channel. For optical GE,
event resolution is therefore 8 ns.
To see how, consider a different question: what is the
probability that any two samples sent through that queue
will experience precisely the same delay (i.e. find precisely
the same number of bits queued in front of it when it
gets there)? I think it is fairly conservative to predict
that the probability that two samples will arrive at a non-empty
output queue with exactly the same number of bits in front of
them will be fairly small; the number of bits in the queue will
be continuously changing, so the delay through a non-empty queue
should have a near-continuous (and unpredictable) probability
distribution, as you point out, and if the sampling is uncorrelated
with the competing traffic it is unlikely that any pair of
samples will find exactly the same point on that distribution.
Yes and no. It is hard to do with a low asking rate, but some properties
can improve with a high asking rate.
The exception to this, of course, is a queue length of
precisely 0 bits (which is precisely why the behaviour
of a switch with no competing traffic is interesting). The
vast majority of queues in the vast majority of network
devices in real networks are no where near continuously
occupied for long periods. The time-averaged fractional load
on the circuit a queue is feeding is also the probability of
finding the queue not-empty. If the average load on the
output circuit is less than 100% then multiple samples are
probably going to find that queue precisely empty; if the
average load on the output circuit is 50% (and that would be
an unusually high number in a LAN, though maybe less
unusual in other contexts) then 50% of the samples that pass
through that queue are going to find it empty. Since samples
that found the queue empty will have experienced pretty much
identical delays, the "results" (for some value of "result")
from those samples will cluster closely together. The
results from samples which experienced a delay will
differ from that cluster but, as discussed above, will also
differ from each other and generally won't form a cluster
somewhere else. The cluster marks the good spot independent
of the precise (and precisely unknowable) nature of the statistics
governing the distribution of samples outside the cluster. If
we can find the cluster we have a result which does not depend
on understanding the precise behaviour of samples outside the
cluster.
Whenever you want to do this, you need to measure the network more
furiously, those the asking rate goes up.
Given this it is also worth while to consider "jitter", which
intuition based on a normal distribution assumption might suggest
should be predictive of the quality of the result derived from a
collection of samples. In the situation above, however, the
dominant contributors to "jitter", however measured, are going
to be the samples outside the cluster since they are the ones
that are "jittering" (it is that property we are relying on to
define the cluster). If jitter mostly measures information
about the samples the estimate doesn't rely on then it tells you
little about the samples the estimate does rely on, and hence
can provide no prediction about the quality of an estimate
derived from those samples alone. In fact, in a true perversion
of normal intuition, high jitter and heavy-tailed probability
distributions might even make it easier to get a good result
by making it easier to identify the cluster. Saying "I see
a lot of jitter" doesn't necessarily tell you anything about
what is possible.
I think one has to realize that what queues and scheduling does to
packet delays, defies the normal "jitter" statistics quite a bit.
The delay varies, and the properties of delay varies. It is an ever
shifting property. There is however a few know properties of this
"jitter". For one thing, it always increases the delay (assuming that we
do not change path in the network).
While the argument gets a lot more complex in a hurry, and
too much to attempt here (the above is too much already), I
believe this general approach can scale to a whole large network
of devices with queues (though even the single-switch case has real
life relevance too). That is, I think it is possible to find a
sample "result" for which there is a strong tendency for "good"
samples to cluster together while "bad" samples are unlikely to do
so, with the quality of the result depending on the population and
nature of variability of the cluster but hardly at all on the
outliers, and with the lack of a measurable cluster telling you
when you might be better off relying on your local clock rather
than the network. The approach relies on the things we do know
about networks and networking equipment while avoiding reliance on
things we can't know: it mostly avoids making gaussian statistical
assumptions about distributions that may not be gaussian. The field
of robust statistics provides tools addressing this which might
be of use.
It just isn't a good set of tools. This is why lots of effort has been
put into research. A few search terms for you: min-TDEV and MAFE
min-TDEV is one of a number of algorithms in which they have applied a
block-min pre-filter prior to the TDEV measure. As the number of samples
in the block measure increases, the TDEV measures lowers.
There is a cluster approach and percentile approaches also being looked
at, but the common trend here is that the asking rate becomes higher,
much higher.
I guess it is worth completing this by mentioning what it
says about ntpd. First, ntpd knows all of the above, probably
much, much better than I do, though it might not put it in
quite the same terms.
Yes and no. NTPD implements impressive filterings. However, it sends far
to little packets to probe the network delays in order for the filters
to eat down enough through the jitter. PTP allows for higher asking
rates, and it is one the things which it has going for it compared to NTP.
If you make the assumption that the
stochastic delays experienced by samples are evenly distributed
between the outbound and inbound paths (this is not a good match
for the real world, by the way, but there are constraints...) then
round trip delay becomes a stand-in measure of "cluster", and ntpd
does what it can with this.
The wedge dispersion plots is nice. The top and bottom part of the wedge
holds the min samples of one-way delay in either in-bound or out-bound
direction. It's not a bad solution, but it needs more samples to chew on.
The fundamental constraint that limits
what ntpd can do, in a couple of ways, is the fact that the final
stage of its filter is a PLL.
That is the traditional view, yes.
The integrator in a PLL assumes
that the errors in the samples it is being fed are zero-mean and
normally distributed, and will fail to arrive at a correct answer if
this is not the case, so if you want to filter samples for which
this is unlikely to be the case you need to do it before they get
to the PLL. The problem with doing this well, however, is that a
PLL is also destabilised by adding delays to its feedback path,
causing errors of a different nature, so anything done before the
PLL is severely limited in the amount of time it can spend doing
that, and hence the number of samples it can look at to do that.
Doing better probably requires replacing the PLL; the "replace
it with what?" question is truly interesting.
The integrator does not expect zero-mean samples. It's infinite gain at
DC drives the detector to produce zero-mean samples. If a set of samples
not being average zero comes in, the DC property of those steers the
integrator state such that the frequency shifts and that the phase ramp
chases in the property and the phase detector start producing zero-mean
samples again. This is the properties of the PI-style PLL being used.
It's how it should be.
To your point, the unstable delay as being measured by NTP causes the
phase to wobble around. Long term frequency is actually safe, the length
of the time-stamps ensure that. Phase and frequency stability however is
affected. It's not the PI-loop that is the culprit, but instability of
the measure. A Kalman filter for timing turns out to be quite near a
self-tuned PI-loop BTW. If you want to combat this noise, you need to do
it with some model of it and means to create a quieter product. Some of
that is in the public, some of it isn't.
As for delay in the feedback path, this has been systematically
investigated and there is a lovely paper that shows that to maintain the
same damping, the bandwidth needs to go down as delays goes up. If you
have low enough bandwidth, you need to trim your damping coefficient
instead. It's not a flaw in the traditional PI PLL, it's just that the
property was not taken into account, and hence applying the wrong model
and stability analysis to the situation. Doing the homework and you get
back to safe ground.
I suspect I've gone well off topic for this list, however, and for
that I apologize. I just wanted to make sure it was understood that
there is an argument for the view that we do not yet know of any
fundamental limits on the precision that NTP, or a network time
protocol like NTP, might achieve, so any effort to build NTP servers
and clients which can make their measurements more precisely is not
a waste of time. It instead is what is required to make progress
in understanding how to do this better.
I think you have misunderstood my intentions here. NTP isn't a bad
build, it's quite impressive. There is a number of things to improve on
it. PTP has taken a lead in some fields, but lagging behind NTP in
others. NTP is not just operating in touch with what I believe is the
state of art in packet delay measurements for timing. There are several
things that would needed to be changed in NTP for it to compete well,
and some of them is in what the standard says, others lies in how the
standard is being used or the system is being used. You can do a lot
more within the realm of NTP, but some of the design decisions
previously made would have to be scrapped.
Cheers,
Magnus