Vested interest on my part: I'm a Meinberg agent :-)
Yes NTP can achieve an end-to-end precision down to approx. 50 us (that's microseconds) of jitter, if you sync a linux "client" on bare metal running Chrony or ntpd, to a Linux-based NTP server disciplined by a GPS, local atomic clock or some such source.
On the machine that has a local GPS (with a PPS interconnect), you will probably see 0-2 microseconds of offset, between the ntpd instance running in the OS, and its PPS refclock driver's input.
Those residual 50 us "end to end over a LAN" are a result of several stages of buffering, variable IRQ latency, other traffic interfering on the LAN and on the computer busses involved and whatnot. 50 us means a LAN with very little traffic. Even just a switch can add some microseconds of jitter - and higher-end switches with complex features add more latency and jitter.
In other words, it can be pretty difficult to achieve those 50 microseconds in your real-world conditions on some practical LAN.
Similarly, those cca <2us of the PPS offset result from just the IRQ latency uncertainty and general bus latency jitter on well behaved PC hardware.
Note that NTP and its implementations ntpd and Chrony certainly measure NTP transaction round-trip time and subtract (add, actually) a half of that round trip, as a measure to filter away the systematic transport latency (one way). They also perform outlier rejection, quorum consensus, syspeer election and any NTP demon filters the responses it gets to its upstream queries. So as others have said, the milliseconds that you see in Ping and Traceroute do not directly offset your local clock. What matters is the variability of transaction round-trip, i.e. other traffic on the path to your upstream NTP server. Ntpq -p is your friend.
A basic GPS receiver for timing use, with a TCXO, can have maybe 100-200 ns of residual jitter+wander on its PPS output. Plenty good enough for NTP, as long as the GPS stays locked. (Holdover performance is not very good with TCXO's.) A quality timing GPS with an OCXO can be well within 100 ns, maybe more like 10-30 ns of residual error (offset from the global UTC).
Note that actual satellites flying overhead and beaming at you through an atmosphere may be a slightly tougher game for the receiver, than benchmarking in a lab with a GPS generator.
PTP is a hammer. You need HW support in the grandmaster, and in the slaves, and in any switches - but if you get all that, residual offsets down to low double digit of nanoseconds are possible. I have personally seen this in ptp4l running with an i210 NIC which has HW support (timestamping with a nanosecond resolution).
The i210 chip is a wonder. It has 4 general-purpose pins that can be used to input or output a PPS signal. The reference Intel addon NIC board with i210 (and its OEM versions from several big vendors) comes equipped with a pin header that gives you access to at least 2 of those GPIO pins (SDP's they're called by Intel). Apart from implementing a PTP grandmaster port, the PPS input can be leveraged for precise timestamping in packet capture. You need a precise source of PPS and a custom piece of software to run a servo loop, fine-tuning the i210's PHC to the ext.PPS. On my test rig, this resulted in single digit ns (per 1 s iteration) of residual offset. This is the precision that you then get in your capture timestamps, if you run a recent tcpdump or wireshark on a modern Linux kernel (all the software needs support for nanosecond-level resolution). Better yet: I went all the way and built a simple PLL synth to produce 25 MHz for the NIC clocks, locked to a precise upstream 10MHz reference. After that, the residual offset in the servo loop of my packet capture rig dropped to a clean 0 (a proof that my 10 MHz reference is phase-synchronous with the PPS from that same GPS box).
Note that PTP grandmasters may be specified to provide timestamps with an actual granularity per 8 ns (in a data type with 1 ns resolution). This makes sense - gigabit Ethernet tends to use a 125 MHz clock, used as a byte clock in the internals of the MAC, this clock is probably also used in the GMII, and it's also the symbol clock in metallic 1000Base-TX (four pairs in parallel, 2 bits per symbol per pair). So unless you're using 1000Base-FX (fiber optic) with SERDES and an extremist implementation of the HW timestamping unit in the PHY that works down to individual SERDES bits, those 8 ns are all you can ever realistically hope for on gigabit Ethernet. Some chip datasheets (with PTP support) even claim that the MII data path is not free of buffering and some jitter can come from there.
The PTP packets actually contain timestamps stored in a data type that allows for deep sub-nanosecond resolution. But the "sub-nanosecond fractional field" is nowadays typically unused. AFAIR only the White Rabbit project (related to CERN the Swiss research center) has implemented sub-ns precision so far.
PTP is also available in pure software, without HW acceleration. In that case, for a SW-based GM and a SW-based client, expect to get a similar residual jitter as with NTP - i.e. about 50 us on a dedicated but PTP-unaware LAN. I recall getting sub-microsecond precision from a HW grandmaster on a direct interconnect (no switch in-between) and a SW-only client (on a PTP unaware PC NIC). Compared to NTP, the PTP's servo converges much faster.
While doing some "homework", it recently occurred to me that transporting PPS or similar "discrete" timing signals over wide-area fiber optic routes may be susceptible to temperature-dependent propagation time "wander". And although I have no way to test this experimentally, some sources in the interwebs quote figures between 40 and 76 picoseconds per km and Kelvin. Note that while this kind of "thermal wander" is impossible to mitigate "in band" in simplex PPS transmission, PTP would post-compensate this inherently, based on its standard path delay measurements (which depends on full duplex transmission).
So much for an overview of what the "precisions" look like, at different timing technologies / interfaces. What level of precision is good enough for you, that depends on your application, on your actual needs.
---- Update in 2020: ----
A colleague has recently demonstrated to me, that Chrony can be made to run NTP on well behaved PC hardware (with no special treatment to its bus clock oscillators) with a residual end-to-end jitter of a microsecond or less. This can be observed under the following conditions:
There's a Gentoo wikipage claiming that the hardware timestamping is dependent on co-existence with ptp4l, which I doubt (not suggested by Chrony's own documentation) - though it does make sense to me, that the PHC in a NIC should get somehow disciplined for HW timestamping to make good sense... not sure if Chrony can benefit from a PHC with a free-running clock. Apart from ptp4l, I can hack an i210 NIC to have its 25MHz clock PLL'ed to a precise frequency reference and its PPS bolted to a precise PPS reference. I haven't yet tried running Chrony on top of that though :-)
Another interesting observation: standard PTP with G.8275.2 Telecom Profile (i.e. not White Rabbit) can achieve low double-digit nanoseconds of "wander" over a modern MPLS VPN that's not congested and gives priority to PTP traffic. That, with the MPLS switches unaware of PTP (no on-path support) and over ~1000 km of distance... Measured as MTIE between PPS signals = the ultimate net result, and the oscillator in the slave was a high-end double-oven OCXO. Yes this practically means ideal conditions, and the real world tends to be a cruel place. Those figures are not something you should take for granted. Just demonstrating the potential. Also note that the immediate protocol jitter reported by the PTP slave at runtime is much worse than the net deviation measured at the oscillator's 1PPS output. And, over those distances / hop counts, you will get a hefty constant offset (path asymmetry) that needs to get calibrated / subtracted for the extracted PPS signal to be any use. And the asymmetry will change upon topology changes (path backup routes kicking into action)...