Timekeeping in virtual machines
This page is under construction! This page or section is a work in progress and may thus be incomplete. Its content may be changed in the near future.
There are several ways to keep track of time in a VM, but they're either very slow (e.g. HPET) or do not work correctly if the VM is migrated (e.g. TSC).
To work around this, VMs such as QEMU/KVM provide several ways keep track of time whilst sacrificing little performance.
KVM_HC_CLOCK_PAIRING
This hypercall is used to get the parameters to calculate a host's clock (KVM_CLOCK_PAIRING_WALLCLOCK for CLOCK_REALTIME).
The host copies the following structure to a physical address given by the guest:
structkvm_clock_pairing{ s64sec; s64nsec; u64tsc; u32flags; u32pad[9]; };
A hypercall is performed with the `vmcall` instruction. On KVM, RBX, RCX, RDX and RSI are used for arguments, RAX as the hypercall number and as the return value. No other registers are clobbered (unless explicitly noted).
For example, calling KVM_HC_CLOCK_PAIRING can be done as follows on x86_64:
; rdi: physical address to copy structure to ; rsi: clock type (KVM_CLOCK_PAIRING_WALLCLOCK = 0) kvm_hc_clock_pairing: moveax,9; KVM_HC_CLOCK_PAIRING movrbx,rdi movrcx,rsi vmcall ret
pvclock
pvclock is a simple protocol and the fastest way to properly track system time in a VM.
To use it, write a 64-bit 4-byte aligned physical address with bit 0 set to 1 to MSR_KVM_SYSTEM_TIME_NEW (0x4b564d01).
The presence of this MSR is indicated by bit 3 in EAX from leaf 0x4000_0001 of CPUID.
The host will write the following structure to this address:
structpvclock_vcpu_time_info{ u32version; u32pad0; u64tsc_timestamp; u64system_time; u32tsc_to_system_mul; s8tsc_shift; u8flags; u8pad[2]; };
The host will automatically update this structure when necessary (e.g. when finishing a migration).
The system time in nanoseconds is calculated as such:
time=rdtsc()-tsc_timestamp if(tsc_shift>=0) time<<=tsc_shift; else time>>=-tsc_shift; time=(time*tsc_to_system_mul)>>32 time=time+system_time
The version field is used to detect when the structure has been / is being updated. If the version is odd an update is in progress and the guest must not read the other fields yet.
Hyper-V TSC page
structms_hyperv_tsc_page{ volatileu32tsc_sequence; u32reserved1; volatileu64tsc_scale; volatiles64tsc_offset; u64reserved2[509]; };