First published: Wed Aug 24 2011(Updated: )
Description of Problem: The call trace is as follows: crash> bt PID: 16963 TASK: f7415aa0 CPU: 0 COMMAND: "1-2.run-test" #0 [eb1c4e20] crash_kexec at c04434bd #1 [eb1c4e64] die at c04064d3 #2 [eb1c4e94] do_divide_error at c0406ac5 #3 [eb1c4f44] error_code (via divide_error) at c0405abb EAX: 5e3c58c2 EBX: 3b9aca00 ECX: fffffe4c EDX: fffffe4c EBP: eb1c4000 DS: 007b ESI: eb1c4fac ES: 007b EDI: eb1c4fac CS: 0060 EIP: c04374cd ERR: ffffffff EFLAGS: 00210246 #4 [eb1c4f78] sample_to_timespec at c04374cd #5 [eb1c4f8c] posix_cpu_clock_get at c0438744 #6 [eb1c4fa8] sys_clock_gettime at c04367f3 #7 [eb1c4fb8] system_call at c0404f44 EAX: ffffffda EBX: fffffff2 ECX: bfe85f78 EDX: 00967ff4 DS: 007b ESI: fffffff2 ES: 007b EDI: 00000000 SS: 007b ESP: bfe85f3c EBP: bfe85f58 CS: 0073 EIP: 00963e75 ERR: 00000109 EFLAGS: 00200246 Here is [customer's] analysis of the problem. Processing clock_gettime system call reached Divide Error Fault as described below: 1) clock_gettime system call is called with 0xfffffff2, which is clock ID of the init process whose process ID is 1. The clock ID is got from clock_getcpuclockid(1,&clock_id). 2) posix_cpu_clock_get() sets cpu_time_count->sched to task_struct->sched_time of PID#1 and calls sample_to_timespec(). 3) sample_to_timespec() divides cpu_time_count->sched by NSEC_PER_SEC using div_long_long_rem(). 4) The result of the division becomes bigger than 0xffffffff. 5) Divide Error Fault occurs. The reason why Divide Error Fault occurs is the huge task_struct->sched_time of PDI#1. When sys_clock_gettime() was called, task_struct->sched_time of PID#1 was 0xfffffe4c5e3c58c2. The task_struct->sched_time is increased by update_cpu_clock() while handling local timer interrupts as follows. --- static inline void update_cpu_clock(struct task_struct *p, struct rq *rq, unsigned long long now) { p->sched_time += now - max(p->timestamp, rq->timestamp_last_tick); } --- The 'now' argument is got from TSC and if it is nearly zero, p->sched_time becomes very large. It can happen while system booting on which TSC is initialized to zero as follows. --- init() "init/main.c" -> smp_prepare_cpus(max_cpus) -> synchronize_tsc_bp() -> write_tsc() => TSC is initialized to 0 --- So the summary of the problem is as follows: 1) TSC is initialized to zero during system booting. 2) update_cpu_clock() is called just after 1) then task_struct->sched_time of PID#1 becomes very large. 3) sys_clock_gettime() is called for clock ID of PID#1. 4) Divide Error Fault occurs.
Credit: secalert@redhat.com secalert@redhat.com
Affected Software | Affected Version | How to fix |
---|---|---|
Linux Linux kernel | <=2.6.25.20 | |
debian/linux-2.6 |
Sign up to SecAlerts for real-time vulnerability data matched to your software, aggregated from hundreds of sources.