CWE
476
Advisory Published
Updated

CVE-2024-43855: md: fix deadlock between mddev_suspend and flush bio

First published: Sat Aug 17 2024(Updated: )

In the Linux kernel, the following vulnerability has been resolved: md: fix deadlock between mddev_suspend and flush bio Deadlock occurs when mddev is being suspended while some flush bio is in progress. It is a complex issue. T1. the first flush is at the ending stage, it clears 'mddev->flush_bio' and tries to submit data, but is blocked because mddev is suspended by T4. T2. the second flush sets 'mddev->flush_bio', and attempts to queue md_submit_flush_data(), which is already running (T1) and won't execute again if on the same CPU as T1. T3. the third flush inc active_io and tries to flush, but is blocked because 'mddev->flush_bio' is not NULL (set by T2). T4. mddev_suspend() is called and waits for active_io dec to 0 which is inc by T3. T1 T2 T3 T4 (flush 1) (flush 2) (third 3) (suspend) md_submit_flush_data mddev->flush_bio = NULL; . . md_flush_request . mddev->flush_bio = bio . queue submit_flushes . . . . md_handle_request . . active_io + 1 . . md_flush_request . . wait !mddev->flush_bio . . . . mddev_suspend . . wait !active_io . . . submit_flushes . queue_work md_submit_flush_data . //md_submit_flush_data is already running (T1) . md_handle_request wait resume The root issue is non-atomic inc/dec of active_io during flush process. active_io is dec before md_submit_flush_data is queued, and inc soon after md_submit_flush_data() run. md_flush_request active_io + 1 submit_flushes active_io - 1 md_submit_flush_data md_handle_request active_io + 1 make_request active_io - 1 If active_io is dec after md_handle_request() instead of within submit_flushes(), make_request() can be called directly intead of md_handle_request() in md_submit_flush_data(), and active_io will only inc and dec once in the whole flush process. Deadlock will be fixed. Additionally, the only difference between fixing the issue and before is that there is no return error handling of make_request(). But after previous patch cleaned md_write_start(), make_requst() only return error in raid5_make_request() by dm-raid, see commit 41425f96d7aa ("dm-raid456, md/raid456: fix a deadlock for dm-raid456 while io concurrent with reshape)". Since dm always splits data and flush operation into two separate io, io size of flush submitted by dm always is 0, make_request() will not be called in md_submit_flush_data(). To prevent future modifications from introducing issues, add WARN_ON to ensure make_request() no error is returned in this context.

Credit: 416baaa9-dc9f-4396-8d5f-8c081fb06d67 416baaa9-dc9f-4396-8d5f-8c081fb06d67

Affected SoftwareAffected VersionHow to fix
Linux Linux kernel<6.1.103
Linux Linux kernel>=6.2<6.6.44
Linux Linux kernel>=6.7<6.10.3
debian/linux
5.10.223-1
5.10.226-1
6.1.115-1
6.1.112-1
6.11.7-1

Never miss a vulnerability like this again

Sign up to SecAlerts for real-time vulnerability data matched to your software, aggregated from hundreds of sources.

Contact

SecAlerts Pty Ltd.
132 Wickham Terrace
Fortitude Valley,
QLD 4006, Australia
info@secalerts.co
By using SecAlerts services, you agree to our services end-user license agreement. This website is safeguarded by reCAPTCHA and governed by the Google Privacy Policy and Terms of Service. All names, logos, and brands of products are owned by their respective owners, and any usage of these names, logos, and brands for identification purposes only does not imply endorsement. If you possess any content that requires removal, please get in touch with us.
© 2024 SecAlerts Pty Ltd.
ABN: 70 645 966 203, ACN: 645 966 203