CWE
921
EPSS
0.043%
Advisory Published
Advisory Published
Updated

CVE-2024-5206: Sensitive Data Leakage in sklearn.feature_extraction.text.TfidfVectorizer in scikit-learn/scikit-learn

First published: Thu Jun 06 2024(Updated: )

A sensitive data leakage vulnerability was identified in scikit-learn's TfidfVectorizer, specifically in versions up to and including 1.4.1.post1, which was fixed in version 1.5.0. The vulnerability arises from the unexpected storage of all tokens present in the training data within the `stop_words_` attribute, rather than only storing the subset of tokens required for the TF-IDF technique to function. This behavior leads to the potential leakage of sensitive information, as the `stop_words_` attribute could contain tokens that were meant to be discarded and not stored, such as passwords or keys. The impact of this vulnerability varies based on the nature of the data being processed by the vectorizer.

Credit: security@huntr.dev security@huntr.dev

Affected SoftwareAffected VersionHow to fix
pip/scikit-learn<1.5.0
1.5.0
IBM Cloud Pak for Security<=1.10.0.0 - 1.10.11.0
IBM QRadar Suite Software<=1.10.12.0 - 1.10.22.0

Never miss a vulnerability like this again

Sign up to SecAlerts for real-time vulnerability data matched to your software, aggregated from hundreds of sources.

Parent vulnerabilities

(Appears in the following advisories)

Contact

SecAlerts Pty Ltd.
132 Wickham Terrace
Fortitude Valley,
QLD 4006, Australia
info@secalerts.co
By using SecAlerts services, you agree to our services end-user license agreement. This website is safeguarded by reCAPTCHA and governed by the Google Privacy Policy and Terms of Service. All names, logos, and brands of products are owned by their respective owners, and any usage of these names, logos, and brands for identification purposes only does not imply endorsement. If you possess any content that requires removal, please get in touch with us.
© 2024 SecAlerts Pty Ltd.
ABN: 70 645 966 203, ACN: 645 966 203