!100 [sync] PR-99: Fix CVE-2023-47038

From: @openeuler-sync-bot Reviewed-by: @openeuler-basic Signed-off-by: @openeuler-basic
2023-11-27 12:36:51 +00:00 · 2023-11-27 12:36:51 +00:00 · 5a93e1d593
commit 5a93e1d593
parent bb3d3ef271 24d33495b8
2 changed files with 123 additions and 1 deletions
--- a/backport-CVE-2023-47038.patch
+++ b/backport-CVE-2023-47038.patch
@ -0,0 +1,118 @@
+From 12c313ce49b36160a7ca2e9b07ad5bd92ee4a010 Mon Sep 17 00:00:00 2001
+From: Karl Williamson <khw@cpan.org>
+Date: Sat, 9 Sep 2023 11:59:09 -0600
+Subject: [PATCH] Fix read/write past buffer end: perl-security#140
+
+A package name may be specified in a \p{...} regular expression
+construct.  If unspecified, "utf8::" is assumed, which is the package
+all official Unicode properties are in.  By specifying a different
+package, one can create a user-defined property with the same
+unqualified name as a Unicode one.  Such a property is defined by a sub 
+whose name begins with "Is" or "In", and if the sub wishes to refer to
+an official Unicode property, it must explicitly specify the "utf8::".
+S_parse_uniprop_string() is used to parse the interior of both \p{} and 
+the user-defined sub lines.
+
+In S_parse_uniprop_string(), it parses the input "name" parameter,
+creating a modified copy, "lookup_name", malloc'ed with the same size as
+"name".  The modifications are essentially to create a canonicalized
+version of the input, with such things as extraneous white-space
+stripped off.  I found it convenient to strip off the package specifier
+"utf8::".  To to so, the code simply pretends "lookup_name" begins just
+after the "utf8::", and adjusts various other values to compensate.
+However, it missed the adjustment of one required one.
+
+This is only a problem when the property name begins with "perl" and 
+isn't "perlspace" nor "perlword".  All such ones are undocumented
+internal properties.
+
+What happens in this case is that the input is reparsed with slightly
+different rules in effect as to what is legal versus illegal.  The 
+problem is that "lookup_name" no longer is pointing to its initial
+value, but "name" is.  Thus the space allocated for filling "lookup_name"
+is now shorter than "name", and as this shortened "lookup_name" is
+filled by copying suitable portions of "name", the write can be to
+unallocated space.
+
+The solution is to skip the "utf8::" when reparsing "name".  Then both
+"lookup_name" and "name" are effectively shortened by the same amount,
+and there is no going off the end.
+
+This commit also does white-space adjustment so that things align
+vertically for readability.
+
+This can be easily backported to earlier Perl releases.
+---
+ regcomp.c           | 17 +++++++++++------
+ t/re/pat_advanced.t |  7 +++++++
+ 2 files changed, 18 insertions(+), 6 deletions(-)
+
+diff --git a/regcomp.c b/regcomp.c
+index f5e5f58..0d3e9a9 100644
+--- a/regcomp.c
+++ b/regcomp.c
+@@ -23815,7 +23815,7 @@ S_parse_uniprop_string(pTHX_
+      * compile perl to know about them) */
+     bool is_nv_type = FALSE;
+ 
+-    unsigned int i, j = 0;
+    unsigned int i = 0, i_zero = 0, j = 0;
+     int equals_pos = -1;    /* Where the '=' is found, or negative if none */
+     int slash_pos  = -1;    /* Where the '/' is found, or negative if none */
+     int table_index = 0;    /* The entry number for this property in the table
+@@ -23949,9 +23949,13 @@ S_parse_uniprop_string(pTHX_
+      * all of them are considered to be for that package.  For the purposes of
+      * parsing the rest of the property, strip it off */
+     if (non_pkg_begin == STRLENs("utf8::") && memBEGINPs(name, name_len, "utf8::")) {
+-        lookup_name +=  STRLENs("utf8::");
+-        j -=  STRLENs("utf8::");
+-        equals_pos -=  STRLENs("utf8::");
+        lookup_name += STRLENs("utf8::");
+        j           -= STRLENs("utf8::");
+        equals_pos  -= STRLENs("utf8::");
+        i_zero       = STRLENs("utf8::");   /* When resetting 'i' to reparse
+                                               from the beginning, it has to be
+                                               set past what we're stripping
+                                               off */
+         stripped_utf8_pkg = TRUE;
+     }
+ 
+@@ -24356,7 +24360,8 @@ S_parse_uniprop_string(pTHX_
+ 
+             /* We set the inputs back to 0 and the code below will reparse,
+              * using strict */
+-            i = j = 0;
+            i = i_zero;
+            j = 0;
+         }
+     }
+ 
+@@ -24377,7 +24382,7 @@ S_parse_uniprop_string(pTHX_
+          * separates two digits */
+         if (cur == '_') {
+             if (    stricter
+-                && (     i == 0 || (int) i == equals_pos || i == name_len- 1
+                && ( i == i_zero || (int) i == equals_pos || i == name_len- 1
+                     || ! isDIGIT_A(name[i-1]) || ! isDIGIT_A(name[i+1])))
+             {
+                 lookup_name[j++] = '_';
+diff --git a/t/re/pat_advanced.t b/t/re/pat_advanced.t
+index d679870..3b79eec 100644
+--- a/t/re/pat_advanced.t
+++ b/t/re/pat_advanced.t
+@@ -2565,6 +2565,13 @@ EOF
+                        {}, "GH #17278");
+     }
+ 
+    {   # perl-security#140, read/write past buffer end
+        fresh_perl_like('qr/\p{utf8::perl x}/',
+                        qr/Illegal user-defined property name "utf8::perl x" in regex/,
+                        {}, "perl-security#140");
+        fresh_perl_is('qr/\p{utf8::_perl_surrogate}/', "",
+                        {}, "perl-security#140");
+    }
+ 
+     # !!! NOTE that tests that aren't at all likely to crash perl should go
+     # a ways above, above these last ones.  There's a comment there that, like
+-- 
+2.33.0
--- a/perl.spec
+++ b/perl.spec
@ -22,7 +22,7 @@ Name:           perl
 License:        (GPL+ or Artistic) and (GPLv2+ or Artistic) and MIT and UCD and Public Domain and BSD
 Epoch:          4
 Version:        %{perl_version}
-Release:        10
+Release:        11
 Summary:        A highly capable, feature-rich programming language
 Url:            https://www.perl.org/
 Source0:        https://www.cpan.org/src/5.0/%{name}-%{version}.tar.xz
@ -41,6 +41,7 @@ Patch6000: backport-CVE-2021-36770.patch
 Patch6001: backport-CVE-2023-31484.patch
 Patch6002: backport-CVE-2023-31486.patch
 Patch6003: backport-CVE-2022-48522.patch
+Patch6004: backport-CVE-2023-47038.patch

 BuildRequires:  gcc bash findutils coreutils make tar procps bzip2-devel gdbm-devel perl-File-Compare perl-File-Find 
 BuildRequires:  zlib-devel systemtap-sdt-devel perl-interpreter perl-generators
@ -491,6 +492,9 @@ make test_harness
 %{_mandir}/man3/*

 %changelog
+* Mon Nov 27 2023 hongjinghao <hongjinghao@huawei.com> - 4:5.34.0-11
+- Fix CVE-2023-47038
+
 * Fri Sep 8 2023 zhangzikang <zhangzikang@kylinos.cn> - 4:5.34.0-10
 - Type:bugfix
 - ID:NA