Regex: Update PCRE to v8.35.
I was über lazy at first, so took libs from SM. But actually it's quite easy to compile, so let's update to latest version \o/.
This commit is contained in:
@ -9,8 +9,10 @@ from:
|
||||
ftp://ftp.csx.cam.ac.uk/pub/software/programming/pcre/pcre-xxx.zip
|
||||
|
||||
There is a mailing list for discussion about the development of PCRE at
|
||||
pcre-dev@exim.org. You can access the archives and subscribe or manage your
|
||||
subscription here:
|
||||
|
||||
pcre-dev@exim.org
|
||||
https://lists.exim.org/mailman/listinfo/pcre-dev
|
||||
|
||||
Please read the NEWS file if you are upgrading from a previous release.
|
||||
The contents of this README file are:
|
||||
@ -25,6 +27,8 @@ The contents of this README file are:
|
||||
Shared libraries
|
||||
Cross-compiling using autotools
|
||||
Using HP's ANSI C++ compiler (aCC)
|
||||
Compiling in Tru64 using native compilers
|
||||
Using Sun's compilers for Solaris
|
||||
Using PCRE from MySQL
|
||||
Making new tarballs
|
||||
Testing PCRE
|
||||
@ -35,10 +39,10 @@ The contents of this README file are:
|
||||
The PCRE APIs
|
||||
-------------
|
||||
|
||||
PCRE is written in C, and it has its own API. There are three sets of functions,
|
||||
one for the 8-bit library, which processes strings of bytes, one for the
|
||||
16-bit library, which processes strings of 16-bit values, and one for the 32-bit
|
||||
library, which processes strings of 32-bit values. The distribution also
|
||||
PCRE is written in C, and it has its own API. There are three sets of
|
||||
functions, one for the 8-bit library, which processes strings of bytes, one for
|
||||
the 16-bit library, which processes strings of 16-bit values, and one for the
|
||||
32-bit library, which processes strings of 32-bit values. The distribution also
|
||||
includes a set of C++ wrapper functions (see the pcrecpp man page for details),
|
||||
courtesy of Google Inc., which can be used to call the 8-bit PCRE library from
|
||||
C++.
|
||||
@ -81,11 +85,12 @@ documentation is supplied in two other forms:
|
||||
1. There are files called doc/pcre.txt, doc/pcregrep.txt, and
|
||||
doc/pcretest.txt in the source distribution. The first of these is a
|
||||
concatenation of the text forms of all the section 3 man pages except
|
||||
those that summarize individual functions. The other two are the text
|
||||
forms of the section 1 man pages for the pcregrep and pcretest commands.
|
||||
These text forms are provided for ease of scanning with text editors or
|
||||
similar tools. They are installed in <prefix>/share/doc/pcre, where
|
||||
<prefix> is the installation prefix (defaulting to /usr/local).
|
||||
the listing of pcredemo.c and those that summarize individual functions.
|
||||
The other two are the text forms of the section 1 man pages for the
|
||||
pcregrep and pcretest commands. These text forms are provided for ease of
|
||||
scanning with text editors or similar tools. They are installed in
|
||||
<prefix>/share/doc/pcre, where <prefix> is the installation prefix
|
||||
(defaulting to /usr/local).
|
||||
|
||||
2. A set of files containing all the documentation in HTML form, hyperlinked
|
||||
in various ways, and rooted in a file called index.html, is distributed in
|
||||
@ -110,6 +115,11 @@ contributions provided support for compiling PCRE on various flavours of
|
||||
Windows (I myself do not use Windows). Nowadays there is more Windows support
|
||||
in the standard distribution, so these contibutions have been archived.
|
||||
|
||||
A PCRE user maintains downloadable Windows binaries of the pcregrep and
|
||||
pcretest programs here:
|
||||
|
||||
http://www.rexegg.com/pcregrep-pcretest.html
|
||||
|
||||
|
||||
Building PCRE on non-Unix-like systems
|
||||
--------------------------------------
|
||||
@ -260,9 +270,17 @@ library. They are also documented in the pcrebuild man page.
|
||||
|
||||
on the "configure" command.
|
||||
|
||||
. PCRE has a counter that can be set to limit the amount of resources it uses.
|
||||
If the limit is exceeded during a match, the match fails. The default is ten
|
||||
million. You can change the default by setting, for example,
|
||||
. PCRE has a counter that limits the depth of nesting of parentheses in a
|
||||
pattern. This limits the amount of system stack that a pattern uses when it
|
||||
is compiled. The default is 250, but you can change it by setting, for
|
||||
example,
|
||||
|
||||
--with-parens-nest-limit=500
|
||||
|
||||
. PCRE has a counter that can be set to limit the amount of resources it uses
|
||||
when matching a pattern. If the limit is exceeded during a match, the match
|
||||
fails. The default is ten million. You can change the default by setting, for
|
||||
example,
|
||||
|
||||
--with-match-limit=500000
|
||||
|
||||
@ -342,7 +360,8 @@ library. They are also documented in the pcrebuild man page.
|
||||
report is generated by running "make coverage". If ccache is installed on
|
||||
your system, it must be disabled when building PCRE for coverage reporting.
|
||||
You can do this by setting the environment variable CCACHE_DISABLE=1 before
|
||||
running "make" to build PCRE.
|
||||
running "make" to build PCRE. There is more information about coverage
|
||||
reporting in the "pcrebuild" documentation.
|
||||
|
||||
. The pcregrep program currently supports only 8-bit data files, and so
|
||||
requires the 8-bit PCRE library. It is possible to compile pcregrep to use
|
||||
@ -354,12 +373,12 @@ library. They are also documented in the pcrebuild man page.
|
||||
|
||||
Of course, the relevant libraries must be installed on your system.
|
||||
|
||||
. The default size of internal buffer used by pcregrep can be set by, for
|
||||
example:
|
||||
. The default size (in bytes) of the internal buffer used by pcregrep can be
|
||||
set by, for example:
|
||||
|
||||
--with-pcregrep-bufsize=50K
|
||||
--with-pcregrep-bufsize=51200
|
||||
|
||||
The default value is 20K.
|
||||
The value must be a plain integer. The default is 20480.
|
||||
|
||||
. It is possible to compile pcretest so that it links with the libreadline
|
||||
or libedit libraries, by specifying, respectively,
|
||||
@ -575,6 +594,27 @@ running the "configure" script:
|
||||
CXXLDFLAGS="-lstd_v2 -lCsup_v2"
|
||||
|
||||
|
||||
Compiling in Tru64 using native compilers
|
||||
-----------------------------------------
|
||||
|
||||
The following error may occur when compiling with native compilers in the Tru64
|
||||
operating system:
|
||||
|
||||
CXX libpcrecpp_la-pcrecpp.lo
|
||||
cxx: Error: /usr/lib/cmplrs/cxx/V7.1-006/include/cxx/iosfwd, line 58: #error
|
||||
directive: "cannot include iosfwd -- define __USE_STD_IOSTREAM to
|
||||
override default - see section 7.1.2 of the C++ Using Guide"
|
||||
#error "cannot include iosfwd -- define __USE_STD_IOSTREAM to override default
|
||||
- see section 7.1.2 of the C++ Using Guide"
|
||||
|
||||
This may be followed by other errors, complaining that 'namespace "std" has no
|
||||
member'. The solution to this is to add the line
|
||||
|
||||
#define __USE_STD_IOSTREAM 1
|
||||
|
||||
to the config.h file.
|
||||
|
||||
|
||||
Using Sun's compilers for Solaris
|
||||
---------------------------------
|
||||
|
||||
@ -624,27 +664,40 @@ NON-AUTOTOOLS-BUILD.
|
||||
The RunTest script runs the pcretest test program (which is documented in its
|
||||
own man page) on each of the relevant testinput files in the testdata
|
||||
directory, and compares the output with the contents of the corresponding
|
||||
testoutput files. Some tests are relevant only when certain build-time options
|
||||
were selected. For example, the tests for UTF-8/16/32 support are run only if
|
||||
--enable-utf was used. RunTest outputs a comment when it skips a test.
|
||||
testoutput files. RunTest uses a file called testtry to hold the main output
|
||||
from pcretest. Other files whose names begin with "test" are used as working
|
||||
files in some tests.
|
||||
|
||||
Some tests are relevant only when certain build-time options were selected. For
|
||||
example, the tests for UTF-8/16/32 support are run only if --enable-utf was
|
||||
used. RunTest outputs a comment when it skips a test.
|
||||
|
||||
Many of the tests that are not skipped are run up to three times. The second
|
||||
run forces pcre_study() to be called for all patterns except for a few in some
|
||||
tests that are marked "never study" (see the pcretest program for how this is
|
||||
done). If JIT support is available, the non-DFA tests are run a third time,
|
||||
this time with a forced pcre_study() with the PCRE_STUDY_JIT_COMPILE option.
|
||||
This testing can be suppressed by putting "nojit" on the RunTest command line.
|
||||
|
||||
The entire set of tests is run once for each of the 8-bit, 16-bit and 32-bit
|
||||
libraries that are enabled. If you want to run just one set of tests, call
|
||||
RunTest with either the -8, -16 or -32 option.
|
||||
|
||||
RunTest uses a file called testtry to hold the main output from pcretest.
|
||||
Other files whose names begin with "test" are used as working files in some
|
||||
tests. To run pcretest on just one or more specific test files, give their
|
||||
numbers as arguments to RunTest, for example:
|
||||
If valgrind is installed, you can run the tests under it by putting "valgrind"
|
||||
on the RunTest command line. To run pcretest on just one or more specific test
|
||||
files, give their numbers as arguments to RunTest, for example:
|
||||
|
||||
RunTest 2 7 11
|
||||
|
||||
You can also specify ranges of tests such as 3-6 or 3- (meaning 3 to the
|
||||
end), or a number preceded by ~ to exclude a test. For example:
|
||||
|
||||
Runtest 3-15 ~10
|
||||
|
||||
This runs tests 3 to 15, excluding test 10, and just ~13 runs all the tests
|
||||
except test 13. Whatever order the arguments are in, the tests are always run
|
||||
in numerical order.
|
||||
|
||||
You can also call RunTest with the single argument "list" to cause it to output
|
||||
a list of tests.
|
||||
|
||||
@ -704,21 +757,24 @@ test is run only when JIT support is not available. They test some JIT-specific
|
||||
features such as information output from pcretest about JIT compilation.
|
||||
|
||||
The fourteenth, fifteenth, and sixteenth tests are run only in 8-bit mode, and
|
||||
the seventeenth, eighteenth, and nineteenth tests are run only in 16/32-bit mode.
|
||||
These are tests that generate different output in the two modes. They are for
|
||||
general cases, UTF-8/16/32 support, and Unicode property support, respectively.
|
||||
the seventeenth, eighteenth, and nineteenth tests are run only in 16/32-bit
|
||||
mode. These are tests that generate different output in the two modes. They are
|
||||
for general cases, UTF-8/16/32 support, and Unicode property support,
|
||||
respectively.
|
||||
|
||||
The twentieth test is run only in 16/32-bit mode. It tests some specific
|
||||
16/32-bit features of the DFA matching engine.
|
||||
|
||||
The twenty-first and twenty-second tests are run only in 16/32-bit mode, when the
|
||||
link size is set to 2 for the 16-bit library. They test reloading pre-compiled patterns.
|
||||
The twenty-first and twenty-second tests are run only in 16/32-bit mode, when
|
||||
the link size is set to 2 for the 16-bit library. They test reloading
|
||||
pre-compiled patterns.
|
||||
|
||||
The twenty-third and twenty-fourth tests are run only in 16-bit mode. They are for
|
||||
general cases, and UTF-16 support, respectively.
|
||||
The twenty-third and twenty-fourth tests are run only in 16-bit mode. They are
|
||||
for general cases, and UTF-16 support, respectively.
|
||||
|
||||
The twenty-fifth and twenty-sixth tests are run only in 32-bit mode. They are
|
||||
for general cases, and UTF-32 support, respectively.
|
||||
|
||||
The twenty-fifth and twenty-sixth tests are run only in 32-bit mode. They are for
|
||||
general cases, and UTF-32 support, respectively.
|
||||
|
||||
Character tables
|
||||
----------------
|
||||
@ -784,11 +840,11 @@ pcre_xxx, one with the name pcre16_xx, and a third with the name pcre32_xxx.
|
||||
(A) Source files of the PCRE library functions and their headers:
|
||||
|
||||
dftables.c auxiliary program for building pcre_chartables.c
|
||||
when --enable-rebuild-chartables is specified
|
||||
when --enable-rebuild-chartables is specified
|
||||
|
||||
pcre_chartables.c.dist a default set of character tables that assume ASCII
|
||||
coding; used, unless --enable-rebuild-chartables is
|
||||
specified, by copying to pcre[16]_chartables.c
|
||||
coding; used, unless --enable-rebuild-chartables is
|
||||
specified, by copying to pcre[16]_chartables.c
|
||||
|
||||
pcreposix.c )
|
||||
pcre[16|32]_byte_order.c )
|
||||
@ -932,4 +988,4 @@ pcre_xxx, one with the name pcre16_xx, and a third with the name pcre32_xxx.
|
||||
Philip Hazel
|
||||
Email local part: ph10
|
||||
Email domain: cam.ac.uk
|
||||
Last updated: 27 October 2012
|
||||
Last updated: 17 January 2014
|
||||
|
Reference in New Issue
Block a user