[TIP] Tips on Position and v11.2
Stephen Orth (6/29/08 7:37PM)
Jeffrey Kain (6/29/08 8:22PM)
Stephen Orth (6/29/08 7:37 PM)
Jeff,
Thank you very much for taking the time to study this and report the
results.
I for one will be taking your advice!
Steve :-)
-----Original Message-----
From: 4d_tech-bounces@...
[mailto:4d_tech-bounces@... On
Behalf Of Jeffrey Kain
Sent: Sunday, June 29, 2008 7:22 PM
Tip 1:
4D v11's Unicode roots significantly change the behavior of the
Position
function if you are looking for strings that contain low-ASCII
characters.
For example, trying to find Position (Char(2);$text) doesn't work the
same
as all earlier versions of 4D, since Char(2) is considered an ignorable
character.
4D added an optional * parameter to make Position (and Replace string)
case
sensitive and diacritical-aware, and it can also find these ignorable
characters as well. A wrapper for Position that is compatible with
previous
versions of 4D (except for Char(0) which is not allowed in strings
anymore),
is as follows:
` Wrapper for 4D function Position
C_LONGINT($0)
C_TEXT($1;$2)
$0:=Position(Uppercase($1);Uppercase($2);*)
Tip 2:
Make sure you run your database in Unicode mode if you do a lot of
work with
strings. I wrote a simple benchmark method which creates a string of
25000
random characters and then tries to find each character in the string.
in
compiled mode, 4D 2004 executes in 1 tick. 4D v11 in non-Unicode mode
takes
about 474 ticks on the same hardware. In interpreted non-Unicode
mode, v11
is about 560 ticks while 4D 2004 is about 58 ticks.
Flip the Unicode switch, however, and 4D v11 is just as fast as v2004
(faster in interpreted mode - the interpreter in v11 seems quite a bit
faster than v2004 which helps in large loops and code intensive
operations).
Jeff
Jeffrey Kain (6/29/08 8:22 PM)
Tip 1:
4D v11's Unicode roots significantly change the behavior of the
Position
function if you are looking for strings that contain low-ASCII
characters.
For example, trying to find Position (Char(2);$text) doesn't work the
same
as all earlier versions of 4D, since Char(2) is considered an ignorable
character.
4D added an optional * parameter to make Position (and Replace string)
case
sensitive and diacritical-aware, and it can also find these ignorable
characters as well. A wrapper for Position that is compatible with
previous
versions of 4D (except for Char(0) which is not allowed in strings
anymore),
is as follows:
` Wrapper for 4D function Position
C_LONGINT($0)
C_TEXT($1;$2)
$0:=Position(Uppercase($1);Uppercase($2);*)
Tip 2:
Make sure you run your database in Unicode mode if you do a lot of
work with
strings. I wrote a simple benchmark method which creates a string of
25000
random characters and then tries to find each character in the string.
in
compiled mode, 4D 2004 executes in 1 tick. 4D v11 in non-Unicode mode
takes
about 474 ticks on the same hardware. In interpreted non-Unicode
mode, v11
is about 560 ticks while 4D 2004 is about 58 ticks.
Flip the Unicode switch, however, and 4D v11 is just as fast as v2004
(faster in interpreted mode - the interpreter in v11 seems quite a bit
faster than v2004 which helps in large loops and code intensive
operations).
Jeff
Reply to this message
Summary created 6/30/08 at 2:06AM by Intellex Corporation
Comments welcome at: feedback@intellexcorp.com