[TIP] Tips on Position and v11.2

Stephen Orth (6/29/08 7:37PM)
Jeffrey Kain (6/29/08 8:22PM)


Stephen Orth (6/29/08 7:37 PM)

Jeff,

Thank you very much for taking the time to study this and report the

results.

I for one will be taking your advice!

Steve  :-)

-----Original Message-----

From: 4d_tech-bounces@...
[mailto:4d_tech-bounces@... On

Behalf Of Jeffrey Kain

Sent: Sunday, June 29, 2008 7:22 PM

Tip 1:

4D v11's Unicode roots significantly change the behavior of the
Position

function if you are looking for strings that contain low-ASCII
characters.

For example, trying to find Position (Char(2);$text) doesn't work the
same

as all earlier versions of 4D, since Char(2) is considered an ignorable

character.

4D added an optional * parameter to make Position (and Replace string)
case

sensitive and diacritical-aware, and it can also find these ignorable

characters as well. A wrapper for Position that is compatible with
previous

versions of 4D (except for Char(0) which is not allowed in strings
anymore),

is as follows:

 ` Wrapper for 4D function Position

 C_LONGINT($0)

 C_TEXT($1;$2)

 $0:=Position(Uppercase($1);Uppercase($2);*)

Tip 2:

Make sure you run your database in Unicode mode if you do a lot of
work with

strings. I wrote a simple benchmark method which creates a string of
25000

random characters and then tries to find each character in the string.
in

compiled mode, 4D 2004 executes in 1 tick. 4D v11 in non-Unicode mode
takes

about 474 ticks on the same hardware.  In interpreted non-Unicode
mode, v11

is about 560 ticks while 4D 2004 is about 58 ticks.

Flip the Unicode switch, however, and 4D v11 is just as fast as v2004

(faster in interpreted mode - the interpreter in v11 seems quite a bit

faster than v2004 which helps in large loops and code intensive
operations).

Jeff

Jeffrey Kain (6/29/08 8:22 PM)

Tip 1:

4D v11's Unicode roots significantly change the behavior of the
Position

function if you are looking for strings that contain low-ASCII
characters.

For example, trying to find Position (Char(2);$text) doesn't work the
same

as all earlier versions of 4D, since Char(2) is considered an ignorable

character.

4D added an optional * parameter to make Position (and Replace string)
case

sensitive and diacritical-aware, and it can also find these ignorable

characters as well. A wrapper for Position that is compatible with
previous

versions of 4D (except for Char(0) which is not allowed in strings
anymore),

is as follows:

 ` Wrapper for 4D function Position

 C_LONGINT($0)

 C_TEXT($1;$2)

 $0:=Position(Uppercase($1);Uppercase($2);*)

Tip 2:

Make sure you run your database in Unicode mode if you do a lot of
work with

strings. I wrote a simple benchmark method which creates a string of
25000

random characters and then tries to find each character in the string.
in

compiled mode, 4D 2004 executes in 1 tick. 4D v11 in non-Unicode mode
takes

about 474 ticks on the same hardware.  In interpreted non-Unicode
mode, v11

is about 560 ticks while 4D 2004 is about 58 ticks.

Flip the Unicode switch, however, and 4D v11 is just as fast as v2004

(faster in interpreted mode - the interpreter in v11 seems quite a bit

faster than v2004 which helps in large loops and code intensive
operations).

Jeff

Reply to this message

Summary created 6/30/08 at 2:06AM by Intellex Corporation

Comments welcome at: feedback@intellexcorp.com