1

I am trying to model a request for a column that will be VARCHAR(120), but will actually store 480 bytes.

The length of 480 is established as VARCHAR (120) = 120 x 4 = 480 bytes. This is due to string unit specification code unit 32. That means for each 1 character, it will consume 4 bytes.

I am not certain of the syntax for the DDL. My research indicates that there should be a statement during table creation that sets the Character Set to UTF-32, but I am uncertain. This is for DB2 Mid-Tier (Non Mainframe)

Here is something I found, but I am not sure it is valid, nor how COLLATE works.

ALTER TABLE table_name   
MODIFY column_name VARCHAR(255) 
CHARACTER SET utf8 
COLLATE utf8_unicode_ci;

Any help would be appreciated!

zx485
  • 2,337

2 Answers2

0

There is no way to declare columns of type UTF-32.

You will need to define the column as VARCHAR(480), then use CODEUNITS32 whenever you access the column, to cast it to UTF-32.

Examples:

SELECT CHARACTER_LENGTH(NAME,CODEUNITS32) FROM T1 WHERE NAME = 'Jürgen';
SET :POSITION = LOCATE_IN_STRING('Jürgen lives on Hegelstraße','ß',-1,CODEUNITS32);

Some IBM references:

harrymc
  • 498,455
0

For DB2 LUW, simply declare the column as VARCHAR(120 CODEUNITS32). Db2 will store the data as UTF-8, only allow up to 120 characters, but will allow up to 480 bytes.

create table T(C VARCHAR(e CODEUNITS32))
insert into t values '€£'
select c, hex(c), lengthb(c),length(c) from t

C    2                  3 4
---- ------------------ - -
€£ E282ACC2A3F09F9883 9 3

https://www.ibm.com/support/knowledgecenter/SSEPGG_11.1.0/com.ibm.db2.luw.sql.ref.doc/doc/r0008470.html