1

Albeit there is some documentation about dataset filtering, the syntax details are only outlined. In my application I want to filter person names with a dataset filter. Normally this works really fast, but I've stumbled over a minor problem filtering for example a TClientDataset. How can I add a like filter for an umlaut? The expression

[X] LIKE 'Ö%'

(for a given field X) does not work (in contrast to the expression [X] LIKE 'A%'). Is this just a bug or do I need to set a charset / encoding somewhere?

Minimal example:

procedure TForm1.FormCreate(Sender: TObject);
var
  LField: TFieldDef;
  LCDs: TClientDataSet;
const
  SAMPLE_CHAR: string = 'Ö';
begin
  LCds := TClientDataSet.Create(Self);
  LField := LCds.FieldDefs.AddFieldDef();
  LField.DataType := ftString;
  LField.Size := 10;
  LField.Name := 'X';
  LCDs.CreateDataSet;
  LCDs.Append;
  LCDs.FieldByName('X').AsString := SAMPLE_CHAR;
  LCDs.Post;

  ShowMessage(LCds.FieldByName('X').AsString);
  LCds.Filter := '[X] LIKE ' + QuotedStr(SAMPLE_CHAR + '%');
  LCds.Filtered := true;
  ShowMessage(LCds.FieldByName('X').AsString);
end;

The first message box shows Ö, whereas the second message box is empty. If you change SAMPLE_CHAR from Ö to A, both message boxes show A.

ventiseis
  • 3,029
  • 11
  • 32
  • 49
  • 1
    What database are you using? What charset is the `X` field actually using? Does the driver you are using to access the database support Unicode characters in SQL statements? SQLServer, for instance, has an `N` prefix for Unicode string literals, eg `[X] LIKE N'Ö%'` – Remy Lebeau Jul 27 '17 at 00:17
  • 1
    @RemyLebeau: There is no database. It's a `TClientDataset`, being used as an in-memory dataset. – Ken White Jul 27 '17 at 01:47
  • 2
    @KenWhite: good point, I missed that. In which case, try using `ftWideString` instead of `ftString`, and possibly also `AsWideString` instead of `AsString`. – Remy Lebeau Jul 27 '17 at 03:00
  • @RemyLebeau. I just tried your ftWideString solution (without the AsWideString) and it works on my machine. – Dsm Jul 27 '17 at 08:38
  • @Dsm Works for me too! But is there another way round? Can I write a filter in such a way without changing the whole dataset? Obviously, the string `Ö` itself is somehow stored in the dataset. [This page](http://docwiki.embarcadero.com/Libraries/Berlin/en/Data.DB.TDataSet.Translate) mentions that the internal string charset is `utf-8`, but I don't know if that applies here. If `AsString`can convert `Ö` why can't the `like`operator in the filter expression do that conversion as well? – ventiseis Jul 27 '17 at 18:00

1 Answers1

2

Use ftWideString data type to create a TWideStringField field instead of ftString, which internally creates a TStringField field. TStringField is for ANSI strings whilst TWideStringField for Unicode ones. Do that, otherwise you lose data.

To access TWideStringField value use AsWideString property. I've made a quick test in D 2009, and when I tried to filter the dataset I got this:

First chance exception at $7594845D. Exception class EAccessViolation with message 'Access violation at address 4DB1E8D1 in module 'midas.dll'. Read of address 00FC0298'.

Tested code:

procedure TForm1.FormCreate(Sender: TObject);
var
  S: string;
  FieldDef: TFieldDef;
  MemTable: TClientDataSet;
begin
  S := 'Ŧĥε qùíçķ ƀřǭŵņ fôx ǰűmpεď ōvêŗ ţħě łáƶÿ ďơǥ';

  MemTable := TClientDataSet.Create(nil);
  try
    FieldDef := MemTable.FieldDefs.AddFieldDef;
    FieldDef.DataType := ftWideString;
    FieldDef.Size := 255;
    FieldDef.Name := 'MyField';

    MemTable.CreateDataSet;
    MemTable.Append;
    MemTable.FieldByName('MyField').AsWideString := S;
    MemTable.Post;

    ShowMessage(MemTable.FieldByName('MyField').AsWideString); { ← data lost }
    MemTable.Filter := '[MyField] LIKE ' + QuotedStr('%' + 'ǰűmpεď' + '%');
    MemTable.Filtered := True; { ← access violation }
    ShowMessage(MemTable.FieldByName('MyField').AsWideString);
  finally
    MemTable.Free;
  end;
end;

I hope it's not related to your Delphi version, but still, I would prefer using FireDAC if you can. There you would do the same for Unicode strings (your code would change by replacing TClientDataSet by TFDMemTable and adding FireDAC units).

Victoria
  • 7,822
  • 2
  • 21
  • 44
  • 2
    Bravo, another +1. Just one tiny thing though, would you consider changing your var name `Field` to `FieldDef`? Having a `Field` which is actually a `TFieldDef` grates a bit. Or, you could make `Field` an actual `TField` and then avoid the `FieldByName` call. – MartynA Jul 27 '17 at 21:19
  • @MartynA, thank you! You are right, I've made the variable names more self explanatory. Still, I'm afraid more about the access violation. – Victoria Jul 27 '17 at 21:26
  • Thank your for the answer! As I wrote in the comment above I'm still looking for a way to write a working filter with the same dataset definition for two reasons: at first, I won't loose any data because I'm not using full unicode data range, I'm happy with `CP-1252` (which accepts `Ö`). Second, filtering is only one functionality I use datasets for. We have serveral hundred datasets to store and display tabular data and I'm unable to determine the consequences of changing the datatype from `ftString` to `ftWideString`.just for filtering. – ventiseis Jul 27 '17 at 21:40
  • 1
    Aha, so it's not about Unicode. Then it must be something with the filter. It behaves weird. With ANSI field type, `SAMPLE_CHAR='ÄÖÄ'` filtering by `Ä%` and `%ÖÄ` does not work, whilst `%Ä`, `ÄÖ%`, `%Ä%` and `%Ö%` does. – Victoria Jul 27 '17 at 22:12
  • @Victoria You're totally right. Just tried `§` and `[X] LIKE '§%'`, doesn't work either. Think it's time for a bug report... – ventiseis Jul 27 '17 at 22:18
  • 1
    Interesting is that if you include `foCaseInsensitive` into `FilterOptions`, it works. It works fine with `ftWideString` fields, so I keep the answer as is. But if you don't mind case insensitivity, it might be a workaround. But it's a MIDAS bug. – Victoria Jul 27 '17 at 22:49
  • 1
    @Victoria That's a very nice solution! As a matter of fact we use `UPPER([X]) LIKE ' + QuotedStr( SAMPLE_CHAR + '%')`, which doesn't work either. With `foCaseInsensitive` we have a viable workaround. – ventiseis Jul 28 '17 at 19:53
  • Glad it helped! :) But, ehm, it's still something that should not happen, I'd say. – Victoria Jul 28 '17 at 19:55