Filtrowe nastajenja CSV

Filter CSV znamjenjowy rjeśazk nastajenja akceptěrujo, kótaryž pěś do pěśnasćo tokenow wopśimujo, kótarež su pśez komu źělone. Tokeny 6 do 15 su fakultatiwne.

Example:

Import z ITF-8, rěc nimšćina, pśez komu źělony, tekstowe źěleńske znamuško ", pólo w pazorkach ako tekst. Dataja CSV ma słupy, kótarež su ako datum, licba, licba, licba formatěrowane:

soffice --infilter="Text - txt - csv (StarCalc):44,34,76,1,1/5/2/1/3/1/4/1,1031,true,true" test.csv

Eksport do Windows-1252, pólne źěleńske znamuško : koma, tekstowe źěleńske znamuško : jadne pazorki, celowe wopśimjeśe ako pokazane składowaś:

soffice --convert-to "csv:Text - txt - csv (StarCalc):44,34,ANSI,1,,0,false,true,true" --outdir=/home/user test.ods

Pozicija tokena

Definicija

Wóznam a pśikład tokena

1

Pólne źěleńske znamuško

Pólne źěleńske znamuška ako gódnoty ASCII. Někotare gódnoty se pśez nakósnu smužku („/“) źěle, to groni, jolic gódnoty se pśez semikolony a horicontalne tabulatory źěle, by token 59/9 był. Aby z někotarymi na se slědujucymi źěleńskimi znamuškami ako jadno wobchadał, pśipowjesćo „/MRG“ k tokenoju. Jolic dataja póla z njepśeměnjateju šyrokosću wopśimujo, wužywajśo „FIX“. Pśikład: 44 (,)

2

Tekstowe źěleńske znamuško

Tekstowe źěleńske znamuško ako gódnota ASCII, na pśikład 34 za dwójne pazorki a 39 za jadnotliwe pazorki: Pśikład: 34 (").

3

Znamjenjowa sajźba

Kod znamjenjoweje sajźby, kótaryž se w dataji wužywa, ako w slědujucej tabeli wopisany. Pśikład: 0 (system).

4

Smužkowy numer, wót kótaregož se ma cytaś.

CSV Import

N: line number to start reading. Example: 3 (start from third line).

5

Cell Format Codes for Each Column

CSV Import

A sequence of column/formatting code, where the formatting code is given in the table below. Example: "1/5/2/1/3/1/4/1".

If value separators are used, the form of this token is column/format[/column/format/…] where column is the number of the column, with 1 being the leftmost column. The format code is detailed below.

If the first token is FIX it has the form start/format[/start/format/…], where start is the number of the first character for this field, with 0 being the leftmost character in a line. The format is explained below.

6

Language identifier

String expressed in decimal notation. This token is the equivalent of the "Language" listbox in the user interface for CSV import. If the value is 0 or omitted, the language identifier of the user interface is used. The language identifier is based on the Microsoft language identifiers.

7

Quoted field as text

String, either false or true. Default value: false. This token is the equivalent of the check box "Quoted field as text".

8

Detect special numbers

Import: String, either false or true. Default value: false. This token is the equivalent of the check box "Detect special numbers".

Export: String, either false or true. Default value: true. This token has no UI equivalent. If true, the number cells are stored as numbers. If false, the numbers are stored as text, with text delimiters.

9

Save cell contents as shown

CSV Export

String, either false or true. Default value:true. This token is the equivalent of the check box "Save cell contents as shown".

10

Export cell formulas

CSV Export

String, either false or true. Default value: false. Export cell formulas.

11

Remove spaces

CSV Import

String, either false or true. Default value: false. Remove spaces. Trim leading and trailing spaces, when reading the file.

12

Export sheets

CSV Export

Export the entire document to individual sheets .csv files or a specified sheet.

  • 0 or absent: means the default behaviour, first sheet from command line, or current sheet in macro filter options, exported to sample.csv

  • -1: for all sheets, each sheet is exported to an individual file of the base file name concatenated with the sheet name, for example sample-Sheet1.csv, sample-Sheet2.csv and sample-Sheet3.csv

  • N: export the N-th sheet within the range of number of sheets. Example: to export the second sheet, set 2 here to get sample-Sheet2.csv

13

Import as formulas

CSV Import

String, either false or true. Default value: false. Determines whether formula expressions starting with a = equal sign character are to be evaluated as formulas or imported as textual data. If true evaluate formulas on input. If false formulas are input as text. If omitted (not present at all), the default value is true to keep the behaviour of old versions' options string that didn't have this token at all. If present and empty (or any other value than true) the default value is false.

14

Include a byte-order-mark (BOM)

CSV Export

String, either false or true. Default value: false. If true include a byte-order-mark (BOM) in the export. If false the export does not include a BOM. If omitted (not present at all), the default value is false to keep the behaviour of old versions' options string that didn't have this token at all. If present and empty (or any other value than true) the default value is false. Automatically detected during the import.

15

Detect numbers in scientific notation

CSV Import

String, either false or true. Default value: true. If true detect if a cell content containing an 'E' or 'e' is a number in scientific notation. If false do not try to detect numbers in scientific notation. Token can be false only if token 8 (Detect special numbers) is false. If omitted, the default value is true to keep the behaviour of old versions' options string that didn't have this token at all.


Special case of CSV files with separator defined in the first line

CSV import and export support a sep= and "sep=" field separator setting. When reading a CSV document, the separator is taken from the initial sep= or "sep=" single field, if that is the only line content.

When reading a CSV file, the quoted form is preserved as (unquoted) cell content. You see sep=| when | is the separator in the first line. In the unquoted form, the separator is discarded because it is a real field separator in the context. You see sep= in the first line.

When writing a CSV file, the existing single top left cell's content such as sep=| is adapted to the current separator with the quoted form of "sep=|" (if quotes / text delimiters aren't set empty and | is the separator) and always uses the ASCII " double quote character.

If the line containing the sep=| is not to be imported as data, remember to set the From row number in the dialog to 2. Note that this line will not be preserved when re-saving.

Example:


        sep=|
        "LETTER"|"ANIMAL"
        "a"|"aardvark"
        "b"|"bear"
        "c"|"cow"
    

Formatting Codes for Token 5

Meaning

Code

Standard

1

Text

2

MM/DD/YY

3

DD/MM/YY

4

YY/MM/DD

5

-

6

-

7

-

8

Ignore field (do not import)

9

US-English

10


Character Set Codes for Token 3

Character set

Index

Unknown

0

Windows-1252/WinLatin 1 (Western)

1

Apple Macintosh (Western)

2

DOS/OS2-437/US (Western)

3

DOS/OS2-850/International (Western)

4

DOS/OS2-860/Portuguese (Western)

5

DOS/OS2-861/Icelandic (Western)

6

DOS/OS2-863/Canadian-French (Western)

7

DOS/OS2-865/Nordic (Western)

8

System default

9

Symbol

10

ASCII/US (Western)

11

ISO-8859-1 (Western)

12

ISO-8859-2 (Central European)

13

ISO-8859-3 (Latin 3)

14

ISO-8859-4 (Baltic)

15

ISO-8859-5 (Cyrillic)

16

ISO-8859-6 (Arabic)

17

ISO-8859-7 (Greek)

18

ISO-8859-8 (Hebrew)

19

ISO-8859-9 (Turkish)

20

ISO-8859-14 (Western)

21

ISO-8859-15/EURO (Western)

22

DOS/OS2-737 (Greek)

23

DOS/OS2-775 (Baltic)

24

DOS/OS2-852 (Central European)

25

DOS/OS2-855 (Cyrillic)

26

DOS/OS2-857 (Turkish)

27

DOS/OS2-862 (Hebrew)

28

DOS/OS2-864 (Arabic)

29

DOS/OS2-866/Russian (Cyrillic)

30

DOS/OS2-869/Modern (Greek)

31

DOS/Windows-874 (Thai)

32

Windows-1250/WinLatin 2 (Central European)

33

Windows-1251 (Cyrillic)

34

Windows-1253 (Greek)

35

Windows-1254 (Turkish)

36

Windows-1255 (Hebrew)

37

Windows-1256 (Arabic)

38

Windows-1257 (Baltic)

39

Windows-1258 (Vietnamese)

40

Apple Macintosh (Arabic)

41

Apple Macintosh (Central European)

42

Apple Macintosh/Croatian (Central European)

43

Apple Macintosh (Cyrillic)

44

Not supported: Apple Macintosh (Devanagari)

45

Not supported: Apple Macintosh (Farsi)

46

Apple Macintosh (Greek)

47

Not supported: Apple Macintosh (Gujarati)

48

Not supported: Apple Macintosh (Gurmukhi)

49

Apple Macintosh (Hebrew)

50

Apple Macintosh/Icelandic (Western)

51

Apple Macintosh/Romanian (Central European)

52

Apple Macintosh (Thai)

53

Apple Macintosh (Turkish)

54

Apple Macintosh/Ukrainian (Cyrillic)

55

Apple Macintosh (Chinese Simplified)

56

Apple Macintosh (Chinese Traditional)

57

Apple Macintosh (Japanese)

58

Apple Macintosh (Korean)

59

Windows-932 (Japanese)

60

Windows-936 (Chinese Simplified)

61

Windows-Wansung-949 (Korean)

62

Windows-950 (Chinese Traditional)

63

Shift-JIS (Japanese)

64

GB-2312 (Chinese Simplified)

65

GBT-12345 (Chinese Traditional)

66

GBK/GB-2312-80 (Chinese Simplified)

67

BIG5 (Chinese Traditional)

68

EUC-JP (Japanese)

69

EUC-CN (Chinese Simplified)

70

EUC-TW (Chinese Traditional)

71

ISO-2022-JP (Japanese)

72

ISO-2022-CN (Chinese Simplified)

73

KOI8-R (Cyrillic)

74

Unicode (UTF-7)

75

Unicode (UTF-8)

76

ISO-8859-10 (Central European)

77

ISO-8859-13 (Central European)

78

EUC-KR (Korean)

79

ISO-2022-KR (Korean)

80

JIS 0201 (Japanese)

81

JIS 0208 (Japanese)

82

JIS 0212 (Japanese)

83

Windows-Johab-1361 (Korean)

84

GB-18030 (Chinese Simplified)

85

BIG5-HKSCS (Chinese Traditional)

86

TIS 620 (Thai)

87

KOI8-U (Cyrillic)

88

ISCII Devanagari (Indian)

89

Unicode (Java's modified UTF-8)

90

Adobe Standard

91

Adobe Symbol

92

PT 154 (Windows Cyrillic Asian codepage developed in ParaType)

93

Unicode UCS4

65534

Unicode UCS2

65535