PDA

View Full Version : Character List



hcrisp
02-23-2009, 10:25 AM
It took me a while to figure out how to initialize an array of all character values. Why do these other methods not work?

This code results in a single, empty string value. Why?


WAVE> x = string(bindgen(256))
WAVE> info, x
X STRING = ''


This code causes an index out of range error. Why?


WAVE> x = strarr(256)
WAVE> i = 0b
WAVE> for j=1L, 255 do i=i+1b & x(j) = string(i)
%%%Attempt to subscript X with J is out of range.
%%%Execution halted at $MAIN$ .


This is the only code I could find that works:


WAVE> x = strarr(256)
WAVE> for i=0L, 255 do x(i) = string(byte(i))
WAVE> info, x
X STRING = Array(256)

hcrisp
02-23-2009, 12:51 PM
Solved my own problem. Appears that the string(0B) is causing the problem in the first example. But I don't know why.

This works:


WAVE> x = string(bindgen(255)+1b)
WAVE> print, x
☺☻♥♦♣
♫☼►◄↕‼??▬↨↑↓→←∟↔▲▼ !"#$%&'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]
^_`abcdefghijklmnopqrstuvwxyz{|}~⌂???????????????? ?????????????₧?????????⌐????
??░▒▓│┤╡╢╖╕╣║╗╝╜╛┐└┴┬├─┼╞╟╚╔╩╦╠═╬╧╨╤╥╙╘╒╓╫╪┘┌█▄▌▐▀ α?ΓπΣσ?τΦΘΩδ∞φε∩≡?≥≤⌠⌡?≈?∙?√ⁿ?



The second example doesn't work because "x(j) = string(i)" gets executed after the for loop which has left the j variable at value 256. Since x does not have an 256th index, you get the error.

rwagner
02-23-2009, 01:41 PM
This is an interesting carry-over from the C programming language. If you look at an ASCII table, the entry for slot 0 is the null character. Strings in C work such that a character array is ended with the null character, so if your string starts with a null followed by valid characters, you still end up with the null string.

x = string(bindgen(255)+1b) is the right way to do this.

hcrisp
02-23-2009, 02:32 PM
So why does string(bindgen()) concatenate the values instead of keeping them as an array like all the other _indgens? Seems inconsistent to me.



WAVE> info, n_elements(string(bindgen(255)+1b))
<Expression> LONG = 1
WAVE> info, n_elements(string(indgen(255)+1))
<Expression> LONG = 255
WAVE> info, n_elements(string(lindgen(255)+1L))
<Expression> LONG = 255


To initialize an array of all characters, I would probably recommend this code:


WAVE> x = strarr(256)
WAVE> for i=0L, 255 do x(i) = string(byte(i))
WAVE> info, x
X STRING = Array(256)

brian
02-23-2009, 02:42 PM
Of course, if you need that null string in the first element you can use:



x=sindgen(256)
for i=0L, 255 do x(i) = string(byte(i))
print, '|' + x(0) + '|'
;||
print, '|' + x(55) + '|'
;|7|


Regards,

brian

brian
02-23-2009, 02:58 PM
Based on rwagner's comments I would guess the byte array is basically a C char array in the kernel. Since a C string interprets null terminating char arrays as a "string" the kernel may not be evaluating each byte in succession but the null terminating char array as a whole. I suspect this is behavior is used in a significant number of legacy PV-WAVE applications and could be problematic to change. Though, as you point out the inconsistency in like functions is not intuitive.

Regards,

brian

totallyunimodular
02-23-2009, 11:39 PM
hcrisp,

The code you recommended for an array of characters is, I think, the easiest way to get what you want. I was also thinking that you could shorten it a bit by making the FOR loop iterator a byte instead of a long, e.g.



WAVE> x = strarr(256)
WAVE> for i=0B, 255 do x(i) = string(i)


but this leads to an infinite loop: once the iterator finishes at 255 and bumps to 256 the variable i rolls over to 0 since its typed as a byte. So, from a style perspective, your code's intent is clear, readable, and results in a finite loop :D

Note that the behavior of STRING in question is documented (e.g., Example 1 in the STRING documentation), but I agree that the behavior is not consistent with the other casting functions like INT or FLOAT. That being said, I think it is beneficial behavior. Consider the task of passing around text as arrays of bytes. Behavior such as



WAVE> myStr = 'Hello World!'
WAVE> str_as_bytes = BYTE(myStr)
WAVE> print, str_as_bytes, string(str_as_bytes)
72 101 108 108 111 32 87 111 114 108 100 33
Hello World!


is intuitive and easy.